GRAPE is a unified group-theoretic framework for positional encoding that subsumes multiplicative mechanisms (like RoPE) and additive mechanisms (like ALiBi and FoX) under a single mathematical ...
From "Combining Global and Local Attention with Positional Encoding for Video Summarization", Proc. of the IEEE Int. Symposium on Multimedia (ISM), Dec. 2021. Written by Evlampios Apostolidis, ...