Relative Positional Encoding for Transformers with Linear Complexity
Antoine Liutkus * 1 Ondrej Cfka * 2 Shih-Lun Wu 3 4 5 Umut Simsekli 6 Yi-Hsuan Yang 3 5 Gael Richard 2
Abstract
Recent advances in Transformer models allow
for unprecedented sequence lengths, due to linear
space and time complexity. In the meantime, rela-
tive positional encoding (RPE) was proposed as
beneficial for classical Transformers and consists
in exploiting lags instead of absolu ...


雷达卡




京公网安备 11010802022788号







