Poolingformer: Long Document Modeling with Pooling Attention
Hang Zhang 1 2 Yeyun Gong 3 Yelong Shen 4 Weisheng Li 5 Jiancheng Lv 1 Nan Duan 3 Weizhu Chen 4
Abstract
In this paper, we introduce a two-level attention
schema, Poolingformer, for long document mod-
eling. Its first level uses a smaller sliding window
pattern to aggregate information from neighbors.
Its second level employs a larger window to in-
crease receptive fields with pooling attention to
r ...


雷达卡




京公网安备 11010802022788号







