Multi-Scale Vision Longformer:
A New Vision Transformer for High-Resolution Image Encoding
Pengchuan Zhang1 Xiyang Dai1 Jianwei Yang1 Bin Xiao1 Lu Yuan1
Lei Zhang2 Jianfeng Gao1
1
Microsoft Corporation {penzhan,xidai,jianwei.yang,bin.xiao,luyuan,jfgao}@microsoft.com
2
International Digital Economy Academy (IDEA) leizhang@idea.edu.cn
vision Transformer that can ...


雷达卡




京公网安备 11010802022788号







