Who’s Waldo? Linking People Across Text and Images
Claire Yuqing Cui1 Apoorv Khandelwal1 Yoav Artzi1,2 Noah Snavely1,2 Hadar Averbuch-Elor1,2
1
Cornell University 2 Cornell Tech
{yc2296, ak2254, yoavartzi, snavely, hadarelor}@cornell.edu
Abstract
We present a task and benchmark dataset for person-
centric visual grounding, the problem of linking between
people named in a caption and people pictured in an im-
age. In c ...


雷达卡




京公网安备 11010802022788号







