VQA: Visual Question Answering
作者: Aishwarya AgrawalJiasen LuStanislaw AntolMargaret MitchellC. Lawrence ZitnickDevi ParikhDhruv Batra
作者单位: 1Virginia Tech
2Microsoft Research
3Facebook AI Research
4Georgia Institute of Technology
刊名: International Journal of Computer Vision, 2017, Vol.123 (1), pp.4-31
来源数据库: Springer Nature Journal
DOI: 10.1007/s11263-016-0966-6
关键词: Visual Question Answering
英文摘要: We propose the task of free-form and open-ended Visual Question Answering (VQA). Given an image and a natural language question about the image, the task is to provide an accurate natural language answer. Mirroring real-world scenarios, such as helping the visually impaired, both the questions and answers are open-ended. Visual questions selectively target different areas of an image, including background details and underlying context. As a result, a system that succeeds at VQA typically needs a more detailed understanding of the image and complex reasoning than a system producing generic image captions. Moreover, VQA is amenable to automatic evaluation, since many open-ended answers contain only a few words or a closed set of answers that can be provided in a multiple-choice format. We...
原始语种摘要: We propose the task of free-form and open-ended Visual Question Answering (VQA). Given an image and a natural language question about the image, the task is to provide an accurate natural language answer. Mirroring real-world scenarios, such as helping the visually impaired, both the questions and answers are open-ended. Visual questions selectively target different areas of an image, including background details and underlying context. As a result, a system that succeeds at VQA typically needs a more detailed understanding of the image and complex reasoning than a system producing generic image captions. Moreover, VQA is amenable to automatic evaluation, since many open-ended answers contain only a few words or a closed set of answers that can be provided in a multiple-choice format. We...
全文获取路径: Springer Nature  (合作)
分享到:
来源刊物:
影响因子:3.623 (2012)

×
关键词翻译
关键词翻译
  • provide 规定
  • system 
  • image 
  • information 报告
  • discuss 议论
  • since 以来
  • natural 自然的
  • world 世界
  • understanding 理解
  • choice 选择