Learning Image Representations Tied to Egomotion from Unlabeled Video
作者: Dinesh JayaramanKristen Grauman
作者单位: 1The University of Texas at Austin
刊名: International Journal of Computer Vision, 2017, Vol.125 (1-3), pp.136-161
来源数据库: Springer Nature Journal
DOI: 10.1007/s11263-017-1001-2
关键词: Feature SpaceConvolutional Neural NetworkFeature LearningTemporal CoherenceScene Recognition
原始语种摘要: Understanding how images of objects and scenes behave in response to specific egomotions is a crucial aspect of proper visual development, yet existing visual learning methods are conspicuously disconnected from the physical source of their images. We propose a new “embodied” visual learning paradigm, exploiting proprioceptive motor signals to train visual representations from egocentric video with no manual supervision. Specifically, we enforce that our learned features exhibit equivariance i.e., they respond predictably to transformations associated with distinct egomotions. With three datasets, we show that our unsupervised feature learning approach significantly outperforms previous approaches on visual recognition and next-best-view prediction tasks. In the most challenging test, we...
全文获取路径: Springer Nature  (合作)
影响因子:3.623 (2012)

  • visual 可见
  • features 特征
  • recognition 识别
  • Recognition 识别
  • scene 景物
  • platform 台地
  • existing 现行
  • video 影象
  • learning 学识
  • aspect 季相