IARG -
Image Analysis Reading Group

Discusses topics related to image and signal analysis, both methods and applications. Special interests in machine learning approaches and medical image analysis.


    Home
    Meetings
    Presenters
    Resources


IARG is an activity of the Machine Learning and Natural Language Processing research group within the Department of Computing, Macquarie University

View the Project on GitHub computing-mq/iarg

Vision and Language Learning: From Image Captioning and Visual Question Answering towards Embodied Agents

Peter Anderson

Abstract

Each time we ask for an object, describe a scene or follow directions, we are converting information between visual and linguistic representations. People do this with ease, typically without even noticing. Intelligent systems that perform useful tasks in unstructured situations, and interact with people, will also require this ability. In this talk, we will focus on the joint modeling of visual and linguistic information using deep neural networks. We will cover some recent advances in automatic image captioning, visual question answering (VQA), and vision and language navigation (VLN).

The material will be drawn from the following two papers to be presented at CVPR 2018: