< Back to previous page

Project

A Camera You Can Talk To

With the development of the computer science community, intelligent agents are given more connotations. People are no longer satisfied purely with a machine that can only understand the input instruction. We expect the machine can understand what it sees through its ‘eyes’ and can communicate with us like a natural person. In this project, we aim to study such intelligent agents by developing a novel scheme for image/video scene understanding. The agent observes the scene, objects, and people therein through a camera. Then it processes the input signals and constructs its internal representations. The agent will keep track of the main events over time. A user can ask questions through a dialog system about the current and past video content. This topic belongs to the field of ‘multimodal representation learning’.

Date:6 Mar 2020 →  6 Mar 2024
Keywords:Computer Vision
Disciplines:Computer vision
Project type:PhD project