CV-SUPER

CV-SUPER: Computer Vision for Scene Understanding from a First-Person Perspective

ERC Starting Grant Project ERC-2012-StG-307432

The goal of CV-SUPER is to create the technology to perform dynamic visual scene understanding from the perspective of a moving human observer. Briefly stated, we want to enable computers to see and understand what humans see when they navigate their way through busy inner-city locations. Our target scenario is dynamic visual scene understanding in public spaces, such as pedestrian zones, shopping malls, or other locations primarily designed for humans. CV-SUPER will develop computer vision algorithms that can observe the people populating those spaces, interpret and understand their actions and their interactions with other people and inanimate objects, and from this understanding derive predictions of their future behaviors within the next few seconds. In addition, we will develop methods to infer semantic properties of the observed environment and learn to recognize how those affect people’s actions. Supporting those tasks, we will develop a novel design of an object recognition system that scales up to potentially hundreds of categories. Finally, we will bind all those components together in a dynamic 3D world model, showing the world’s current state and facilitating predictions how this state will most likely change within the next few seconds. These are crucial capabilities for the creation of technical systems that may one day assist humans in their daily lives within such busy spaces, e.g., in the form of personal assistance devices for elderly or visually impaired people or in the form of future generations of mobile service robots and intelligent vehicles.

This project has received funding from the European Union’s Seventh Framework Programme for research, technological development and demonstration under grant agreement no. 307432.