Activity and Attention in First-Person Vision

Speaker: Prof. James M. Rehg
Affiliation: Georgia Institute of Technology

Abstract:  Recent progress in miniaturizing digital cameras and improving battery life has created a booming market for wearable cameras, which is exemplified by the success of consumer products such as GoPro and the adoption of body-worn cameras by law enforcement. Wearable cameras capture a first-person view of the scene, and the resulting video implicitly encodes the attention and movement of the user. The analysis of such video through First-Person Vision (FPV) provides new opportunities to model and analyze human behavior, create personalized records of visual experiences, and improve the treatment of a broad range of mental and physical health conditions. This talk will describe current research progress in First Person Vision, with a focus on methods for recognizing activities and modeling and exploiting visual attention. For activities involving object manipulation, we present a model of the spatio-temporal relationship between the gaze point, the scene objects, and the action label. We demonstrate that attention can provide a powerful cue for recognition, and present an inference method for predicting gaze locations. We argue that standard feature models for action recognition in video need to be re-examined in the context of FPV. We will also introduce a model for group social interactions and demonstrate its use in retrieving moments of interest from large collections of first person videos. FPV is a key technology for the behavioral sciences, as it can be used to automatically and objectively measure human behavior in naturalistic settings, such as the home or classroom. We will present a method for automatically detecting moments of eye contact between an adult therapist and a child. This technology provides a new approach to measuring the response to treatment of children with autism who are receiving an intervention. We will briefly discuss applications to adult health-related behaviors and visual exposure to advertising and other environmental cues. Our long-term goal is to develop a new computational approach to measuring and analyzing social, communicative, and health-related behaviors in natural settings, which we refer to as behavioral imaging. We are developing the technology for behavioral imaging through two multi-site research centers funded by the NSF and the NIH. This is joint work with Drs. Agata Rozga and Alireza Fathi, and Ph.D. students Yin Li, Zhefan Ye, and Yun Zhang.

Biography: James M. Rehg (pronounced “ray”) is a Professor in the School of Interactive Computing at the Georgia Institute of Technology, where he is Director of the Center for Behavioral Imaging and co-Director of the Computational Perception Lab (CPL). He received his Ph.D. from CMU in 1995 and worked at the Cambridge Research Lab of DEC (and then Compaq) from 1995-2001, where he managed the computer vision research group. He received an NSF CAREER award in 2001 and a Raytheon Faculty Fellowship from Georgia Tech in 2005. He and his students have received best student paper awards at ICML 2005, BMVC 2010, Mobihealth 2014, and Face and Gesture 2015, and a 2013 Method of the Year Award from the journal Nature Methods. Dr. Rehg serves on the Editorial Board of the Intl. J. of Computer Vision, and he served as the Program co-Chair for ACCV 2012 and General co-Chair for CVPR 2009, and will serve as Program co-Chair for CVPR 2017. He has authored more than 100 peer-reviewed scientific papers and holds 25 issued US patents. His research interests include computer vision, machine learning, robot perception and mobile health. Dr. Rehg was the lead PI on an NSF Expedition to develop the science and technology of Behavioral Imaging, the measurement and analysis of social and communicative behavior using multi-modal sensing, with applications to developmental disorders such as autism. He is currently the Deputy Director of the NIH Center of Excellence on Mobile Sensor Data-to-Knowledge (MD2K), which is developing novel on-body sensing and predictive analytics for improving health outcomes. See and for details.

For more information contact Professors Suhas Diggavi & Mani Srivastava

Date(s) - Oct 17, 2016
12:30 pm - 1:30 pm

EE-IV Shannon Room #54-134
420 Westwood Plaza - 5th Flr., Los Angeles CA 90095