Added Value of Gaze-Exploiting Semantic Representation to Allow Robots Inferring Human Behaviors

Karinne Ramirez-Amaro; Humera Noor Minhas; Michael Zehetleitner; Michael Beetz; Gordon Cheng

doi:10.1145/2939381

Loading next page...

References (76)

J. Pérez-Osorio, H. Müller, E. Wiese, A. Wykowska (2015)
Gaze Following Is Modulated by Expectations Regarding Others’ Action Goals
PLoS ONE, 10
H. Pirsiavash, Deva Ramanan (2012)
Detecting activities of daily living in first-person camera views
2012 IEEE Conference on Computer Vision and Pattern Recognition
C. Nehaniv, K. Dautenhahn (2009)
Imitation and Social Learning in Robots, Humans and Animals: Behavioural, Social and Communicative Dimensions
Satoshi Suzuki, K. Abe (1985)
Topological structural analysis of digitized binary images by border following
Comput. Vis. Graph. Image Process., 30
S. Albrecht, K. Ramirez-Amaro, F. Ruiz-Ugalde, D. Weikersdorfer, M. Leibold, M. Ulbrich, M. Beetz (2011)
Imitating human reaching motions using physically inspired optimization principles
Proceedings of the IEEE/RAS International Conference on Humanoid Robots (Humanoids). IEEE
Heng Wang, C. Schmid (2013)
Action Recognition with Improved Trajectories
2013 IEEE International Conference on Computer Vision
Anupam Guha, Yezhou Yang, C. Fermüller, Y. Aloimonos (2013)
Minimalist plans for interpreting manipulation actions
2013 IEEE/RSJ International Conference on Intelligent Robots and Systems
W. Takano, Yoshihiko Nakamura (2006)
Humanoid Robot's Autonomous Acquisition of Proto-Symbols through Motion Segmentation
2006 6th IEEE-RAS International Conference on Humanoid Robots
Mirko Wächter, T. Asfour (2015)
Hierarchical segmentation of manipulation actions based on object relations and motion characteristics
2015 International Conference on Advanced Robotics (ICAR)
Ekaterina Spriggs, F. Torre, M. Hebert (2009)
Temporal segmentation and activity classification from first-person sensing
2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops
G. Metta, G. Sandini, D. Vernon, L. Natale, F. Nori (2008)
The iCub humanoid robot: an open platform for research in embodied cognition
J. Quinlan (1992)
C4.5: Programs for Machine Learning
H. Chennamma, Xiaohui Yuan (2013)
A Survey on Eye-Gaze Tracking Techniques
ArXiv, abs/1312.6410
E. Schneider, S. Kohlbecher, K. Bartl, F. Wallhoff, T. Brandt (2009)
Experimental platform for Wizard-of-Oz evaluations of biomimetic active vision in robots
2009 IEEE International Conference on Robotics and Biomimetics (ROBIO)
M. Holte, Cuong Tran, M. Trivedi, T. Moeslund
Human 3d Pose Estimation and Activity Recognition from Multi-view Videos: Comparative Explorations of Recent Developments
K. Ramirez-Amaro, T. Inamura, E. Dean-Leon, M. Beetz, G. Cheng (2014)
Bootstrapping humanoid robot skills by extracting semantic representations of human-like activities from virtual reality
IEEE/RAS International Conference on Humanoid Robots (Humanoids). IEEE
W. Takano, Y. Nakamura (2006)
Humanoid robot’s autonomous acquisition of proto-symbols through motion segmentation
Proceedings of the IEEE-RAS International Conference on Humanoid Robots (Humanoids’06). IEEE
Added Value of Gaze-Exploiting Semantic Representation
Shaheena Noor, Humera Minhas (2014)
Context-Aware Perception for Cyber-Physical Systems
R. Poppe (2010)
A survey on vision-based human action recognition
Image Vis. Comput., 28
K. Ramirez-Amaro, J. Chimal-Eguía (2012)
Image-Based Learning Approach Applied to Time Series Forecasting
Journal of Applied Research and Technology, 10
Christopher Kurby, Jeffrey Zacks (2008)
Segmentation in the perception and memory of events
Trends in Cognitive Sciences, 12
Alejandro Betancourt, Pietro Morerio, C. Regazzoni, G.W.M. Rauterberg (2014)
The Evolution of First Person Vision Methods: A Survey
IEEE Transactions on Circuits and Systems for Video Technology, 25
Chen Yu, Linda Smith (2012)
Embodied attention and word learning by toddlers
Cognition, 125
M. Beetz, Moritz Tenorth, Dominik Jain, Jan Bandouch (2009)
Towards Automated Models of Activities of Daily Life
Technology and Disability, 22
R. Quinlan (1993)
C4
Morgan Kaufmann Publishers
E. Schneider, T. Villgrattner, Johannes Vockeroth, K. Bartl, S. Kohlbecher, S. Bardins, H. Ulbrich, T. Brandt (2009)
EyeSeeCam: An Eye Movement–Driven Head Camera for the Examination of Natural Visual Exploration
Annals of the New York Academy of Sciences, 1164
(2011)
Human activity analysis: A review
M. Wachter, Sebastian Schulz, T. Asfour, E. Aksoy, F. Worgotter, R. Dillmann (2013)
Action sequence reproduction based on automatic segmentation and Object-Action Complexes
2013 13th IEEE-RAS International Conference on Humanoid Robots (Humanoids)
E. Aksoy, Mohamad Aein, M. Tamosiunaite, F. Wörgötter (2015)
Semantic parsing of human manipulation activities using on-line learned models for robot imitation
2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
Karinne Ramirez-Amaro, T. Inamura, E. Dean-León, M. Beetz, G. Cheng (2014)
Bootstrapping humanoid robot skills by extracting semantic representations of human-like activities from virtual reality
2014 IEEE-RAS International Conference on Humanoid Robots
Karinne Ramirez-Amaro, Eun-Sol Kim, Jiseob Kim, Byoung-Tak Zhang, M. Beetz, G. Cheng (2013)
Enhancing human action recognition through spatio-temporal feature learning and semantic rules
2013 13th IEEE-RAS International Conference on Humanoid Robots (Humanoids)
Sangho Park, J. Aggarwal (2004)
Semantic-level Understanding of Human Actions and Interactions using Event Hierarchy
2004 Conference on Computer Vision and Pattern Recognition Workshop
G. Bradski, A. Kaehler (2008)
Learning OpenCV - computer vision with the OpenCV library: software that sees
T. Inamura, T. Shibata (2008)
Geometric proto-symbol manipulation towards language-based motion pattern synthesis and recognition
2008 IEEE/RSJ International Conference on Intelligent Robots and Systems
Karinne Ramirez-Amaro, M. Beetz, G. Cheng (2013)
Extracting Semantic Rules from Human Observations
M. Land, M. Hayhoe (2001)
In what ways do eye movements contribute to everyday activities?
Vision Research, 41
R. Johansson, J. Flanagan (2009)
Sensorimotor control of manipulation
B. Soran, Ali Farhadi, L. Shapiro (2014)
Action Recognition in the Presence of One Egocentric and Multiple Static Cameras
Moritz Tenorth, F. Torre, M. Beetz (2013)
Learning probability distributions over partially-ordered human everyday activities
2013 IEEE International Conference on Robotics and Automation
(2009)
First - person , inside - out vision . In Keynote Speech , 1 st Workshop on Egocentric Vision
H. Kjellström, Javier Romero, D. Kragic (2011)
Visual object-action recognition: Inferring object affordances from human demonstration
Comput. Vis. Image Underst., 115
Karinne Ramirez-Amaro, M. Beetz, G. Cheng (2017)
Transferring skills to humanoid robots by extracting semantic representations from observations of human activities
Artif. Intell., 247
Zhangzhang Si, Mingtao Pei, Benjamin Yao, Song-Chun Zhu (2011)
Unsupervised learning of event AND-OR grammar and semantics from video
2011 International Conference on Computer Vision
C. Teo, Yezhou Yang, Hal Daumé, C. Fermüller, Y. Aloimonos (2012)
Towards a Watson that sees: Language-guided action recognition for robots
2012 IEEE International Conference on Robotics and Automation
K. Dubba, A. Cohn, David Hogg, M. Bhatt, F. Dylla (2015)
Learning Relational Event Models from Video
J. Artif. Intell. Res., 53
Y. Kuniyoshi, M. Inaba, H. Inoue (1994)
Learning by watching: extracting reusable task knowledge from visual observation of human performance
IEEE Trans. Robotics Autom., 10
A. Billard, S. Calinon, R. Dillmann, S. Schaal (2008)
Survey: Robot Programming by Demonstration
Handbook of Robotics. Springer Berlin Heidelberg (2008).
M. F. Land, M. M. Hayhoe (2001)
In what ways do eye movements contribute to everyday activities? Vision Research 41 (2001), 3559--3565
what ways do eye movements contribute to everyday activities? Vision Research 41 (2001)
T. Kanade (2009)
First-person, inside-out vision
Keynote Speech
Basilio Noris, J. Keller, A. Billard (2011)
A wearable gaze tracking system for children in unconstrained environments
Comput. Vis. Image Underst., 115
K. Ogawara, Tomikazu Tanuki, H. Kimura, K. Ikeuchi (2001)
Acquiring hand-action models by attention point analysis
Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164), 1
B. Knott, J. Gratch, A. Cangelosi, James Caverlee (2018)
ACM Transactions on Interactive Intelligent Systems (TiiS) Special Issue on Trust and Influence in Intelligent Human-Machine Interaction
ACM Transactions on Interactive Intelligent Systems (TiiS), 8
Sebastian Albrecht, Karinne Ramirez-Amaro, Federico Ruiz-Ugalde, David Weikersdorfer, M. Leibold, M. Ulbrich, M. Beetz (2011)
Imitating human reaching motions using physically inspired optimization principles
2011 11th IEEE-RAS International Conference on Humanoid Robots
F. Wörgötter, C. Geib, M. Tamosiunaite, E. Aksoy, J. Piater, Hanchen Xiong, A. Ude, B. Nemec, D. Kraft, N. Krüger, Mirko Wächter, T. Asfour (2015)
Structural Bootstrapping—A Novel, Generative Mechanism for Faster and More Efficient Acquisition of Action-Knowledge
IEEE Transactions on Autonomous Mental Development, 7
M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, I. Witten (2009)
The WEKA data mining software: an update
SIGKDD Explor., 11
Karinne Ramirez-Amaro, M. Beetz, G. Cheng (2014)
Automatic segmentation and recognition of human activities from observation based on semantic reasoning
2014 IEEE/RSJ International Conference on Intelligent Robots and Systems
K. Ikeuchi, T. Suehiro (1992)
Towards an assembly plan from observation. I. Assembly task recognition using face-contact relations (polyhedral objects)
Proceedings 1992 IEEE International Conference on Robotics and Automation
A. Karpathy, G. Toderici, Sanketh Shetty, Thomas Leung, R. Sukthankar, Li Fei-Fei (2014)
Large-Scale Video Classification with Convolutional Neural Networks
2014 IEEE Conference on Computer Vision and Pattern Recognition
M. Land, N. Mennie, J. Rusted (1998)
The Roles of Vision and Eye Movements in the Control of Activities of Daily Living
Perception, 28
E. Aksoy, A. Abramov, Johannes Dörr, KeJun Ning, B. Dellen, F. Wörgötter (2011)
Learning the semantics of object–action relations by observation
The International Journal of Robotics Research, 30
F. Torre, J. Hodgins, Adam Bargteil, Xavier Martin, J. Macey, A. Collado, Pep Beltran (2008)
Guide to the Carnegie Mellon University Multimodal Activity (CMU-MMAC) Database
M. Tenorth, F. de la Torre, M. Beetz (2013)
Learning probability distributions over partially-ordered human everyday activities
Proceedings of the IEEE International Conference on Robotics and Automation (ICRA’13)
A. Billard, S. Calinon, R. Dillmann, S. Schaal (2008)
Robot Programming by Demonstration
Suha Kwak, Bohyung Han, J. Han (2014)
On-Line Video Event Detection by Constraint Flow
IEEE Transactions on Pattern Analysis and Machine Intelligence, 36
Rainer Jäkel, Sven Schmidt-Rohr, M. Lösch, R. Dillmann (2010)
Representation and constrained planning of manipulation strategies in the context of Programming by Demonstration
2010 IEEE International Conference on Robotics and Automation
S. Park, J. K. Aggarwal (2004)
Semantic-level understanding of human actions and interactions using event hierarchy
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’04) Workshop
J. Flanagan, R. Johansson (2003)
Action plans used in action observation
Nature, 424
J. Pelz, M. Hayhoe, Russ Loeber (2001)
The coordination of eye, head, and hand movements in a natural task
Experimental Brain Research, 139
Karinne Ramirez-Amaro, M. Beetz, G. Cheng (2015)
Understanding the intention of human activities through semantic perception: observation, understanding and execution on a humanoid robot
Advanced Robotics, 29
Yezhou Yang, C. Fermüller, Y. Aloimonos (2013)
Detection of Manipulation Action Consequences (MAC)
2013 IEEE Conference on Computer Vision and Pattern Recognition
Darrin Bentivegna, C. Atkeson, G. Cheng (2006)
Learning Similar Tasks From Observation and Practice
2006 IEEE/RSJ International Conference on Intelligent Robots and Systems
P. Azad, A. Ude, R. Dillmann, G. Cheng (2004)
A full body human motion capture system using particle filtering and on-the-fly edge detection
4th IEEE/RAS International Conference on Humanoid Robots, 2004., 2
A. Fathi, Yin Li, James Rehg (2012)
Learning to Recognize Daily Actions Using Gaze
Quoc Le, Will Zou, Serena Yeung, A. Ng (2011)
Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis
CVPR 2011
M. Wächter, T. Asfour (2015)
Hierarchical segmentation of manipulation actions based on object relations and motion characteristics
Proceedings of the International Conference on Advanced Robotics (ICAR’15). IEEE

Publisher: Association for Computing Machinery
Copyright: Copyright © 2017 ACM
ISSN: 2160-6455
eISSN: 2160-6463
DOI: 10.1145/2939381
Publisher site: See Article on Publisher Site

Abstract

Neuroscience studies have shown that incorporating gaze view with third view perspective has a great influence to correctly infer human behaviors. Given the importance of both first and third person observations for the recognition of human behaviors, we propose a method that incorporates these observations in a technical system to enhance the recognition of human behaviors, thus improving beyond third person observations in a more robust human activity recognition system. First, we present the extension of our proposed semantic reasoning method by including gaze data and external observations as inputs to segment and infer human behaviors in complex real-world scenarios. Then, from the obtained results we demonstrate that the combination of gaze and external input sources greatly enhance the recognition of human behaviors. Our findings have been applied to a humanoid robot to online segment and recognize the observed human activities with better accuracy when using both input sources; for example, the activity recognition increases from 77% to 82% in our proposed pancake-making dataset. To provide completeness of our system, we have evaluated our approach with another dataset with a similar setup as the one proposed in this work, that is, the CMU-MMAC dataset. In this case, we improved the recognition of the activities for the egg scrambling scenario from 54% to 86% by combining the external views with the gaze information, thus showing the benefit of incorporating gaze information to infer human behaviors across different datasets.

Journal

ACM Transactions on Interactive Intelligent Systems (TiiS) – Association for Computing Machinery

Published: Mar 23, 2017

Keywords: Robot learning by observation

Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Added Value of Gaze-Exploiting Semantic Representation to Allow Robots Inferring Human Behaviors

Added Value of Gaze-Exploiting Semantic Representation to Allow Robots Inferring Human Behaviors

Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Added Value of Gaze-Exploiting Semantic Representation to Allow Robots Inferring Human Behaviors

Added Value of Gaze-Exploiting Semantic Representation to Allow Robots Inferring Human Behaviors

References (76)

Abstract

Journal

Recommended Articles

There are no references for this article.

Our policy towards the use of cookies