In:Eye-tracking in Interaction: Studies on the role of eye gaze in dialogue
Edited by Geert Brône and Bert Oben
[Advances in Interaction Studies 10] 2018
► pp. 169–194
Chapter 8Automatic analysis of in-the-wild mobile eye-tracking experiments using object, face and person detection
Published online: 13 November 2018
https://doi.org/10.1075/ais.10.08beu
https://doi.org/10.1075/ais.10.08beu
Abstract
In this chapter, we discuss a novel method for the analysis of mobile eye-tracking data in natural environments. Mobile eye-tracking systems generate large amounts of continuous data, making manual analysis extremely time-consuming. Available solutions provided by commercially available eye-tracking systems, such as marker-based analysis, minimize the manual labor but require experimental control, making real-life experiments practically unfeasible, and generally only apply to the analysis of objects. Here, we discuss a novel method for the processing of mobile eye-tracking data, based on the integration of computer vision techniques. Using such an approach allows us to automatically detect specific objects, faces and human bodies/body parts in images captured by a mobile eye-tracker. By mapping the gaze data on top of these detections, we gain insights into the visual behavior of recorded participants. As an important step in the integration of this method in the analysis of multimodal interaction, we developed an output format that is compatible with annotation tools such as ELAN, making our software integratable with existing annotations. In this chapter we give an overview of relevant image processing techniques and their application in interaction studies. We also present a thorough comparison between manual analysis and our automatic analysis in both speed and accuracy on challenging, real-life experiments.
Article outline
- 1.Introduction
- 2.Related work
- 2.1Object recognition techniques
- 2.2Object detection techniques
- 3.Recognition and detection solution for mobile eye-tracking data: A technical description
- 3.1Recognition of specific objects
- 3.2Detection of faces and bodies
- 3.3Person reidentification
- 4.Experimental results
- 4.1Object recognition results
- 4.2Results of face and body detections
- 4.3Combined results of objects, face and body detection
- 5.Conclusion and future work
Note References
References (31)
Bay, H., Tuytelaars, T., & Van Gool, L. (2006). Surf: Speeded up robust features. In European Conference on Computer Vision, 404–417.
Brône, G., Oben, B., a Goedemé, T. (2011). Towards a more effective method for analyzing mobile eye-tracking data: integrating gaze data with object recognition algorithms. In
Proceedings of the 1st PETMEI Workshop in Pervasive Eye-Tracking and Mobile Eye-Based Interaction
, 53–56.
Calonder, M., Lepetit, V., Strecha, C., & Fua, P. (2010). Brief: binary robust independent elementary features. In European Conference on Computer Vision, 778–792.
Dalal, N. & Triggs, B. (2005). Histograms of oriented gradients for human detection. In Computer Vision and Pattern Recognition, 886–893.
De Beugher, S., Brône, G., & Goedemé, T. (2012). Automatic analysis of eye-tracking data using object detection algorithms. Proceedings of the 2nd PETMEI Workshop in Pervasive Eye-Tracking and Mobile Eye-Based Interaction.
(2014). Automatic analysis of in-the-wild mobile eye-tracking experiments using object, face and person detection. In Computer Vision Theory and Applications, 625–633.
Dollár, P., Tu, Z., Perona, P., & Belongie, S. (2009). Integral channel features. In Proceedings of the British Machine Vision Conference, 1–11.
Dollár, P., Wojek, C., Schiele, B., & Perona, P. (2012). Pedestrian detection: An evaluation of the state of the art. Transactions on Pattern Analysis and Machine Intelligence, 34(4), 743–761.
Evans, K. M., Jacobs, R. A., Tarduno, J. A., & Pelz, J. B. (2012). Collecting and analyzing eye-tracking data in outdoor environments. Journal of Eye Movement Research, 5(2) 1–19.
Felzenszwalb, P. F., Girshick, R. B., & McAllester, D. (2010). Cascade object detection with deformable part models. In Computer Vision and Pattern Recognition, 2241–2248.
Fischler, M. A. & Bolles, R. C. (1981). Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commununications of the ACM, 24(6), 381–395.
Gall, J. & Lempitsky, V. (2009). Class-specific hough forests for object detection. In Computer Vision and Pattern Recognition, 1022–1029.
Hayes A. F. & Krippendorff K. (2007). Answering the Call for a Standard Reliability Measure for Coding Data. Communication Methods and Measures, 1(1), 77–89.
Henderson, J. M. (2003). Human gaze control during real-world scene perception. Trends in Cognitive Sciences, 7(11), 498–504.
Judd, T., Ehinger, K., Durand, F., & Torralba, A. (2009). Learning to predict where humans look. In International Conference on Computer Vision, 2106–2113.
Kalman, R. (1960) A New Approach to Linear Filtering and Prediction Problems. Transaction of the ASME Journal of Basic Engineering, 82, 35–45.
Kassner, M. and Patera, W. & Bulling, A. (2014) Pupil: An Open Source Platform for Pervasive Eye Tracking and Mobile Gaze-based Interaction. In CoRR.
Leutenegger, S., Chli, M., & Siegwart, R. (2011). Brisk: Binary robust invariant scalable keypoints. In International Conference on Computer Vision, 2548–2555.
Lowe, D. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.
Mathias, M., Benenson, R., Timofte, R., & Van Gool, L. (2013). Handling occlusions with franken-classifiers. In International Conference on Computer Vision, 1505–1512.
Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., Kadir, T., & Van Gool, L. (2005). A comparison of affine region detectors. International Conference on Computer Vision, 65(1–2), 43–72.
Miksik, O. & Mikolajczyk, K. (2012). Evaluation of local detectors and descriptors for fast feature matching. In International Conference on Pattern Recognition, 2681–2684.
Rosten, E. & Drummond, T. (2005). Fusing points and lines for high performance tracking. In International Conference on Computer Vision, 1508–1515.
Rublee, E., Rabaud, V., Konolige, K., & Bradski, G. (2011). Orb: An efficient alternative to sift or surf. In International Conference on Computer Vision 2564–2571.
Toyama, T., Kieninger, T., Shafait, F., & Dengel, A. (2012). Gaze guided object recognition using a head-mounted eye tracker. In Proceedings of the ETRA Conference, 91–98.
Tuytelaars, T. & Mikolajczyk, K. (2008). Local invariant feature detectors: a survey. Foundations and Trends in Computer Graphics and Vision, 3(3), 177–280.
Vandemoortele, S., De Beugher, S., Brône, G., Feyaerts, K., Goedemé, T., De Baets, T., & Vervliet, S. (2015). Into the wild: Musical communication in ensemble playing. Discerning mutual and solitary gaze events in musical duos using mobile eye-tracking. In Proceedings of the SAGA Workshop.
Viola, P. & Jones, M. (2001). Rapid object detection using a boosted cascade of simple features. In Computer Vision and Pattern Recognition, 511–518.
Cited by (1)
Cited by one other publication
This list is based on CrossRef data as of 14 november 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
