GenEAI: Generative AI meets Eye Tracking

ACM Symposium on Eye Tracking Research and Applications
May 26 – 29, Tokyo, Japan

Generative artificial intelligence has come into focus with applications in content generation, design, and predictive modeling. At the same time, eye-tracking technologies have provided unparalleled insights into human attention, perception, and behavior, enabling various applications in neuroscience, human-computer interaction (HCI), and marketing. Bringing these two powerful technologies together will open up new possibilities for understanding and improving human interaction with AI systems.

This workshop aims to explore the synergies between these two cutting-edge fields and provide a platform for researchers, practitioners, and developers to share new approaches, tools, and methods at the intersection of these fields. The workshop will also address possible ethical considerations and challenges that arise from this integration, including privacy concerns and the implications of using data on human gaze behavior as a source of information for generative algorithms.

Submission

Authors are invited to submit original work complying with the ETRA SHORT PAPER format (max. 8 pages, plus any number of additional pages for references, max. 150 words abstract). Also ensure that the Author Guidelines for SIG sponsored events (sigconf) are met prior to submission. GenEAI uses the Precision Conference System (PCS) through ETRA 2025 to handle the submission and reviewing process. All accepted papers will be published by ACM as part of the Workshop Proceedings (in ACM DL) of ETRA 2025.

Important Dates

Event Date
Submission Deadline March 17
Notification March 24
Camera Ready Deadline March 31

Topics of Interest

  • Generative AI models informed by eye-tracking data for improved content generation and personalization.
  • Foundation models and their application to analyze eye-tracking data in new and efficient ways.
  • Real-time adaptive interfaces that combine generative AI and eye tracking for dynamic user experiences in AR/VR, gaming, and education.
  • Personalization of user interfaces and experiences based on gaze data.
  • Behavioral and cognitive insights gained by analyzing gaze data with generative AI, with applications in HCI, marketing, and psychology.
  • Visualization techniques for eye movement data, including spatio-temporal analysis and visual exploration of gaze patterns.
  • Ethical and privacy considerations in using eye-tracking data to inform generative AI models.
  • Applications of eye-tracking in challenging environments, including mobile devices, large displays, and mixed/virtual reality systems.
  • Integration of eye-tracking into LLMs/VLMs for more intuitive and responsive human-AI interaction systems.

GenEAI Workshop Schedule

GenEAI Workshop will take place on May 29th in Room 3, from 11:00 AM to 5:30 PM.

11:00 AM

Introduction and Welcome


11:10 AM – Keynote Talk 1

Dr. Xi Wang“Decoding Human Behavior Through Gaze Patterns”

11:40 AM – Q&A (Keynote 1)


11:50 AM – Paper Presentations Session 1

11:50 AM – Wong et al., “Shifts in Doctors' Eye Movements Between Real and AI-Generated Medical Images”

12:05 PM – Nguyen et al., “Large Language Models and Eye Tracking for Learning Disorder Detection: Do We Still Need Machine Learning?”

12:20 PM – Q&A (Papers 1 & 2)


12:30 PM – 2:00 PM

Lunch Break


2:00 PM – Keynote Talk 2

Dr. Xucong Zhang“Generative Models for Gaze Estimation”

2:30 PM – Q&A (Keynote 2)


2:40 PM – Paper Presentations Session 2

2:40 PM – Mardanbegi et al., “GazeLog: Optimizing Eye-Tracking with Fixation Keyframes & LLM Insights”

2:55 PM – Abdrabou et al., “From Gaze to Data: Privacy and Societal Challenges of Using Eye-tracking Data to Inform GenAI Models”

3:10 PM – Q&A (Papers 3 & 4)


3:20 PM – Keynote Talk 3

Dr. Stein Dolan“‘Attention’ is All You Need”

3:50 PM – Q&A (Keynote 3)


4:00 PM – 4:30 PM

Coffee Break


4:30 PM – Paper Presentations Session 3

4:30 PM – Lohr et al., “Device-Specific Style Transfer of Eye-Tracking Signals”

4:45 PM – Q&A (Paper 5)


4:55 PM – Best Paper Award Announcement

5:10 PM – Closing Remarks & Acknowledgements

Keynote Speakers

Xi Wang, ETH Zürich
Decoding Human Behavior Through Gaze Patterns
In everyday interactions, a person’s gaze often reveals their focus, attention, and intentions. This talk delves into the wealth of information conveyed through gaze patterns, exploring how they reflect underlying cognitive processes. I will highlight our recent work using gaze patterns to predict future trajectories of ego-vehicles and future action sequences, demonstrating the potential of gaze analysis to anticipate movement and decision-making in dynamic environments.

Xi Wang is an established researcher in the Computer Vision and Geometry Lab with Prof. Marc Pollefeys at ETH Zurich while continue working with Prof. Luc Van Gool at INSAIT. She is an incoming Junior Group Leader at TUM, funded by the Federal Ministry of Education and Research of Germany (BMBF). Xi was an ETH Postdoc Fellow in the Advanced Interactive Technologies lab led by Prof. Otmar Hilliges at ETH and a member of the virtual group of Ruth Rosenholtz at MIT. She completed her PhD in the Computer Graphics Group at TU Berlin, advised by Prof. Marc Alexa. During her PhD, she visited MIT working in the Computational Perception & Cognition Group led by Aude Oliva, and interned at Adobe Research working with Zoya Bylinskii and Aaron Hertzmann. Her research interests fall at the intersection of computer vision & graphics, and vision science. Her goal is to bring human common sense and behavior patterns into machine learning.
XI Wang, ETH Zürich
Xucong Zhang, TU Delft
Generative Models for Gaze Estimation
Generative models can be utilized for the gaze estimation task in different ways. This talk discusses our work using generative models for gaze redirection, privacy protection, and gaze target detection. We explored GAN, VAE, and NeRF models for gaze redirection and diffusion models for gaze target detection. We have used the GAN, VAE and NeRF models for gaze redirection with explicit 3D geometric control. The generated face image can also be used to replace the user’s face with an obfuscated face to prevent privacy information leak. We also explored the usage of diffusion model for the gaze target detection by taking the stable diffusion model as a simple feature extractor. Our research reveals that generative models can enhance the functionality and versatility of gaze estimation related tasks. This talk will discuss the methods, challenges, and outcomes of our approach, providing insights into the potential and future direction of generative models in gaze estimation.

Xucong Zhang is an assistant professor at TU Delft. He was a postdoc from 2018 to 2021 in the Advanced Interaction Technologies Lab at ETH Zurich in Switzerland and completed his PhD research with summa cum laude from 2013 to 2018 at Max Planck Institute for Informatics in Germany. Xucong is known for his pioneering work on gaze estimation in real-world settings by the MPIIGaze dataset. He is also the primary developer of the ETH-XGaze dataset and is an author of numerous further works in gaze estimation methods and their applications.
Xucong Zhang, TU Delft
Stein Dolan, META
"Attention" is All You Need
Wearable AI offers a paradigm shift in the way we interact with technology. Instead of disruptive, repetitive interactions with diverse user interfaces, we will be able to communicate with our digital world seamlessly through everyday actions. This is achieved by capturing context from an egocentric viewpoint over time, and then pairing this context with generative AI to predict the user’s goal. Such goal predictions would enable AI interactions to take place at the right time and with the most appropriate actions. However, achieving this efficiently with intuitive interactions and reliably helpful results is not trivial. User attention is a crucial signal for improving the relevance of captured context. While eye tracking is not the sole indicator of user attention, it is often one of the most important signals. In this talk, we will introduce research that leverages eye tracking and other relevant signals to infer and utilize user attention in various scenarios. Our aim is to unlock efficient, intuitive, and helpful Wearable AI, empowering everyone to harness the superpowers of this emerging technology in their daily lives.

Stein Dolan is a Principal Technical Program Manager at Meta Reality Labs Research where he works on pioneering research for contextual AI, augmented reality, and mixed reality. Stein’s current research focus is on system development for contextual artificial intelligence that builds from egocentric sensors including eye tracking. Stein has been at Meta for 4 years and oversees projects focused on eye tracking systems, graphics systems, and display systems for AR and MR. Stein was part of the Seattle start-up community in the late 90s as a software developer and then spent 21 years at Microsoft working on cloud services and security systems. Stein holds a B.S. in engineering from the University of Washington and a J.D. from Seattle University.
Stein Dolan, META

Organizers

Prof. Dr. Arantzazu Villanueva
Prof. Dr. Arantxa Villanueva

Public University of Navarra, Spain

(avilla@unavarra.es)
Arantzazu Villanueva Larre is currently a full professor in the area of Signal Theory and Communications within the Department of Electrical, Electronic, and Communication Engineering at the Public University of Navarre. She specializes in Signal Theory and Communications and has taught courses on image processing, computer vision, and multimedia signal processing at both undergraduate and master’s levels. Her main research focus is on image-based eye-tracking systems as a tool for human-computer interaction. She also conducts research in the field of medical image analysis. She has served as a co-chair for several conferences, such as the Communication by Gaze Interaction (COGAIN) conference in 2009 and the European Conference on Eye Movements (ECEM) in 2013. Additionally, she has held roles such as Poster Chair at various ETRA conferences and Doctoral Consortium Chair. Her involvement also extends to organizing workshops, such as the "How an Eyetracker Works – Looking Inside the Black Box" workshop, which was held alongside Jeff Pelz and Dixon Cleveland at ECEM 2013. She also contributed to a workshop on IR Eye Safety in Brussels in 2008.
Prof. Dr. Enkelejda Kasneci
Prof. Dr. Enkelejda Kasneci

Technical University of Munich, Germany

(enkelejda.kasneci@tum.de)
Enkelejda Kasneci is a Distinguished Professor (“Liesel Beckmann Distinguished Professorship”) for Human-Centered Technologies for Learning at the School of Social Sciences & Technology and in her second affiliation at the School of Computation, Information and Technology. Her research focuses on human-centered technologies, emphasizing the crossroads between multimodal interaction and cutting-edge technological tools like VR, AR, and eye-tracking methodologies. She is member of the ETRA Steering Committee and served as general co-chair of ETRA 2022 and ETRA 2023.
Prof. Dr. Gjergji_Kasneci
Prof. Dr. Gjergji Kasneci

Technical University of Munich, Germany

(gjergji.kasneci@tum.de)
Gjergji Kasneci is a full professor of Responsible Data Science at the Technical University of Munich and a core member of the Munich Data Science Institute. He served as the Chief Technology Officer from 2017 to 2022 at SCHUFA Holding AG, and an Honorary Professor at the University of Tübingen from 2018 until 2023. His research focuses on transparency, robustness, bias, and fairness in machine learning algorithms and involves ethical, legal, and societal considerations with the goal of using artificial intelligence responsibly for the benefit of individuals and society. He is a Fellow of the Konrad Zuse School for Reliable AI and serves on several boards and program committees of renowned conferences, including AAAI, NeurIPS, xAI, DSP, and many more.
Prof. Dr. Yusuke Sugano
Prof. Dr. Yusuke Sugano

University of Tokyo, Japan

(sugano@iis.u-tokyo.ac.jp)
Yusuke Sugano is an associate professor at the Institute of Industrial Science at the University of Tokyo. He was previously an associate professor at the Graduate School of Information Science and Technology, Osaka University, a postdoctoral researcher at the Max Planck Institute for Informatics, and a project research associate at the Institute of Industrial Science, the University of Tokyo. His research interests focus on computer vision and human-computer interaction. He has served as the General Chair for ETRA in both 2024 and 2025.
Yao Rong
Dr. Yao Rong

Rice University, United States

(yao.rong@rice.edu)
Yao Rong is a Junior Fellow at the Rice Academy of Fellows working in the Computer Science Department. She earned her Ph.D. in Computer Science from the Technical University of Munich in 2024. Her research focuses on integrating human factors into AI model design to enhance user experience in human-AI interactions and advancing the trustworthiness of AI systems through the development of human-centered explainable AI techniques. She served as the Chair for Diversity and Inclusion from 2022 to 2024. For instance, during ETRA 2023, she organized a workshop as part of a Diversity Event.
Talissa Stadler
Talissa Stadler

Technical University of Munich, Germany

(office.hctl@sot.tum.de)
Responsible for administration and organization.