About Me

Hi! I am a PhD student at the University of Michigan, Department of Computer Science and Engineering! I work with Dr. Joyce Chai as a member of the Situated Language and Embodied Dialogue (SLED) Lab. My research focuses on language understanding, cognitive development, and embodied agents. Speficially, relating to how robots can take advantage of their ability to interact with the world around them to learn language during spoken interaction with others. Prior to joining the University of Michigan I graduated from Boise State University with a B.S. in Computer Science where during my last two years at Boise State University I performed research advised by Dr. Casey Kennington as a member of the Spoken Language and Interative Machines (SLIM) lab.

Education

Boise State University, 2018 - 2023

Bachelor of Science, Computer Science

University of Michigan, 2023 - 2025

Master of Science, Computer Science and Engineering

University of Michigan, 2023 - 2028 (Expected)

Doctor of Philosophy, Computer Science and Engineering

Publications

Illustration of the symbol grounding mechanism through information aggregation. Lighter colors denote more salient attention, quantified by saliency scores, i.e., gradient × attention contributions to the loss (Wang et al., 2023). When predicting the next token, aggregate heads (Bick et al., 2025) emerge to exclusively link environmental tokens (visual or situational context; ⟨ENV⟩) to linguistic tokens (words in text; ⟨LAN⟩). These heads provide a mechanistic pathway for symbol grounding by mapping external environmental evidence into its linguistic form.
The Mechanistic Emergence of Symbol Grounding in Language Models
Shuyu Wu, Ziqiao Ma, Xiaoxi Luo, Yidong Huang, Josue Torres-Fonseca, Freda Shi, Joyce Chai
International Conference on Machine Learning 2026 -- Seoul, South Korea
Visualization of the SafetyALFRED evaluation pipeline. Environment is perturbed to introduce a hazard (1). Two separate instances of the same model then evaluate the scene: one identifies hazards as in a static QA setting (2a), while the other generates an embodied plan that must mitigate hazards before completing the task (2b). Alignment occurs when a hazard recognized in QA task is also mitigated in the embodied task (3)
SafetyALFRED: Evaluating Safety-Conscious Planning of Multimodal Large Language Models
Josue Torres-Fonseca, Naihao Deng, Yinpei Dai, Shane Storks, Yichi Zhang, Rada Mihalcea, Casey Kennington, Joyce Chai
Findings of the Association of Computational Linguistics 2026 -- San Diego, California, United States
Summary of upgraded and newly added modules in rrSDS 2.0
rrSDS 2.0: Incremental, Modular, Distributed, Multimodal Spoken Dialogue with Robotic Platforms
Anna Manaseryan, Porter Rigby, Brooke Matthews, Ryan Whetten, Catherine Henry, Josue Torres-Fonseca, Enoch Levandowsky, Casey Kennington
Proceedings of the Special Interest Group on Discourse and Dialogue 2025 -- Avignon, France
Schematic of object permanence spoken dialogue system.
Symbol and Communicative Grounding through Object Permanence with a Mobile Robot
Josue Torres-Fonseca, Catherine Henry, Casey Kennington
Proceedings of the Special Interest Group on Discourse and Dialogue 2022 -- Edinburgh, UK
Process of collecting data from the perspective of a participant with example of logged functions, descriptions, and emotion ratings.
HADREB: Human Appraisals and (English) Descriptions of Robot Emotional Behaviors
Josue Torres-Fonseca, Casey Kennington
Proceedings of the Language Resources and Evaluation Conference 2022 -- Marseille, France