Breaking the Linear Barrier: A Multi-Modal LLM-Based System for Navigating Complex Web Content

descriptionPublicationkeyboard_double_arrow_right Article , Conference object 08 Jul 2025Publisher:IEEEJournal:2025 IEEE 49th Annual Computers, Software, and Applications Conference (COMPSAC)

Authors: Gabriel Moterani; Wenjun Randy Lin;

doi: 10.1109/compsac65507.2025.00289 , 10.5281/zenodo.19503524 , 10.5281/zenodo.19503525

Breaking the Linear Barrier: A Multi-Modal LLM-Based System for Navigating Complex Web Content

- Summary
- Metrics

Abstract

Visually impaired users still face fundamental obstacles when interacting with complex, dynamic websites. Conventional screen readers expose pages in a strict linear order, offer little semantic context for visual media, and provide limited context regarding the page content. This paper introduces a multi-modal accessibility framework combining Large Language Models (LLMs), Computer Vision, and dynamic DOM manipulation to significantly enhance semantic clarity, non-linear navigation, and interaction richness. By interpreting visual and textual web content contextually and adapting it into an intuitive, conversationally navigable interface, our method provides a foundation for visually impaired users to interact effectively with previously inaccessible or challenging digital experiences.The deployment of a functional prototype on a modern web browser illustrates the capability of the proposed system to interact with diverse websites and tasks. The research team selected Canada's most frequented websites to assess the system's efficacy in enhancing contextual understanding of the page content and enabling navigation through pages and actions via a chat-driven interface. A comprehensive demonstration was executed using a prominent ticketing site, which facilitated users in obtaining a deeper understanding of the page while guiding them towards the successful purchase of concert tickets. By illustrating how vision language and LLM reasoning can be coupled with low-level browser control, this work lays the groundwork for future efforts in performance optimization, large-scale evaluation, and personalization across diverse web contexts.

Related Organizations

Algoma University
Canada

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Related to Research communities

UArctic