VISOR: An AI-Powered Guiding Shield for Vision

Shreya Shinde,Sneh Patel,Supriya V. Mahadevkar,Archana Y. Chaudhari,Anilkumar Gupta

Published 2025 in 2025 Modern Electronics Devices and Intelligent Communication Systems (MEDCOM)

ABSTRACT

Visually impaired people are handicapped in perceiving and acting within their environment based on limited contextual information. Conventional aid tools like white canes and GPS devices give incomplete solutions but cannot offer real-time descriptive directions. With advances in multimodal artificial intelligence (Al), especially vision-language models, it is now feasible to develop smart, chat-like systems that combine vision, language, and speech. This paper introduces VISOR - Guiding Shield for Vision, a multimodal AI-driven assistive system that processes visual input with computer vision, narrates scenes based on vision-language models, and provides natural audio output through speech synthesis. With its integration of object detection, visual question answering, and speech interfaces, VIBE provides context-aware, real-time assistance to visually impaired users. The architecture is lightweight, portable, and modular, which enables the cost-effective solution with scalability for the future.

PUBLICATION RECORD

Publication year
2025
Venue
2025 Modern Electronics Devices and Intelligent Communication Systems (MEDCOM)
Publication date
2025-12-11
Fields of study
Not labeled
Identifiers
DOI 10.1109/medcom67532.2025.11405266
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models
2023cited by this paper
PaLM-E: An Embodied Multimodal Language Model
2023cited by this paper
Learning Transferable Visual Models From Natural Language Supervision
2021cited by this paper
YOLOv4: Optimal Speed and Accuracy of Object Detection
2020cited by this paper
Deep Learning
2016cited by this paper
Very Deep Convolutional Networks for Large-Scale Image Recognition
2014cited by this paper
Neural Machine Translation by Jointly Learning to Align and Translate
2014cited by this paper
ImageNet Large Scale Visual Recognition Challenge
2014cited by this paper

CITED BY

No citing papers are available for this paper.