SSGR-AR: Semantic-Enhanced Scene Graph Reasoning for Robust Video Action Recognition

Daxu Shi,Fan Qi,Changsheng Xu

Published 2025 in 2025 IEEE International Conference on Knowledge Graph (ICKG)

ABSTRACT

Due to the inherent complexity of video data, video action recognition faces significant challenges in modeling spatial-temporal dynamics and handling diverse scene contexts. Although scene graph-based methods can effectively model interactions between entities, most existing approaches overlook the rich semantic information embedded within scene graphs. Additionally, integrating large language models (LLMs) for semantic enhancement often suffers from hallucination problems, potentially introducing incorrect reasoning that misleads action recognition. To address these limitations, we propose SSGR-AR, a novel framework that structurally represents videos through scene graphs and constrain LLM reasoning using structured semantic paths derived from scene graph knowledge, ensuring controllable and reliable semantic enrichment. Moreover, we formulate entity alignment as a link prediction task and leverage a graph transformer to model the dynamic evolution of actions, thereby enhancing the model's capacity for long-term temporal reasoning. Experimental results on three widely used benchmark datasets show that our method outperforms state-of-the-art methods in terms of action recognition accuracy and generalization robustness.

PUBLICATION RECORD

CITATION MAP

EXTRACTION MAP

CLAIMS

  • No claims are published for this paper.

CONCEPTS

  • No concepts are published for this paper.

REFERENCES

Showing 1-32 of 32 references · Page 1 of 1

CITED BY

  • No citing papers are available for this paper.

Showing 0-0 of 0 citing papers · Page 1 of 1