Vision Relation Transformer for Unbiased Scene Graph Generation

Gopika Sudhakaran,D. Dhami,K. Kersting,S. Roth

Published 2023 in IEEE International Conference on Computer Vision

ABSTRACT

Recent years have seen a growing interest in Scene Graph Generation (SGG), a comprehensive visual scene understanding task that aims to predict entity relationships using a relation encoder-decoder pipeline stacked on top of an object encoder-decoder backbone. Unfortunately, current SGG methods suffer from an information loss regarding the entities’ local-level cues during the relation encoding process. To mitigate this, we introduce the Vision rElation TransfOrmer (VETO), consisting of a novel local-level entity relation encoder. We further observe that many existing SGG methods claim to be unbiased, but are still biased towards either head or tail classes. To overcome this bias, we introduce a Mutually Exclusive ExperT (MEET) learning strategy that captures important relation features without bias towards head or tail classes. Experimental results on the VG and GQA datasets demonstrate that VETO + MEET boosts the predictive performance by up to 47% over the state of the art while being ∼ 10× smaller.1

PUBLICATION RECORD

Publication year
2023
Venue
IEEE International Conference on Computer Vision
Publication date
2023-08-18
Fields of study
Computer Science, Engineering
Identifiers
DOI 10.1109/ICCV51070.2023.02000 arXiv 2308.09472
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Fine-Grained Scene Graph Generation with Data Transfer
2022influential reference
Stacked Hybrid-Attention and Group Collaborative Learning for Unbiased Scene Graph Generation
2022influential reference
Gtnet: guided transformer network for detecting human-object interactions
2021cited by this paper
Recovering the Unbiased Scene Graphs from the Biased Ones
2021cited by this paper
Context-aware Scene Graph Generation with Seq2Seq Transformers
2021cited by this paper
Energy-Based Learning for Scene Graph Generation
2021cited by this paper
Bipartite Graph Network with Adaptive Message Passing for Unbiased Scene Graph Generation
2021influential reference
Learning of Visual Relations: The Devil is in the Tails
2021cited by this paper
GPS-Net: Graph Property Sensing Network for Scene Graph Generation
2020cited by this paper
VSGNet: Spatial Attention Network for Detecting Human Object Interactions Using Graph Convolutions
2020cited by this paper
Bridging Knowledge Graphs to Generate Scene Graphs
2020influential reference
Unbiased Scene Graph Generation From Biased Training
2020influential reference
Learning to Recover 3D Scene Shape from a Single Image
2020cited by this paper
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
2020influential reference
CogTree: Cognition Tree Loss for Unbiased Scene Graph Generation
2020cited by this paper
PCPL: Predicate-Correlation Perception Learning for Unbiased Scene Graph Generation
2020cited by this paper
Tackling the Unannotated: Scene Graph Generation with Bias-Reduced Models
2020cited by this paper
End-to-End Object Detection with Transformers
2020cited by this paper
Relation Transformer Network
2020cited by this paper
Exploring Context and Visual Pattern of Relationship for Scene Graph Generation
2019cited by this paper
Knowledge-Embedded Routing Network for Scene Graph Generation
2019cited by this paper
Scene Graph Generation With External Knowledge and Image Reconstruction
2019cited by this paper
Improving Visual Relation Detection using Depth Maps
2019influential reference
GQA: A New Dataset for Real-World Visual Reasoning and Compositional Question Answering
2019influential reference
Stand-Alone Self-Attention in Vision Models
2019cited by this paper
Learning to Detect Human-Object Interactions With Knowledge
2019cited by this paper
VisualBERT: A Simple and Performant Baseline for Vision and Language
2019cited by this paper
Graph R-CNN for Scene Graph Generation
2018cited by this paper
Zoom-Net: Mining Deep Feature Interactions for Visual Relationship Recognition
2018cited by this paper
Explainable and Explicit Visual Reasoning Over Scene Graphs
2018cited by this paper
Learning to Compose Dynamic Tree Structures for Visual Contexts
2018influential reference
Referring Relationships
2018cited by this paper
LinkNet: Relational Embedding for Scene Graph
2018cited by this paper
iCAN: Instance-Centric Attention Network for Human-Object Interaction Detection
2018cited by this paper
A Comprehensive Survey of Deep Learning for Image Captioning
2018cited by this paper
Visual Relationship Prediction via Label Clustering and Incorporation of Depth Information
2018cited by this paper
Attention is All you Need
2017cited by this paper
Visual Relationship Detection with Internal and External Linguistic Knowledge Distillation
2017cited by this paper
Neural Motifs: Scene Graph Parsing with Global Context
2017influential reference
Graph Attention Networks
2017cited by this paper
Scene Graph Generation by Iterative Message Passing
2017cited by this paper
Detecting Visual Relationships with Deep Relational Networks
2017cited by this paper
Natural Language Guided Visual Relationship Detection
2017cited by this paper
Mask R-CNN
2017cited by this paper
Scene Graph Generation from Objects, Phrases and Region Captions
2017cited by this paper
Visual Translation Embedding Network for Visual Relation Detection
2017cited by this paper
Feature Pyramid Networks for Object Detection
2016cited by this paper
Aggregated Residual Transformations for Deep Neural Networks
2016influential reference
Visual Relationship Detection with Language Priors
2016cited by this paper
Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations
2016cited by this paper
Deep Residual Learning for Image Recognition
2015cited by this paper
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
2015cited by this paper
VQA: Visual Question Answering
2015cited by this paper
Adam: A Method for Stochastic Optimization
2014cited by this paper
Question Answering
2010cited by this paper

CITED BY

CDC: Enhancing Scene Graph Generation for IoST-Driven Social Behavioral Modeling With Cooperative Dual Classifier
2026influential citation
DuoNet: Joint Optimization of Representation Learning and Prototype Classifier for Unbiased Scene Graph Generation
2026cites this paper
Salience-SGG: Enhancing Unbiased Scene Graph Generation with Iterative Salience Estimation
2026cites this paper
Universal Scene Graph Generation
2025cites this paper
RelationLMM: Large Multimodal Model as Open and Versatile Visual Relationship Generalist
2025cites this paper
Learning 4D Panoptic Scene Graph Generation from Rich 2D Visual Scene
2025cites this paper
A Reverse Causal Framework to Mitigate Spurious Correlations for Debiasing Scene Graph Generation
2025cites this paper
From Data to Modeling: Fully Open-vocabulary Scene Graph Generation
2025cites this paper
Front-door causal attention for unbiased panoptic scene graph generation
2025cites this paper
Hybrid Reciprocal Transformer with Triplet Feature Alignment for Scene Graph Generation
2025cites this paper
Reusing Attention for One-stage Lane Topology Understanding
2025cites this paper
ART: Adaptive Relation Tuning for Generalized Relation Prediction
2025influential citation
Query-guided predicate decoupling and prototype approximation learning for scene graph generation
2025cites this paper
Interaction-Centric Knowledge Infusion and Transfer for Open-Vocabulary Scene Graph Generation
2025cites this paper
Scene Graph Generation Based on Depth Information and Feature Enhancement
2024cites this paper
Adaptive Feature Learning for Unbiased Scene Graph Generation
2024cites this paper
Scene Graph Generation Strategy with Co-occurrence Knowledge and Learnable Term Frequency
2024cites this paper
Leveraging Predicate and Triplet Learning for Scene Graph Generation
2024influential citation
Semantic Diversity-aware Prototype-based Learning for Unbiased Scene Graph Generation
2024cites this paper
DIAGen: Semantically Diverse Image Augmentation with Generative Models for Few-Shot Learning
2024cites this paper
Hydra-SGG: Hybrid Relation Assignment for One-stage Scene Graph Generation
2024influential citation
Causal Intervention for Panoptic Scene Graph Generation
2024cites this paper
EGTR: Extracting Graph from Transformer for Scene Graph Generation
2024cites this paper
GGBDCA:Scene graph generation based on Global Gradient Balanced Distribution and Compound Attention
2024cites this paper
DeiSAM: Segment Anything with Deictic Prompting
2024influential citation
ALF: Adaptive Label Finetuning for Scene Graph Generation
2023cites this paper