Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition

Ken Ziyu Liu,Hongwen Zhang,Zhenghao Chen,Zhiyong Wang,Wanli Ouyang

Published 2020 in Computer Vision and Pattern Recognition

ABSTRACT

Spatial-temporal graphs have been widely used by skeleton-based action recognition algorithms to model human action dynamics. To capture robust movement patterns from these graphs, long-range and multi-scale context aggregation and spatial-temporal dependency modeling are critical aspects of a powerful feature extractor. However, existing methods have limitations in achieving (1) unbiased long-range joint relationship modeling under multi-scale operators and (2) unobstructed cross-spacetime information flow for capturing complex spatial-temporal dependencies. In this work, we present (1) a simple method to disentangle multi-scale graph convolutions and (2) a unified spatial-temporal graph convolutional operator named G3D. The proposed multi-scale aggregation scheme disentangles the importance of nodes in different neighborhoods for effective long-range modeling. The proposed G3D module leverages dense cross-spacetime edges as skip connections for direct information propagation across the spatial-temporal graph. By coupling these proposals, we develop a powerful feature extractor named MS-G3D based on which our model outperforms previous state-of-the-art methods on three large-scale datasets: NTU RGB+D 60, NTU RGB+D 120, and Kinetics Skeleton 400.

PUBLICATION RECORD

Publication year
2020
Venue
Computer Vision and Pattern Recognition
Publication date
2020-03-31
Fields of study
Computer Science
Identifiers
DOI 10.1109/cvpr42600.2020.00022 arXiv 2003.14111
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Spatio-Temporal Graph Routing for Skeleton-Based Action Recognition
2019influential reference
Graph U-Nets
2019cited by this paper
NTU RGB+D 120: A Large-Scale Benchmark for 3D Human Activity Understanding
2019cited by this paper
Break the Ceiling: Stronger Multi-scale Deep Graph Convolutional Networks
2019cited by this paper
MixHop: Higher-Order Graph Convolutional Architectures via Sparsified Neighborhood Mixing
2019influential reference
Actional-Structural Graph Convolutional Networks for Skeleton-Based Action Recognition
2019influential reference
Skeleton-Based Action Recognition With Directed Graph Neural Networks
2019influential reference
Graph WaveNet for Deep Spatial-Temporal Graph Modeling
2019cited by this paper
An Attention Enhanced Graph Convolutional LSTM Network for Skeleton-Based Action Recognition
2019influential reference
Simplifying Graph Convolutional Networks
2019influential reference
LanczosNet: Multi-Scale Deep Graph Convolutional Networks
2019cited by this paper
A Comprehensive Survey on Graph Neural Networks
2019cited by this paper
Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition
2018influential reference
Memory Attention Networks for Skeleton-Based Action Recognition
2018cited by this paper
Optimized Skeleton-based Action Recognition via Sparsified Graph Regression
2018influential reference
Co-occurrence Feature Learning from Skeleton Data for Action Recognition and Detection with Hierarchical Aggregation
2018cited by this paper
Deep Graph Infomax
2018influential reference
Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition
2018influential reference
Skeleton-Based Action Recognition with Spatial Reasoning and Temporal Stack Learning
2018cited by this paper
Videos as Space-Time Region Graphs
2018cited by this paper
Hierarchical Graph Representation Learning with Differentiable Pooling
2018influential reference
Recognizing Human Actions as the Evolution of Pose Estimation Maps
2018cited by this paper
Adaptive Graph Convolutional Neural Networks
2018influential reference
Spatio-Temporal Graph Convolution for Skeleton Based Action Recognition
2018influential reference
Independently Recurrent Neural Network (IndRNN): Building A Longer and Deeper RNN
2018cited by this paper
Learning Clip Representations for Skeleton-Based 3D Action Recognition
2018cited by this paper
How Powerful are Graph Neural Networks?
2018influential reference
View Adaptive Recurrent Neural Networks for High Performance Human Action Recognition from Skeleton Data
2017cited by this paper
Skeleton-Based Human Action Recognition With Global Context-Aware Attention LSTM Networks
2017cited by this paper
Graph Attention Networks
2017cited by this paper
Inductive Representation Learning on Large Graphs
2017influential reference
Spatio-temporal Graph Convolutional Neural Network: A Deep Learning Framework for Traffic Forecasting
2017cited by this paper
Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks
2017cited by this paper
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour
2017cited by this paper
The Kinetics Human Action Video Dataset
2017influential reference
A Closer Look at Spatiotemporal Convolutions for Action Recognition
2017cited by this paper
Realtime Multi-person 2D Pose Estimation Using Part Affinity Fields
2016cited by this paper
An End-to-End Spatio-Temporal Attention Model for Human Action Recognition from Skeleton Data
2016cited by this paper
Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning
2016cited by this paper
Semi-Supervised Classification with Graph Convolutional Networks
2016influential reference
Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering
2016cited by this paper
Spatio-Temporal LSTM with Trust Gates for 3D Human Action Recognition
2016cited by this paper
NTU RGB+D: A Large Scale Dataset for 3D Human Activity Analysis
2016influential reference
Deep Residual Learning for Image Recognition
2015cited by this paper
Diffusion-Convolutional Neural Networks
2015cited by this paper
Deep Convolutional Networks on Graph-Structured Data
2015cited by this paper
Hierarchical recurrent neural network for skeleton based action recognition
2015cited by this paper
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
2015cited by this paper
Multi-Scale Context Aggregation by Dilated Convolutions
2015cited by this paper
Human Action Recognition by Representing 3D Skeletons as Points in a Lie Group
2014cited by this paper
Learning Spatiotemporal Features with 3D Convolutional Networks
2014cited by this paper
Spectral Networks and Locally Connected Networks on Graphs
2013influential reference
Mining actionlet ensemble for action recognition with depth cameras
2012cited by this paper
Wavelets on Graphs via Spectral Graph Theory
2009influential reference

CITED BY

SkeFi: Cross-Modal Knowledge Transfer for Wireless Skeleton-Based Action Recognition
2026influential citation
Spatial-temporal channel topology-optimized graph transformer for skeleton-based wild terrestrial mammalian behavior recognition
2026cites this paper
RelPosGAR: Hierarchical Relative Position-Aware Interaction Modeling for Weakly Supervised Skeleton-Based Group Activity Recognition
2026cites this paper
Hierarchical joint contrastive learning with knowledge distillation for self-supervised 3D skeleton-based action recognition
2026cites this paper
Dynamic-static partitioning mask and multi-dimensional attention mechanism for skeleton-based action recognition
2026cites this paper
TE-STGCN: Topology enhanced spatio-temporal graph convolutional network for skeleton-based action recognition
2026cites this paper
ST-VA-AR: Learning Velocity-Aware Action Representations with Mixture of Spatiotemporal Attention
2026cites this paper
Local-Global Feature Fusion for Enhancing 3D Human Pose Estimation
2026cites this paper
Multimodal action recognition in human-robot collaborative assembly: A contrastive semantic query approach
2026cites this paper
ASMa: Asymmetric Spatio-temporal Masking for Skeleton Action Representation Learning
2026cites this paper
E2E-GNet: An End-to-End Skeleton-based Geometric Deep Neural Network for Human Motion Recognition
2026cites this paper
Affinity Contrastive Learning for Skeleton-Based Human Activity Understanding
2026cites this paper
Fine-to-coarse self-attention graph convolutional network for skeleton-based action recognition
2026cites this paper
Frequency-Aware Spatio-Temporal Topology Learning for Skeleton-Based Human Activity Recognition
2026cites this paper
SkelFormer: An adaptive hierarchical transformer-based approach on skeleton graphs for human action recognition in video sequences
2026cites this paper
Accurate Courtship-Related Social Behavior Automated Recognition in Zebrafish
2026cites this paper
Recognition of Daily Activities through Multi-Modal Deep Learning: A Video, Pose, and Object-Aware Approach for Ambient Assisted Living
2026cites this paper
FMFNet: A Faster Multimodal Fusion Network for action recognition via efficient modality compensation
2026cites this paper
Egocentric Hand Activity Video Dataset and Bidirectional Motion-Priors for Hand Action Recognition
2026cites this paper
3D human pose estimation-based action recognition method for complex industrial scenarios
2026cites this paper
TDSN-GCN: Transformerify Overall Structure Decaying Static Graph Embedding NAS-Guided GCN for Skeleton Action Recognition
2026cites this paper
MAR-GCN: A meta-action refinement graph convolutional network for skeleton-based human action recognition
2026cites this paper
Skarimva: Skeleton-based Action Recognition is a Multi-view Application
2026cites this paper
ImpSGNv2: Improved Semantic-Guided Network with Attention-based Graph Convolution (GCNs) For Skeleton-based Action Recognition
2025cites this paper
Multimodal Raga Classification from Vocal Performances with Disentanglement and Contrastive Loss
2025cites this paper
Towards understanding human actions through long-short-term semantic motion encoding
2025cites this paper
UniSTFormer: Unified Spatio-Temporal Lightweight Transformer for Efficient Skeleton-Based Action Recognition
2025cites this paper
Client-Unbiased Skeletal Action Recognizer in Federated Learning
2025cites this paper
Doodle Your Keypoints: Sketch-Based Few-Shot Keypoint Detection
2025cites this paper
AeroGen: Ground-to-Air Generalization for Action Recognition
2025cites this paper
Frame topology fusion-based hierarchical graph convolution for automatic assessment of physical rehabilitation exercises
2025cites this paper
LSTF-GCN: local spatio-temporal feature fusion graph convolutional network for skeleton-based action recognition
2025cites this paper
Learning Adaptive Node Selection with External Attention for Human Interaction Recognition
2025cites this paper
Error-Guided Pose Augmentation: Enhancing Rehabilitation Exercise Assessment through Targeted Data Generation
2025cites this paper
Multi-scale Self-Attention Convolutional Networks for Skeleton-Based Action Recognition
2025influential citation
Efficient Real-Time Fine-Grained Action Recognition over a Progressive and Hierarchical Classification Framework
2025cites this paper
Multi-Modal Graph Convolutional Network with Sinusoidal Encoding for Robust Human Action Segmentation
2025cites this paper
Automated violence monitoring system for real-time fistfight detection using deep learning-based temporal action localization
2025cites this paper
Skeleton tokenized graph transformer via the Joint Bone Graph for action recognition
2025cites this paper
LLMs Encounter Critical Elements Prompts: Semantically Guided Partial Supervision Skeleton-Based Action Recognition
2025cites this paper
Attention mechanism based multimodal feature fusion network for human action recognition
2025cites this paper
Automated motor-leg scoring in stroke via a stable graph causality debiasing model
2025cites this paper
ATD-GCN: A human activity recognition approach for human-robot collaboration based on adaptive skeleton tree-decomposition
2025cites this paper
Frequency-Aware Self-Supervised Group Activity Recognition with skeleton sequences
2025cites this paper
Snippet-Aware Transformer With Multiple Action Elements for Skeleton-Based Action Segmentation
2025cites this paper
Robust Understanding of Human-robot Social Interactions through Multimodal Distillation
2025cites this paper
SHARDeg: A Benchmark for Skeletal Human Action Recognition in Degraded Scenarios
2025cites this paper
GAMA-Pose: Graph-Aware Multi-Representation Aggregation for 3D Human Pose Estimation
2025cites this paper
A lightweight model LGCSPNet for sitting posture risk management applications
2025cites this paper
End-to-end pose-action recognition via implicit pose encoding and multi-scale skeleton modeling
2025cites this paper
Spatial–Temporal Transformer for Optimizing Human Health Through Skeleton-Based Body Sports Action Recognition
2025influential citation
Dynamic multi-stream graph neural networks for efficient interactive action recognition
2025cites this paper
Learning Pose-Aware Representations in Vision Transformers for Understanding Activities of Daily Living
2025cites this paper
Skor-Xg: Skeleton-Oriented Expected Goal Estimation in Soccer
2025cites this paper
Dynamic Adaptive Graph Convolution with Attention for Skeleton-Based Action Recognition
2025influential citation
Video-Based Human-Object Interaction Analysis for Patient Behavioral Monitoring
2025cites this paper
ODMTCNet: An Interpretable Multiview Deep Neural Network Architecture for Feature Representation
2025cites this paper
Enhancing action recognition in educational settings using AI-driven information systems for public health monitoring
2025cites this paper
Research on human skeleton behavior recognition method based on GCN-Transformer hybrid network architecture
2025cites this paper
Hybrid-Supervised Hypergraph-Enhanced Transformer for Micro-Gesture Based Emotion Recognition
2025cites this paper
LiteFat: Lightweight Spatio-Temporal Graph Learning for Real-Time Driver Fatigue Detection
2025cites this paper
Motion Matters: Motion-guided Modulation Network for Skeleton-based Micro-Action Recognition
2025cites this paper
Enhanced fine-grained relearning for skeleton-based action recognition
2025cites this paper
Towards Generalizing Temporal Action Segmentation to Unseen Views
2025cites this paper
Text-Derived Relational Graph-Enhanced Network for Skeleton-Based Action Segmentation
2025cites this paper
VA-AR: Learning Velocity-Aware Action Representations with Mixture of Window Attention
2025cites this paper
Event-Guided Video Transformer for End-to-End 3D Human Pose Estimation
2025cites this paper
Event-Guided Fusion-Mamba for Context-Aware 3D Human Pose Estimation
2025cites this paper
A Systematic Review on Vision-Based Proactive Human Assembly Intention Recognition for Human-Centric Smart Manufacturing in Industry 5.0
2025cites this paper
Temporal refinement with channel topology attention network for skeleton-based action recognition
2025cites this paper
TCD-GCN-Light: A Lightweight Temporal-Channel Decoupling Graph Convolutional Network for human early action prediction based on channel fusion
2025cites this paper
Theater Scene Description for Human-Scene Interaction
2025influential citation
Soccer-CLIP: Vision Language Model for Soccer Action Spotting
2025cites this paper
Latent space improved masked reconstruction model for human skeleton-based action recognition
2025cites this paper
Online Hand Gesture Recognition Using Semantically Interpretable Attention Mechanism
2025cites this paper
Sparse and Dense: Learning Confusion Representation Network for 3-D Action Recognition
2025influential citation
Jointly Understand Your Command and Intention: Reciprocal Co-Evolution Between Scene-Aware 3D Human Motion Synthesis and Analysis
2025cites this paper
SeamFit: Towards Practical Smart Clothing for Automatic Exercise Logging
2025cites this paper
Dual Multi-Scale GCN with Deformable Temporal Kernel for Skeleton-based Action Recognition
2025cites this paper
SkeletonMix: A Mixup-Based Data Augmentation Framework for Skeleton-Based Action Recognition
2025cites this paper
Multi Activity Sequence Alignment via Implicit Clustering
2025cites this paper
Semantics-Assisted Training Graph Convolution Network for Skeleton-Based Action Recognition
2025cites this paper
Infant Action Generative Modeling
2025cites this paper
SSTAR: Skeleton-Based Spatio-Temporal Action Recognition for Intelligent Video Surveillance and Suicide Prevention in Metro Stations
2025influential citation
Fall recognition using a three stream spatio temporal GCN model with adaptive feature aggregation
2025cites this paper
Action Recognition in Real-World Ambient Assisted Living Environment
2025cites this paper
Cross-Scale Spatial Refinement Graph Convolutional Network for Skeleton-Based Action Recognition
2025cites this paper
MAF-Net: A multimodal data fusion approach for human action recognition
2025cites this paper
SkeletonX: Data-Efficient Skeleton-Based Action Recognition via Cross-Sample Feature Aggregation
2025cites this paper
Robust 2D Skeleton Action Recognition via Decoupling and Distilling 3D Latent Features
2025cites this paper
Multi-stage human motion prediction algorithm based on spatiotemporal graph convolution
2025cites this paper
CoCoDiff: Diversifying Skeleton Action Features via Coarse-Fine Text-Co-Guided Latent Diffusion
2025cites this paper
Empowering Deaf Communication: "Deep Learning Framework for Sign to Speech Translation"
2025cites this paper
Structural Topology Refinement Network for Skeleton-Based Action Recognition
2025cites this paper
MoCLIP: Motion-Aware Fine-Tuning and Distillation of CLIP for Human Motion Generation
2025cites this paper
The Journey of Action Recognition
2025cites this paper
Heterogeneous modal collaborative training network for human action recognition
2025cites this paper
Selectable receptive-field graph convolution networks for skeleton-based action recognition
2025cites this paper
InterHandNet: Capturing Two-hand Interaction for Robust Hand-washing Activity Recognition
2025cites this paper
Remote Sensing Surveillance Using Multilevel Feature Fusion and Deep Neural Network
2025cites this paper