Dynamic Concept Composition for Zero-Example Event Detection

Xiaojun Chang,Yi Yang,Guodong Long,Chengqi Zhang,Alexander Hauptmann

Published 2016 in AAAI Conference on Artificial Intelligence

ABSTRACT

In this paper, we focus on automatically detecting events in unconstrained videos without the use of any visual training exemplars. In principle, zero-shot learning makes it possible to train an event detection model based on the assumption that events (e.g. birthday party) can be described by multiple mid-level semantic concepts (e.g. ``blowing candle'', ``birthday cake''). Towards this goal, we first pre-train a bundle of concept classifiers using data from other sources. Then we evaluate the semantic correlation of each concept w.r.t. the event of interest and pick up the relevant concept classifiers, which are applied on all test videos to get multiple prediction score vectors. While most existing systems combine the predictions of the concept classifiers with fixed weights, we propose to learn the optimal weights of the concept classifiers for each testing video by exploring a set of online available videos with free-form text descriptions of their content. To validate the effectiveness of the proposed approach, we have conducted extensive experiments on the latest TRECVID MEDTest 2014, MEDTest 2013 and CCV dataset. The experimental results confirm the superiority of the proposed approach.

PUBLICATION RECORD

Publication year
2016
Venue
AAAI Conference on Artificial Intelligence
Publication date
2016-01-14
Fields of study
Computer Science
Identifiers
DOI 10.1609/aaai.v30i1.10474 arXiv 1601.03679
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Large-Scale Multi-View Spectral Clustering via Bipartite Graph
2015cited by this paper
Multimedia Event Detection
2015cited by this paper
Searching Persuasively: Joint Event Detection and Evidence Recounting with Limited Supervision
2015influential reference
Learning Sample Specific Weights for Late Fusion
2015influential reference
Event Oriented Dictionary Learning for Complex Event Detection
2015cited by this paper
Complex Event Detection via Event Oriented Dictionary Learning
2015cited by this paper
Semantic Concept Discovery for Large-Scale Zero-Shot Event Detection
2015influential reference
Complex Event Detection using Semantic Saliency and Nearly-Isotonic SVM
2015cited by this paper
DISCOVER: Discovering Important Segments for Classification of Video Events and Recounting
2014cited by this paper
Temporal Sequence Modeling for Video Event Detection
2014cited by this paper
Video Event Detection by Inferring Temporal Instance Labels
2014influential reference
Self-Paced Learning with Diversity
2014influential reference
Composite Concept Discovery for Zero-Shot Video Event Detection
2014cited by this paper
COSTA: Co-Occurrence Statistics for Zero-Shot Classification
2014cited by this paper
Event-Driven Semantic Concept Discovery by Exploiting Weakly Tagged Internet Images
2014cited by this paper
Instructional Videos for Unsupervised Harvesting and Learning of Action Examples
2014cited by this paper
Large-Scale Video Classification with Convolutional Neural Networks
2014cited by this paper
Clustering and projected clustering with adaptive neighbors
2014cited by this paper
Dynamic Pooling for Complex Event Recognition
2013cited by this paper
On Decomposing the Proximal Map
2013cited by this paper
Sample-Specific Late Fusion for Visual Category Recognition
2013influential reference
Multi-attribute Queries: To Merge or Not to Merge?
2013cited by this paper
Searching informative concept banks for video event detection
2013cited by this paper
Zero-shot video retrieval using content and concepts
2013cited by this paper
Recommendations for video event recognition using concept vocabularies
2013cited by this paper
Distributed Representations of Words and Phrases and their Compositionality
2013cited by this paper
UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild
2012cited by this paper
Knowledge adaptation for ad hoc multimedia event detection with few exemplars
2012cited by this paper
Scene Aligned Pooling for Complex Video Recognition
2012cited by this paper
Sparse Support Vector Infinite Push
2012cited by this paper
Distinctive Image Features from Scale-Invariant Keypoints Abstract by Matthijs Dorst Based on the paper by
2011cited by this paper
Consumer video understanding: a benchmark database and an evaluation of human and machine performance
2011cited by this paper
TRECVID 2011 - An Overview of the Goals, Tasks, Data,Evaluation Mechanisms, and Metrics
2011cited by this paper
Improving the Fisher Kernel for Large-Scale Image Classification
2010influential reference
Learning mid-level features for recognition
2010cited by this paper
Notes on the OpenSURF Library
2009cited by this paper
Learning to detect unseen object classes by between-class attribute transfer
2009cited by this paper
Zero-shot Learning with Semantic Output Codes
2009influential reference
Parallel Support Vector Machines: The Cascade SVM
2004influential reference
Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering
2001cited by this paper
Cognition and Categorization
1980cited by this paper
Author manuscript, published in "International Conference on Computer Vision (2013)" Action Recognition with Improved Trajectories
year unknowninfluential reference

CITED BY

Leveraging Transformers for Weakly Supervised Object Localization in Unconstrained Videos
2024cites this paper
A Systematic Review of Event-Matching Methods for Complex Event Detection in Video Streams
2024cites this paper
CoLo-CAM: Class Activation Mapping for Object Co-Localization in Weakly-Labeled Unconstrained Videos
2023cites this paper
Semantics Guided Contrastive Learning of Transformers for Zero-shot Temporal Activity Detection
2023cites this paper
Zero-Shot Video Event Detection With High-Order Semantic Concept Discovery and Matching
2022influential citation
TCAM: Temporal Class Activation Maps for Object Localization in Weakly-Labeled Unconstrained Videos
2022cites this paper
Object Priors for Classifying and Localizing Unseen Actions
2021cites this paper
Zero-Shot Action Recognition from Diverse Object-Scene Compositions
2021cites this paper
Semantics-Guided Contrastive Network for Zero-Shot Object Detection
2021cites this paper
ZSTAD: Zero-Shot Temporal Activity Detection
2020cites this paper
Grounding Visual Concepts for Zero-Shot Event Detection and Event Captioning
2020cites this paper
Shuffled ImageNet Banks for Video Event Detection and Search
2020cites this paper
Exploiting Mid-Level Semantics for Large-Scale Complex Video Classification
2019cites this paper
Embodied One-Shot Video Recognition: Learning from Actions of a Virtual Embodied Agent
2019cites this paper
Learning to rank images for complex queries in concept-based search
2018cites this paper
Stable and orthogonal local discriminant embedding using trace ratio criterion for dimensionality reduction
2018cites this paper
Watching a Small Portion could be as Good as Watching All: Towards Efficient Video Classification
2018cites this paper
Structured Summarization of Social Web for Smart Emergency Services by Uncertain Concept Graph
2018cites this paper
From Text to Video: Exploiting Mid-Level Semantics for Large-Scale Video Classification
2018cites this paper
Determining the best attributes for surveillance video keywords generation
2018cites this paper
Positive Negative Positive Negative Positive Negative Meaningful Subspace 2 D 3 Boxy 3 D 3 Boxy
2018cites this paper
Semantic Reasoning in Zero Example Video Event Retrieval
2017influential citation
Exploring Commonality and Individuality for Multi-Modal Curriculum Learning
2017cites this paper
Recent Advances in Zero-Shot Recognition: Toward Data-Efficient Understanding of Visual Content
2017cites this paper
Unified Embedding and Metric Learning for Zero-Exemplar Event Detection
2017influential citation
SPFTN: A Self-Paced Fine-Tuning Network for Segmenting Objects in Weakly Labelled Videos
2017cites this paper
Leveraging Weak Semantic Relevance for Complex Video Event Classification
2017cites this paper
Complex event detection via attention-based video representation and classification
2017cites this paper
An efficient multi-feature SVM solver for complex event detection
2017cites this paper
Zero-Shot Action Recognition with Error-Correcting Output Codes
2017cites this paper
A Weight-Adaptive Laplacian Embedding for Graph-Based Clustering
2017cites this paper
LGA: latent genre aware micro-video recommendation on social media
2017cites this paper
Unsupervised 2D Dimensionality Reduction with Adaptive Structure Learning
2017cites this paper
Concept Language Models and Event-based Concept Number Selection for Zero-example Event Detection
2017cites this paper
The Many Shades of Negativity
2017cites this paper
Face detection of golden monkeys via regional color quantization and incremental self-paced curriculum learning
2017cites this paper
Semantic Reasoning in Zero Example Video Event Retrieval
2017influential citation
Event-based media processing and analysis: A survey of the literature
2016cites this paper
A Learning-based Frame Pooling Model For Event Detection
2016cites this paper
ARGIS-based outdoor underground pipeline information system
2016cites this paper
Multi-Task Zero-Shot Action Recognition with Prioritised Data Augmentation
2016cites this paper
What is the best way for extracting meaningful attributes from pictures?
2016cites this paper
Unsupervised automatic attribute discovery method via multi-graph clustering
2016cites this paper
They are Not Equally Reliable: Semantic Event Search Using Differentiated Concept Classifiers
2016cites this paper
Exploring semantic concepts for complex event analysis in unconstrained video clips
2016cites this paper
Automatic and quantitative evaluation of attribute discovery methods
2016cites this paper
Determining the best attributes for surveillance video keywords generation
2016cites this paper
Robust Automatic Target Recognition Algorithm for Large-Scene SAR Images and Its Adaptability Analysis on Speckle
2016cites this paper
A novel learning-based frame pooling method for event detection
2016cites this paper
Beyond Semantic Attributes: Discrete Latent Attributes Learning for Zero-Shot Recognition
2016cites this paper
REM-Net: Recursive Erasure Memory Network for Explanation Refinement on Commonsense Question Answering
year unknowncites this paper