Feedforward object-vision models only tolerate small image variations compared to human

M. Ghodrati,Amirhossein Farzmahdi,Karim Rajaei,R. Ebrahimpour,Seyed-Mahdi Khaligh-Razavi

Published 2014 in Frontiers in Computational Neuroscience

ABSTRACT

Invariant object recognition is a remarkable ability of primates' visual system that its underlying mechanism has constantly been under intense investigations. Computational modeling is a valuable tool toward understanding the processes involved in invariant object recognition. Although recent computational models have shown outstanding performances on challenging image databases, they fail to perform well in image categorization under more complex image variations. Studies have shown that making sparse representation of objects by extracting more informative visual features through a feedforward sweep can lead to higher recognition performances. Here, however, we show that when the complexity of image variations is high, even this approach results in poor performance compared to humans. To assess the performance of models and humans in invariant object recognition tasks, we built a parametrically controlled image database consisting of several object categories varied in different dimensions and levels, rendered from 3D planes. Comparing the performance of several object recognition models with human observers shows that only in low-level image variations the models perform similar to humans in categorization tasks. Furthermore, the results of our behavioral experiments demonstrate that, even under difficult experimental conditions (i.e., briefly presented masked stimuli with complex image variations), human observers performed outstandingly well, suggesting that the models are still far from resembling humans in invariant object recognition. Taken together, we suggest that learning sparse informative visual features, although desirable, is not a complete solution for future progresses in object-vision modeling. We show that this approach is not of significant help in solving the computational crux of object recognition (i.e., invariant object recognition) when the identity-preserving image variations become more complex.

PUBLICATION RECORD

Publication year
2014
Venue
Frontiers in Computational Neuroscience
Publication date
2014-07-18
Fields of study
Medicine, Computer Science
Identifiers
DOI 10.3389/fncom.2014.00074 PMID 25100986 PMCID 4103258
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar, PubMed

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Performance-optimized hierarchical models predict neural responses in higher visual cortex
2014influential reference
Decoding neural representational spaces using multivariate pattern analysis.
2014cited by this paper
Do we understand high-level vision?
2014influential reference
The dynamics of invariant object recognition in the human visual system.
2014cited by this paper
Resolving human object recognition in space and time
2014influential reference
A Toolbox for Representational Similarity Analysis
2014cited by this paper
Anatomy of hierarchy: Feedforward and feedback pathways in macaque visual cortex
2013cited by this paper
Unsupervised Learning of Invariant Representations in Hierarchical Architectures
2013cited by this paper
Population-code representations of natural images across human visual areas
2013cited by this paper
Newborn chickens generate invariant object representations at the onset of visual object experience
2013cited by this paper
Multifeatural Shape Processing in Rats Engaged in Invariant Visual Object Recognition
2013cited by this paper
Recurrent Processing during Object Recognition
2013cited by this paper
Trade-off between curvature tuning and position invariance in visual area V4
2013cited by this paper
DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition
2013cited by this paper
Representational dynamics of object vision: the first 1000 ms.
2013cited by this paper
Shape Similarity, Better than Semantic Membership, Accounts for the Structure of Visual Object Representations in a Population of Monkey Inferotemporal Neurons
2013cited by this paper
Detecting meaning in RSVP at 13 ms per picture
2013cited by this paper
A functional and perceptual signature of the second visual area in primates
2013cited by this paper
Learning invariant representations and applications to face verification
2013cited by this paper
Toward a unified theory of visual area V4.
2012cited by this paper
A Stable Biologically Motivated Learning Mechanism for Visual Feature Extraction to Handle Facial Categorization
2012cited by this paper
ImageNet classification with deep convolutional neural networks
2012cited by this paper
Balanced Increases in Selectivity and Tolerance Produce Constant Sparseness along the Ventral Visual Stream
2012cited by this paper
How does the brain solve visual object recognition?
2012cited by this paper
Invariant Visual Object and Face Recognition: Neural and Computational Bases, and a Model, VisNet
2012cited by this paper
The Limits of Feedforward Vision: Recurrent Processing Promotes Robust Object Recognition when Objects Are Degraded
2012cited by this paper
How Can Selection of Biologically Inspired Features Improve the Performance of a Robust Object Recognition Model?
2012influential reference
Incremental Learning by Message Passing in Hierarchical Temporal Memory
2012cited by this paper
The Characteristics and Limits of Rapid Visual Categorization
2011cited by this paper
Comparing state-of-the-art visual features on invariant object recognition tasks
2011cited by this paper
The Timing of Visual Object Categorization
2011cited by this paper
High temporal resolution decoding of object position and category.
2011cited by this paper
Recurrent Processing in V1/V2 Contributes to Categorization of Natural Scenes
2011cited by this paper
Encoding and decoding in fMRI
2011cited by this paper
A neural model of the temporal dynamics of figure-ground segregation in motion perception
2010cited by this paper
Renovation of Journal of Visualization
2010cited by this paper
On the limits of feed-forward processing in visual object recognition
2010cited by this paper
Selectivity and Tolerance (“Invariance”) Both Increase as Visual Information Propagates from Cortical Area V4 to IT
2010cited by this paper
Functional Compartmentalization and Viewpoint Generalization Within the Macaque Face-Processing System
2010cited by this paper
What is the best multi-stage architecture for object recognition?
2009cited by this paper
Relating Population-Code Representations between Man, Monkey, and Computational Models
2009cited by this paper
Temporal Constraints
2009cited by this paper
Matching categorical object representations in inferior temporal cortex of man and monkey.
2008cited by this paper
Representational Similarity Analysis – Connecting the Branches of Systems Neuroscience
2008influential reference
Why is Real-World Visual Object Recognition Hard?
2008influential reference
Feedforward and Recurrent Processing in Scene Segmentation: Electroencephalography and Functional Magnetic Resonance Imaging
2008cited by this paper
Trade-Off between Object Selectivity and Tolerance in Monkey Inferotemporal Cortex
2007cited by this paper
Object category structure in response patterns of neuronal population in monkey inferior temporal cortex.
2007cited by this paper
A feedforward architecture accounts for rapid categorization
2007influential reference
Untangling invariant object recognition.
2007cited by this paper
Visual object recognition: do we know more now than we did 20 years ago?
2007cited by this paper
Robust Object Recognition with Cortex-Like Mechanisms
2007influential reference
Object selectivity of local field potentials and spikes in the macaque inferior temporal cortex.
2006cited by this paper
Multiclass Object Recognition with Sparse, Localized Features
2006cited by this paper
Ultra-rapid object detection with saccadic eye movements: visual processing speed revisited.
2006cited by this paper
Do We Know What the Early Visual System Does?
2005cited by this paper
Fast Readout of Object Identity from Macaque Inferior Temporal Cortex
2005cited by this paper
The time course of visual processing: backward masking and natural scene categorisation.
2005cited by this paper
Figure–ground segregation requires two distinct periods of activity in V1: a transcranial magnetic stimulation study
2005cited by this paper
FigureGround Segregation in a Recurrent Network Architecture
2002cited by this paper
Visual features of intermediate complexity and their use in classification
2002cited by this paper
The Time Course of Visual Processing: From Early Perception to Decision-Making
2001cited by this paper
Distributed and Overlapping Representations of Faces and Objects in Ventral Temporal Cortex
2001cited by this paper
The distinct modes of vision offered by feedforward and recurrent processing.
2000cited by this paper
Temporal constraints on the grouping of contour segments into spatially extended objects.
1999cited by this paper
Separate processing dynamics for texture elements, boundaries and surfaces in primary visual cortex of the macaque monkey.
1999cited by this paper
Hierarchical models of object recognition in cortex
1999cited by this paper
Convolutional networks for images, speech, and time series
1998cited by this paper
The Psychophysics Toolbox.
1997cited by this paper
The VideoToolbox software for visual psychophysics: transforming numbers into movies.
1997cited by this paper
Invariant face and object recognition in the visual system.
1997cited by this paper
Visual object recognition.
1996cited by this paper
Inferotemporal cortex and object vision.
1996cited by this paper
Speed of processing in the human visual system
1996influential reference
The neurophysiology of figure-ground segregation in primary visual cortex
1995cited by this paper
Distributed hierarchical processing in the primate cerebral cortex.
1991cited by this paper
Adaptive pattern classification and universal recoding: I. Parallel development and coding of neural feature detectors
1976cited by this paper
Recognition memory for a rapid sequence of pictures.
1969cited by this paper
Receptive fields and functional architecture of monkey striate cortex
1968cited by this paper
Receptive fields, binocular interaction and functional architecture in the cat's visual cortex
1962cited by this paper
Annals of the New York Academy of Sciences Vision: Are Models of Object Recognition Catching up with the Brain?
year unknowncited by this paper

CITED BY

Predictive coding narrows the gap between convolutional networks and human brain function in misspelled-word reading
2026cites this paper
Unveiling the content of frontal feedback in challenging object recognition
2025cites this paper
Measuring Error Alignment for Decision-Making Systems
2024cites this paper
Spiking representation learning for associative memories
2024cites this paper
Multimodal contrastive learning for brain-machine fusion: From brain-in-the-loop modeling to brain-out-of-the-loop application
2024cites this paper
A fully spiking coupled model of a deep neural network and a recurrent attractor explains dynamics of decision making in an object recognition task
2024cites this paper
Recurrent issues with deep neural network models of visual recognition
2024cites this paper
Self-Calibrating Vicinal Risk Minimisation for Model Calibration
2024influential citation
Convolutional neural networks for vision neuroscience: significance, developments, and outstanding issues
2023cites this paper
Layerwise complexity-matched learning yields an improved model of cortical area V2
2023cites this paper
Self-attention in vision transformers performs perceptual grouping, not attention
2023cites this paper
Deeper neural network models better reflect how humans cope with contrast variation in object recognition.
2023cites this paper
Resolving the neural mechanism of core object recognition in space and time: A computational approach.
2022cites this paper
Informative neural representations of unseen contents during higher-order processing in human brains and deep artificial networks
2021cites this paper
The Comparison of Environmental Constraints Changes on Quiet Eye Factors during Performance Skill of Throw Targeting
2021cites this paper
Task‐dependent neural representations of visual object categories
2021cites this paper
A temporal hierarchical feedforward model explains both the time and the accuracy of object recognition
2021cites this paper
Challenges and Opportunities of End-to-End Learning in Medical Image Classification
2020influential citation
Beyond accuracy: quantifying trial-by-trial behaviour of CNNs and humans by measuring error consistency
2020cites this paper
Convolutional Neural Networks as a Model of the Visual System: Past, Present, and Future
2020cites this paper
Distinguishing mirror from glass: A “big data” approach to material perception
2019cites this paper
What causes the dip in object recognition rotation functions
2019cites this paper
Beyond Core Object Recognition : Recurrent 1 processes account for object recognition 2 under occlusion 3
2019cites this paper
Beyond core object recognition: Recurrent processes account for object recognition under occlusion
2018cites this paper
A temporal neural network model for object recognition using a biologically plausible decision making layer
2018cites this paper
Domain Adaptation for Deviating Acquisition Protocols in CNN-based Lesion Classification on Diffusion-Weighted MR Images
2018cites this paper
Large-Scale, High-Resolution Comparison of the Core Visual Object Recognition Behavior of Humans, Monkeys, and State-of-the-Art Deep Artificial Neural Networks
2018cites this paper
Three-stage processing of category and variation information by entangled interactive mechanisms of peri-occipital and peri-frontal cortices
2018influential citation
Comparing computational models of vision to human behaviour
2018influential citation
Towards a Theory of Computation in the Visual Cortex
2017cites this paper
Invariant object recognition is a personalized selection of invariant features in humans, not simply explained by hierarchical feed-forward vision models
2017influential citation
Hard-wired feed-forward visual mechanisms of the brain compensate for affine variations in object recognition.
2017cites this paper
Models of visual categorization.
2016cites this paper
Humans and Deep Networks Largely Agree on Which Kinds of Variation Make Object Recognition Harder
2016cites this paper
How popular CNNs perform in real applications of face recognition
2016cites this paper
Incorporating Prototype Theory in Convolutional Neural Networks
2016cites this paper
The Role of Typicality in Object Classification: Improving The Generalization Capacity of Convolutional Neural Networks
2016cites this paper
STDP-based spiking deep neural networks for object recognition
2016cites this paper
Bio-inspired unsupervised learning of visual features leads to robust invariant object recognition
2015cites this paper
University of Birmingham Hierarchical Object Representations in the Visual Cortex and Computer Vision
2015cites this paper
Measuring and Understanding Sensory Representations within Deep Networks Using a Numerical Optimization Framework
2015cites this paper
A specialized face-processing network consistent with the representational geometry of monkey face patches
2015cites this paper
Editorial: Hierarchical Object Representations in the Visual Cortex and Computer Vision
2015cites this paper
Creating a Human Similarity Ratings Benchmark Database for Artificial Neural Networks
2015cites this paper
Deep Networks Can Resemble Human Feed-forward Vision in Invariant Object Recognition
2015influential citation
Deep Supervised, but Not Unsupervised, Models May Explain IT Cortical Representation
2014cites this paper
What you need to know about the state-of-the-art computational models of object-vision: A tour through the models
2014cites this paper
Fixed versus mixed RSA: Explaining visual representations by fixed and mixed feature sets from shallow and deep computational models
2014cites this paper
Scholarship@Western Scholarship@Western
year unknowncites this paper