Deep learning models fail to capture the configural nature of human shape perception

Published 2022 in iScience

ABSTRACT

Summary A hallmark of human object perception is sensitivity to the holistic configuration of the local shape features of an object. Deep convolutional neural networks (DCNNs) are currently the dominant models for object recognition processing in the visual cortex, but do they capture this configural sensitivity? To answer this question, we employed a dataset of animal silhouettes and created a variant of this dataset that disrupts the configuration of each object while preserving local features. While human performance was impacted by this manipulation, DCNN performance was not, indicating insensitivity to object configuration. Modifications to training and architecture to make networks more brain-like did not lead to configural processing, and none of the networks were able to accurately predict trial-by-trial human object judgements. We speculate that to match human configural sensitivity, networks must be trained to solve a broader range of object tasks beyond category recognition.

PUBLICATION RECORD

Publication year
2022
Venue
iScience
Publication date
2022-08-01
Fields of study
Medicine, Computer Science, Psychology
Identifiers
DOI 10.1016/j.isci.2022.104913 PMID 36060067 PMCID 9429800
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar, PubMed

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

CoAtNet: Marrying Convolution and Attention for All Data Sizes
2021cited by this paper
Local features and global shape information in object classification by deep convolutional neural networks.
2020cited by this paper
Recurrent neural circuits for contour detection
2020cited by this paper
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
2020cited by this paper
What We Could Learn About Holistic Face Processing Only From Nonface Objects
2020influential reference
Fast recurrent processing via ventral prefrontal cortex is needed by the primate ventral stream for robust core visual object recognition
2020influential reference
Neural Inverse Rendering of an Indoor Scene From a Single Image
2019cited by this paper
Inverse Rendering for Complex Indoor Scenes: Shape, Spatially-Varying Lighting and SVBRDF From a Single Image
2019cited by this paper
Res2Net: A New Multi-Scale Backbone Architecture
2019cited by this paper
Brain-Like Object Recognition with High-Performing Shallow Recurrent ANNs
2019cited by this paper
Approximating CNNs with Bag-of-local-Features models works surprisingly well on ImageNet
2019cited by this paper
InverseRenderNet: Learning Single Image Inverse Rendering
2018cited by this paper
Shape from Contour: Computation and Representation.
2018cited by this paper
Learning long-range spatial dependencies with horizontal gated-recurrent units
2018cited by this paper
Abstract Shape Representation in Human Visual Perception
2018cited by this paper
Brain-Score: Which Artificial Neural Network for Object Recognition is most Brain-Like?
2018influential reference
Evidence that recurrent circuits are critical to the ventral stream’s execution of core object recognition behavior
2018cited by this paper
Deep convolutional networks do not classify based on global object shape
2018cited by this paper
ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness
2018cited by this paper
The role of global cues in the perceptual grouping of natural shapes.
2018cited by this paper
Recurrent Convolutional Neural Networks: A Better Model of Biological Object Recognition
2017influential reference
Learning Multi-attention Convolutional Neural Network for Fine-Grained Image Recognition
2017cited by this paper
Squeeze-and-Excitation Networks
2017cited by this paper
StyleNet: Generating Attractive Visual Captions with Styles
2017cited by this paper
Revisiting Unreasonable Effectiveness of Data in Deep Learning Era
2017cited by this paper
Attention is All you Need
2017cited by this paper
Recurrent Processing in the Formation of Shape Percepts
2016cited by this paper
Deep Neural Networks as a Computational Model for Human Shape Sensitivity
2016cited by this paper
Learning without Forgetting
2016cited by this paper
Identity Mappings in Deep Residual Networks
2016influential reference
Beyond Faces and Expertise
2016cited by this paper
The “Parts and Wholes” of Face Recognition: A Review of the Literature
2016cited by this paper
Performance-optimized hierarchical models predict neural responses in higher visual cortex
2014cited by this paper
Very Deep Convolutional Networks for Large-Scale Image Recognition
2014cited by this paper
Deep Neural Networks Rival the Representation of Primate IT Cortex for Core Visual Object Recognition
2014cited by this paper
Deep Supervised, but Not Unsupervised, Models May Explain IT Cortical Representation
2014cited by this paper
The composite face illusion: A whole window into our understanding of holistic face perception
2013cited by this paper
A century of Gestalt psychology in visual perception: I. Perceptual grouping and figure-ground organization.
2012cited by this paper
A century of Gestalt psychology in visual perception: II. Conceptual and theoretical foundations.
2012cited by this paper
Emergence of Perceptual Gestalts in the Human Visual Cortex
2011cited by this paper
Representations of Vision: Trends and Tacit Assumptions in Vision Research
2009cited by this paper
Distinguishing the cause and consequence of face inversion: the perceptual field hypothesis.
2009cited by this paper
Cue dynamics underlying rapid detection of animals in natural scenes.
2009cited by this paper
Picture-plane inversion leads to qualitative changes of face perception.
2008cited by this paper
Semantic texton forests for image categorization and segmentation
2008cited by this paper
The role of neuronal synchronization in selective attention.
2007cited by this paper
Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study
2006cited by this paper
Components of bottom-up gaze allocation in natural images.
2005cited by this paper
Distinctive Image Features from Scale-Invariant Keypoints
2004cited by this paper
The emergence of kind-based object individuation in infancy.
2004cited by this paper
Is it an animal? Is it a human face? Fast processing in upright and inverted natural scenes.
2003cited by this paper
Ecological statistics of Gestalt laws for the perceptual organization of contours.
2002cited by this paper
Unraveling mechanisms for expert object recognition: bridging brain activity and behavior.
2002cited by this paper
Edge co-occurrence in natural images predicts contour grouping performance.
2001cited by this paper
Information and viewpoint dependence in face recognition.
1997cited by this paper
Grouping by Proximity and Multistability in Dot Lattices: A Quantitative Gestalt Theory
1995cited by this paper
A measure of closure.
1994cited by this paper
The effect of contour closure on the rapid discrimination of two-dimensional shapes.
1993cited by this paper
Parts and Wholes in Face Recognition
1993cited by this paper
What's up in top-down processing?
1991cited by this paper
Surface versus edge-based determinants of visual recognition.
1988cited by this paper
Upside-down faces: a review of the effect of inversion upon face recognition.
1988cited by this paper
The importance of shape in early lexical learning
1988cited by this paper
Configurational Information in Face Perception
1987cited by this paper
Perception of wholes and of their component parts: some configural superiority effects.
1977cited by this paper
Looking at Upside-down Faces
1969cited by this paper
A Coefficient of Agreement for Nominal Scales
1960cited by this paper
Laws of organization in perceptual forms.
1938cited by this paper
SOME FACTORS DETERMINING FIGURE-GROUND ARTICULATION
1936cited by this paper
The Quarterly Journal of Experimental Psychology Section A: Human Experimental Psychology When Inverted Faces Are Recognized: the Role of Configural Information in Face Recognition
year unknowncited by this paper

CITED BY

SPECTRA-Net: Spatiotemporal edge-preserving contextual reinforcement architecture for adaptive crowd behavior recognition
2026cites this paper
Behavioral differences between humans and machines arise early in visual processing.
2026cites this paper
Review of Image Analysis Using Deep Learning Method Applied to the Road Context in Degraded Weather Conditions
2026cites this paper
Potential role of developmental experience in the emergence of the parvo-magno distinction
2025cites this paper
Learning Object Focused Attention
2025influential citation
Alignment and Adversarial Robustness: Are More Human-Like Models More Secure?
2025cites this paper
Brain-Model Evaluations Need the NeuroAI Turing Test
2025cites this paper
Connecting the dots - Recognition of artificial and natural shapes relies on representing points of high information
2025cites this paper
Visual Anagrams Reveal Hidden Differences in Holistic Shape Processing Across Vision Models
2025cites this paper
MindSet: Vision. A toolbox for testing DNNs on key psychological experiments
2024influential citation
Population encoding of stimulus features along the visual hierarchy
2024cites this paper
Teaching deep networks to see shape: Lessons from a simplified visual world
2024influential citation
Editorial: Perceptual organization in computer and biological vision
2024cites this paper
Configural processing as an optimized strategy for robust object recognition in neural networks
2024cites this paper
Newly sighted perceivers and the relation between sight and touch
2024cites this paper
Deep convolutional neural networks are sensitive to face configuration
2024cites this paper
A computational deep learning investigation of animacy perception in the human brain
2024cites this paper
Human shape perception spontaneously discovers the biological origin of novel, but natural, stimuli
2024cites this paper
Image curvature assessment in the presence of distractors
2024cites this paper
On the importance of severely testing deep learning models of cognition
2023cites this paper
Does resistance to Style-Transfer equal Shape Bias? Evaluating Shape Bias by Distorted Shape
2023cites this paper
Meaningful Communication but not Superficial Anthropomorphism Facilitates Human-Automation Trust Calibration: The Human-Automation Trust Expectation Model (HATEM)
2023cites this paper
Does training with blurred images bring convolutional neural networks closer to humans with respect to robust object recognition and internal representations?
2023cites this paper
Image Memorability Prediction with Vision Transformers
2023cites this paper
Does resistance to style-transfer equal Global Shape Bias? Measuring network sensitivity to global shape configuration
2023cites this paper
Configural relations in humans and deep convolutional neural networks
2023cites this paper
What Is a Preferred Retinal Locus?
2023cites this paper
Shape-selective processing in deep networks: integrating the evidence on perceptual integration
2023cites this paper
Classifying Malignancy in Prostate Glandular Structures from Biopsy Scans with Deep Learning
2023cites this paper
Population encoding of stimulus features along the visual hierarchy
2023cites this paper
Gestalt theory: A revolution put on pause? Prospects for a paradigm shift in the psychological sciences
2023cites this paper
Drawing as a versatile cognitive tool
2023cites this paper
Deep problems with neural network models of human vision
2022cites this paper
A novel feature-scrambling approach reveals the capacity of convolutional neural networks to learn spatial relations
2022cites this paper
The best game in town: The reemergence of the language-of-thought hypothesis across the cognitive sciences
2022cites this paper
Does the brain's ventral visual pathway compute object shape?
2022cites this paper
SpIRL: Spatially-aware image representation learning under the supervision of relative position descriptors
year unknowncites this paper
I NFERRING DNN-BRAIN ALIGNMENT USING R EPRESENTATIONAL S IMILARITY A NALYSES CAN BE PROBLEMATIC
year unknowncites this paper