Chain-of-thought reasoning and scratchpads have emerged as critical tools for enhancing the computational capabilities of transformers. While theoretical results show that polynomial-length scratchpads can extend transformers'expressivity from $TC^0$ to $PTIME$, their required length remains poorly understood. Empirical evidence even suggests that transformers need scratchpads even for many problems in $TC^0$, such as Parity or Multiplication, challenging optimistic bounds derived from circuit complexity. In this work, we initiate the study of systematic lower bounds for the number of chain-of-thought steps across different algorithmic problems, in the hard-attention regime. We study a variety of algorithmic problems, and provide bounds that are tight up to logarithmic factors. Overall, these results contribute to emerging understanding of the power and limitations of chain-of-thought reasoning.
Lower Bounds for Chain-of-Thought Reasoning in Hard-Attention Transformers
Alireza Amiri,Xinting Huang,Mark Rofin,Michael Hahn
Published 2025 in International Conference on Machine Learning
ABSTRACT
PUBLICATION RECORD
- Publication year
2025
- Venue
International Conference on Machine Learning
- Publication date
2025-02-04
- Fields of study
Computer Science
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-81 of 81 references · Page 1 of 1
CITED BY
Showing 1-23 of 23 citing papers · Page 1 of 1