On learning in agent-centered search

Nathan R Sturtevant,V. Bulitko,Y. Björnsson

Published 2010 in Adaptive Agents and Multi-Agent Systems

ABSTRACT

Since the introduction of the LRTA* algorithm, real-time heuristic search algorithms have generally followed the same plan-act-learn cycle: an agent plans one or several actions based on locally available information, executes them and then updates (i.e., learns) its heuristic function. Algorithm evaluation has almost exclusively been empirical with the results often being domain-specific and incomparable across papers. Even when unification and cross-algorithm comparisons have been carried out in a single paper, there was no understanding of how efficient the learning process was with respect to a theoretical optimum. This paper addresses the problem with two primary contributions. First, we formally define a lower bound on the amount of learning any heuristic-learning algorithm needs to do. This bound is based on the notion of heuristic depressions and allows us to have a domain-independent measure of learning efficiency across different algorithms. Second, using this measure we propose to learn "costs-so-far" (g-costs) instead of "costs-to-go" (h-costs). This allows us to quickly identify redundant paths and dead-end states, thereby leading to asymptotic performance improvement as well as 1--2 orders of magnitude convergence speed-ups in practice.

PUBLICATION RECORD

Publication year
2010
Venue
Adaptive Agents and Multi-Agent Systems
Publication date
2010-05-10
Fields of study
Computer Science
Identifiers
DOI 10.1145/1838206.1838253
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Learning in Real-Time Search: A Unifying Framework
2011cited by this paper
Hardness measures for gridworld benchmarks and performance analysis of real-time heuristic search algorithms
2010cited by this paper
Of robot ants and elephants
2009cited by this paper
TBA*: Time-Bounded A*
2009cited by this paper
Comparing real-time and incremental heuristic search for real-time situated agents
2009cited by this paper
Real-Time Heuristic Search with a Priority Queue
2007cited by this paper
Real-time adaptive A*
2006cited by this paper
nLRTS : Improving Distance Vector Routing in Sensor Networks
2006cited by this paper
LRTA*(k)
2005influential reference
Anytime Dynamic A*: An Anytime, Replanning Algorithm
2005cited by this paper
Learning for Adaptive Real-time Search
2004cited by this paper
A comparison of fast search methods for real-time situated agents
2004cited by this paper
Controlling the learning process of real-time heuristic search
2003cited by this paper
Agent-Centered Search
2001cited by this paper
Speeding up the Convergence of Real-Time Search
2000influential reference
The Focussed D* Algorithm for Real-Time Replanning
1995cited by this paper
Do the right thing - studies in limited rationality by Stuart Russell and Eric Wefald, MIT Press, Cambridge, MA, £24.75, ISBN 0-262-18144-4
1994cited by this paper
An Admissible Heuristic Search Algorithm
1993influential reference
Dynamic Programming
1993cited by this paper
Moving Target Search with Intelligence
1992cited by this paper
Complexity Analysis of Real-Time Reinforcement Learning
1992cited by this paper
Do the right thing: studies in limited rationality
1991cited by this paper
Moving Target Search
1991cited by this paper
Real-Time Heuristic Search
1990influential reference
Depth-First Iterative-Deepening: An Optimal Admissible Tree Search
1985influential reference
A Formal Basis for the Heuristic Determination of Minimum Cost Paths
1968cited by this paper

CITED BY

Effects of Self-Knowledge: Once Bitten Twice Shy
2021cites this paper
Dynamic Pathfinding for Non-Player Character Follower on Game
2021cites this paper
Weighted Lateral Learning in Real-Time Heuristic Search
2021cites this paper
Per-Map Algorithm Selection in Real-Time Heuristic Search
2021cites this paper
Time-Bounded Best-First Search
2021cites this paper
An evolutionary implicit enumeration procedure for solving the resource-constrained project scheduling problem
2017cites this paper
Escaping Depressions in LRTS Based on Incremental Refinement of Encoded Quad-Trees
2017influential citation
The effect of different motion types in simple discrete particle systems with quantitative stigmergy
2017cites this paper
Online Bridged Pruning for Real-Time Search with Arbitrary Lookaheads
2017cites this paper
Scrubbing During Learning In Real-time Heuristic Search
2016influential citation
Time-Bounded Best-First Search for Reversible and Non-reversible Search Graphs
2016cites this paper
Evolving Real-time Heuristic Search Algorithms
2016cites this paper
Achieving Goals Quickly Using Real-time Search: Experimental Results in Video Games
2015cites this paper
RT-RRT*: a real-time path planning algorithm based on RRT*
2015cites this paper
Avoiding and Escaping Depressions in Real-Time Heuristic Search
2014cites this paper
Reaching the Goal in Real-Time Heuristic Search: Scrubbing Behavior is Unavoidable
2014cites this paper
A critical evaluation of literature on robot path planning in Dynamic environment
2014cites this paper
Fast and Optimal Pathfinding
2014cites this paper
Exponential Deepening A* for Real-Time Agent-Centered Search
2014cites this paper
DYNAMIC PATHFINDING USING GENETIC ALGORITHM BASED FLOCKING BEHAVIOR ON NON-PLAYER CHARACTER FOLLOWER
2014cites this paper
Online Detection of Dead States in Real-Time Agent-Centered Search
2013cites this paper
Heuristic Search Comes of Age
2012cites this paper
Video game pathfinding and improvements to discrete search on grid-based maps
2012cites this paper
Real-Time Heuristic Search for Pathfinding in Video Games
2011cites this paper
Online Graph Pruning for Pathfinding On Grid Maps
2011cites this paper
Learning Where You Are Going and from Whence You Came: h- and g-Cost Learning in Real-Time Heuristic Search
2011cites this paper
Distance Learning in Agent-Centered Heuristic Search
2011cites this paper