Controlling the learning process of real-time heuristic search

Masashi Shimbo, Toru Ishida

Research output: Contribution to journalArticle

68 Citations (Scopus)

Abstract

Real-time search provides an attractive framework for intelligent autonomous agents, as it allows us to model an agent's ability to improve its performance through experience. However, the behavior of real-time search agents is far from rational during the learning (convergence) process, in that they fail to balance the efforts to achieve a short-term goal (i.e., to safely arrive at a goal state in the present problem solving trial) and a long-term goal (to find better solutions through repeated trials). As a remedy, we introduce two techniques for controlling the amount of exploration, both overall and per trial. The weighted real-time search reduces the overall amount of exploration and accelerates convergence. It sacrifices admissibility but provides a nontrivial bound on the converged solution cost. The real-time search with upper bounds insures solution quality in each trial when the state space is undirected. These techniques result in a convergence process more stable compared with that of the Learning Real-Time A* algorithm.

Original languageEnglish
Pages (from-to)1-41
Number of pages41
JournalArtificial Intelligence
Volume146
Issue number1
DOIs
Publication statusPublished - 2003 May 1
Externally publishedYes

Fingerprint

learning process
heuristics
Autonomous agents
learning
remedies
time
Heuristics
Learning Process
Costs
present
ability
costs
performance
experience

Keywords

  • Adaptive learning
  • Convergence process
  • Rational agent
  • Real-time heuristic search
  • Resource-boundedness

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language
  • Artificial Intelligence

Cite this

Controlling the learning process of real-time heuristic search. / Shimbo, Masashi; Ishida, Toru.

In: Artificial Intelligence, Vol. 146, No. 1, 01.05.2003, p. 1-41.

Research output: Contribution to journalArticle

Shimbo, Masashi ; Ishida, Toru. / Controlling the learning process of real-time heuristic search. In: Artificial Intelligence. 2003 ; Vol. 146, No. 1. pp. 1-41.
@article{ef38fc2631834acca14d34300224f78b,
title = "Controlling the learning process of real-time heuristic search",
abstract = "Real-time search provides an attractive framework for intelligent autonomous agents, as it allows us to model an agent's ability to improve its performance through experience. However, the behavior of real-time search agents is far from rational during the learning (convergence) process, in that they fail to balance the efforts to achieve a short-term goal (i.e., to safely arrive at a goal state in the present problem solving trial) and a long-term goal (to find better solutions through repeated trials). As a remedy, we introduce two techniques for controlling the amount of exploration, both overall and per trial. The weighted real-time search reduces the overall amount of exploration and accelerates convergence. It sacrifices admissibility but provides a nontrivial bound on the converged solution cost. The real-time search with upper bounds insures solution quality in each trial when the state space is undirected. These techniques result in a convergence process more stable compared with that of the Learning Real-Time A* algorithm.",
keywords = "Adaptive learning, Convergence process, Rational agent, Real-time heuristic search, Resource-boundedness",
author = "Masashi Shimbo and Toru Ishida",
year = "2003",
month = "5",
day = "1",
doi = "10.1016/S0004-3702(03)00012-2",
language = "English",
volume = "146",
pages = "1--41",
journal = "Artificial Intelligence",
issn = "0004-3702",
publisher = "Elsevier",
number = "1",

}

TY - JOUR

T1 - Controlling the learning process of real-time heuristic search

AU - Shimbo, Masashi

AU - Ishida, Toru

PY - 2003/5/1

Y1 - 2003/5/1

N2 - Real-time search provides an attractive framework for intelligent autonomous agents, as it allows us to model an agent's ability to improve its performance through experience. However, the behavior of real-time search agents is far from rational during the learning (convergence) process, in that they fail to balance the efforts to achieve a short-term goal (i.e., to safely arrive at a goal state in the present problem solving trial) and a long-term goal (to find better solutions through repeated trials). As a remedy, we introduce two techniques for controlling the amount of exploration, both overall and per trial. The weighted real-time search reduces the overall amount of exploration and accelerates convergence. It sacrifices admissibility but provides a nontrivial bound on the converged solution cost. The real-time search with upper bounds insures solution quality in each trial when the state space is undirected. These techniques result in a convergence process more stable compared with that of the Learning Real-Time A* algorithm.

AB - Real-time search provides an attractive framework for intelligent autonomous agents, as it allows us to model an agent's ability to improve its performance through experience. However, the behavior of real-time search agents is far from rational during the learning (convergence) process, in that they fail to balance the efforts to achieve a short-term goal (i.e., to safely arrive at a goal state in the present problem solving trial) and a long-term goal (to find better solutions through repeated trials). As a remedy, we introduce two techniques for controlling the amount of exploration, both overall and per trial. The weighted real-time search reduces the overall amount of exploration and accelerates convergence. It sacrifices admissibility but provides a nontrivial bound on the converged solution cost. The real-time search with upper bounds insures solution quality in each trial when the state space is undirected. These techniques result in a convergence process more stable compared with that of the Learning Real-Time A* algorithm.

KW - Adaptive learning

KW - Convergence process

KW - Rational agent

KW - Real-time heuristic search

KW - Resource-boundedness

UR - http://www.scopus.com/inward/record.url?scp=0037405586&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0037405586&partnerID=8YFLogxK

U2 - 10.1016/S0004-3702(03)00012-2

DO - 10.1016/S0004-3702(03)00012-2

M3 - Article

AN - SCOPUS:0037405586

VL - 146

SP - 1

EP - 41

JO - Artificial Intelligence

JF - Artificial Intelligence

SN - 0004-3702

IS - 1

ER -