Improving the learning efficiencies of realtime search

Toru Ishida, Masashi Shimbo

Research output: Contribution to conferencePaper

15 Citations (Scopus)

Abstract

The capability of learning is one of the salient features of realtime search algorithms such as LRTA*. The major impediment is, however, the instability of the solution quality during convergence: (1) they try to find all optimal solutions even after obtaining fairly good solutions, and (2) they tend to move towards unexplored areas thus failing to balance exploration and exploitation. We propose and analyze two new realtime search algorithms to stabilize the convergence process. ε-search (weighted realtime search) allows suboptimal solutions with ε error to reduce the total amount of learning performed. δ-search (realtime search with upper bounds) utilizes the upper bounds of estimated costs, which become available after the problem is solved once. Guided by the upper bounds, δ-search can better control the tradeoff between exploration and exploitation.

Original languageEnglish
Pages305-310
Number of pages6
Publication statusPublished - 1996 Dec 1
Externally publishedYes
EventProceedings of the 1996 13th National Conference on Artificial Intelligence, AAAI 96. Part 1 (of 2) - Portland, OR, USA
Duration: 1996 Aug 41996 Aug 8

Conference

ConferenceProceedings of the 1996 13th National Conference on Artificial Intelligence, AAAI 96. Part 1 (of 2)
CityPortland, OR, USA
Period96/8/496/8/8

Fingerprint

Costs

ASJC Scopus subject areas

  • Software
  • Artificial Intelligence

Cite this

Ishida, T., & Shimbo, M. (1996). Improving the learning efficiencies of realtime search. 305-310. Paper presented at Proceedings of the 1996 13th National Conference on Artificial Intelligence, AAAI 96. Part 1 (of 2), Portland, OR, USA, .

Improving the learning efficiencies of realtime search. / Ishida, Toru; Shimbo, Masashi.

1996. 305-310 Paper presented at Proceedings of the 1996 13th National Conference on Artificial Intelligence, AAAI 96. Part 1 (of 2), Portland, OR, USA, .

Research output: Contribution to conferencePaper

Ishida, T & Shimbo, M 1996, 'Improving the learning efficiencies of realtime search', Paper presented at Proceedings of the 1996 13th National Conference on Artificial Intelligence, AAAI 96. Part 1 (of 2), Portland, OR, USA, 96/8/4 - 96/8/8 pp. 305-310.
Ishida T, Shimbo M. Improving the learning efficiencies of realtime search. 1996. Paper presented at Proceedings of the 1996 13th National Conference on Artificial Intelligence, AAAI 96. Part 1 (of 2), Portland, OR, USA, .
Ishida, Toru ; Shimbo, Masashi. / Improving the learning efficiencies of realtime search. Paper presented at Proceedings of the 1996 13th National Conference on Artificial Intelligence, AAAI 96. Part 1 (of 2), Portland, OR, USA, .6 p.
@conference{789b74cf58a245bb98495ddb8e81be67,
title = "Improving the learning efficiencies of realtime search",
abstract = "The capability of learning is one of the salient features of realtime search algorithms such as LRTA*. The major impediment is, however, the instability of the solution quality during convergence: (1) they try to find all optimal solutions even after obtaining fairly good solutions, and (2) they tend to move towards unexplored areas thus failing to balance exploration and exploitation. We propose and analyze two new realtime search algorithms to stabilize the convergence process. ε-search (weighted realtime search) allows suboptimal solutions with ε error to reduce the total amount of learning performed. δ-search (realtime search with upper bounds) utilizes the upper bounds of estimated costs, which become available after the problem is solved once. Guided by the upper bounds, δ-search can better control the tradeoff between exploration and exploitation.",
author = "Toru Ishida and Masashi Shimbo",
year = "1996",
month = "12",
day = "1",
language = "English",
pages = "305--310",
note = "Proceedings of the 1996 13th National Conference on Artificial Intelligence, AAAI 96. Part 1 (of 2) ; Conference date: 04-08-1996 Through 08-08-1996",

}

TY - CONF

T1 - Improving the learning efficiencies of realtime search

AU - Ishida, Toru

AU - Shimbo, Masashi

PY - 1996/12/1

Y1 - 1996/12/1

N2 - The capability of learning is one of the salient features of realtime search algorithms such as LRTA*. The major impediment is, however, the instability of the solution quality during convergence: (1) they try to find all optimal solutions even after obtaining fairly good solutions, and (2) they tend to move towards unexplored areas thus failing to balance exploration and exploitation. We propose and analyze two new realtime search algorithms to stabilize the convergence process. ε-search (weighted realtime search) allows suboptimal solutions with ε error to reduce the total amount of learning performed. δ-search (realtime search with upper bounds) utilizes the upper bounds of estimated costs, which become available after the problem is solved once. Guided by the upper bounds, δ-search can better control the tradeoff between exploration and exploitation.

AB - The capability of learning is one of the salient features of realtime search algorithms such as LRTA*. The major impediment is, however, the instability of the solution quality during convergence: (1) they try to find all optimal solutions even after obtaining fairly good solutions, and (2) they tend to move towards unexplored areas thus failing to balance exploration and exploitation. We propose and analyze two new realtime search algorithms to stabilize the convergence process. ε-search (weighted realtime search) allows suboptimal solutions with ε error to reduce the total amount of learning performed. δ-search (realtime search with upper bounds) utilizes the upper bounds of estimated costs, which become available after the problem is solved once. Guided by the upper bounds, δ-search can better control the tradeoff between exploration and exploitation.

UR - http://www.scopus.com/inward/record.url?scp=0030362555&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0030362555&partnerID=8YFLogxK

M3 - Paper

AN - SCOPUS:0030362555

SP - 305

EP - 310

ER -