Improving the learning efficiencies of realtime search

Toru Ishida*, Masashi Shimbo

*この研究の対応する著者

研究成果: Paper査読

16 被引用数 (Scopus)

抄録

The capability of learning is one of the salient features of realtime search algorithms such as LRTA*. The major impediment is, however, the instability of the solution quality during convergence: (1) they try to find all optimal solutions even after obtaining fairly good solutions, and (2) they tend to move towards unexplored areas thus failing to balance exploration and exploitation. We propose and analyze two new realtime search algorithms to stabilize the convergence process. ε-search (weighted realtime search) allows suboptimal solutions with ε error to reduce the total amount of learning performed. δ-search (realtime search with upper bounds) utilizes the upper bounds of estimated costs, which become available after the problem is solved once. Guided by the upper bounds, δ-search can better control the tradeoff between exploration and exploitation.

本文言語English
ページ305-310
ページ数6
出版ステータスPublished - 1996 12 1
外部発表はい
イベントProceedings of the 1996 13th National Conference on Artificial Intelligence, AAAI 96. Part 1 (of 2) - Portland, OR, USA
継続期間: 1996 8 41996 8 8

Conference

ConferenceProceedings of the 1996 13th National Conference on Artificial Intelligence, AAAI 96. Part 1 (of 2)
CityPortland, OR, USA
Period96/8/496/8/8

ASJC Scopus subject areas

  • ソフトウェア
  • 人工知能

フィンガープリント

「Improving the learning efficiencies of realtime search」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル