Learning and relearning of target decision strategies in continuous coordinated cleaning tasks with shallow coordination

Keisuke Yoneda, Ayumi Sugiyama, Chihiro Kato, Toshiharu Sugawara

    Research output: Contribution to journalArticle

    4 Citations (Scopus)

    Abstract

    We propose a method of autonomous learning of target decision strategies for coordination in the continuous cleaning domain. With ongoing advances in computer and sensor technologies, we can expect robot applications for covering large areas that often require coordinated/cooperative activities by multiple robots. We focus on the cleaning tasks by multiple robots or by agents which are programs to control the robots in this paper. We assumed situations where agents did not directly exchange deep and complicated internal information and reasoning results such as plans, strategies and long-term targets for their sophisticated coordinated activities, but rather exchanged superficial information such as the locations of other agents (using the equipment deployed) for their shallow coordination and individually learned appropriate strategies by observing how much dirt/dust had been vacuumed up in multi-agent system environments. We will first discuss the preliminary method of improving the coordinated activities by autonomously learning to select cleaning strategies to determine which targets to move to clear them. Although we could have improved the efficiency of cleaning, we observed a phenomenon where performance degraded if agents continued to learn strategies. This is because so many agents overly selected the same strategy (over-selection) by using autonomous learning. In addition, the preliminary method assumed information given about which regions in the environment easily became dirty. Thus, we propose a method that was extended by incorporating the preliminary method with (1) environmental learning to identify which places were likely to be dirty and (2) autonomous relearning through self-monitoring the amount of vacuumed dirt to avoid strategies from being over-selected. We experimentally evaluated the proposed method by comparing its performance with those obtained by the regimes of agents with a single strategy and obtained with the preliminary method. The experimental results revealed that the proposed method enabled agents to select target decision strategies and, if necessary, to abandon the current strategies from their own perspectives, resulting in appropriate combinations of multiple strategies. We also found that environmental learning on dirt accumulation was effectively learned.

    Original languageEnglish
    Pages (from-to)279-294
    Number of pages16
    JournalWeb Intelligence
    Volume13
    Issue number4
    DOIs
    Publication statusPublished - 2015 Nov 18

    Fingerprint

    Cleaning
    Robots
    Robot applications
    Multi agent systems
    Dust
    Monitoring
    Sensors

    Keywords

    • coordination
    • learning
    • Multi-robot sweeping
    • robot patrolling

    ASJC Scopus subject areas

    • Artificial Intelligence
    • Computer Networks and Communications
    • Software

    Cite this

    Learning and relearning of target decision strategies in continuous coordinated cleaning tasks with shallow coordination. / Yoneda, Keisuke; Sugiyama, Ayumi; Kato, Chihiro; Sugawara, Toshiharu.

    In: Web Intelligence, Vol. 13, No. 4, 18.11.2015, p. 279-294.

    Research output: Contribution to journalArticle

    @article{47f4382f99d745aa9ebe3c4cee12aa8e,
    title = "Learning and relearning of target decision strategies in continuous coordinated cleaning tasks with shallow coordination",
    abstract = "We propose a method of autonomous learning of target decision strategies for coordination in the continuous cleaning domain. With ongoing advances in computer and sensor technologies, we can expect robot applications for covering large areas that often require coordinated/cooperative activities by multiple robots. We focus on the cleaning tasks by multiple robots or by agents which are programs to control the robots in this paper. We assumed situations where agents did not directly exchange deep and complicated internal information and reasoning results such as plans, strategies and long-term targets for their sophisticated coordinated activities, but rather exchanged superficial information such as the locations of other agents (using the equipment deployed) for their shallow coordination and individually learned appropriate strategies by observing how much dirt/dust had been vacuumed up in multi-agent system environments. We will first discuss the preliminary method of improving the coordinated activities by autonomously learning to select cleaning strategies to determine which targets to move to clear them. Although we could have improved the efficiency of cleaning, we observed a phenomenon where performance degraded if agents continued to learn strategies. This is because so many agents overly selected the same strategy (over-selection) by using autonomous learning. In addition, the preliminary method assumed information given about which regions in the environment easily became dirty. Thus, we propose a method that was extended by incorporating the preliminary method with (1) environmental learning to identify which places were likely to be dirty and (2) autonomous relearning through self-monitoring the amount of vacuumed dirt to avoid strategies from being over-selected. We experimentally evaluated the proposed method by comparing its performance with those obtained by the regimes of agents with a single strategy and obtained with the preliminary method. The experimental results revealed that the proposed method enabled agents to select target decision strategies and, if necessary, to abandon the current strategies from their own perspectives, resulting in appropriate combinations of multiple strategies. We also found that environmental learning on dirt accumulation was effectively learned.",
    keywords = "coordination, learning, Multi-robot sweeping, robot patrolling",
    author = "Keisuke Yoneda and Ayumi Sugiyama and Chihiro Kato and Toshiharu Sugawara",
    year = "2015",
    month = "11",
    day = "18",
    doi = "10.3233/WEB-150326",
    language = "English",
    volume = "13",
    pages = "279--294",
    journal = "Web Intelligence",
    issn = "2405-6456",
    publisher = "IOS Press",
    number = "4",

    }

    TY - JOUR

    T1 - Learning and relearning of target decision strategies in continuous coordinated cleaning tasks with shallow coordination

    AU - Yoneda, Keisuke

    AU - Sugiyama, Ayumi

    AU - Kato, Chihiro

    AU - Sugawara, Toshiharu

    PY - 2015/11/18

    Y1 - 2015/11/18

    N2 - We propose a method of autonomous learning of target decision strategies for coordination in the continuous cleaning domain. With ongoing advances in computer and sensor technologies, we can expect robot applications for covering large areas that often require coordinated/cooperative activities by multiple robots. We focus on the cleaning tasks by multiple robots or by agents which are programs to control the robots in this paper. We assumed situations where agents did not directly exchange deep and complicated internal information and reasoning results such as plans, strategies and long-term targets for their sophisticated coordinated activities, but rather exchanged superficial information such as the locations of other agents (using the equipment deployed) for their shallow coordination and individually learned appropriate strategies by observing how much dirt/dust had been vacuumed up in multi-agent system environments. We will first discuss the preliminary method of improving the coordinated activities by autonomously learning to select cleaning strategies to determine which targets to move to clear them. Although we could have improved the efficiency of cleaning, we observed a phenomenon where performance degraded if agents continued to learn strategies. This is because so many agents overly selected the same strategy (over-selection) by using autonomous learning. In addition, the preliminary method assumed information given about which regions in the environment easily became dirty. Thus, we propose a method that was extended by incorporating the preliminary method with (1) environmental learning to identify which places were likely to be dirty and (2) autonomous relearning through self-monitoring the amount of vacuumed dirt to avoid strategies from being over-selected. We experimentally evaluated the proposed method by comparing its performance with those obtained by the regimes of agents with a single strategy and obtained with the preliminary method. The experimental results revealed that the proposed method enabled agents to select target decision strategies and, if necessary, to abandon the current strategies from their own perspectives, resulting in appropriate combinations of multiple strategies. We also found that environmental learning on dirt accumulation was effectively learned.

    AB - We propose a method of autonomous learning of target decision strategies for coordination in the continuous cleaning domain. With ongoing advances in computer and sensor technologies, we can expect robot applications for covering large areas that often require coordinated/cooperative activities by multiple robots. We focus on the cleaning tasks by multiple robots or by agents which are programs to control the robots in this paper. We assumed situations where agents did not directly exchange deep and complicated internal information and reasoning results such as plans, strategies and long-term targets for their sophisticated coordinated activities, but rather exchanged superficial information such as the locations of other agents (using the equipment deployed) for their shallow coordination and individually learned appropriate strategies by observing how much dirt/dust had been vacuumed up in multi-agent system environments. We will first discuss the preliminary method of improving the coordinated activities by autonomously learning to select cleaning strategies to determine which targets to move to clear them. Although we could have improved the efficiency of cleaning, we observed a phenomenon where performance degraded if agents continued to learn strategies. This is because so many agents overly selected the same strategy (over-selection) by using autonomous learning. In addition, the preliminary method assumed information given about which regions in the environment easily became dirty. Thus, we propose a method that was extended by incorporating the preliminary method with (1) environmental learning to identify which places were likely to be dirty and (2) autonomous relearning through self-monitoring the amount of vacuumed dirt to avoid strategies from being over-selected. We experimentally evaluated the proposed method by comparing its performance with those obtained by the regimes of agents with a single strategy and obtained with the preliminary method. The experimental results revealed that the proposed method enabled agents to select target decision strategies and, if necessary, to abandon the current strategies from their own perspectives, resulting in appropriate combinations of multiple strategies. We also found that environmental learning on dirt accumulation was effectively learned.

    KW - coordination

    KW - learning

    KW - Multi-robot sweeping

    KW - robot patrolling

    UR - http://www.scopus.com/inward/record.url?scp=84948470017&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=84948470017&partnerID=8YFLogxK

    U2 - 10.3233/WEB-150326

    DO - 10.3233/WEB-150326

    M3 - Article

    VL - 13

    SP - 279

    EP - 294

    JO - Web Intelligence

    JF - Web Intelligence

    SN - 2405-6456

    IS - 4

    ER -