< Back to previous page

Publication

Parallel reinforcement learning with minimal communication overhead for IoT environments

Journal Contribution - Journal Article

Many Internet of Things (IoT) applications require a distributed architecture for decision making: either because of a lack of a centralized system, failure-prone connectivity to a centralized system, or because the imposed latency to contact such a system is too high for real-time applications. Often, these IoT applications fall in the domain of Reinforcement Learning (RL), e.g., autonomous robot navigation in Smart Factories and Traffic Signal Control in Smart Cities. However, RL-based applications require a long learning time. To overcome this limitation and scale with the number of agents, Parallel Reinforcement Learning (PRL) algorithms run multiple RL agents in parallel and on distributed environments. However, deploying PRL algorithms in such environments entails a communication overhead that increases the (actual) execution time. State-of-the-art PRL algorithms are designed for reducing the learning time while assuming no (or limited) communication overhead. In this work, we present a novel partitioning algorithm that minimizes the communication overhead in PRL running on IoT environments. To the best of our knowledge, this is the first work that focuses on solving the communication overhead of distributing PRL algorithms algorithm without requiring any a priori knowledge about the structure of the problem. The proposed algorithm intelligently combines a dynamic state partitioning strategy, which exploits the agent’s exploration capabilities to build partition knowledge while learning, with an efficient mapping of agents to partitions, which reduces the communication among agents. Performance evaluations show that the proposed algorithm can achieve almost no communication among PRL agents at the converged state.
Journal: IEEE internet of things journal
ISSN: 2327-4662
Volume: 7
Pages: 1387 - 1400
Publication year:2020
Keywords:A1 Journal article
BOF-keylabel:yes
BOF-publication weight:10
CSS-citation score:1
Authors from:Higher Education
Accessibility:Closed