Project

Resource-Constrained Machine Learning across the Computing Continuum

Networked sensing systems and ML/AI applications are unlocking a world of possibilities, fueled by vast data streams ranging from industrial monitoring applications to e-Health applications. While traditionally ML applications were solely deployed on cloud-based infrastructure, accuracy-centric models are struggling to keep pace with ubiquity and real-time demands. Latency-critical applications like NLP chatbots, facial recognition, and voice assistants are pushing the boundaries, demanding timely response alongside accuracy. Centralized learning also raises privacy concerns, as captured data might contain unintended personal information, and users lose control after its collection. From the perspective of the service providers, the costs of ML/AI inference to serve user-queries are rising and are expected to constiute 90% of total cloud infrastructure costs. Thus, the advantages of distributing traning and inference of ML/AI applications across mobile, edge and IoT devices closer to the data source is clear from both the consumer and provider perspectives.

The push towards the edge, mobile and IoT devices comes with its own set of challenges namely (i) the devices are constrained in terms of memory and processing capacity in comparison to cloud infrastructure, (ii) IoT and mobile devices have a limited energy budget reliant on limited batteries or energy harvesting and as a result can perform limited tasks and (iii) the connection to cloud infrastructure often is slow and not always reliable and connected via high bandwidth networks. These broad set of challenges manifest themselves in different ways in various ML/AI application deployment paradigms namely, (i) distributed and on-device model training, (ii) on-device inference and (iii) distribution of intelligence across the computing continuum. This dissertation highlights the challenges in these paradigms and addresses them with three main contributions.

The first contribution is an adaptive framework for context-aware inference in mobile/cloud architectures. It deploys modular software components adaptively to match application goals and context (network, memory), optimizing model placement between mobile and cloud. Evaluated with a real-world food image recognition application provided by our industrial partner, our proposed solution adapts at runtime to changing goals, reducing server load and boosting scalability compared to static deployments.

The second contribution is FaultBit, a versatile toolkit for fault detection in industrial machinery based upon signals collected from IoT devices. FaultBit leverages machine learning and deep system reconfiguration of the IoT devices and sensors to achieve efficient data collection, compression, and classification over bandwidth-limited networks. Evaluation across use cases demonstrates high accuracy, comparable to that achieved with raw data) multi-year battery life, and substantial reduction in network bandwidth requirements.

The final contribution of this thesis presents a standalone on-device learning approach which tackles the limited energy budget of IoT devices by optimizng energy harvesting. This is accomplished by predicting the solar irradiance values of each time slot with a weighting factor that is specifically learned for each slot. This approach has higher accuracy in predicting the irradiance values by up to 13% per time-slot and by 4.5% on average per day. The low power overhead of on-device training is more than offset by these energy gains.

The contributions of this thesis facilitate deployment of ML/AI applications across the computing continuum and optimize these deployments to pave the way for multi-year lifetime for IoT devices, achieve energy neutrality for energy harvesting applications and adaptive reconfiguration of deployment in dynamic scenarios encountered by application deployment in heterogeneous devices across the computing continuum.

Date:15 Feb 2018 → 22 Apr 2024

Keywords:Internet of Things, e-health, data processing, scalability, adaptability

Disciplines:Applied mathematics in specific fields, Computer architecture and networks, Distributed computing, Information sciences, Information systems, Programming languages, Scientific computing, Theoretical computer science, Visual computing, Other information and computing sciences

Project type:PhD project

Project

Resource-Constrained Machine Learning across the Computing Continuum

Researchers

Project partners

Funding