Nicholas Birschbach, “Energy Efficient AI on Embedded Systems”
Mentor: Shuaiqi Shen, Electrical Engineering
Poster #221
Big data is becoming the future of the computing world and Artificial Intelligence (AI) is more widely used to process such data. It is impractical to always centralize the vast amount of data collected from certain applications to cloud servers, such as autonomous transportation and advanced manufacturing. Due to privacy concerns, limited network bandwidth, and response delay, there is an urgent demand to process the data where it is collected, for which AI on embedded systems is a promising solution. However, embedded systems lack the computing power required to run computationally intensive AI algorithms, such as deep neural networks. Some embedded systems are battery-powered and can hardly support complex AI learning in the long term. Using accurate low-power AI models becomes imperative to process large datasets, especially on resource-constrained devices such as smartphones, medical equipment, and industrial robotics. To address these challenges, our project aims to simplify the AI model training on embedded systems, while maintaining the desired model accuracy with Federated Learning (FL). FL trains AI models on decentralized embedded devices without exchanging the local data samples held by devices. The updated model parameters are sent from devices to the central server, where they are aggregated with the contributions from other devices to collaboratively train the global model. This iterative process continues, allowing the model to learn from diverse and distributed datasets for improved accuracy and generalization. In addition, this project investigates the scarcity of computing power and memory space in embedded devices to balance the tradeoff in model complexity, number of connected devices, and dozens of other factors to achieve efficient and accurate FL. Finally, a comprehensive FL testbed is developed to validate the proposed AI simplification methods, consisting of a central server and multiple embedded devices conducting image processing tasks based on the collected real-world data streams.