The Challenges for building AI into Edge AI apps
Mobile and Embedded system apps are continuously advancing and Artificial Intelligence (AI) is ready to touch our lives in a huge way. Numerous apps are evolving with the research in energy consumption on Deep Learning (DL) and the growing customer demands, such as smartphone apps, smart home, and self-driving vehicles. These demands also motivate products to process AI-enhancements at the ‘Edge device’ rather than rely solely on Cloud-connected support.
The developers for Edge AI apps are facing serious challenges. Firstly, they need to ensure the algorithm readiness along with the concern of real-time constraint and data privacy. Secondly, device makers have to guarantee the framework robustness, interconnection, power consumption, thermal issue, and system security. Aiming at shortening the time-to-market and enhancing user experience, we try to break down the AI application development into three levels, hoping that we could shed some light on these challenges.
Algorithm level: The accuracy of AI Algorithms/models is the most important metric of the apps. In the meantime, the developers always have to fully-acknowledge the applied scenarios and design constraints of the targeted device. Taking autonomous vehicle for example: the tolerance of latency and the requirement on frame rate are limited in order to achieve target app functionalities. If the computation cannot be well parallelized and accelerated to meet real time constraint, crash or accident may happens. In addition to the timing constraint, the memory utilization, power consumption and thermal issue are also tightly bounded with the complexity of AI models. Thus this need to be evaluated in the early stage of development.
After evaluating the complexity and the accuracy of AI model, the developer may decide to either switch to a smaller model or reduce the model size while keeping the accuracy criterion. With more concise model, it is also beneficial for speeding up future application download and/or on-air update. Although network reduction and quantization technique may further help on squeezing the models, they prolong the development cycle. The developer may be able to reduce this overhead with well-integrated tools.
Platform Level: For realizing the algorithm, the AI models are integrated with Edge AI platform framework. The maturity of software-hardware co-design dominates the efficiency for the performance on the platform. The form factors are the heterogeneous processors (CPU, GPU, and APUs) and memory hierarchy architecture. [ APU is a dedicated AI processor that is designed to accelerate matrix processing and general neural network operations]
With the AI model as an input, the platform designer optimizes the task scheduling and task pipelining to achieve high parallelization. The processers compute over different data partitions (tiles into different memory banks). With proper data re-use and scheduling, the platform can reduce the power consumption and latency from reducing and hiding the redundant read/write on SRAM/DRAM. This can be achieved with hardware-oriented platform design.
System Level: For building up the end-to-end apps, the developer need to examine the overall data-path. Again we take autonomous vehicle for example, the integration and placement of sensors, display, in-car network (e.g, Controller Area Network (CAN) bus), storage and power supply are crucial. In addition, a review is required on the security standard (passenger and other road users), robustness to outer-attack and system-level failure (e.g., random failure from overheat) and other malfunctions of the apps. Moreover, the end-to-end latency including transmission and signal processing latency shall also feedback to algorithm design. Last but not the least, user interface/experience is an optimization topic.