Proceedings Article | 4 October 2023
KEYWORDS: Performance modeling, Internet of things, Machine learning, Education and training, Power consumption, Computer hardware, Design and modelling, Computer simulations
Model cards organize and portray information about a model’s training data, hyper parameters, and behavior, such as predictive accuracy and bias for machine learning (ML) algorithms or models. Consumers use these cards to determine whether a model is well suited for a particular use case. However, the current design of model cards does not include resource utilization metrics, which are important when models are being optimized for specific hardware. The main objective of this study was to determine which set of hardware-centric performance metrics provided the most value to users when comparing among different algorithms for Internet of Things (IoT) edge devices. The Department of Defense (DoD) can utilize these metrics to select the best model for key mission tasks within the Command and Control (C2) missions that require AI/ML deployment on edge devices. Our study focused on finding correlations between resource availability and machine learning models’ computational footprint using three key metrics: energy consumption, execution time, and memory utilization. The data was gathered from simulated environments hosted by Docker containers (i.e., lightweight, stand-alone, executable package of software that includes everything needed to run an application) with various memory sizes. Two different series of experiments were performed. The first series of experiments simulated environments with 256 MB to 8GB (steps of the powers of 2) of available memory. For each memory size, we deployed 50 different containers, calculated the metrics from each container, and then recorded the average result of each metric. Two ML models from Microsoft’s EdgeML library that are designed to run more efficiently on edge devices were used: Bonsai, a decision tree based model, and ProtoNN, a KNN based model. Using Spearman’s correlation, we found that none of the metrics showed promising correlations for the Bonsai model, while the ProtoNN model demonstrated a correlation of 0.5 for its computational time. In order to investigate the lack of adequate results, we ran a second series of experiments in a similar fashion but with smaller memory sizes from 250MB to 950MB (in steps of 50MB). Using Spearman’s correlation for the Bonsai model with smaller memory sizes, we found that RAM energy consumption showed the most promising correlation value of 0.75. While for ProtoNN model with smaller memory sizes, both the RAM energy consumption and total energy consumption showed promising correlation values of 0.63 and 0.71 respectively.