
Advancements in cloud computing have boosted Machine Learning as a Service (MLaaS), highlighting the challenge of scheduling tasks under latency and deadline constraints. Neural network compression offers the latency and energy consumption reduction in data centers, aligning with efforts to minimize cloud computing's carbon footprint, despite some accuracy loss.This paper investigates the Deadline Scheduling with Compressible Tasks -Energy Aware (DSCT-EA) problem, which addresses the scheduling of compressible machine learning tasks on several machines, with different speeds and energy efficiencies, under an energy budget constraint. Solving DSCT-EA involves determining both the machine on which each task will be processed and its processing time, a problem that has been proven to be NP-Hard. We formulate DSCT-EA as a Mixed-Integer Programming (MIP) problem and also provide an approximation algorithm for solving it. The efficacy of our approach is demonstrated through extensive experimentation, revealing its superiority over traditional scheduling techniques. It allows to save up to 70% of the energy budget of image classification tasks, while only losing 2% of accuracy compared to when not using compression.
[INFO.INFO-AI] Computer Science [cs]/Artificial Intelligence [cs.AI], [INFO.INFO-NI] Computer Science [cs]/Networking and Internet Architecture [cs.NI], energy budget, [INFO.INFO-DS] Computer Science [cs]/Data Structures and Algorithms [cs.DS], scheduling, deadlines, [INFO] Computer Science [cs], neural network compression, [INFO.INFO-RO] Computer Science [cs]/Operations Research [math.OC]
[INFO.INFO-AI] Computer Science [cs]/Artificial Intelligence [cs.AI], [INFO.INFO-NI] Computer Science [cs]/Networking and Internet Architecture [cs.NI], energy budget, [INFO.INFO-DS] Computer Science [cs]/Data Structures and Algorithms [cs.DS], scheduling, deadlines, [INFO] Computer Science [cs], neural network compression, [INFO.INFO-RO] Computer Science [cs]/Operations Research [math.OC]
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 2 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
