Towards Greener DoD HPC

Samsi, Siddharth (MIT Lincoln Laboratory)

McDonald, Joseph
Frey, Nathan
Bestor, David
Jones, Michael
Gadepally, Vijay

Other: High Performance Computing, Datacenter Energy Usage

Recent advances in AI have come at a significant cost to the environment. For example, according to the estimates of [1], training the popular natural language processing model, BERT, uses nearly two months of energy from the average US household. Further, the increasing budget of compute and energy needed to achieve modern gains in AI systems isn't ethically or economically sustainable [2]. This observation has also been made by the DoD which has included Energy Efficiency Targets for Data Centers in the 2022 NDAA (Sec 2921). While the traditional focus has been on making algorithms and systems more efficient to lessen energy consumption, the unbounded nature of these problems imply that such solutions may not address the elasticity problem in which people may respond to energy reduction efforts by solving more or larger problems (e.g., one may wish to go from 93% accuracy to 94% accuracy) that end up using more energy. Our research on fast and ethical AI is pursued through three thrusts. The first, intervention, focuses on technology that reduces the resources required for a computation, either explicitly or under the hood. For example, naïve interventions such as power capping can consistently reduce overall energy usage by 15-20% with relatively low performance impact across a number of power-hungry AI applications. Other interventions such as can dynamically allocate AI and scientific workloads to hardware platforms tuned for the problem at hand. The second thrust on instrumentation, looks at low-overhead technology embedded in AI applications that can give feedback to coders, system administrators, and policy makers on resource utilization. For example, highlighting portions of an application that have significant energy usage or presenting information to users about the carbon impact of their research. The third thrust, designed to overcome the elasticity problem, looks at incentives and institutions that are needed to promote and empower users to make "power efficient" choices through policy and behavioral tools. This talk will outline the research vision along with initial results along these different thrusts.

[1] Strubell, Emma, et al. "Energy and Policy Considerations for Deep Learning in NLP." Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019.
[2] Thompson, Neil, et al. "Deep Learning's Diminishing Returns." IEEE Spectrum, 2021