No one really knows how much energy AI is using

Full power behind AI’s green ambitions

Image credit: Dreamstime

No one really knows how much energy artificial intelligence is using, but the need to find out is getting more urgent.

It’s bonanza time in the land of computing, especially if you have bet large on artificial intelligence (AI). Opening his company’s autumn technology conference, Nvidia CEO Jensen Huang claimed: “Computing is advancing at incredible speeds. The engine propelling this rocket is accelerated computing, and its fuel is AI.”

Huang has good reason to be optimistic about the future of AI-driven computing. A decade ago, researchers at the Swiss research institute IDSIA took the deep-learning concepts developed by a small group led by Geoffrey Hinton, professor of computer science at the University of Toronto, and found they could use the parallel computing units sitting inside graphics processing units (GPUs) originally developed to run 3D games to speed up the processing. After training the deep neural network on road signs, they found the model could spot such tiny hints in shapes that it could read almost completely bleached surfaces.

Not only did the IDSIA work help how much deep learning would propel AI development, it underlined how important specialised accelerators would be in driving the revolution that followed – and, to an increasing extent, Nvidia’s revenues. The company’s top-end GPUs are now designed explicitly for speeding-up AI models running in data centres rather than first-person shooters. GTC (Nvidia’s GPU Technology Conference) was not short of speakers promoting AI and high-speed computing as a way of improving life and combating climate change. The GPU-maker is far from alone: it has become something of an article of faith that information and computing technologies (ICT) will deliver overall savings in energy consumption through better planning.

Peter Herweck, CEO of Aveva, was typical of many executives in the ICT sector when he characterised the situation earlier this year for the World Economic Forum (WEF): “The key to unlocking a net-zero future in industry is transforming the way industrial teams work through digitalisation. Higher efficiency and more ambitious sustainability objectives are enabled today by technologies that provide real-time data to optimise and better automate industrial processes and energy management.”

The reasoning is simple: the more you know about a process, the more you can optimise it. AI delivers a mechanism for improving this understanding by finding many more patterns in data that lead to efficiency improvements. Two years ago, research by Ricardo Vinuesa, associate professor at KTH Royal Institute of Technology, Stockholm, and co-workers identified numerous situations where AI and similar data-driven algorithms could help meet close to 80 per cent of the 170 targets of the 2030 Agenda for Sustainable Development. At the same time, he warns that meeting a third of those targets could just as easily be made harder through the expansion of AI.

Examples of positive effects are not difficult to find, particularly in industrial control. An example that many have used over the past couple of decades is that of motors, where direct digital control works out far less wasteful than just letting basic AC motors keep spinning and using gearboxes to control the speed of conveyors and grinders.

Fine-grained control can extend across an entire facility. Industrial giant Siemens is keen to build entire digital twins of factories and to use AI to manage processes, and this year signed a partnership with Nvidia to help that process along. Nvidia itself has demonstrated digital twins of warehouses, using a combination of AI and simulation to calculate the best layouts for shelves and to change in real-time how conveyor robots move around the space so that they do not accidentally block each other.

Though it can achieve energy savings in the mechanical systems it helps control, AI’s rampant growth raises questions of how big some of the savings could be to society once the energy consumption of the part-cloud, part-local ICT systems are factored in. One major problem is working out just how much the computing consumes in the first place.

One eye-watering estimate hit the headlines in 2019 when researcher Emma Strubell and colleagues at the University of Massachusetts, Amherst found the development of one of the largest neural networks of the time would emit the same amount of carbon dioxide as five petrol cars over their entire lifetimes from some 650MWh of electricity. It was a statistic that has been repeated many times since in keynotes and talks on the environmental cost of AI, though often with one key detail missing.

Google engineers argued in a paper published in March this year that the headline figure that often gets quoted does not distinguish between the extensive rounds of training performed on different variants of a model before selecting one that shows enough promise to be trained fully, though the Strubell paper did identify this difference. Even then, the energy cost of targeted training can still easily be counted in megawatt-hours.

At first the compute requirements for neural-network training were fairly modest. Even today, the kinds of models used for simple computer vision applications can remain fairly light in terms of energy. In their experiments to test the energy demands of AI, a team from the Allen Institute for AI in Seattle, working with Strubell and others, found even a fairly large version of the DenseNet neural network introduced in 2017 for computer vision took about 40Wh of energy to train in about half an hour, about the same as recharging 10 mobile phones.

The numbers shoot up dramatically for the huge models that have become famous for their apparent ability to understand written text and to connect words to images. Researchers at Stanford University’s Institute for Human-Centred AI regard these natural-language processing (NLP) engines as so important they decided to call them ‘foundation models’.

Based on the Transformer structure originally developed by Google Brain computer scientists, these networks can take days to train even on arrays of top-end GPUs. BERT-small, which is considerably smaller than one of the headline-grabbing neural networks like OpenAI’s GPT-3 or DALL-E, took the Allen Institute team a day and half to train on eight Nvidia V100 GPUs, for a total energy cost a thousand times higher than DenseNet, at 37kWh.

The Allen Institute group was unable to commit the resources to test the training of a larger six-billion-parameter model, which is about 30 times smaller than GPT-3, but estimated it would need 103.5MWh to complete the job. OpenAI reported that it needed close to 1.3GWh to train GPT-3.

Working against this rising demand for energy is the improved efficiency of accelerators designed specifically for machine learning. A 2020 paper by OpenAI claimed algorithmic advances and the use of accelerators on AlexNet, a predecessor of DenseNet, cut the computational cost of training by a factor of 44 over a period of seven years. Similarly, in a paper published in the spring, Google Brain engineers pointed to the advances made in dedicated neural-network accelerators such as the company’s own Tensor Processing Units. They argued their one-trillion-parameter GLaM language model took about a third of the energy needed to train GPT-3. The resulting demand was still equivalent to the daily electricity supply to a small town.

Though more common than searching for a deep-learning architecture, a factor in training’s favour is that it is not an everyday process. Some applications such as automated driving may call for regular, even daily updates of models that are then downloaded to a fleet of vehicles. In many cases, the re-training will likely be a more restricted and less energy-intensive process and will be amortised across a large fleet.

Similarly, Nvidia wants to sell customers access to its own supersized NLP engine, Megatron-BERT, but it does not expect customers to fully train it. Foundation models benefit from being amenable to a fine-tuning process for specific tasks that consumes far less data and compute time than the original training. Typically, this process involves 10 per cent of the energy that goes into the initial training.

Most of the time a neural network model will be used for inferencing: analysing inputs based on what it has already been trained on. Amazon Web Services claimed in 2020 that inference accounted for 90 per cent of the infrastructure costs of AI. However, comparatively little research has gone into the energy usage of AI inferencing, though results have shown consumption to be highly variable.

“In our carbon-footprint characterisation, inference dominates universal language models’ overall carbon footprint, whereas for deep-learning recommendation tasks, the footprints between training and inference are a roughly equal split,” Meta research scientist Carole-Jean Wu said at the MLSys conference in the summer.

One big contributor to differences in energy lies in model accuracy, as a study by post-doctoral researcher Fernando Martínez-Plumed at the Valencian Research Institute for AI (VRAIN) and colleagues found in 2021. The top-performing model of 2012 for computer vision needed around two billion floating-point operations (2GFLOPs) to perform a single pass, though this has a lower accuracy than the best available models today. The older model had a 60 per cent chance of being correct when presented with a test image. A much larger, 90 per cent accurate model from 2021 needed more than 3,000GFLOPs. A modern model with 80 per cent accuracy reduced that overhead to 100GFLOPs.


As with training, successive generations of accelerator have helped prevent inferencing energy growing too far and have provided ways to deploy complex models more cheaply. Stripped-down data formats such as a 16bit operation designed specifically for neural networks in place of the conventional 32bit IEEE-standard format could push efficiency measured in billions of floating-point operations per second per watt (GFLOPS/W) to 1,000, compared to 100 or less for the full-precision calculations, according to the VRAIN analysis. On top of that, overall hardware efficiency has improved thanks to hardware tweaks. Even at the same precision, accelerators and GPUs provide 10 times more GFLOPS/W in 2021 compared to those available in 2011.

Higher level restructuring pays off even more. Wu outlined at MLSys the process the company’s AI operations use to try to cut down the number of cycles each model needs. Some of these include looking again at the structure of foundation models and slashing the amount of memory they need, which translates into large power savings. She said the technique was not designed to minimise the carbon footprint but has provided a way to make savings.

The VRAIN researchers found that, thanks to the use of acceleration and other shortcuts, models that make it into production and so are far more likely to be replicated around the world in thousands or millions of instances tend to be more efficient than those that grab the headlines immediately. But savings can come with a sting in the tail.

“Improved efficiency can translate into more uses. This is also known as Jevons’ paradox: efficiency improvements can encourage higher uses, leading to even higher resource consumption overall. So, despite the fact that we can achieve higher performance efficiency, the overall footprint of machine-learning tasks continues to rise over time,” says Wu.

automated car

Image credit: Dreamstime

More efficient hardware may just push AI into supporting various forms of Jevons’ paradox. The rise of autonomous vehicles could make transport more efficient and less polluting, a process that will be helped by electrification. But it can lead to increased usage and even substitution for other possibly less polluting modes of transport, some researchers argue.

In 2020, ETH Zurich academic Vlad Coroama and Daniel Pargman, who works at KTH, came up with the phrase “skill rebound” to describe this form of Jevons’ paradox: a situation where people locked out of using something find themselves able to use it once it has been automated. With self-driving vehicles, children and the elderly could call up a car they cannot drive today, as could people who want to work while they travel to their destination. Road usage and even air travel, made possible by a new generation of personal aircraft, could easily increase, potentially displacing more energy-efficient but individually less convenient choices. How much so is hard to gauge right now, as is the overall contribution of ICT to global energy use and carbon footprint.

A 2021 study by a team at Lancaster University estimated ICT’s contribution to global greenhouse gas emissions to be between 2 and 3 per cent, possibly up to 4 per cent. Some believe this proportion could easily reach 20 per cent in the coming years. But, in reality, no one yet knows. The Lancaster researchers and others have called for much greater use of measurement and transparency by large ICT users, and some teams have released tools to help the process.

“We need to encourage the ICT industry to address its own emissions and other environmental impacts,” says Lancaster University lecturer Kelly Widdicks. “And we need to take a more cohesive approach to this, considering the full lifecycle and all scopes of emission rather than the current focus on efficiencies and use-phase emissions. Exact measurement is very difficult to do and rebound effects make ICT’s impacts even more tricky to estimate, but the sector mustn’t delay efforts to reduce its impacts by trying to get accurate numeric values on its emissions.”

Sign up to the E&T News e-mail to get great stories like this delivered to your inbox every day.

Recent articles