Blog

Sustainable software – how to close the metrics gap

Measuring the environmental impact of your software is a complicated task indeed. In the first of a two-part blog, we look at how best to obtain estimates of software energy use – and how these can provide the basis for improvements in sustainability.

Energy estimates matter

In the quest for more sustainable software, the pursuit of accuracy can often prevent us from making real progress. It's crucial to remember that approximations and estimates are valuable steps toward improvement. By focusing on actionable insights rather than perfect data, software professionals can make meaningful contributions to sustainability.

In the first part of this two-part blog, l look at how obtaining reliable estimates of energy use by your software is the basis on which you can produce reasonable, actionable carbon metrics that can drive sustainability improvements.

A man wearing a navy collard shirt and smiling
Adam Coles: estimation is key

Why measurement is essential

Modern software development relies on a feedback loop of research and data analysis to continually improve and produce high-quality, efficient software. Much like metrics for availability or error rates, we need sustainability metrics to ensure that we’re making our software as sustainable as possible.

Software and hardware have significant environmental footprints, primarily through energy use and embedded carbon. We should also consider water usage, land impact and social effects. As responsible software professionals, we must strive to understand and minimise these impacts, even when they are hidden behind the abstraction of the cloud.

If you can not measure it, you can not improve it.

Lord Kelvin

What exactly are we measuring?

For those of us who agonise about the amount of recycling we do, or that flight to Europe, we’re used to thinking about our impact in terms of CO2e per year. So being responsible for 10 tonnes of CO2 output is bad,while less than 1 tonne is great. We think about things in a similar way when working out our corporate carbon reduction plans. How do we do the same for software?

Energy is directly related to carbon. Certainly the more energy you use running your software, the more CO2e you’ll also create, with the amount varying based on the energy mix available to your data centre.

If you can estimate the carbon intensity of the grid at the time you’re running (eg via the electricity maps API) and you know how much energy your software is consuming then you can get a measure of how much CO2e you’ve emitted. Easy, right? Well, it is if your computer is attached to one of these (pictured).

When I first looked into this subject I was amazed how many professionals were advocating running code on a reference computer attached to an energy meter – but it’s actually one of the easiest and most accurate ways to measure the energy use of software (once you’ve calibrated/accounted for the energy use of the rest of the OS/hardware.)

For some teams this will be a decent way of working – try the software on the reference machine and at least you can see when changes result in changes to energy use. However this won’t give you an idea of the real energy use once that software is deployed to the distributed, autoscaled environments most of us are now using.

Image of an energy measurement device in wall plug socket

Challenges in the cloud

The cloud is essentially a network of physical data centres. Measuring the environmental impact of cloud services is challenging because of the opacity in how data is managed and the varying reporting standards of cloud providers.

Ideally each provider would supply you with easy to consume, highly detailed and accurate carbon use figures that would make it easy for software creators to quickly measure and improve their products, but unfortunately, for a variety of technical and commercial reasons, we are not there yet.

Mostly the hyperscalers’ carbon reporting tools take the form of carbon footprint calculators, using the GHG protocol scopes to give you an after-the-fact idea of how much carbon your service as a whole is consuming.

While some providers give you useful figures, none of them could be said to be comprehensive, timely or even accurate. Instead of relying on this self-reporting, we can also use open source tools such as the Cloud Carbon Footprint tool created by Thoughtworks which can give you more realistic figures but is based on a variety of crowdsourced and unofficial datasets.

Carbon footprint tools are useful in establishing a baseline figure for an existing service, but are less useful for teams who are developing or modifying a service in real time and need to see the impact of the changes they make.

Getting more detailed

What if you need real-time information about your software? Unfortunately this is where it gets more complicated and relies on an element of DIY.

The base requirement continues to be energy use – if you can measure energy then you can estimate carbon, but even energy is not something most services report, considering most of the time software is running on a virtual machine, in a container or as part of a managed service on completely unknown hardware.

Luckily in many cases we do have a useful figure that has some correlation to energy – CPU utilisation. Measuring and displaying CPU usage is something DevOps engineers are very used to doing – to manage performance, trigger auto-scaling etc. In general, the higher the CPU usage of a process, the higher the power drain on the underlying hardware (though this isn’t a simple, directly proportional, relationship due to the fact that CPUs work more efficiently at higher utilisation).

Note that CPU utilisation is probably the most important figure in traditional workloads, but when it comes to AI you may find GPU usage is more important.

So great, now we can easily calculate the energy from CPU metrics, right? Again, this is where the complexities of the cloud and modern systems make things complex. Your reported CPU usage may apply to anything from a dedicated, reserved server instance, a VM, a Kubernetes node or a serverless function.

And, for some of your resources, eg vendor-managed SaaS products such as managed databases, you may not be able to get these figures at all. So the CPU figures your component reports may relate to some small percentage of actual hardware CPU load.

The people who know best how to relate those virtual CPU figures to actual energy and/or carbon use are the vendors – and depending on vendor and your level of trust you might find their proprietary carbon tools give you useful information (again, in that after-the-fact form). But to self-calculate energy use, and then carbon use, we have to turn to estimation.

Luckily for most of us, much of the groundwork to enable that estimation has been done by organisations with a keen interest in this area. Using detective work to identify and rate common hardware used by datacentres, academic research into hardware performance and analysis of common software loads, a number of organisations have surveyed and documented vendor hardware and created digital models and databases that can be used to create credible estimates of energy use based on supplied figures.

Examples include:

You can use these data sources to get credible estimates of your software’s energy use from its CPU use – this is still not necessarily an easy process but can result in good enough output. That means you can estimate how much carbon your software is producing, either by multiplying energy by grid emission factor or by obtaining real time/historical estimates of grid carbon intensity, which will give you either average or dynamic figures for CO2e produced by energy in kWh.

Note that you’ll have to carry out one more step here, as not all of the energy consumed in a datacentre is used to power your hardware. Energy is also used to power cooling, lighting etc and some of that is your responsibility, so you’ll also need to multiply by the vendor’s published Power Usage Effectiveness (PUE) figures.

Phew, we got there! We finally have some figures for carbon usage that you, as software service providers, can use to understand and improve the carbon efficiency of your products.

Key takeaways

  • It’s vital that we obtain useful metrics in order to make our software more sustainable
  • We can get some estimates easily using vendor carbon calculators, though quality is both questionable and data is often incomplete
  • If we want more real-time estimates of carbon use that can drive software development improvements, we need to use estimation techniques to firstly get figures for energy use and then extrapolate to carbon output.

In the second part of this blog I’ll be looking at how new tools such as a the Impact Framework can make it easier to obtain and automate your carbon metrics, how we also need to include embedded sustainability impacts – and how we can use the same techniques to measure the impact of software on other areas, such as water use.

Opencast's social impact initiatives: sustainable by default

Authors

Adam Coles

Head of Sustainable Services

Related insights

Loading...