Dark data: "The dirty secret of the ICT sector"
Poor storage practices are a massive drain on the world's resources - but there's a simple fix.
Advances in data storage and analytics mean businesses are now more incentivised than ever to gather and keep information about every aspect of their operations. But while this data has the potential to unlock great value, it has a hidden cost, too.
Storing unstructured and unused data generates millions of tonnes of greenhouse gases every year, as well as consuming huge amounts of water and taking up vast swathes of land; but businesses and data centres can make a major and immediate difference with one simple step: taking control of data storage.
"Dark data accounts for about 54% of data that's stored around the world," says Aoife Foley - IEEE senior member and a reader in the School of Mechanical and Aerospace Engineering at Queen's University Belfast. "We're talking about emails, customer calls, video recordings - a whole load of junk that's never used."
More than half of the world's data going unused? It's a startling figure - especially when you consider how storage is growing. Since 2016 the total amount of data stored worldwide has climbed from 2 exabits (Eb) to 6Eb in 2022, and is on track to reach 11Eb by 2025.
For reference, an exabit is one quintillion bits - that's 18 zeroes - or 1 billion gigabits.
"All that data comes at a cost. Data centres are storing huge amounts of data around the world that sits there for no function. In essence, that's just rubbish: 54% of it. Another 32% of it is redundant or abandoned data: it's data that did its function, it's probably a duplicate, it's sitting somewhere else, and it may have been required at one point in time, but it no longer is. And it has little or no value to the organisation... Even if they wanted to find it, they have no proper system in place to dig it out."
Foley and her colleagues found that only around 14% of stored data is critical - that is, bank records, online inventory management, and healthcare data. But to get to it, organisations are forced to wade through the sludge of dark data clogging up their servers.
Everything in a box
Foley comes from an engineering background - a strictly ordered industry where everything must adhere to certain ISO standards, from labelling charts to storing them. The lack of comparable standardisation is one of the biggest issues facing the IT sector today, she believes.
"When you look at the way the data is all gathered, it's very unstructured. As an engineer, to me, that's the antithesis of what we need to do. Everything is structured, everything goes in a box, everything is filed away carefully. If you go into a solicitor's office, a bank, anything, everybody has a number, everything is carefully recorded. But the problem is all of this glut of rubbish is going with the recording and the backup and the storage."
The answer is simple, but not easy: she believes data storage needs regulating by the International Standards Organisation (ISO), much like it already is with engineering.
"The same thing needs to be done in different sectors: the legal sector, the medical sector, the commercial sector. They all need to have the same filing systems and they need to get rid of the rubbish."
Environmental impact
In their research, Foley and her colleagues looked at the countries with the most data centre activity globally: the UK, Germany, France, Russia, Ireland, South Africa, Italy, Switzerland, the Netherlands, USA, China and Brazil. What they found was massive variance: not only in power, water and land use, but in how much of that resource was given over to dark data.
"That growth [in dark data] is huge. The emissions and the energy footprint associated with that, aside from the land, it's sort of the dirty secret of the ICT sector, isn't it? Nobody wants to talk about it, because they don't know what to do with it."
The environmental cost of doing business online - and storing dark data - varies hugely around the world. Per gigabyte of data use, the CO2 equivalent (CO2e) footprint ranges from 28-63g; water from 0.1-35 litres; and land from 0.7-20cm².
Much of the variance is down to the local environment: Brazil, for example, is very hot, so water use for data centre cooling is well above the global average; but the country scores well on carbon footprint, with a good mix of renewables powering the grid.
France was one of the few countries in the research to score below the global median on carbon, water and land footprints dedicated to dark data. In fact, the country has one of the highest percentages of clean and identifiable business critical and redundant data storage.
Why is that? Simply, "they're better at filing than we are [in the UK and Ireland]." That means less storage space - and energy, land and water - dedicated to storing dark data.
The problem is that few people think of something as simple as more efficient filing as part of the solution to climate change. Few people consider the effect of their digital filing ineptitude on their energy and carbon footprint. Instead, the digital landfill just keeps on growing as people add more and more data to the pile.
"There's no such thing as a free lunch. The likes of Dropbox or Google Docs or Apple, where they give you certain amount of free storage - is that an appropriate, sustainable business proposition? Is it a sustainable service?"
An inconvenient truth
While regulation could address some of these issues, it's ultimately an unlikely answer. There is simply no political will to do something about dark data, says Foley, partly because of the economic strength of tech industry.
Governments who tell companies like PayPal or Amazon to clean up their excess storage - probably by mandating to customers what they can and can't keep - could find themselves facing the loss of these major employers. Although Ireland has many favourable qualities for the data centre industry, it wouldn't be impossible to replace.
"This sector is like the goose that laid the golden egg, it's the gift that keeps on giving. But you have to ask yourself, when will they start to wake up to the fact that we're going to reach a point that 20% of energy demand on the planet is just from data? It's a bit of a worry.
"And that's the problem for Ireland. EirGrid, the national grid operator, has worked out that basically by 2030, up to 32% of the energy demand on the island of Ireland will come from data... So, that's the interesting thing for Ireland, and that's the problem for the Government. Damned if they do, damned if they don't. They're not going to hit their emissions reduction targets if they let this continue."
What can we do?
If governments have the incentive, but little will, to drive change, the responsibility falls to companies and individuals. It is up to everyone - especially IT leaders, with control over business data - to take more control over storage.
The benefits are not purely on environmental. Better storage and archiving practices mean less time, energy and money spent retrieving data, and more effective data mining.
In the short term, take a hard look at your servers and determine what data you are storing that you no longer actually need: the dark data. In the medium term, write policies and processes to eliminate unstructured data, and check regularly for compliance.
Not only will these steps "safeguard the environment," they'll also give a better view of your own processes - and bring your company closer to its green targets.
"Corporate entities around the world...part of their profit is being tied into reaching and achieving sustainability targets in their own organisations. So, it'll all filter down if everybody starts doing it. But I think waiting for governments to do it is a bit of a problem, because the [Irish] Government doesn't want to upset the public, the voters, and it doesn't want to lose jobs from the ICT sector."
Without demand, "data centres aren't going to change their behaviour," despite rising energy costs - which will certainly be passed on to customers. Foley believes piling the pressure on is "a moral responsibility."
"Filtering out of the dark data and unnecessary information is a moral responsibility for individuals and organisations around the world. That can change the data centre business, really.
"It's like in the '70s, when we had landfills, and water pollution and rubbish... It's the same in the digital world: it's a dirty, filthy landfill. And that's what we need to ask ourselves: is it morally correct?"
Data collection and analytics have reached a level the world has never seen before; but just like any new development, it has also brought new problems in the form of digital - and actual - pollution. The faster we act to curtail it, the faster we can derive actual value from the information we're storing.
You may also like
/news/4339182/uk-irish-police-most-prolific-ddos-site
Hacking
UK and Irish police take down 'most prolific' DDoS site
DigitalStress was a DDoS-for-hire service designed to make attacks easy
/news/4338523/tatas-uk-gigafactory-project-takes-major-step-forward
Components
Tata's UK gigafactory project takes major step forward
Sir Robert McAlpine to build multi-billion-pound factory
/podcasts/4333508/national-grid-analogue-digital-ctrl-alt-lead-podcast
Public Sector
National Grid is turning analogue to digital - Ctrl Alt Lead podcast
'We can't do what we've always done, just more efficiently'