The true cost of unstructured 'dark data' in the GDPR era

Kazoup's Johan Holder warns that unmanaged, unstructured data will pose major risks to organisations when the GDPR comes into force in just nine months

The EU's General Data Protection Regulation (GDPR) could finally make organisations take decisive action on what they've been ignoring for decades: ‘dark data', spinning on disc, costing more money to store and protect than any organisation truly appreciates.

If the cost of storing a file was only the storage infrastructure it sat on, then it wouldn't be such a concern.

Dark data is the data that accumulates through both automated and manual processes, but which remains invisible to the business: idle, unanalysed and without any clear owner.

Organisations are... spending time, energy and, most important of all, money trying to keep ‘dark data' under control

Most organisations typically just buy more storage to throw at the problem and ignore it. But in the new era of GDPR, unstructured data can no longer be forgotten, ignored, and kept in the an unstructured state; we need to start looking at intelligent ways of quantifying, categorising and analysing it.

This means we can make informed decisions about where to keep, delete or store it in a searchable format for future retrieval when required.

More than 80% of enterprise data is unstructured

Eighty per cent or more of enterprise data today is unstructured, according to analysts' consensus. Thirty years ago that would have been 80 per cent of a very small amount, but as the world becomes more connected, it's also generating exponentially more information.

Data is flooding in from a multitude of sources - some known, and some invisible. The 80 per cent is becoming a very large amount of data that needs to be addressed.

However, organisations have neither the time, resources or technology in place to effectively manage all that data; let alone benefit from it. The trouble is, while big data and analytics remain in vogue, neither the volume of data produced, nor the impulse to store it all, will change.

In pursuit of business intelligence, many organisations are hoarding - often unconsciously - useless data. That means more important insights will be lost in endless exabytes of inaccessible (dark) data. Organisations are not only missing out on potential insights, as they are struggling to manage redundant dark data; they are also spending time, energy and, most important of all, money trying to keep dark data under control.

The security consideration of unmanaged data

As organisations lock down sensitive information in structured systems, it is becoming more difficult for hackers to obtain access. Cyber criminals are now turning their attention to the unstructured data, often perceived as less critical and, usually, less well managed and understood.

GDPR requirements for unstructured data

Most of the advice dished out for getting organisations ready for GDPR focuses on the structured data held, mostly in CRM and ERP systems.

For structured data, it is relatively simple to put in processes for data discovery and removal. But GDPR doesn't apply just to data that might be in a structured format, it applies to all data.

Consider how much more difficult it is to search and remove unstructured employee data, such as a scanned passport, for example.

This is where dark data becomes a much bigger problem. No longer is that scanned passport sitting innocuously on a file server, slowly becoming ‘dark' and not providing value to the business. Now, a user demand for the removal of that data will have a significant cost in man-hours for an exhaustive search.

Or, in the worst case, if that data is revealed in a breach, you run the risk of significantly higher fines under GDPR.

So, we need to ask ourselves the question: is the visibility of the data we hold more important than ever before?

When the first proposals for the GDPR were released on 25 January 2012, few paid much attention to them. Following much to-ing, fro-ing, and negotiating it was approved by the European Parliament on 27th April 2016 and will become enforceable in its entirety from 25th May 2018.

With this date fast-approaching organisations can no longer afford not to know: Ignorance will be no defence; every organisation must make sure they implement an effective data governance strategy.

The GDPR aims to address the data protection challenges that have resulted from "rapid technological developments and globalisation" by unifying rules and rights across the EU. GDPR is significant for a number of reasons, but most notably because the financial penalty for an infringement is increasing from the current maximum of £500,000 to €20m or four per cent of annual worldwide turnover.

Research carried out by NCC Group suggest that the fines from the Information Commissioner's Office (ICO) against British companies for data breaches in 2016 could have been £69m rather than £880,500 if maximum fines under the pending GDPR had been applied.

Indeed, the fine of £400,000 levied against internet service provider TalkTalk for security shortcomings that helped hackers to access customer data could have been as high as £59m under the GDPR.

This legislation will also give consumers the ability to manage who has their data and what they do with it. As individuals, we should be delighted. But as organisations, we need to fully understand the data we hold, or at least be able to quickly produce such data if requested.

The invisible file, with a credit card number, passport or National Insurance number could potentially cost an organisation millions if not managed properly or removed appropriately on request - and evidenced. The new regulations must not be treated lightly, otherwise those organisations will be left behind.

Gartner estimate the average cost of storing a single terabyte of file data is as much as $3,351 a year. The cost of one terabyte of mismanaged file data under the GDPR, however, will be far greater.

How to comply with GDPR

We can either ignore it and hope for the best, or we can tackle the problem from the top down. So here's where to start:

1. Implement a GDPR programme

Designate a Data Protection Officer, if required, or someone who can take responsibility for data protection compliance. They should have a mandate that is not only regulatory but also focused at driving business innovation. A budget is also required to be able to implement the right tools, processes and standardisation to manage the right information effectively.

2. Awareness training and documentation

Make sure that decision makers and key people in the organisation are aware that the law is changing. Document what personal data you hold, where it came from and who you share it with. Make sure you have the right procedures in place to detect, report and investigate a personal data breach, as well as deal with data access requests.

3. Automate your data lifecycle

Create a self-sustaining data lifecycle based retention policies that reflect the value of individual files to your business. Use hybrid cloud storage across the data lifecycle and automate the management and movement of file data across local and cloud services.

4. Empower your staff

Give your team the tools to find, discover, prepare, analyse, share and store information in a controlled way. Take steps to improve business agility, speed-up value realisation and make business people accountable for their data.

Under the regime inaugurated by the GDPR, unmanaged and unstructured data can no longer be left to whither. Everything needs to be intelligently discoverable, regardless of whether this is onsite or in the cloud.

"Your data, your responsibility" is a clause now embedded in the majority of, if not all, public cloud contracts. It is important to understand this in order to have an effective data protection strategy in place. It needs to be more than just words on a page this time round.

And, of course, if you suffer a data breach the ICO will come knocking and if they're not impressed then you could be in for a financially painful fine, on top of the costs of dealing with the fallout from the breach.

Johan Holder is co-founder and CEO of Kazoup, a cloud-based file management company that can help organisations organise files over Office 365, Slack, Google Drive and many other services and applications.