Generative AI and data ethics: Just because you can, does it mean you should?

Natalie Cramp & Alistair Dent of Profusion

Image:
Natalie Cramp & Alistair Dent of Profusion

Natalie Cramp and Alistair Dent of data consultancy Profusion, set out the impact of generative AI on data ethics on day 2 of the Computing Cybersecurity Festival.

Given that around 25 million people everyday are using Chat GPT or other generative models, the widely acknowledged inaccuracies in the datasets these models were fed on, as highlighted by researchers such Timnit Gebru, have some serious implications. The failure of vendors such as Gebru's former employer Google to acknowledge these shortcomings, and the failure of tech to self-regulate presents a problem for the rest of us.

"We clearly cannot trust the people putting these models out into the world to mark their own homework," said Natalie Cramp, CEO of Profusion. "So then you look to the law, turn to compliance. But technology moves a lot faster than the law so where do we go?"

According to Cramp, cybersecurity professionals are able to play a vital role in helping their employers to find a pragmatic balance between making sure that they can take the opportunities presented by AI driven tech, whilst at the same time reducing the risks it creates.

"That's because we have to go beyond compliance. This is not about following a set of rules. It's about making a series of judgement calls, and they're going to be different in every single organisation depending on what you do."

Cramp provided the example of an algorithm which might be used to predict when someone is in the market for a new car, so marketing efforts aren't wasted on people who aren't looking for a new car, but people who are could be invited for a test drive. Most people are probably reasonably happy to have their data used in this way. If a gambling company uses data in the same way to predict when people are likely to be in the market to gamble, ethically you're in very different territory.

Data ethics and security

Alistair Dent, Chief Stategy Officer, continued to explore the overlap between questions of ethics and questions of security, and broke the challenge into four distinct areas.

"The first is our data supply chain. Almost every platform that we use, almost every partner that we engage with, is creating data that we end up using. We have to ask if our suppliers and partners are ethically using the data that ends up benefiting us. Understanding that supply chain is critical to being able to make ethical decisions.

"The second is the use of encryption. Right up to governmental level around the world, there are questions about the use of encryption for ethical reasons and balancing the expectation of privacy against the ability to know how platforms are being used and misused."

The third aspect is one of accessibility. Dent continues:

"There's an accessibility versus security question. Different people have different requirements of the workplace and adding additional barriers and thresholds for security reasons can disadvantage people who struggle to get through those barriers. How do we try and balance the ethical question of how best to enable them to do their job to the best of their abilities with the security question?

"Finally, understanding the lifecycle management of our data. When we're trying to go beyond compliance, we need to think about where this data is going. How is it being stored? What's happening to it when we're done with it? How is it being destroyed?

People first approach

"Most security issues are created not by maliciousness but by human error," Cramp reminded the audience. "It's really the same in data. One thing that cybersecurity and other tech leaders can do is to ensure that a wide range of voices are heard in these debates."

This avoids scenarios where algorithms make mistakes based on historically biased data, and there isn't a human being in the loop to spot it. Other aspects to consider are data literacy and the data culture among employees.

"Generative AI is in your organisation," Cramp emphasised. "People are using it and if they do not have enough of a baseline level of knowledge to know how to make informed decisions about how they use data, what they give away, how they store it etc. then we're opening ourselves up to a lot of risk.

"It should be part of training every single year in the same way as we put security and compliance training in place, we need to include data ethics."

The final people principle is accountability. Who is accountable for data ethics in your organisation and does everyone know who that person is?"

Cramp finished her talk with a list of four tests that leaders could apply to answer the question, "we can, but should we?"

The first is the sunlight test. If the thought of what you're planning being made public makes you uncomfortable, you might want to think again. The second is accountability and the third is disadvantage. Does your plan potentially disadvantage a particular cohort?

"The final and probably most important test is to consider," Cramp concluded, "is whether this benefits the customer or just you. Data should be a value exchange. If it just benefits you, you probably shouldn't be doing it. "

The Profusion Good Data Guide will be available from Friday.