February 10, 2022

AI Bias: Shifting from Old Misconceptions to New Practical Ways to Mitigate Business Risks And Drive Impact

AI Bias: Shifting from Old Misconceptions to New Practical Ways to Mitigate Business Risks And Drive Impact

Nicolai Baldin is joined by Dr. Ansgar Koene, the Global AI Ethics and Regulatory Leader at Ernst & Young (EY). Together they examine the ethics and the potential risks to the business of artificial intelligence, how to define those risks and what happens when a business fails to properly manage the risks involved in developing and using AI. They discuss how much data plays into this and how to mitigate those threats.

The full recording of their discussion is available here.

Ansgar, let's start by telling us a bit about your work and your background

Ansgar: I'm the Global AI Ethics and Regulatory Leader at Ernst & Young, it is a function that was initially created in the Global Innovation team, but is now part of the Global Public Policy team. This reflects on how innovation is increasingly becoming important in the way in which policymakers need to be thinking about how to anticipate the direction in which these technologies are going and the impact that they're having on society. 

In addition to my role at EY, I'm also a Senior Research Fellow at the University of Nottingham, where I've been involved in many research projects related to how these technologies are impacting society, and also how people (especially young people) understand these technologies and interact with them. Through that work, I also engage with various civil society organizations, a primary one being the 5 Rights Foundation, (focused on the rights of young people online).

Most of my current work focuses on how these technologies impact people, society, and businesses, on how we can create a good environment where the benefits of these technologies can come out, and where we can address the potential problems that might arise from them. Part of working in this space is also engaging with the development of standards. This includes working with the British Standards Institute and the European Standards Bodies to try to provide clear guidance on what is best practice, and how we can implement as developers and deployers of these technologies, to make sure that they're doing that in the best possible way. 

My background is in Engineering, with a Master's degree in Electrical Engineering and Control Systems, where I focused on robotics. My Ph.D. was in computational neuroscience, looking at how we can use controlled systems thinking to better understand how biological networks, neural networks, machine learning, or learning systems function. And bring those two areas together. I spent 15 years as an academic working on the intersection between robotics, neuroscience, psychology, and AI. Gradually through working in that space I became more focused on the impacts of this technology on society and that transitioned more into the focus that I currently have on the public policy side. 

Could you introduce the concept of ethical AI and AI bias, please? 

Ansgar: Yes. When we're talking about ethical AI, we're trying to make sure that we are thinking about AI, not just in terms of whether it’s performing a particular task, such as measuring the accuracy of an outcome, but rather we're thinking beyond that to what are the actual implications that this has on the immediate stakeholders that are being engaged with, and also the wider society. Bias is a clear example of one of the types of ethical concerns that arise from AI.

It's important to be clear about what we mean when we say “bias in AI'' systems because it depends on how you interpret the task of the system. You can interpret any AI system as being biased. A completely unbiased system would just produce random uniform and random distribution outcomes, which are useless. What we mean when we say we want to avoid bias or we want to mitigate unintentional biases, is that we must make sure that the outcomes differentiate so we get a different outcome for certain different kinds of inputs. The differentiation is based on things that are relevant to the task.

Let’s take an example that highlights how something could be considered as biased or unintended bias from one perspective, but might not be biased in a particular application domain. Is an AI system that prefers giving certain jobs to taller people than it does to shorter people biased? Yes and no. It depends. If there is a functional reason why a job is more suitable for tall people then this is not an unjustified bias, but rather it is the AI system performing its task. If however, we cannot give a clear rationale for why this differentiation makes sense in the purpose of the task, then this would be considered an unjustified bias and something that we're trying to remove.

Do you think that there is any misalignment between the way the concept is being addressed in the scientific community and some finance and healthcare organizations? Is there potential to further educate on the topic? 

Ansgar: Certainly. For starters, there are many conversations about completely unbiased datasets, which is something that doesn't make any sense if you haven't clarified what the data set will be used for and how it's being assessed. There’s a need for having an actual understanding of what the AI system is designed to do: the intersection between explainability and addressing bias issues needs to be brought forward more in some of the conversations. 

When it comes to how certain business sectors are applying it, there is a difference in how much experience some have in thinking about bias issues. In the insurance sector, for instance, which has a long history of using statistics of certain population groups, etc. to make decisions, there is a greater understanding of where bias might come from and the need to provide explanations about why a particular differentiation makes sense.

In other sectors that have not traditionally been using these data and statistics-driven approaches,  there's less experience and knowledge of how to deal with it. And there's more of a naive approach as to try and throw enough data at it to get rid of bias or do something to remove all data bias without having predefined what the data was going to be used for.

So in my opinion, there is still some learning necessary both in how certain industries are using it and in the way to ensure that various policymakers have a good understanding of how these technologies operate since they approach the point of discussions around how to best regulate and support these technologies.

Nicolai: We also see several companies being able to scan different tables, columns, and databases to understand the different concerns on the data level as well. So as you mentioned, data bias and AI bias depend on many variables, such as different legislations in different parts of the world. And for that, it's important to be able to scan very quickly, understand the current situation, and make certain decisions based on them. And we see that there is a huge potential and the market opportunity from simply providing a bit more clarity on the topic and giving some very simple reporting to business units on the AI risk, data risks, and data bias as well.

What do you think are the most interesting questions we are trying to find answers to collaboratively right now?

Ansgar: First, on the topic of bias. I think one of the areas of debate is how we should approach this. The traditional approach to avoiding unintended discrimination has been “you're not allowed to collect certain data about people”. For example, to avoid gender discrimination, you're not allowed to ask what the gender of the person is.

Increasingly, we are seeing that that is not a good approach to take when it comes to avoiding gender bias in AI systems, because there are just so many other factors that correlate with somebody's gender. And in my opinion, the better approach is to actually include the factor of gender in the data set in order to enable the assessment of whether the system is actually being unintentionally biased according to this dimension. But what this means is there's a need to reassess how our legislation in this space is formulated, because of certain domains such as credit risk assessments.

Currently, companies are not willing to collect such data, as they are afraid of not complying with the letter of the law. There is a need to rethink this and we see some of that debate already happening, for instance, the EU’s AI Act, where there is a clause around collecting certain personal and sensitive data to mitigate concerns of unintentional discrimination, due to correlations with secondary factors. So that's one element where there's some new thinking that needs to come in, because of the power these technologies have to find correlations. 

Nicolai: I agree, we need to make sure that we put data in a system that has different balances and classes, to understand and evaluate the potential impact on the system from a biased point of view.

Also, you mentioned there is quite some work being done around artificial data, to try and collect more data and put it into the systems. So it's clear that we can simulate a given behavior and test for bias. 

Do you see companies moving towards such applications?

Ansgar: We do see companies talking about this, and exploring how to do it. We also see some companies coming onto the market to offer synthetic data. The challenge is that you always need to be clear as to what it is exactly you want to achieve. So, first of all, what are the main drivers for us trying to move towards synthetic data as opposed to trying to collect more real data? 

Well, one aspect is that for certain minority groups, there may simply not be enough data available to create a robust machine learning model. And simply collecting more data of the general type would just increase the bias that you have towards the majority data group. So that doesn't work.

There may even be cases where minority groups, because of historical reasons, are very reluctant to hand over data. It has not historically been in the interest to cooperate in data collection. So therefore there is a need to come up with a different way to create sufficient datasets.

How do you create sufficient data sets? Well, one of the important elements (and a recurring theme throughout) is the discussion about how to address ethics. You need to be very clear as to what you want to achieve, what the context is, and where this AI system will be produced. If you're creating synthetic data you're taking some of the original data and then you're creating new instances, which are certain, randomized combinations of the synthetic data, etc., that reproduce the key statistics that were in the actual data that you collected.

But in order to know the key statistics, you need to know what it is that you are working towards. What factors are going to be important, and therefore what are the elements where you need to be recapturing those statistics? You cannot recapture all possible statistics that were in the data because that would mean having real new data. That's an intrinsic challenge with synthetic data, but it is a challenge that is very addressable. A recurring theme in a lot of the ethics discussions is to be clear about: where are you actually using this system, who are the stakeholders, which community is going to be impacted by it? And therefore you’ll know the community whose statistics you need to be capturing in the right way. 

Nicolai: I agree, and we've spent over three years in the area of synthetic data, and I understand that there is a bit of a misconception and misunderstanding of the concept in the area. And I think it's very important to work collaboratively on resolving those issues. We've seen companies using it quite badly. And there are a number of academic papers around it as well. That's why it's important to work collaboratively on improvement to understand them.

Another topic is historic data, as we've seen many companies using historical data for understanding the AI risk. But also, as you mentioned it may not always be a solution because historic data may not have a reflection of reality, because you might not have enough representation of different groups. So you need to somehow augment it, and improve it before testing and before using it for assessing the AI risk.

What are common practices to resolve key misconceptions when it comes to the impact of historic data on AI models and AI risk?

Ansgar: Firstly, historic data is always going to be the starting point, as that data reflects the present and the past. It cannot be data that reflects the future because this hasn’t happened yet. We are always dealing with historic data in one way or another. The challenge is making sure that you understand that data sufficiently, and whether that data reflects the kind of condition that you want to capture. 

One issue is that historic data is biased in the way in which it was captured. An example of that is if we are collecting data through the internet and people's use of internet services, it’s going to be skewed towards those who use the internet more. Populations that don't interact as much online will not be reflected in the data set. That is a skewing of the data based on the population groups who are being captured in it. We see a similar problem in recruitment data. For instance, if it is data that reflects who historically has applied for these jobs, and if the job was historically male-dominated then the dataset is going to be male-dominated. That means the quality of the inferences that you can make about the male population will be better than the quality of the inference that you can make about the female population.

The other question is about what kind of bias we have in the decision-making that was made historically. There is a huge gap between the society that we want to have versus the society that we currently have. This has always been the case, and as our society has progressed from what it is, to what it wants to be, usually what it wants to be has moved further ahead. But this means that the historical data that we are looking at would entrench the same kinds of biases that we had in historical decision-making. 

We would have to create new data to fill in the gaps. It means there is a need to take responsibility for how you are collecting and using the data. However, if for instance, we talk about the example of news content recommendations, does the company providing their recommendation have a moral authority to say what you should see, even if it isn't the dominant news story at that moment? We get into complicated feedback loop problems because the producer of the recommendation system is influencing what is visible. You need to be aware of those factors to be able to address them.

You cannot simply collect data and feed it into the machine learning system without looking at it, cleaning it, addressing it, and making sure that you're producing the decisions that you are looking for. And what that means is you need to be clear about what you're trying to achieve. This is an interesting discussion that has been coming up also in academics. Can machine learning lead us to a post-theory space where we don't need to have theoretical models of how things operate, or cause and effect models? We can simply throw the data in and do the correlations and operate based on statistics. But if you do that, you will entrench existing problems and will not be able to get the outcomes that we as a society want to get.

What are some of the incentives for financial services, insurance, and healthcare organizations to invest in better understanding AI and data bias, and what impact would you expect it to have? 

Ansgar: Firstly you need to understand the data in the right way in order to produce meaningful predictions, and understand where your system is potentially vulnerable to changes. Also, there is a need to address the regulatory requirements and be able to show that your models are reliable and that you understand what is going on. And there is still some conversation happening between the regulators and the industry as to what exactly does proving the resilience of the models means in the context of AI.

Ultimately, business risk comes down to long-term insurance that the business is going to be able to grow. And that means not just short term quick profit, but also how we are building the reputation, respect, and trustworthiness, ensuring that we are clear as to how our AI systems operate, we can explain why they are performing in a certain way, and address unintended bias in the systems. 

Finally, what world problem could AI easily solve? 

Ansgar: AI can play an important role in supporting solving a lot of our problems, including climate change and social justice, but it is not going to be the silver bullet that will solve things on their own, as is genuinely the case. AI can provide part of the puzzle, but it will only really work if the other parts of the puzzle also exist. And I think that is a very important element that we need to put in front of our minds as we try to address any kind of problem.