
Should we let AI decide?
Image credit: Getty Images
Getting the artificial intelligence process right the first time around enables it to be the best it can be.
Women were finding it tough to get jobs at Amazon. Not, it turned out, due to conscious discrimination in the HR department, but instead because of bias inherent in the AI-driven selection algorithm.
The problem is that not all artificial intelligence (AI) systems are equally good. Ideally, they start off good and become better, but equally they can start off bad and become worse. What makes the algorithm good is if it learns ‘fairly’, removing any human bias from its operation. If it doesn’t recognise this bias and have the ability to counter it, any new data reinforces that bias and the effect is amplified. Indeed removing human bias from systems and allowing an AI program to be not just artificially intelligent, but actually intelligent of its own accord, is the next step for AI.
Bias can be introduced unintentionally when the AI makes an inference from a data set without any context. This sort of scenario occurs because the vast majority of AI’s applications today use deep learning to find patterns in data.
The unintentional result is that bias is introduced based on past experience. And although deep-learning models are tested for performance before they are deployed, they are not tested for downstream effects such as unintentional bias.
Numerous high-profile cases exist, including the aforementioned one at Amazon. The algorithm discriminated against women simply because it looked at historical recruitment patterns, where more men than women had been employed. That trend was continued so that the AI inferred that men were a better fit for the job, favouring male candidates.
A group called Fairness, Accountability, and Transparency in Machine Learning looks at this. It brings together researchers and other interested parties on an annual basis and one of its purposes is to examine how the complexity of machine learning may reduce the justification for consequential decisions to “the algorithm made me do it”.
The key is to know what the question being put to the AI is. Being able to frame the question and decide what the AI needs to achieve is the starting point. For example, if a company wants to augment its profit margins then it might introduce bias based on the fact that it is trying to increase its profit, rather than to be fair. This can introduce a downstream impact of unintentional bias, and so the issue is taking this into consideration – trying to define what the AI needs to do and what the consequences of that might be.
James Luke, distinguished engineer at IBM, comments: “There are many types of algorithms and the issue is sometimes working out what will work best where. Often different components work together. For example, take a driverless car: there you need to have audio, visual, verbal, planning, GPS etc. There is a variety of applications, from mere sorting and image classification to those that are more decision-based such as whether to speed up or slow down.”
Igor Carron, CEO and co-founder at LightOn, a company dedicated to making artificial intelligence computations faster and power efficient, adds: “The holy grail of machine learning and deep learning is trying to figure out how, with a mix of good data and good algorithm, one can get the very best result. On the algorithm side, it is a subject of intense interest from researchers all around the world.”
Carron cites a paper, ‘Understanding deep learning requires rethinking generalisation’, which has already been cited 800 times in AI academic literature and is only two years old.
‘The holy grail of machine learning, and deep learning is trying to figure out how with a mix a good data and good algorithm one can get the very best result.’
The paper says that despite their massive size, successful deep artificial neural networks can exhibit a remarkably small difference between training and test performance, so getting the basics right works.
Having selected the right algorithm, it then needs to be transparent and its actions explained; if a system can say how it arrived at a decision then it can learn from its mistakes and the programming can be changed to flag up potential hot spots.
Carron says: “Part of these generalisation and fairness issues have also triggered other fields of research such as those around explainability of algorithm: Can we provide a reasoning as to why a certain result was obtained by a specific algorithm?”
But if the algorithm is the right fit for the problem, transparent, can explain itself and is programmed to know that it doesn’t know, then this moves things on. Indeed in this sense, the issue might not actually be around determining the limits of the free thought of deep neural networks (DNN), but rather being able to be transparent, explain them and knowing that there are known unknowns.
IBM’s Luke says: “DNNs are all about good system engineering and design. It’s about determining what the end outcome will be – will people’s lives be affected? As in the case of a driverless vehicle? Or is it for something more creative where difference can be encouraged and tolerated. The need is definitely to move things on so that the computer can identify what it doesn’t know and raise the flag itself.”
Cases
A look at the evolution of AI helps here. Everyone more or less now has email spam filters and many organisations, such as banks and retailers, regularly use natural language processing (NLP) enabled chatbots to enhance customer service for customers as part of an overall offering.
New data from Juniper Research found the global number of successful retail chatbot interactions will reach 22 billion by 2023, up from an estimated 2.6 billion in 2019.
The past two years have seen a massive jump in the art of the possible and the result is we are starting to see more every day use cases for AI.
The US Defense Advanced Research Projects Agency (Darpa) funded a programme to facilitate better understanding of these issues. One of its outcomes has been to say specifically that unless the AI can explain itself and the reasoning behind its conclusions, its use cases will remain limited. This is because, unless the user can have confidence in past decisions, there can be no confidence in future decisions that have partially been made by learning from the past.
The trust element is so important given that the use cases for AI are becoming more and more impactful. For example, if the end user in the HSBC use case (see box) cannot have confidence when looking for items of interest in massive multimedia data sets, then the AI becomes pointless. If the user cannot trust the autonomous car to make the right decisions or stop when it cannot decide then, again, it also becomes somewhat pointless.
Advancing things further, if the AI can be set up to know when it does not know, then correctability becomes the next thing to tackle. The AI can learn from its mistakes and import new knowledge to its overall context.
Alex Kwiatkowski, principal industry consultant, global banking practice, at data analytics business SAS, says: “It is about evolution at the appropriate speed. Banks might use complicated algorithms to identify fraud, but there needs to be flags to add human oversight. Cars might be driverless, but they also need to know when they have reached the limits of their contextual awareness. In a way, this is the exciting transformational thing about AI... we’re not just setting fixed limits for its capabilities, we’re setting limits so that it can learn from itself and continually evolve. There is no hard stop to these limits.” Importance of data
Hence the need to support the AI with not just the right algorithm that is fit for purpose, but also the right data to give the AI the best chance possible of being effective.
Luke explains: “Choosing the right algorithm and the learning subsystem is crucial, but the other thing is to ensure that the data set it is working with is fit for purpose and representative of what the system will encounter. For example, a weather predictor set up to work in the UK would not work with the same data in Australia, as the temperature ranges are so different. The key thing to know here is that the algorithm must be working off the same data as in the area where it was initially trained.”
Ideally this would be a situation or process already being done and where the past could be used to inform future decisions. The computational speed provides scale and the AI provides more accuracy.
A good example of this is insurance underwriting and introducing more data sets to the underwriting process to better understand the nature of claims and the supporting evidence around them. This would be enabling the same team of humans to be more productive. A new example is HSBC using AI to improve its fraud detection. Here, the system uses data to recognise behaviour that is not normal and flag it (see box).
Kwiatkowski says: “The SAS Data for Good initiative does a lot of work around this, be it medical or looking at the natural environment. It’s about making sure that the data is the very best it can be to be used for the greater good. This will be transformational in terms of identifying triggers for disease, cures for previously incurable diseases, environmental issues, preventing infant mortality and the like, but only if the system has the right data to work with.”
Data preparation also comes into this equation. This is where the framework is set for which attributes the algorithm should consider. This is often called the ‘art’ of deep learning because the decisions made at this stage have a significant impact on the end result and the accuracy with which the AI can predict outcomes. Again, the unintentional impacts of biases that might be introduced as a result of the chosen frameworks are less easy to see and need to be monitored.
Indeed, to get the very best from AI, it seems that the right data and engineering to give it parameters and boundaries from which to learn are central to its continued development. In this sense, the debate is moving on from the art of the possible to the business of the desirability of what is possible, coinciding with the reality for the need to set these limits in order for the AI to be trustworthy and useful.
There is no doubt that AI is a massive force for good, both for society as a whole and for business. But, in delivery, it is imperative to embrace the right data and engineering to support the AI in its performance.
Kwiatkowski sums this up well: “We need to remember that we are still at a fairly early stage in that we have moved on from the art of the possible and now we are looking more at where the useful use cases are and how we should manage them to get the best but also impose limits on them. We need to be ready, aim, fire – not ready, fire, aim.”
HSBC
HSBC is another good example of harnessing the right data to provide meaningful and accurate results when it comes to fraud prevention. It recently won Celent’s Model Bank 2019 Risk Management Award for its initiative.
It has created a Hadoop data lake, which sits underneath a machine-learning platform, and its goal is to give a new and centralised approach to potentially fraudulent activity.
Craig Beattie, analyst at Celent, says: “Data lakes create a ‘good enough’ environment – this is in contract, where you might have a single golden copy of data in a warehouse that has pretty much reached perfection, but it has taken time to get there. The data lake, meanwhile, is pretty much real-time, and this is important for fraud as any latency means that fraud is only spotted after the event.”
The platform provides investigators with the ability to access and analyse internal and external data from various sources across jurisdictions, presenting a contextual and comprehensive view of customer activities and relationships. This approach represents a significant step forward in the fight against financial crime.
Beattie says: “There are AI models that allow the AI to tell you when they find something and then there are those that work around how the fraud is coming into the system. Both are about identifying patterns to stop the crime from happening.”
Key to the platform working, however, is having the ‘good enough’ data from multiple sources and also that the platform can evolve and learn. “Evolution is very important in that fraud rings constantly adapt their behaviour so the AI needs to know what a ‘normal’ pattern of behaviour around a certain transaction looks like. This is exception-based machine learning,” Beattie adds.
Driverless cars
A good example of needing the right data and the right algorithm lies in the emergent driverless cars sector. Here, the onus is on the system knowing when it does not know something and asking for help, and then learning from that.
Within driverless cars an autonomous system has ‘learned’ from training examples that don’t match what’s actually happening in the real world.
In one instance, the AI identified a white van that was actually an ambulance and the car should have pulled over. This was an event that the car had not previously been exposed to. Applying a different type of algorithm, the Dawid-Skene, changes things.
A paper by the Massachusetts Institute of Technology describes how researchers applied this algorithm to more accurately aggregate data and probability calculations to identify both blind spots and safe situations and assign each situation a confidence level.
Most importantly, the algorithm can learn. So in a situation that it is 90 per cent comfortable with the presence of certain markers it might flag to the system that there is enough ambiguity to require human help. This is a better way of doing things than simply having a situation where a system has come across a very similar situation and behaved correctly nine times out of the previous ten.
This is also a better practice than simply tallying the acceptable and unacceptable actions for each situation. If the system performed correct actions nine times out of ten in the ambulance situation, for instance, a simple majority vote would label that situation as safe.
Co-author of the research paper, Ramya Ramakrishnan, explains how the model helps autonomous systems better know what they don’t know. He says: “Because unacceptable actions are far rarer than acceptable actions, the system will eventually learn to predict all situations as safe, which can be extremely dangerous.”
This is dealt with by producing a heat map, where each situation that the system has been trained with is assigned a low to high probability of being a blind spot. This identifies risk. The system can then use the risk rating in real-life situations to behave more cautiously by using the risk context with other factors in a given situation.
For example, if the learned model predicts a situation to be a blind spot and the situation also has a high probability of it being so assigned to it, then the system can query a human for the acceptable action, allowing for safer execution.
In this instance the right algorithm has been applied, allowing the AI to know what it doesn’t know and therefore ask for help and, again, learn continuously from that.
Sign up to the E&T News e-mail to get great stories like this delivered to your inbox every day.