Artificial Intelligence (AI) has revolutionized the way we live and work and the speed of the revolution seems to quicken every day. However, as the utility of AI systems for critical applications grows, so too does the risk of bias in AI grow with it. Bias in AI can lead to discrimination and unequal outcomes for certain groups of people, and addressing these issues is essential to the ethical use of AI.
Bias occurs when an AI system is programmed with biased information or assumptions, typically via biases in the training data, leading to biased results. For example, facial recognition algorithms have been shown to be biased against people with darker skin tones, and language models like OpenAI’s GPT-3 have been shown to perpetuate gender and racial biases.
But AI bias can occur in many other areas and has been reported in a variety of applications. Predictive policing algorithms intended to inform law enforcement of potential criminal activity have been shown to perpetuate racial biases and reinforce existing inequalities. Talent recruitment algorithms that assess job candidates based on historical data have been criticized for reinforcing gender or racial biases and penalizing candidates based on factors that are not related to their qualifications. Medical diagnosis algorithms have led to biased decisions, for example, based on gender, ethnicity, socioeconomic status, monetary kickbacks for prescribing drugs, and difficult-to-treat diseases, leading to unequal treatment and outcomes for patients. Educational technology that uses AI for student assessment or to predict academic success can perpetuate biases in teacher expectations, student background, and other factors. AI algorithms used in banking and lending institutions have been shown to discriminate against certain demographic groups, such as individuals of low-income or minority backgrounds, leading to unequal access to loans and credit. Online advertising algorithms used in online advertising can discriminate against certain groups, such as women and people of color, by showing them fewer job ads or by pricing them out of certain markets.
The consequences of bias in AI can be significant and far-reaching. For example, biased facial recognition algorithms can lead to false arrests and wrongful convictions, as well as discrimination against certain groups of people. It is altogether too easy to envision a near future where the police shoot an innocent civilian due to a biased AI algorithm flagging the victim as a dangerous threat.
But we needn’t look to the future to see the problem of AI bias. It’s happening now. The Justice Department has begun scrutinizing an AI program called the Allegheny Family Screening Tool which is by Pittsburgh-area child protective services agency. According to the Associated Press, the tool has already led to discrimination against families with disabilities.
So AI bias is happening now. It’s up to organizations that build and utilize AI to make it stop happening.
To avoid AI bias, business and government organizations need AI systems that are understandable, explainable, editable, and auditable. What they have now are anything but: neural networks are black boxes, utterly inscrutable to their developers. A neural network can never be reliably confirmed to be actually unbiased – it can merely be shown to have not been biased in the past according to some particularly defined measure. Neural networks, at their core, face an unsolved and unsolvable problem of induction: All swans are white is a valid induction until you encounter a black swan. All neural networks seem well-aligned until they’re suddenly not.
Diveplane Reactor™, however, can help lead to bias-free algorithms by understanding and addressing the bias in the data itself. Unlike neural networks, which are model-based, Diveplane Reactor™ is an instance-based Understandable AI®.
Model-based learning and instance-based learning are two different approaches in Artificial Intelligence (AI) for training machine learning algorithms. Model-based learning refers to a method of machine learning in which a model is built based on a given set of data, and this model is then used to make predictions on new, unseen data. In this approach, the algorithm tries to build a model that mimics the correlations in the data. This model can be used for prediction and decision making. Neural networks are the most popular type of model-based learning today.
Instance-based learning, on the other hand, involves loading the training examples into a specialized database and interpolating or extrapolating the data to make predictions on new, unseen data. In this approach, the algorithm does not build a separate model, but instead uses a similarity measure between the new data and the training examples to determine which training example is most relevant to the new data. The prediction is then based on the output of the most similar training example. Instance-based learning fell into disfavor when neural networks soared in popularity, but as the insolvable flaws of neural networks become more apparent, the merits of instance-based learning have become apparent, especially given the advancements in instance-based learning.
Diveplane Reactor™ is an instance-based platform that, instead of using an arbitrary similarity or distance measure, actually uses the mathematical surprisal as the “distance”; this means that everything is based on the probability that two values are the same, and interpolations are just the expected value. In other words, the less surprising it would be for a particular known data element to match a new data element, the more similar the AI will rate the two instances. This type of measure can help to ensure that the AI system is making accurate predictions and has been empirically validated to be highly accurate compared to other popular ML algorithms across hundreds of open-source data sets. This approach can also analyze relationships in any direction and can be used to dynamically analyze the relationships of any subset of features, making it an effective tool to find proxy relationships within the data that are causing bias, and can be used to quickly and easily identify and edit the training data and features that are driving biased decisions.
Because it is instance-based rather than model-based, Diveplane’s AI never hides its decision making behind an inscrutable model. For Diveplane Reactor™, the data is the model, and that data is fully transparent to the data scientists and users. Since Diveplane’s technology allows users to understand the reasoning behind AI decisions, it offers a clear way to check for potential biases and data drift – not just in decisions that have already been made, but in decisions that have yet to be made.