In recent years, we have seen artificial intelligence (AI) being used to solve complex analytical problems and introduced in a growing number of fields in ways never before thought possible. We find ourselves amidst machine learning (ML) and deep learning algorithms deployed for specific cases to replace the need for humans and mimicking human functions. These algorithms are designed to process innumerable input parameters, assess the various possibilities by running multiple iterations before identifying the best solution.
All the above steps occur within fractions of a second. Consequently, these scenarios call for high accuracy and high precision models which leave almost null to zero chances for error. How confident and comfortable are we, when machines take over these responsibilities that we thrust upon them?
As organizations, we often tend to focus mostly on the accuracy and results of the model with little or no importance given to the inherent bias of the data scientist who develops the algorithm. What is the significance and impact of these biases?
With complex models and a myriad of critical factors involved, a tiny assumption made on the data might spiral out and have a cascading effect on the final outcome. The inference may seem harmless at first sight, as it comes with backed evidence formed out of a bias made on the programmer’s mind in early childhood.
What is Bias?
Biases affect nearly every part of our daily lives. They are the prejudices and inclinations we have in favour or against certain things, people, and groups with reference to another. There are two types of biases, each with its own downside. Conscious bias happens when we believe our acquired intelligence to be factual data instead of treating it as preconceived notions formed by our socio-cultural environments. Unconscious bias can possess more of a serious threat as it mainly goes unobserved by the person with the bias, so has fewer chances of getting detected later.
How Vulnerable are the Algorithms for Being Prone to Error?
Machines by themselves do not possess bias. Artificial intelligence learns logic built through algorithms as fundamental stepping stones that evolve in stature as time progress. The algorithms are logical decisions made by developers and incidentally governed by their habits and patterns. These patterns are ingrained in our behaviour and have become an integral part of us. In data science, human bias exists in understanding the data, creating algorithms, and interpreting results. In the case of projects where developers build programs over a period of time, it would help to go back to fixed designs in AI to recheck and validate any unintentional biases left behind instead of taking them for granted based on high-performance results.
Once these biases are incorporated unintentionally in developing the AI, they become part of the structure and extended to ML models from the system. With AI systems being used in multiple applications and in a multitude of ways, the developer becomes responsible for any mishaps and its implications.
How Do We Solve for This?
By becoming conscious of our cognitive biases and methods to debias them, we will be capable of developing fool-proof AI systems. Also, by exposing ourselves to different situations and interacting with people from diverse backgrounds, we add layers to our acquired knowledge. This assists in getting varied perspectives and avoiding common data fallacies.
At Fiind Inc., we spend considerable time and effort to ensure we eliminate bias from the model by:
1. Understanding and Identifying Diversity Gaps in Training Data
We spend a significant part of the modelling timeline in data exploration to look out for potential bias and skewness in data. By having multiple sets of eyes looking at the data, we try changing the way data is presented. The different methods of visualization are critical to spot these anomalies. For example, while analyzing sales data from an e-commerce platform, there was a natural bias in the buying pattern based on the demographics. We handled this by assigning weights to regions, based on the sales volume.
2. Selecting Models and Understanding How the Algorithm Works
Our models are tailor-made for the available data, as a single model does not work well every time. At Fiind, we make sure to understand the algorithms’ process and how it arrives at the end result. This, in turn, helps us to find the bias in the algorithms. When updating the model for enhancements, it is best not to ignore the previously discarded variables from the existing model and start iteration from the entire list of variables to eliminate bias getting carried forward.
3. Monitor Performance Using Real Data
Although many models show promising signs and work as expected in controlled test environments, there are more chances of them spinning out of control in real-world scenarios. While analyzing conversion rates for a product, we found there was a statistical bias towards retail industries compared to other industries. We handled this by re-sampling the data to avoid signal amplification by the model. It is advisable to upscale using real-time data to avoid any unforeseen surprises.
By being aware of our personal preferences and keeping in mind the customer’s goodwill, we will be able to eliminate the bias getting baked into AI systems and ML models. This knowledge can help to produce perfect algorithms with their original intent and purpose.