Preventing bias in recruitment algorithms | Avoiding bias with AI

Can AI get it wrong? When approached correctly, AI can provide a fair, consistent and inclusive approach to recruiting. Here's how to avoid bias with recruitment AI.

Recently, we heard that Amazon scrapped its AI tool that was reportedly demonstrating bias against female job candidates. How? It was taught.

Amazon’s model was trained on data submitted by applicants over a 10-year period; mostly from males. Using only historical applicant data, which reflects the gender balance in the tech sector, this in effect taught the algorithm to favour male candidates.

Algorithms are everywhere

When done well, algorithms provide a fair, objective, consistent, data-driven and inclusive approach to recruiting. They help to get the right people in the right role quickly and at volume.

“Algorithms are everywhere. Sorting and separating the winners from the losers” says Cathy O'Neil in a provocative TED Talk in which the speaker talks of ‘secret and destructive data laundering’ by data scientists. “What if the algorithms are wrong?” She asks.

I am a data scientist. My goals are to champion candidates, to fill our client organisations with the best people and to offer a level playing field for doing so. But as we’ve seen in the Amazon case, algorithms can be flawed. That’s why it’s so important to understand what’s going in to your model.

Here’s what we do at LaunchPad to prevent bias in our algorithms:

1. Build a model using data from representative samples

A representative sample doesn’t just include ALL past applicants as the Amazon example shows. The model needs to be built using data from candidates from all demographics in fair and representative proportions. For example, if your model is built from 80% of data from male candidates and 20% female candidate data, there is a high probability that you’ll create a biased model that may unintentionally favour male candidates. A better way to build the model would be to get these ratios closer to 50:50. If you’re unable to do this, this needs to be taken into account when testing the model, - as shown in the next point.

2. Test the model – and post-check the model

When we build our algorithms, we never use any demographic data to make sure right-fit candidates can be flagged without any demographic characteristics attached. We also post-check the results generated by the model to ensure that candidate groups from protected characteristic groups are not being discriminated. For example, if there are very few applicants with a disability applying for roles, we ensure that they’re not discriminated against when they do apply. The fact that there have not been many candidates with a disability in the past would not affect the model.

3. Champion the candidate

Our job is to champion the candidate. It’s easier for us to do this when we can collect demographic data about candidates to use in testing and make sure the model is working well. Because of fears about bias, people are often reluctant to share sensitive information. An important part of our process is communicating to candidates about why we ask demographic questions and how the data will be used. We need representation from a wide range of candidates to help the model learn.

4. Analyse only the factors that matter

During an interview, a human may not be able to discount factors that don’t matter. Bias and unconscious bias can creep in and lead to inconsistent decision-making. To minimise this, we ensure that only the relevant aspects of the job are assessed by conducting a job analysis and respectively designing questions and review criteria that reflect this. This creates a level playing field as everyone is assessed on only the behaviours and characteristics that matter most to a specific job.

5. Be aware of proxy data

Even when companies remove sensitive data like ethnicity, biased decisions can still be made based on proxy data. For example, postcode can be a proxy for income or ethnicity. Our data scientists are well aware of such data that may unintentionally approximate protected characteristic demographic data, and we make sure that we do not include this type of data in our models.

Algorithms are not perfect but neither are people

Amazon may have scrapped one algorithm but with 563,100 employees and still recruiting, it’s going to need another one. Amazon brands itself as a market leading tech operation so living its brand through using recruitment in tech makes sense. Plus, from a practical point of view, Amazon will need to use tech to manage the volume of applications.

As much as we look to blame the algorithms for poor decision making, biases begin long before job application stage. Who is attracted to apply? Who is qualified to apply? Some biases are embedded in society and are changing slowly – some of this change is as a result of algorithms effectively stripping out bias.

O’Neil says in her TED Talk that data scientists should be “translators of ethical discussions” rather than arbiters of the truth. Algorithms can help us translate the reality of what’s going on - as long as we’ve provided the model with the right information in the first place.

The practices listed above ensure we know exactly what goes into our algorithms and therefore, what outcomes the algorithm should drive.

Interested in how algorithms and machine learning can enhance your recruitment process? Find out about our predictive solution LaunchPad PREDICT.