The practice of recruiters deploying artificial intelligence to scan and sort candidate resumes has become fairly common over the past few years. AI makes such tasks, previously undertaken by HR staff, more streamlined and efficient by being able to summarize large quantities of data, highlight desirable traits and spotlight red flags.
At the same time, numerous organizations representing the disability community have warned of the potential of the technology to discriminate against and exclude job seekers with disabilities due to superficial variations in how their resumes may appear when contrasted with the general population.
Now, researchers at the University of Washington have identified a fascinating new layer to these exclusionary dynamics by interrogating OpenAI’s ChatGPT on how references to disability influence how it ranks job candidate resumes.
To begin their investigation researchers from UW’s Paul G. Allen School of Computer Science & Engineering used one of the study authors’ publicly available CVs as a control. The team then modified the control CV to create six variations with each one citing different disability-related credentials ranging from scholarships and awards to memberships on a diversity, equity and inclusion panel or student organization. Running ChatGPT’s GPT-4 model ten times over to rank the modified CVs against the original version for a real-life “student researcher” job listing at a large software company – the results proved both eye-opening and deflating.
In virtually any other sphere, awards and participation in panels should be recognized as a net positive but due to their association with disability in this experiment across 60 trials, ChatGPT only ranked the disability-modified CVs ahead of the control one quarter of the time. This is in spite of the fact that, aside from the disability-related modifications, all other parts of the CV remained identical to the original.
Delving deeper
Of course, the beauty of a large language model like GPT-4 is the capacity for users to engage in human-like back-and-forth conversations with the chat interface and ask it more about how it reached its conclusions. In this experiment, ChatGPT appeared to make several discriminatory suppositions such as that an autism leadership award was likely to have “less emphasis on leadership roles.” It also determined that a candidate with depression had “additional focus on DEI and personal challenges,” which “detract from the core technical and research-oriented aspects of the role” even though no such challenges were explicitly detailed.
Explaining the uneasy relationship between AI algorithms and disability during a recent interview, Ariana Aboulafia the Center for Democracy and Technology’s Policy Counsel for Disability Rights in Technology says, “Algorithms and algorithmic systems are based on pattern recognition and a lot of folks with disabilities exist outside of a pattern.”
She continues, “These algorithmic systems may, to a certain extent, be inherently incompatible with creating an output that’s not discriminatory against people with disabilities.”
Commenting on the UW project specifically, the project’s lead author Kate Glazko said, “Ranking resumes with AI is starting to proliferate, yet there’s not much research behind whether it’s safe and effective…. For a disabled job seeker, there’s always this question when you submit a resume of whether you should include disability credentials. I think disabled people consider that even when humans are the reviewers.
“People need to be aware of the system’s biases when using AI for these real-world tasks,” added Glazko.
The human touch
Nevertheless, the UW research did offer a glimmer of hope. The researchers were able to make the disability-related activity-modified CVs rank higher by using GPT-4’s editor function which allows users to add further customizations to the tool. In this instance, they asked GPT-4 to not exhibit ableist bias and to work with disability justice and DEI principles. With this tweak, bias for all but one of the disabilities tested improved with depression being the exception. CVs associated with deafness, blindness, cerebral palsy, autism and the general term “disability” all improved but only three ranked higher than resumes that didn’t mention disability. Overall, this system ranked the disability-modified CVs higher than the control CV 37 times out of 60 after GPT-4 was instructed to be more inclusive.
This suggests that awareness amongst recruiters of the limitations of AI and having tools that can be trained and customized on DEI principles can be part of what remains a complex challenge when it comes to enhancing inclusivity in AI.
Another aspect is growing our understanding of this new and emerging area through more specific research as Senior research author and Allen School professor Jennifer Mankhoff explained:
“It is so important that we study and document these biases,” Mankoff said. “We’ve learned a lot from and will hopefully contribute back to a larger conversation — not only regarding disability but also other minoritized identities — around making sure technology is implemented and deployed in ways that are equitable and fair.”
Aboulafia firmly agrees. Emphasizing that, “There’s always questions of multiple marginalization. So, it’s important to recognize that a straight cisgender, white disabled man is unlikely to have the same experiences with systems and technology as a disabled queer woman of color.”
Aboulafia is a huge exponent of codesigning with the disability community for both building out data sets and auditing tools but acknowledges the limitation that each individual with a disability “can only really speak out to their own lived experience.”
“It can be useful to include people with a disability rights or disability justice background,” Aboulafia says.
“There are just as many ways to be disabled as there are people with disabilities and so having a background in disability rights and justice and coming at things from that framework can help a lot with more cross-disability advocacy.”
Despite being unfathomably complex under the hood, generative AI at its front end is becoming more human-like. Maximizing its potential would appear to be, in large part, about asking it the right questions. Building a more disability-inclusive AI future may be less about talking to computers but simply liaising with the right humans at the right time and taking a moment to truly listen to what they have to say.