“[W]hile overall, educators should be cautious when using AI, it can also be beneficial.”
Source: NMSU Newsroom
Courtesy Photo Caption: Members of a new research collaborative based at New Mexico State University’s College of Health, Education and Social Transformation are exploring the use of artificial intelligence in education and ties to racial bias. The collaborative includes faculty at NMSU, the University of Texas at El Paso and other universities in the region. NMSU faculty involved in the collaborative include (from left) Mariana Alvidrez, Melissa Warr, Amanda Peel and David Rutledge. (NMSU photo by Josh Bachman)
The use of artificial intelligence in education has been a hot-button topic as of late. Research on the use of AI and ties to racial bias has led to the founding of a new research collaborative based at New Mexico State University’s College of Health, Education and Social Transformation involving faculty at NMSU, the University of Texas at El Paso and other universities in the region to explore bias in the educational uses of generative AI.
One example of generative AI is the popular ChatGPT, which allows users to create text, visual and audio content using data it is trained on. Melissa Warr, assistant professor in the School of Teacher Preparation, Administration and Leadership, is leading a research study exploring racial bias in ChatGPT, and the implications for the use of tools like Chat GPT in education.
The study, which was co-authored by Nicole Jakubczyk Oster of Arizona State University and Roger Isaac of NMSU, found that when presented with identical student writing examples along with student demographics, Chat GPT gave differing scores to students whose racial indicators were implied rather than explicitly stated, compared to scores that were given when student descriptors explicitly stated race.
“I used the same passage every time – the only thing that changed was how I described the student,” Warr says. “The resulting score patterns show significant but implicit bias.”
Warr said when calling out race directly, the bias isn’t clear, but when she described a student as attending an “inner-city public school” or “elite private school,” ChatGPT gave lower scores to inner-city public-school students. She also found bias when students were described as liking rap music compared to students who liked classical music. Scoring clearly favored students who stated a preference for classical music.
More recently, Warr has found that newer AI models, like ChatGPT-4o, show even more bias than previous models.
“When directly labeling race, there is bias – it’s just not in the expected direction,” Warr says. “It changes the tone of the feedback it gives depending on the race, so it uses less confident language and more analytical language with White students, for example. A key finding is that the feedback text takes on a more authoritative tone with Black and Hispanic students.”
Due to the study’s continuous findings, Warr helped establish the Biased AI in Systems of Education, or BAIS, research collaborative. Six NMSU faculty members make up the collaborative, along with faculty from Arizona State University, the University of Texas at El Paso, Universidad Autónoma de Ciudad Juárez, the University of Central Arkansas and Utah Valley University, plus a representative from the Spencer Foundation. They are focusing on not just bias in generative AI, but what educators can do about it to encourage critically informed use.
“Learning about generative AI has been full of surprises as it is an emerging and evolving technology, with further updates being developed as we speak,” says Nicole Oster, an ASU graduate student who is part of the collaborative and has worked with Warr on her research.
“When I was an educator, ChatGPT had only recently been made public, so my experiences learning about generative AI were new to me. At the same time, I was not surprised that ChatGPT exhibits bias as I had enough awareness about the characteristics of (large language models) and other instances of algorithmic bias in past technologies to anticipate this.”
Oster says that while overall, educators should be cautious when using AI, it can also be beneficial.
“While our findings suggest teachers should avoid or be very cautious of using generative AI to grade or provide feedback on student work, there are many other uses of generative AI that can promote equity in education,” Oster says. “As a former teacher of emergent multilingual learners, two uses about which I am particularly excited are translating between languages and modifying the reading level of texts. Other popular uses that have saved teachers time include making guided notes, study guides and quizzes. One more potentially impactful use is to augment our ability to explain content to learners in engaging, novel and developmentally appropriate ways.”
Researchers are planning a summer workshop for teachers to help them design curriculum to address the issues.
To learn more about Warr’s research, visit https://melissa-warr.com/.
A version of this story was originally published in the fall 2024 issue of Pinnacle, the NMSU College of HEST’s magazine. To read more, visit https://pinnacle.nmsu.edu/.