Using Large Language Models for Qualitative Analysis can Introduce Serious Bias,Sociological Methods & Research

当前位置： X-MOL 学术 › Sociological Methods & Research › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Using Large Language Models for Qualitative Analysis can Introduce Serious Bias
Sociological Methods & Research ( IF 6.5 ) Pub Date : 2025-05-27 , DOI: 10.1177/00491241251338246
Julian Ashwin, Aditya Chhabra, Vijayendra Rao

Large language models (LLMs) are quickly becoming ubiquitous, but their implications for social science research are not yet well understood. We ask whether LLMs can help code and analyse large-N qualitative data from open-ended interviews, with an application to transcripts of interviews with Rohingya refugees and their Bengali hosts in Bangladesh. We find that using LLMs to annotate and code text can introduce bias that can lead to misleading inferences. By bias we mean that the errors that LLMs make in coding interview transcripts are not random with respect to the characteristics of the interview subjects. Training simpler supervised models on high-quality human codes leads to less measurement error and bias than LLM annotations. Given that high quality codes are necessary in order to assess whether an LLM introduces bias, we argue that it may be preferable to train a bespoke model on a subset of transcripts coded by trained sociologists rather than use an LLM.

中文翻译：

使用大型语言模型进行定性分析可能会引入严重的偏差

大型语言模型（LLM）正迅速变得无处不在，但它们对社会科学研究的影响尚不清楚。我们询问 LLM 是否可以帮助编码和分析来自开放式访谈的大 N 定性数据，并应用于对孟加拉国的罗兴亚难民及其孟加拉东道主的采访记录。我们发现，使用 LLM 对文本进行注释和编码可能会引入偏见，从而导致误导性推理。我们所说的偏见是指 LLM 在编码访谈记录时所犯的错误就访谈对象的特征而言不是随机的。与 LLM 注释相比，在高质量人工代码上训练更简单的监督模型可以减少测量误差和偏差。鉴于为了评估 LLM 是否引入偏见，需要高质量的代码，我们认为在由训练有素的社会学家编码的成绩单子集上训练定制模型可能比使用 LLM 更可取。

更新日期：2025-05-27

点击分享查看原文

点击收藏

阅读更多本刊新发论文