Chatgpt Perpetuates Gender Bias In Machine Translation And Ignores Non-gendered Pronouns: Findings Across Bengali And Five Other Low-resource Languages · The Large Language Model Bible Contribute to LLM-Bible

Chatgpt Perpetuates Gender Bias In Machine Translation And Ignores Non-gendered Pronouns: Findings Across Bengali And Five Other Low-resource Languages

Ghosh Sourojit, Caliskan Aylin. Arxiv 2023

[Paper]    
Applications Ethics And Bias GPT Model Architecture Prompting Tools

In this multicultural age, language translation is one of the most performed tasks, and it is becoming increasingly AI-moderated and automated. As a novel AI system, ChatGPT claims to be proficient in such translation tasks and in this paper, we put that claim to the test. Specifically, we examine ChatGPT’s accuracy in translating between English and languages that exclusively use gender-neutral pronouns. We center this study around Bengali, the 7\(^{th}\) most spoken language globally, but also generalize our findings across five other languages: Farsi, Malay, Tagalog, Thai, and Turkish. We find that ChatGPT perpetuates gender defaults and stereotypes assigned to certain occupations (e.g. man = doctor, woman = nurse) or actions (e.g. woman = cook, man = go to work), as it converts gender-neutral pronouns in languages to he' or she’. We also observe ChatGPT completely failing to translate the English gender-neutral pronoun `they’ into equivalent gender-neutral pronouns in other languages, as it produces translations that are incoherent and incorrect. While it does respect and provide appropriately gender-marked versions of Bengali words when prompted with gender information in English, ChatGPT appears to confer a higher respect to men than to women in the same occupation. We conclude that ChatGPT exhibits the same gender biases which have been demonstrated for tools like Google Translate or MS Translator, as we provide recommendations for a human centered approach for future designers of AIs that perform language translation to better accommodate such low-resource languages.

Similar Work