[Paper]
Members of the Human-Robot Interaction (HRI) and Artificial Intelligence (AI)
communities have proposed Large Language Models (LLMs) as a promising resource
for robotics tasks such as natural language interactions, doing household and
workplace tasks, approximating common sense reasoning', and modeling humans.
However, recent research has raised concerns about the potential for LLMs to
produce discriminatory outcomes and unsafe behaviors in real-world robot
experiments and applications. To address these concerns, we conduct an
HRI-based evaluation of discrimination and safety criteria on several
highly-rated LLMs. Our evaluation reveals that LLMs currently lack robustness
when encountering people across a diverse range of protected identity
characteristics (e.g., race, gender, disability status, nationality, religion,
and their intersections), producing biased outputs consistent with directly
discriminatory outcomes -- e.g.
gypsy’ and mute' people are labeled
untrustworthy, but not
european’ or `able-bodied’ people. Furthermore, we test
models in settings with unconstrained natural language (open vocabulary)
inputs, and find they fail to act safely, generating responses that accept
dangerous, violent, or unlawful instructions – such as incident-causing
misstatements, taking people’s mobility aids, and sexual predation. Our results
underscore the urgent need for systematic, routine, and comprehensive risk
assessments and assurances to improve outcomes and ensure LLMs only operate on
robots when it is safe, effective, and just to do so. Data and code will be
made available.