Anthropocentric bias in language model evaluation

Charles Rathkopf; Raphaël  Millière

Authors

Charles Rathkopf Forschungszentrum Jülich
Raphaël Millière

Keywords:

language models, cognitive evaluation, anthropocentric bias, metalinguistic prompting , grammaticality judgment , auxiliary oversight, mechanistic chauvinism, test-time scaling, latent competence

Abstract

Evaluating the cognitive capacities of large language models (LLMs) requires overcoming
not only anthropomorphic but also anthropocentric biases. This article identifies two types of
anthropocentric bias that have been neglected: overlooking how auxiliary factors can impede
LLM performance despite competence (auxiliary oversight), and dismissing LLM mechanistic
strategies that differ from those of humans as not genuinely competent (mechanistic chauvinism).
Mitigating these biases requires an empirical, iterative approach to mapping cognitive tasks to
LLM-specific capacities and mechanisms, achieved by supplementing behavioral experiments
with mechanistic studies.

Anthropocentric bias in language model evaluation

Authors

Keywords:

Abstract

Downloads

Published

Issue

Section

Make a Submission

Information

Announcements

Special Issue on the Ethics of NLP and CL in Computational Linguistics

EMNLP 2026 – CL deadlines for Qualifying Papers