Are Your Metrics Doing Too Much Work?

Evaluation is a constant feature of academic life: hiring, promotion, grant review, annual reviews, program rankings. Wherever evaluation happens at scale, metrics inevitably follow. But are the metrics we rely on actually measuring what we think they are?

The Problem with Familiar Favorites

Two of the most widely used metrics, the Journal Impact Factor (JIF) and the h-index, illustrate the core challenge. The JIF was designed to compare citation rates across journals, not to evaluate individual papers or researchers. When applied to people, it masks enormous variation within journals and rewards venue prestige over actual contribution. The h-index blends productivity and citation impact, but accumulates over career time, varies by database and obscures how impact is distributed. Two very different careers can produce the same number.

Despite these limitations, quantitative indicators are deeply embedded in U.S. academic evaluation. An analysis of promotion and tenure documents from 129 North American institutions found that 97 percent used quantitative criteria explicitly, and 40 percent directly referenced the JIF, with 87 percent of those mentions framed positively (McKiernan et al., 2019). Research on U.S. evaluator practices found that while most describe themselves as "responsible" metric users, familiar indicators continue to shape and justify decisions more than evaluators tend to acknowledge (Rushforth and de Rijcke, 2024).

What Responsible Use Looks Like

Stronger metrics are field-aware, make their assumptions visible and are designed to support expert judgment rather than replace it. Field-Weighted Citation Impact (FWCI) and percentile-based citation indicators reduce misleading cross-field comparisons while being transparent about their limits.

Beyond metric choice, narrative approaches offer a promising complement. The Royal Society's Résumé for Researchers asks faculty to describe contributions in their own words, surfacing mentorship, team science and career interruptions that publication lists obscure. Narrative bibliometrics treat metrics as evidence within a larger story, rather than as the story itself.

What Evaluation Often Leaves Out

Mentoring, DEI work, patient advocacy, accessibility labor and community partnerships rarely appear in quantitative indicators, yet institutions depend on them. A survey of 181 researchers found that 59 percent felt evaluated using methods misaligned with their work, and 60 percent encountered non-transparent criteria; lack of recognition for diverse contributions was cited far more often than metric misuse itself (Muhonen and Himanen, 2025). This invisible labor falls disproportionately on BIPOC, women and disabled faculty already facing well-documented evaluation biases (Hirshfield and Joseph, 2012; Guarino and Borden, 2017).

The Galter Library Metrics and Impact Core Can Help

Responsible metrics practice starts with a simple pause: What is this metric standing in for, and what might it be leaving out?

The Galter Library Metrics and Impact Core can help faculty navigate these questions, whether you're constructing a contextualized citation profile, preparing promotion materials, developing a narrative CV, or exploring values-based approaches to research assessment.

To learn more, visit the Galter website.

References

Guarino, C. M., & Borden, V. M. H. (2017). Faculty service loads and gender: Are women taking care of the academic family? Research in Higher Education, 58, 672–694. https://doi.org/10.1007/s11162-017-9454-2

Hirshfield, L. E., & Joseph, T. D. (2012). 'We need a woman, we need a black woman': Gender, race, and identity taxation in the academy. Gender and Education, 24(2), 213–227. https://doi.org/10.1080/09540253.2011.606208

McKiernan, E. C., et al. (2019). Use of the Journal Impact Factor in academic review, promotion, and tenure evaluations. eLife, 8, e47338. https://doi.org/10.7554/eLife.47338

Muhonen, R., & Himanen, L. (2025). Evaluation as a source of unhappiness in academia — unpacking the boundaries of responsible research assessment. Research Evaluation, 34. https://doi.org/10.1093/reseval/rvaf034

Rushforth, A., & de Rijcke, S. (2024). Practicing responsible research assessment: Qualitative study of faculty hiring, promotion, and tenure assessments in the United States. Research Evaluation, 33, rvae007. https://doi.org/10.1093/reseval/rvae007

Karen Gutzman leads Research Assessment and Communications at Galter Health Sciences Library, Feinberg School of Medicine, Northwestern University.

Are Your Metrics Doing Too Much Work?

By Karen Gutzman, Head of Research Assessment and Communications