4 October 2016
by Simon Marginson

Do rankings drive better performance?

Global ranking is still only 13 years old but has already installed itself as a permanent part of international higher education and has deeply transformed the sector.

Global ranking is inevitable. People inside and outside the sector want to understand higher education, and ranking is the simplest way to do so. It maps the pecking order and underpins partnership strategies. It guides investors in research capacity. It shapes the life decisions of many thousands of cross-border students and faculty—despite the patchy quality of much of the data, and the perverse effects of all rankings, good or bad.

Global ranking has remade global higher education as a relational environment, magnifying some potentials in that environment, and blocking others. It has done so in three ways.

First, competition. Ranking has burned into the global consciousness the idea of higher education as a competitive market of universities and countries. This competition is about research performance, the main driver of ranking outcomes, and about reputation.

Second, hierarchy. Ranking is a core element of the system of valuation, whereby unequal weights are assigned to knowledge and to the credentials that graduates take into national and global labor markets. Through ranking universities become more tightly connected to the political economy, the labor markets and the unequal societies in which they sit.

Third, performance. Ranking has installed a performance economy that controls behavior, driving an often frenetic culture of continuous improvement in each institution.

Unequal competition

There are naturally competitive elements in research and in graduate labor markers. But ranking gives competition a more powerful and pristine form, embedding it in indicators and incentives. It makes competition the principal strategy for many university rectors, presidents and vice-chancellors. Solidarity and cooperation within systems is weakened.

We continue to cooperate, regardless of ranking. The metrics include intellectual collaboration in publishing, though this is often explained as self-interest (joint publication expands citation rates). But the point is that a large and increasing share of the remarkable collective resources in global higher education is allocated to mutual conflict.

Cooperation is further hampered by the hierarchy of value formed in ranking. Though research and learning flow freely across borders they are not equally valued. There is a clear status hierarchy. What defines this hierarchy is not a global system for valuing credentials or learning. There is no global system for credentials. We don’t measure learning on a comparative basis. What systematizes the global hierarchy is the process of codifying, rating and ranking knowledge, summarized and spread everywhere by global ranking.

Knowledge is ordered by journal metrics and hierarchies, publication metrics, citation metrics and hierarchies, and crowned by rankings, which are largely based on research. Research performance is the whole content of the Shanghai ARWU, and the Leiden ranking and Scimago, and more than two thirds of the Times Higher Education ranking. Rankings translate the status economy in research into an institutional hierarchy, determining the value of each knowledge producer and so determining the value of what they produce. Knowledge metrics and ranking recycle the dominance of the strongest universities.

Better performance?

What about performance improvement? This is the ultimate rationale for competition. If ranking is grounded in real university performance, and measures the important things about universities, then a better ranking means improved performance. If every university strives for a higher rank, all must be lifting performance. Is this what happens? Yes and no.

The potential is there, for a virtuous circle between ranking, strategy, efforts to improve, better performance, then back to better ranking, and so on. But there are problems.

Only some university activities are included in ranking. There is no virtuous circle for teaching and learning, a big gap in the performance driver. Many research metrics are inside the virtuous circle, but not in the humanities, the humanistic social sciences and most professional disciplines, and all scholarly work outside English is excluded.

What about science? There some rankings drive performance, others do not. Rankings that rest on coherent metrics for publication and citation drive more and better research outputs, all else being equal (e.g. ARWU, Leiden, Scimago). Since 2003 research-based rankings have contributed to increased investment in university scientific capacity and elevated research outputs within institutional strategy.

The picture is more mixed with the Times Higher and QS ranking. To the extent they draw on strong research metrics, there is the potential for a virtuous circle. Taken alone, the QS indicator for citations per faculty, and the Times Higher indicators for citations and for research volume, potentially have this effect. ‘Potentially’, because the incentives are blunted: the research-based indicators are buried within combined multi-indicators.

The internationalization indicators generate incentives to increase students and faculty from abroad, and joint publications, but tare minor within the total ranking—and again the performance incentive is buried within the other elements in the multi-indicators used.

Therefore a university may improve its citation per faculty performance, or improve its internationalisation numbers, but watch its ranking go down because of what happened in the reputational surveys, which constitute a large slab of both the Times Higher and the QS ranking but are decoupled from real performance. Surveys contain data about opinions about performance, not data about performance. The link between effort, improvement and ranking, essential to the virtuous circle, is broken.

The same happens when the ranking position changes because of small shifts in methodology. Again, there is no coherent link between effort, performance and ranking.

Wait on, you might say, reputation matters to students. The value of degrees is affected by the pecking order. That’s right. And a reputational hierarchy based on surveys, by itself, uncontaminated by other factors, does tell us something important. But a reputational ranking alone, while interesting, cannot drive continually improving performance in real terms. It can only drive a position-and-marketing game. In the end, reputation must be grounded in real performance to consistently benefit stakeholders and the public good.

The point can be made by analogy. The winner of the World Cup in football is determined by who scores the most goals within the allotted time on the field. Now what if FIFA changes the rules. Instead of rewarding the final performance alone, who scores the most goals, it decides to give 50% to the most goals, and 50% to the team believed to be the best, measured by survey. We would all have less trust in the result, wouldn’t we?

Multi-indicator rankings provide a large data set, but because the link between effort in a each area and the rankings outcome is not transparent, they cannot coherently drive performance. The incentives pull in different directions and the effects are invisible. In ARWU the different indicators correlate fairly well; they pull in the same direction and share common performance drivers. But QS and Times Higher use heterogeneous indicators.

On the other hand, if the multi-indicator rankings were disaggregated, the individual indicators could effectively drive performance improvement. Then at least ranking competition would be directed towards better outcomes, not reputation for its own sake.