Final Observations of Canadian University Rankings: A Misadventure Now Over Two Decades Long

In November, 2012, Maclean’s published its  21st annual rankings of Canadian universities. Indeed, the ranking of universities has become a popular exercise with which to assess and promote higher education in North America. The ranking approach is similar to that used by publications such as Consumer Reports, in which goods or services are assigned scores based on rational parameters, and then assigned relative rank standings. Rankings of universities continue to be advertised annually as required reading for prospective students (and parents)—for example, to locate “top profs.”

The annual ranking data, to date, have elicited almost no formal statistical or quantitative evaluation. However, we have now analyzed the ranking data used in the Maclean’s system for every year of publication, i.e., since their inception in 1990. This planned sequence of studies, for which many references are now available in the literature, has been carried out annually by myself and Prof. Ken Cramer, Dept. of Psychology, University of Windsor.

Because many Canadian schools have now withdrawn their active cooperation in supplying information to Maclean’s, the data underlying the annual rankings are now drawn largely from publicly available sources such as Statistics Canada. Six main measures continue to be used:  Student Body (comprised of indices of students’ past performance); Classes (including indices of class size and percentage of classes taught by tenured faculty); Faculty (indices of faculty members’ academic qualifications; Finances (indices of budget parameters and student services; Library (indices assessing holdings); and Reputation (indices based on alumni support and a reputational survey—not including student respondents. The number of indices comprising each measure has been recently reduced to 14, 13, and 13, respectively, i.e., for Medical/Doctoral, Comprehensive, and Undergraduate schools.

Key Observations From Annual Data Analyses

Using the 2010 ranking data alone as a representative example and reference point, and using Spearman rho (rank-based) correlations which assess the level of association between two rank-based variables, we find that many indices are actually unrelated to final rankings. For each university type, as in all previous studies, many of the rho correlations are actually negative, that is, with higher final rankings correlated with lower rankings on several indices, and vice versa. For Medical/Doctoral universities, 6 of the 14 possible correlations with rank (42 % cent) were statistically significant, i.e., at the conventional criterion of less than five chances in 100 of the correlation occurring by chance. For Comprehensive universities, 4 out of the 13 correlations with rank (30 %) were significant, and, for Undergraduate universities, 5 out of 13 (38 %) were significant. Also, although they are conceptually similar across Maclean’s three university types, inspection of the intercorrelation of indices in 2010 shows again that they correlate weakly and unpredictably with each other. In practical terms, the data therefore seldom allow students or others to use the indices as logical or reliable indicators, either of final rank standings or each other.

As we have done for other years, we also assessed to what extent lower-ranking universities in 2010 differed from higher-ranking ones, in terms of the indices in the Maclean’s system. The top and bottom subgroups (halves) of the universities, within each type, were therefore compared using the Wilcoxon Rank Sum test (Mann-Whitney U-test), which examines the significance of differences in ranked data on a specified index, taken from two independent samples (i.e., universities). For all universities pooled together, 9 of these 40 comparisons (22 %) were significant at the .05 level of significance. 

For Medical/Doctoral universities, the top and bottom groups (halves) differed significantly on 2 of the 14 individual indices (14 %). For Comprehensive universities, the top versus bottom halves differed significantly on 3 of 13 (23 %), and for Undergraduate universities, the top versus bottom halves differed on 4 of 13 (30 %) of the indices. Thus, collapsing over the three university types, the top and bottom halves did not differ significantly in average rank on 31 (78%) of the 40 individual comparisons. For most comparisons, higher-ranking universities were therefore little or no different from those of lower rank, and vice versa.

A vertical rank ordering of schools tends to exaggerate apparent differences and mask similarities and overlap. We thus have employed cluster analysis, using Ward’s method of clustering, with which to examine interrelationships and similarities among the universities for the 2010 rankings, across the three university types. This procedure identifies clusters, or families, of schools which are empirically similar, and excludes those which are dissimilar, i.e., based on their overall pattern of scores on the individual Maclean’s indices. For each annual analysis, we have always found that the relationships within and between clusters (i.e., groupings of empirically similar schools) were not clearly reflective of rank differences between higher and lower standing universities, or differences within or across the three university types. In several cases, “unlikely” pairs or groups of schools were seen to be empirically similar, that is, in terms of their pattern of scores on the indices contributing to their final ranks. In effect then, schools of different characteristics, programs, missions, types, and rank standings may nevertheless show communality in their pattern of scores on a particular set of indices. 

The 2011 rankings were published in November, 2011. In these, Brock, Ryerson, and Wilfrid Laurier were moved by Maclean’s from the Undergraduate to the Comprehensive category.

In this sample, the basic observations were highly similar to those of all previous years. For all university types combined, 23 of 40 possible rho correlations (57 per cent) between indices and final rank standings were significant.

For all university types combined, Mann-Whitney U-tests showed that 12 out of 40 (30 per cent) of comparisons between the top and bottom halves of universities were significant, i.e., at the .05 level of significance.

Finally, a cluster analysis again indentified several clusters and sub-clusters, each containing several family members whose coexistence seemed improbable a priori, but which were empirically similar in terms of their pattern of scores on the underlying indices.                 

Conclusions

Aside from the formalities of statistical comparisons, however, there are broader, and recurring, issues relevant to ranking exercises such as the Maclean’s system.

One issue is that the various indices and measures have repeatedly shown only partial overlap with the reasons for university attendance and selection typically reported and actually employed by undergraduate students themselves.

A second issue, though dependent somewhat on the nature of measures used and how research questions are asked, is that the annual rankings have not strongly reflected available studies of student satisfaction. Students have often indicated high levels of satisfaction and loyalty toward their own institutions, and higher ranking institutions, interestingly, have often done relatively poorly on this type of measure. Moreover, should university rankings be based on student satisfaction indices generally, few if any major differences in university rank ordering would likely be observed. Moreover, twenty one years of annual rankings, i.e., when their properties are carefully examined, have taught us that they cannot be used to identify a singular “best” school for a particular person.

A third issue therefore is that rank-based data alone cannot reflect individual missions or programs unique to particular schools—nor does it seem reasonable to expect lay readers or students in general to  meaningfully compare schools using specialized indices such as university finances, library, faculty characteristics, or parameters concerning faculty research grants. Similarly, Stephen Trachtenberg, President Emeritus of George Washington University, cited by CNN.com, Feb.9, 2012, commented that the 2012 annual report of national college rankings, published by U.S. News and World Report, represented a “racket”—and also asked “whether vichyssoise is better than chicken soup with matzoh balls, or just more satisfying to some people more than others? Frequently, it’s a matter of taste more than nutrition.”

 A final matter concerns the unintended effects of ranking comparisons upon the quality of a university’s academic and intellectual spirit, as these are experienced and perceived by students. These side effects raise the possibility that rankings may help to generate yet another form of the educational self-fulfilling prophecy, indeed one which will affect negatively the lives of many students.

The student, supposedly the “consumer,” is therefore an increasingly vulnerable customer and target, caught between the ideals of higher education and annual financially-driven exercises and marketing campaigns—many now increasingly undertaken or tacitly supported by universities themselves. Despite the uncertainties of ranking exercises observed annually since 1990, institutions do not hesitate to utilize and strategically exploit this material, again perhaps driven by the general idiom of hubris, that is, of competition, self-importance, and self promotion.

 Our 21 years of examining Canadian university rankings have now concluded. In the author’s opinion, the essential portrayal of academic matters initiated by Maclean’s in 1990 has changed little, and neither has its underlying view of higher education as a type of financial issue and investment. According to Maclean’s (2011), earning (receiving) a master’s degree, for example, will improve one’s wages about 4.1 per cent. The current system thus continues to emphasize financial and other generic factors which can be directly measured—something like ranking humans in terms of height, weight, or bank accounts—but also continues to exclude factors in which various schools may excel in the context of their particular circumstances, resources, educational vision, and service to unique populations. Periodically we find some mention of unique programs at particular locations, such as those for Aboriginal or foreign students. Yet, after 21 years, the present system still includes no criteria with which to represent meaningfully or credit these programs. Moreover, we have also observed, following several analyses undertaken principally by Prof. Ken Cramer at the University of Windsor, that the statistical variance due to variability in rank over time is typically far greater than that due to reputation, that is, despite some variation in certain indices or ranks over time, schools tend to maintain their basic hierarchy of ordering in terms of resources and reputation.

A last academic matter remains for the future. That is, the question of whether rankings and their promoters will continue to emphasize competition, oversimplification, and identification of the supposedly less fit. In Universities in the marketplace (2003), former Harvard president Derek Bok wondered whether universities will continue to sell their soul, if the price is right. Bok’s answer was yes, but, in the present context of university rankings and their effects, the question is perhaps still open, and a different answer may be possible over another twenty one years.

Stewart Page, Ph.D., is University Professor Emeritus of Psychology, University of Windsor. References, and details of research and analyses summarized herein, are available upon request.