GFC and the USRIs: How Should Teaching Evaluations at the University of Alberta Be Used?

This is a brief post meant to stimulate conversation about a question faculty and other instructors across campus should be asking: in a decade in which studies indicating that student evaluations of teaching involve bias against certain groups of instructors, how should our teaching evaluations at the University of Alberta, the USRIs (Universal Student Ratings of Instruction), be used?

The issue was on the floor Monday at the first 2017-18 meeting of the General Faculties Council (GFC) in relation to a report tabled by the Committee on the Learning Environment (CLE) responding to a 30 May 2016 motion of GFC. Problem is, neither the formal paperwork for the motion nor the report itself correctly cited the motion. As a result of an amendment I moved at that May 30th meeting in 2016, the motion included a crucial word that should have been a touchstone for the committee’s work. The word omitted from the formal paperwork and every mention of the motion in the report is in blue:

THAT the General Faculties Council, on the recommendation of the GFC Executive Committee, request that the GFC Committee on the Learning Environment report by 30 April 2017, on research into the use of student rating mechanisms of instruction in university courses. This will be informed by a critical review of the University of Alberta’s existing Universal Student Ratings of Instruction (USRIs) and their use for assessment and evaluation of teaching as well as a broad review of possible methods of multifaceted GFC General Faculties Council 05/30/2016 Page 9 assessment and evaluation of teaching. The ultimate objective will be to satisfy the Institutional Strategic Plan: For the Public Good strategy to: Provide robust supports, tools, and training to develop and assess teaching quality, using qualitative and quantitative criteria that are fair, equitable, non-discriminatory and meaningful across disciplines. CARRIED

The paragraph on page 4 of the CLE report dealing with studies indicating that student evaluations involve a gender bias reads as follows:

● Gender: The literature in this category is extensive and conflicted. Numerous articles in this subcategory report gender differences or no differences in student evaluations of teaching. For example, Boring, Ottoboni, and Stark (2016) concluded that student ratings are “biased against female instructors by an amount that is large and statistically significant.” On the other hand, Wright and Jenkins-Guarieri (2012) conducted a meta-analysis of 193 studies and concluded that student evaluations appear to be free from gender bias. The University of Alberta TSQS conducted descriptive analyses and the results showed there is no apparent difference between scores for males ( N†=18576, Mdn†= 4.53) and females ( N†= 13679, Mdn†= 4.57) for statement 211 (“overall the instructor was excellent”) .

In my remarks at GFC I noted that it should concern us that this paragraph is cursory in its treatment of the possibility that student evaluations of teaching involve gender discrimination. I also find the description of the “literature” as “extensive and conflicted” odd given that the table in the report clearly shows that studies indicating gender bias are far more numerous than those that do not. But how about that last sentence?! What do you make of that?

I didn’t mention the sentence in my prepared remarks, partly because I aimed to keep those remarks to no more than two minutes. Over a hundred people sit on GFC. On such a weighty matter I assumed that many colleagues would want to speak. Instead, the presentation team from CLE defended their position by citing that sentence. I cannot for the life of me see how that sentence shows anything. To say that the median scores are the same for men and women tells us nothing about how the scores are achieved. It surely obscures much. Right? But I’m just a Shakespearean . . . .

GFC, by the way, was being asked to endorse a set of recommendations in which our USRIs would continue to be used for summative purposes — that is, for merit, tenure, and promotion decisions. I stated that we need to take seriously the statement on page 10 of the report that indicates the reservations of some chairs: “Some department chairs expressed concerns around biases, validity, and the potential for misinterpretation of USRI results for summative purposes of promotion and tenure decisions.”

This, in my view, is exactly the issue. Instructors at the University of Alberta need to receive formal feedback from their students about their courses. Formative feedback on their teaching from their students is important. And we could seek that feedback by more sophisticated means than we currently do. But with a growing body of research indicating that student evaluations of teaching involve bias — the most significant studies are about bias in the assessment of instructors who are women — it would not be responsible for GFC to continue to endorse the use of USRIs for “summative purposes.” I cited the conclusion of Boring, Ottoboni, and Stark study to this effect. Its final statement reads as follows:

[T]he onus should be on universities that rely on SET [student evaluations of teaching] for employment decisions to provide convincing affirmative evidence that such reliance does not have disparate impact on women, underrepresented minorities, or other protected groups. . . . Absent such specific evidence, SET should not be used for personnel decisions.

                                                                                    [my emphases]

The issue has now entered the international mainstream in the form of an article in last week’s Economist which you can read here. The Economist discusses a study published last Fall by another team of researchers, Mengel, Sauermann, and Zölitz. The Mengel, Sauermann, and Zölitz study is not mentioned in the CLE report. In the Economist it is discussed under the category “Academic Sexism.”

At GFC, President Turpin invited someone to move a postponement of consideration of the CLE’s recommendations. The matter will presumably return at the next meeting of GFC, scheduled for 30 October 2017. Can I hear from you before then? Especially about that darn sentence: The University of Alberta TSQS conducted descriptive analyses and the results showed there is no apparent difference between scores for males ( N†=18576, Mdn†= 4.53) and females ( N†= 13679, Mdn†= 4.57) for statement 211 (“overall the instructor was excellent”). I have heard some very scathing things about this statement from people whose disciplines make their critique significant but I’d like to hear more. 

Shall I also ask Boring (Institut d’études politiques de Paris), Ottoboni (Berkeley), Stark (Berkeley), Mengel (University of Essex), Sauermann (Stockholm University), and Zölitz (Institute on Behaviour and Inequality, Bonn, Germany), what they make of it? ; )

Oh, and for now you can read CLE’s report in full here.

 

 

 

This entry was posted in alberta postsecondary education, Student evaluations of teaching and tagged , , , , , , , . Bookmark the permalink.

3 Responses to GFC and the USRIs: How Should Teaching Evaluations at the University of Alberta Be Used?

  1. There are many problems with this USRI report that I will discuss at a later time. However, I would like to point out one of the most glaring errors right now. In discussing gender bias in evaluations, the authors of the report described the findings of the Wright and Jenkins-Guarieri (2012) article, stating that “Wright and Jenkins-Guarieri (2012) conducted a meta-analysis of 193 studies and concluded that student evaluations appear to be free from gender bias“ on page 4 of the Summary Report. However, if you read beyond the abstract of the Wright and Jenkins-Guarieri (2012) article, you will see that this article is a meta-analysis of meta-analyses that only includes one 1993 meta-analysis related to gender across 28 studies (not the 193 cited in the USRI report). Wright and Jenkins-Guarieri (2012) even noted that “the meta-analysis largely included studies from the 1970s, and more current research is needed before making conclusive statements based on gender” (p. 693). Overall, I was quite sad to see such a lack of attention to research and detail in this report with additional factual errors that go beyond the discussion of gender bias.

  2. Arts Squared says:

    Thanks, Michelle. Very much look forward to hearing further! In the meantime I thought I should note (in relation to one of the comments that I’ve received offline) that it is true that the issues with their possible discriminatory aspects are not the only issues with our USRIs. The anonymity of student evaluations (to cite just one other issue) is also a problem. And as I noted at GFC last week, there are better models elsewhere for the form that student evaluations might take in order to be better mechanisms for soliciting formative feedback.

  3. Arts Squared says:

    Further on the larger problems of student evaluations: “There is no current literature of which the current author is aware that convincingly argues for the position that SET scores are measures of instructor teaching competence, and no specific evidence of content validity that suggests that the instrument used at many universities measures teaching competence” (Hornstein, 2017 at http://www.tandfonline.com/doi/full/10.1080/2331186X.2017.1304016).

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s