Research

Who Should Evaluate Teachers?

Integrating peer review systems into teacher evaluations can lead to improved teacher effectiveness, academic achievement, and collaboration among colleagues.

By Andrew Boryga

July 14, 2023

Your content has been saved!

Go to My Saved Content.

Despite large-scale reforms over the past decade, teacher evaluation systems have “failed to improve student outcomes,” a widely cited 2021 study by Brown University researchers concluded.

The new evaluation systems, researchers wrote, still relied on a top-heavy structure, which created “large demands on administrators’ time to conduct frequent observations and complete considerable paperwork, displacing other more potentially productive activities.” Many districts also placed “unrealistic expectations” on administrators to provide feedback to teachers, “narrowing the scope, depth, and quality of feedback teachers received.”

One of the few bright spots in the report, in fact, were cities like Cincinnati, which implemented peer evaluation systems where teachers were evaluated by “experienced, expert teachers,” in addition to school principals—a system that improved student achievement in math.

The findings align with what teachers say they want from evaluations. According to a 2022 National Council on Teacher Quality (NCTQ) report, studies over the past decade show that teachers perceive evaluations to be “more meaningful” and see greater improvement in their practice when evaluators are trained on smartly constructed observation rubrics, have more classroom experience, and have experience with “the content their evaluees are teaching.”

The system probably can’t be scrapped entirely; it’s important for school leaders to have a window into classrooms and a clear sense of how individual pedagogical decisions are aligned with the school’s mission. But there remain real questions about the nature and scope of the leaders’ role in evaluations, and whether the process as a whole should be more distributed, with greater contributions from other stakeholders, like teachers and students.

A 2016 report by the Brookings Institution, for example, argued that top-down systems would be more efficient if formal, and heavily weighted, observations were replaced with “checks” by school leaders to ensure that teachers have classrooms under control and are teaching in a way that aligns with the school mission and values. “This kind of minimal observation, analogous to car inspections, would be less taxing but still yield useful information.” Meanwhile, the report adds, incorporating more factors into the teacher evaluation process—improvements in student performance, findings from student surveys, and feedback from colleagues, for example—would give a fuller picture of teacher effectiveness and reduce noise in the system.

If decades of research call the top-down systems into question, what are the real alternatives? After all, it’s teachers who are on the hook—and it’s teachers who know the ins and outs of the craft and are well positioned to deliver transformative classroom feedback. With that in mind, we scanned the research and identified alternative, evidence-based methods of evaluation for consideration.

Collegial Feedback

A 2021 study makes a strong case against top-down evaluation systems, revealing that peer review by colleagues can prove to be more academically enriching to students and more useful to teachers.

Researchers randomly assigned 1,300 math and English teachers in the U.K. to peer observation over a period of two years. Teachers used detailed rubrics to deliver feedback on classroom environment, instruction, planning, and assessment to their peers.

The evaluations that teachers received didn’t only improve their practice; they also improved academic achievement. According to the study, students saw improvements in math and English that were the equivalent of “adding two to four weeks of additional class time to the school year,” when compared with their counterparts in schools that conducted “business as usual” evaluations, which were handled by school leaders and did not involve any observations by peers.

Observers also stood to benefit from the process. According to the study, peer reviewers learned new evaluation skills and were able to use those new skills for self-evaluation. “Skill improvement may also explain improvements for observer teachers even though they were not actually scored,” the researchers wrote.

The Value of Rubrics

What’s good for students is good for teachers. Rubrics provide structure, reduce unintentional bias, and promote detailed, targeted analysis of teaching practice when evaluators use them to provide more relevant and targeted feedback. According to the same 2021 study referenced above, “Being scored with a rubric creates new information likely helpful in deciding where to direct effort.” If observed and scored on a rubric by peers, they write, teachers not only get a sense of their own strengths and weaknesses, but also may “learn or infer information about how their own performance compares with that of other teachers.”

Rubrics, the researchers contend, should be organized into key sections to evaluate a teacher’s practice, with several standards articulated within each section, such as whether a teacher creates an effective learning culture characterized by high expectations for students, whether they have solid classroom procedures and time management strategies in place, and whether they can effectively manage student behavior.

But passing out evaluation rubrics isn’t enough. To make teachers confident in the integrity of the process, training observers on how to use the rubrics is required, according to NCTQ. Research shows that “teacher observation scores were more reliable and student learning improved when teachers were observed by evaluators who had been trained on the observation rubric.”

A Picture’s Worth a Thousand Words

Part of the problem with more traditional evaluation processes is that they are a one-shot affair, requiring teachers to perform under unusual, high-stress circumstances. Video, on the other hand, is a powerful, asynchronous tool that can be leveraged for evaluations, writes educator Michael Moody. “Teachers can record themselves and submit videos to be viewed later by evaluators and/or peers for observation and coaching.” Videos can also provide teachers with an opportunity to self-evaluate and tinker with their teaching practice as they work on the feedback they’ve received.

In a recent study, researchers found that when teachers videotaped themselves delivering a lesson and then discussed it with a feedback partner, they reported more positive feelings about the evaluation process and remained in the profession at a higher rate than peers not selected for videotaped observations.

In addition, researchers noted, video observations are more efficient, allowing observers, including busy school leaders, to provide targeted feedback on their own schedule, without having to make space for time-consuming classroom visits.

The Role of Students

A 2010 report from the Gates Foundation analyzed nearly 3,000 teachers in urban school districts and looked at multiple sources of data to test new approaches to measuring effective teaching. Part of that data included student surveys. Researchers concluded that student feedback and student achievement “do seem to point in the same direction, with teachers performing better on one measure tending to perform better on the other measures.”

According to the researchers, “Students seem to know effective teaching when they experience it.” Their research showed that positive student perceptions of a teacher were not merely self-serving: Teachers who were highly rated by students drove significant academic progress in other classes as well. The researchers noted that the most important data points were related to the students’ perception of a teacher’s ability to control a classroom and challenge students with rigorous work.

To get clear directional feedback, the researchers recommended surveys that present students with a wide range of statements and allow them to indicate their level of agreement, from “totally untrue” to “totally true.” Some statements, like “My teacher seems to know if something is bothering me,” are related to how much a teacher seems to care about students and their work, while others, like “My teacher knows when the class understands, and when we do not,” are related to how well a teacher can clarify and explain difficult concepts. Meanwhile, prompts like “My teacher pushes us to think hard about the things we read” are aimed at assessing the level of rigor in the classroom.

Student surveys are imperfect tools, of course. Scores on surveys can be negatively influenced by students who take umbrage with work they find too challenging, refuse to take the surveys seriously, or don’t trust that their answers will be kept confidential. This is why the researchers argued that student surveys represent a “valuable complement” and “inexpensive way” to supplement other, more costly, performance measures and indicators, such as classroom observations.

Group Reflection Using Teaching Squares

Teaching squares are a peer observation tool developed by educator Anne Wessely and frequently used in higher education. The approach calls for a group of four instructors, ideally from different disciplines, who observe each other’s teaching over a sequence of weeks or months.

In a 2017 report, researchers from the Centre for Teaching Support & Innovation at the University of Toronto wrote that group members are not meant to conduct formal evaluations of their colleagues, but rather to focus observations around a few stated goals the teacher might have—such as stimulating more classroom discussion and student engagement or creating smoother transitions between activities.

After each member of the group has been observed by their peers, the entire group meets to debrief, share notes, and discuss strategies.

The aim of the group is to enhance each individual’s teaching and learning through a “structured process of classroom observation, reflection and discussion,” according to the researchers from the University of Toronto. Such observations and discussions—which could be recorded as part of the yearly evaluation process—might focus around a teacher’s desire to arouse curiosity in students and leave them “wanting to know more” at the end of a lesson, according to a teaching squares guide from the University of Central Florida. Alternatively, the group might focus on how one member ties assessments to learning objectives or how well they structure and organize group work.

During class visits, the group records observations. A larger group meeting occurs at the end of this process, which is meant to create a space where best practices and alternative methods can be exchanged among peers and across disciplines.