Study looks ‘under the hood’ of new teacher-evaluation systems

More and more states are adopting new teacher-evaluation systems in response to a growing consensus that improved teacher quality can spell improved student achievement. The idea is that measuring how teachers perform in the classroom will help schools take the first steps toward helping them get better. But so far, there’s little consensus on the best ways to make those measurements.

Many states are moving quickly to launch new systems—in some cases, without much thought or work to ensure that the methods are sound. One teacher evaluation expert, Charlotte Danielson, has warned of a wave of lawsuits in places that don’t proceed with extreme caution.

A new report out today looks at the early adopters of new teacher-evaluation systems—and how they differ—as a way to help states and districts consider both innovations and potential missteps. The report, “Measuring Teacher Effectiveness: A Look ‘Under the Hood’ of Teacher Evaluation in 10 Sites,” was commissioned by ConnCAN, 50CAN and Public Impact, all education advocacy groups that support overhauling how teachers have been evaluated historically.

While most states and districts have agreed to use multiple ways to rate teachers—including, usually, a combination of classroom observations and student test-scores—from there they often diverge.

The use of standardized test-scores to evaluate teachers has received a great deal of attention, but one of the bigger challenges that states and districts face is how to rate teachers whose students don’t take such tests. In most school systems, the vast majority of teachers—up to 88 percent, for instance, in the District of Columbia—do not teach subjects or grade-levels covered by standardized tests. The study found that the early adopters have come up with different ways to measure how the students of these teachers are progressing. In some instances, districts are simply adding more standardized tests. Elsewhere, teachers will be graded based on portfolios or teacher-created assessments.

In several cases, more than one measure looking at how a teacher has affected students is being taken into account. “The rationale is that no single measure is perfect, but combining multiple measures diminishes the weaknesses of any particular measure,” the report’s authors say.

Daniela Doyle and Jiye Grace Han of Public Impact, who co-wrote the report, also found differences in how student “growth” on standardized tests is measured; some of the more esoteric details of the calculations have been hotly debated among researchers.

When it comes to classroom observations, “there was a surprising amount of consensus,” according to Doyle and Han. But leaders of the various sites studied in their report made different decisions about who conducts the observations and how often they occur. (In our own reporting on efforts to overhaul teacher evaluations, we have found nuances among different systems that do seem to matter.)

“Measuring Teacher Effectiveness” doesn’t endorse any particular method: “None of these systems claims to have cracked the code for teacher evaluation,” the authors write. Instead, they focus on in-depth comparisons. What might seem like so many insignificant details to outside observers will loom large for the teachers and students they’re meant to help.