“‘Where are the academics who are in favor of value-added?’ Here they are, with persuasive reasoning.”
So concludes the press release that today announced a new Brookings report, “Evaluating Teachers: The Important Role of Value-Added.” The report was co-authored by Steve Glazerman (Mathematica Policy Research), Susanna Loeb (Stanford), Dan Goldhaber (University of Washington-Bothell), Doug Staiger (Dartmouth), Stephen Raudenbush (University of Chicago) and Russ Whitehurst (Brookings).
The report can be seen in some ways as a direct reply to the EPI brief, “Problems with the Use of Student Test Scores to Evaluate Teachers,” released in August. The EPI brief was co-authored by an equally distinguished set of scholars, including Linda Darling-Hammond, Helen Ladd , Diane Ravitch, Richard Rothstein and Lorrie Shepard.
The Brookings paper also addresses recent reports on value-added measures from the Educational Testing Service, the Institute of Education Sciences, the National Academies and RAND. The authors acknowledge some common ground with what others have argued — no serious scholar has said that value-added measures should be the sole factor in teacher evaluations, and few favor publishing individual teachers’ value-added scores — but they also highlight their disagreements:
“There are three problems with these reports. First, they often set up an impossible test that is not the objective of any specific teacher evaluation system, such as using a single year of test score growth to produce a rank ordered list of teachers for a high stakes decision such as tenure. Any practical application of value-added measures should make use of confidence intervals in order to avoid false precision, and should include multiple years of value-added data in combination with other sources of information to increase reliability and validity. Second, they often ignore the fact that all decision-making systems have classification error. The goal is to minimize the most costly classification mistakes, not eliminate all of them. Third, they focus too much on one type of classification error, the type that negatively affects the interests of individual teachers.”
On this last point, the Brookings authors explain that “the interests of students and the interests of teachers in classification errors are not always congruent, and … a system that generates a fairly high rate of false negatives could still produce better outcomes for students by raising the overall quality of the teacher workforce.”
The authors readily acknowledge that value-added measures aren’t perfect — but they don’t take that to mean value-added measures should be thrown out: “It is not a perfect system of measurement, but it can complement observational measures, parent feedback, and personal reflections on teaching far better than any available alternative. It can be used to help guide resources to where they are needed most, to identify teachers’ strengths and weaknesses, and to put a spotlight on the critical role of teachers in learning.”
The key questions, then, are when and how value-added data are used. Whether these data should be used in teacher evaluations, the authors argue, is clearer (i.e., yes) than whether they have a place in personnel decisions (maybe).
These lines, from the conclusion, provide a useful summary of the Brookings paper: “When teacher evaluation that incorporates value-added data is compared against an abstract ideal, it can easily be found wanting in that it provides only a fuzzy signal. But when it is compared to performance assessment in other fields or to evaluations of teachers based on other sources of information, it looks respectable and appears to provide the best signal we’ve got. Teachers differ dramatically in their performance, with large consequences for students. Staffing policies that ignore this lose one of the strongest levers for lifting the performance of schools and students. That is why there is great interest in establishing teacher evaluation systems that meaningfully differentiate performance. Teaching is a complex task and value-added captures only a portion of the impact of differences in teacher effectiveness. Thus high stakes decisions based on value-added measures of teacher performance will be imperfect. We do not advocate using value-added measures alone when making decisions about hiring, firing, tenure, compensation, placement, or developing teachers, but surely value-added information ought to be in the mix given the empirical evidence that it predicts more about what students will learn from the teachers to which they are assigned than any other source of information.”
For in-depth materials on the timely topic of evaluating teachers – including links to important reports and news coverage on value-added measures – see our relevant “Go Deep” section.