Tests as teacher-training tools: Two views from winners

U.S.  Secretary of Education Arne Duncan today announced the department’s picks for two plans to create new and improved tests to replace the fill-in-the-bubble versions that most states have long used. The grant competition was billed as a mini-Race to the Top, with only three contestants. The one loser was a smaller group of states proposing low-stakes high-school assessments; the two winners will split $330 million to develop their proposals for high-stakes and interim K-12 assessments.

The winning proposals are fairly similar. There’s a focus on performance assessments – tests in which students go beyond choosing answers from a multiple-choice list and produce things like essays, science experiments or speeches. One of the interesting — if subtle — distinctions, however, is how each proposes using the tests to improve teacher effectiveness.

In the SMARTER proposal — that’s the name, not my endorsement — the use of teachers as test-scorers is emphasized. Their proposal suggests that teacher- scoring is not only a way to get around the shortcomings of computer-scoring, but also a way to improve teaching.

As the SMARTER Balanced Assessment Consortium put it: “The Consortium recognizes and values the professional development opportunity inherent in the use of teacher scorers. We value teacher scoring because of its potential to help teachers internalize the performance standards and buy into the scoring process.”

The SMARTER proposal, which was submitted on behalf of 31 states and is 1,294 pages long, won less money from the U.S. Department of Education, although it has more states participating in it than the other winning consortium.

In the Partnership for Assessment of Readiness of College and Careers (PARCC) proposal, which won $170 million and received a higher score than the SMARTER proposal, states can choose whether to use teacher-scorers. Twenty-five states signed on to the PARCC proposal, including eight states that were also party to the SMARTER proposal. The PARCC consortium left some flexibility in its 1,609-page proposal because, in its words, “these assessments will be used for the purposes of teacher and school leader evaluations.” So the fear seems to be that tests scored by teachers might be less reliable since they know student scores could affect their job security.

SMARTER says it gets around this potential pitfall by saying it will require that teachers score only student work from other states.

And there’s a precedent, said Joe Willhoft, an assistant superintendent in the Washington public schools, the lead state in SMARTER: “Many states in our consortium have used teacher scoring in the past and it has been very successful.”

It’s also common overseas in high-performing countries, according to Linda Darling-Hammond, one of the advisers to the SMARTER consortium and an expert on performance-assessments. For an in-depth look at how it’s worked elsewhere, see this research series that Darling-Hammond headed up. (Disclosure: I did some copyediting for Stanford on several of the papers.)

Not that the PARCC consortium doesn’t see the tests as a way to increase both student achievement and teacher effectiveness. As Mitchell Chester, the Massachusetts schools commissioner said, these are the “twin goals” driving the creation of the new tests.

The two different proposals set up a potentially interesting experiment on how teaching might be improved by tests, though. One set of states (SMARTER) will go forward involving teachers more intimately in the process of testing, while the other (PARCC) will hold teachers more at arm’s length.

“We all know that teacher involvement is critical,” said Duncan’s chief of staff, Joanne Weiss. In the next few years – the tests are expected to go live in 2014-15 – we may be able to see how critical.