Assessment

Healthier Testing Made Easy: The Idea of Authentic Assessment

Tests don’t just measure absorption of facts. They teach what we value.

April 3, 2006

Credit: Thomas Reis

Here's a radical idea: We need more assessment, not less.

Seem crazy? Substitute feedback for assessment, and you'll better understand what I mean. The point of assessment in education is to advance learning, not to merely audit absorption of facts. That's true whether we're talking about that fourth-period pop quiz, the school play, or the state test. No one ever mastered a complicated idea or skill the first -- or fifth -- time. To reach any genuine standard, we need lots of trials, errors, and adjustments based on feedback.

Think of assessment, then, as information for improving.

This idea takes a while to get used to if you teach, test, and move on. The research could not be clearer, though: Increasing formative assessment is the key to improvement on tests of all kinds, including traditional ones. And more "authentic" and comprehensive forms of assessment provide not only significant gains on conventional tests but also more useful feedback (because the tasks are more realistic).

What do I mean by "authentic assessment"? It's simply performances and product requirements that are faithful to real-world demands, opportunities, and constraints. The students are tested on their ability to "do" the subject in context, to transfer their learning effectively.

The best assessment is thus "educative," not onerous. The tasks educate learners about the kinds of challenges adults actually face, and the use of feedback is built into the process. In the real world, that's how we learn and are assessed: on our ability to learn from results.

Good feedback and opportunities to use it are extremely important in this scenario. In their seminal report Inside the Black Box: Raising Standards Through Classroom Assessment, British researchers Paul Black and Dylan Wiliam showed that improving the quality of classroom feedback offers the greatest performance gains of any single instructional approach. "Formative assessment is an essential component," they wrote, and its development can raise standards of achievement. "We know of no other way of raising standards for which such a strong prima facie case can be made."

This just makes sense. The more you teach without finding out who understands the information and who doesn't, the greater the likelihood that only already-proficient students will succeed.

Richard J. Light, Walter H. Gale Professor of Education at Harvard University, buttressed these findings in his book Making the Most out of College: Students Speak Their Minds:

"The big point -- it comes up over and over again as crucial -- is the importance of quick and detailed feedback. Students overwhelmingly report that the single most important ingredient for making a course effective is getting rapid response on assignments and quizzes. ... An overwhelming majority are convinced that their best learning takes place when they have a chance to submit an early version of their work, get detailed feedback and criticism, and then hand in a final revised version. ... Students improve and are engaged when they receive feedback (and opportunities to use it) on realistic tasks requiring transfer at the heart of learning goals and real-world demands."

Credit: Thomas Reis

Understanding as Transfer

A good education makes knowledge, skill, and ideas useful. Assessment should determine whether you can use your learning, not merely whether you learned stuff.

Achieving transferability means you have learned how to adapt prior learning to novel and important situations. In an education for understanding, learners are constantly challenged to take various ideas and resources (such as content) they encounter and become adept at applying them to increasingly complicated contexts.

When I was a soccer coach, I learned the hard way about transfer and the need to better assess for it. The practice drills did not seem to transfer into fluid, flexible, and fluent game performance. It often appeared, in fact, as if all the work in practice were for naught, as players either wandered around purposelessly or reacted only to the most obvious immediate needs.

The epiphany came during a game, from the mouth of a player. In my increasing frustration, I started yelling, "Give and go!" "Three on two!" "Use it, use it -- all the drills we worked on!" At that point, the player stopped dribbling in the middle of the field and yelled back, "I can't see it now! The other team won't line up like the drill for me!"

That's both a clear picture of the problem and the road to the solution: too many sideline drills of an isolated skill, and not enough testing of it; too great a gap between what the simplified drill was teaching and testing and what the performance demands.

As the authors of How People Learn: Brain, Mind, Experience, and School put it:

"A major goal of schooling is to prepare students for flexible adaptation to new problems and settings. ... Many classroom activities... focus on facts or details rather than larger themes of causes and consequences. ... Understanding how and when to put knowledge to use... is an important characteristic of expertise. Learning in multiple contexts most likely affects this aspect of transfer."

Fair enough, you may say, but what about the tests we are obligated to administer? Shouldn't we just mimic their format? Oddly enough, the answer is no.

Consider: Once a year, we go to the doctor for a physical exam. The doctor performs a few tests, which yield a few useful indicators of one's health.

Now, suppose we are terribly concerned about the final numbers. What we might do, in our panicky state prior to each physical, is practice for it and focus all our energy on it. If our doctor knew of our actions, her response would surely be, "Whoa! You have mixed up cause and effect. The best way to 'pass' your physical is to live a healthful life on a regular basis -- exercise, decrease fat intake, get sufficient sleep, avoid tobacco, etc."

Note that none of the elements of true healthfulness -- your diet, fitness regimen, or management of stress -- are directly tested; doctors use indirect indicators of blood pressure, weight, skin tone and color, and so on. Thus, the effects of your healthful regimen will be reflected in the test indicators.

Like doctors with their patients, state education agencies give schools an annual checkup via such testing that serves as a proxy for real performance. A state test, like a physical, consists of indicators -- a set of items that sample indirectly from the broader domain of the content supposedly addressed through a local educational regimen based on the standards. Our job is to teach to the standards, not the test. Many educators forget this.

Credit: Thomas Reis

High Standards

The local task is to honor the standards, and the state evaluates local work against those standards through its tests. And all state standards identify the kinds of authentic work that should occur in instruction and assessment locally. Here are a few examples:

According to the Georgia Performance Standard ELA4W2, "The student produces a response to literature that

ENGAGES the reader by establishing a context, creating a speaker's voice, and otherwise developing reader interest.
ADVANCES a judgment that is interpretive, evaluative, or reflective.
SUPPORTS judgments through references to the text, other works, authors, or nonprint media, or references to personal knowledge.
DEMONSTRATES an understanding of the literary work.
EXCLUDES extraneous details and inappropriate information.
PROVIDES a sense of closure to the writing."

The Vermont standard H&SS7-8:1 requires that "students initiate an inquiry by asking focusing and probing questions that will lead to independent research and incorporate concepts of personal, community, or global relevance (e.g., What are causes of low voter turnout?)."

Like a physical, the annual test can be useful even if the test questions seem superficial. All the test maker need do is show a correlation between a set of right answers on items with a related set of results on more complicated performances. Further, because the phrasing of the questions is unknown, most of the test questions involve mini-transfer: If you really understand the topic, you should have no trouble handling a question that looks a little different from the questions the teacher asked. If you learned only by rote, however, a novel question will stump you.

None of us likes the overemphasis on these tests, which provide precious little useful and timely feedback (especially in states where the test isn't made public -- most states, unfortunately.) But that doesn't mean the test is worthless. So, though it would seem silly to practice the physical exam as a way to be healthy, this is in effect what many teachers do (and are encouraged to do). Educators end up focusing locally on simple tests of "plug-and-chug," or recall, in the belief that this is the best preparation for standardized tests. The format of the test, in other words, misleads us into doing the equivalent of practicing for a physical, instead of teaching for meaning and transfer.

"Intellectual Work" Works

The good news? You can have your cake and eat it, too. Research by Fred M. Newmann and his colleagues on "intellectual works" (previously called "authentic achievement") showed how more real-world and complex performance assessment improves student achievement as measured by national and state tests. Researchers analyzed classroom writing and mathematics assignments in grades three, six, and eight over the course of three years. In addition, they evaluated student work generated by the various assignments. Finally, the researchers examined correlations among the nature of classroom assignments, the quality of student work, and scores on standardized tests.

Assignments were rated according to the degree to which they required "authentic" intellectual work, which "involves original application of knowledge and skills, rather than just routine use of facts and procedures. It also entails disciplined inquiry into the details of a particular problem and results in a product or presentation that has meaning or value beyond success in school."

This study concluded that "students who received assignments requiring more challenging intellectual work also achieved greater than average gains on the Iowa Tests of Basic Skills in reading and mathematics, and demonstrated higher performance in reading, mathematics, and writing on the Illinois Goals Assessment Program. Contrary to some expectations, we found high-quality assignments in some very disadvantaged Chicago classrooms and [found] that all students in these classes benefited from exposure to such instruction. We conclude, therefore, [that] assignments calling for more authentic intellectual work actually improve student scores on conventional tests."

Doesn't this finding again just reflect common sense? The more students receive challenging, interesting work demands, the better they do on simple measures.

Assess, Don't Audit

A good local assessment system does more than audit performance. It is deliberately designed to model authentic work and to improve performance. The aim of teaching is not to master state tests, but to meet worthy intellectual standards. We must recapture the primary aim of assessment: to help students better learn and teachers to better instruct.

All current state and national tests merely audit student performance using simplified indirect test items. It is a mere once-a-year checkup, like a doctor's physical. Sadly, fearful educators understandably feel driven to teach to the test instead of working to ensure that students meet genuine academic standards (and letting the test results follow naturally). Local faculties unwittingly mimic the audit function locally instead of building a robust feedback system. All other needs for data, such as accountability testing and program evaluation, come second and must not be allowed to pervert the system, as so often happens now.

Assessment tasks must model and demand important real-world work. Focused and accountable teaching requires ongoing assessment of the core tasks that embody the aims of schooling: whether students can wisely transfer knowledge with understanding in simulations of complex adult intellectual tasks. Only by ensuring that the assessment system models such (genuine) performance will student achievement and teaching be improved over time. And only if that system holds all teachers responsible for results (as opposed to only those administering high-stakes testing in four of the twelve years of schooling) can it improve.

Students are entitled to a more educative and user-friendly assessment system. They deserve far more feedback -- and opportunities to use it -- as part of the local assessment process. Those tasks should recur, as in the visual and performing arts and in sports, so there are many chances to get good at vital work. When assessment properly focuses teaching and learning in this way, student self-assessment and self-adjustment become a critical part of all instruction and are themselves assessed.

Tests don't just measure; they teach what we value. Should we provide practice on traditional tests? Of course. But we need to stop thinking like the naive coach who thinks all those drills are sufficient for mastering the game of transfer. Unsatisfying (and sometimes unacceptable) results will continue until we no longer see assessment as mere typical testing and as what you do after teaching and learning are over.

Grant Wiggins, president of Authentic Education, in Hopewell, New Jersey, is coauthor, with Jay McTighe, of Understanding by Design.

Get Started

Authentic Education
Black, Paul, and Wiliam Dylan, "Inside the Black Box: Raising Standards Through Classroom Assessment," Phi Delta Kappan, 80 (2): 139(9)
Bransford, John, Ann L. Brown, and Rodney R. Cocking, ed., How People Learn: Brain, Mind, Experience, and School, (National Academies Press, 2000)
Newmann, Fred M., Anthony S. Bryk, and Jenny K. Nagaoka, "Authentic Intellectual Work and Standardized Tests: Conflict or Coexistence?" (Consortium on Chicago School Research, University of Chicago, 2001)