Assessment

What Does the Research Say About Testing?

There’s too much testing in schools, most teachers agree, but well-designed classroom tests and quizzes can improve student recall and retention.

By Carly Berwick

October 25, 2019

Your content has been saved!

Go to My Saved Content.

For many teachers, the image of students sitting in silence filling out bubbles, computing mathematical equations, or writing timed essays causes an intensely negative reaction.

Since the passage of the No Child Left Behind Act (NCLB) in 2002 and its 2015 update, the Every Student Succeeds Act (ESSA), every third through eighth grader in U.S. public schools now takes tests calibrated to state standards, with the aggregate results made public. In a study of the nation’s largest urban school districts, students took an average of 112 standardized tests between pre-K and grade 12.

This annual testing ritual can take time from genuine learning, say many educators, and puts pressure on the least advantaged districts to focus on test prep—not to mention adding airless, stultifying hours of proctoring to teachers’ lives. “Tests don’t explicitly teach anything. Teachers do,” writes Jose Vilson, a middle school math teacher in New York City. Instead of standardized tests, students “should have tests created by teachers with the goal of learning more about the students’ abilities and interests,” echoes Meena Negandhi, math coordinator at the French American Academy in Jersey City, New Jersey.

The pushback on high-stakes testing has also accelerated a national conversation about how students truly learn and retain information. Over the past decade and a half, educators have been moving away from traditional testing—particularly multiple choice tests—and turning to hands-on projects and competency-based assessments that focus on goals such as critical thinking and mastery rather than rote memorization.

But educators shouldn’t give up on traditional classroom tests so quickly. Research has found that tests can be valuable tools to help students learn, if designed and administered with format, timing, and content in mind—and a clear purpose to improve student learning.

Not All Tests Are Bad

One of the most useful kinds of tests are the least time-consuming: quick, easy practice quizzes on recently taught content. Tests can be especially beneficial if they are given frequently and provide near-immediate feedback to help students improve. This retrieval practice can be as simple as asking students to write down two to four facts from the prior day or giving them a brief quiz on a previous class lesson.

Retrieval practice works because it helps students retain information in a better way than simply studying material, according to research. While reviewing concepts can help students become more familiar with a topic, information is quickly forgotten without more active learning strategies like frequent practice quizzes.

But to reduce anxiety and stereotype threat—the fear of conforming to a negative stereotype about a group that one belongs to—retrieval-type practice tests also need to be low-stakes (with minor to no grades) and administered up to three times before a final summative effort to be most effective.

Timing also matters. Students are able to do fine on high-stakes assessment tests if they take them shortly after they study. But a week or more after studying, students retain much less information and will do much worse on major assessments—especially if they’ve had no practice tests in between.

A 2006 study found that students who had brief retrieval tests before a high-stakes test remembered 60 percent of material, while those who only studied remembered 40 percent. Additionally, in a 2009 study, eighth graders who took a practice test halfway through the year remembered 10 percent more facts on a U.S. history final at the end of the year than peers who studied but took no practice test.

Short, low-stakes tests also help teachers gauge how well students understand the material and what they need to reteach. This is effective when tests are formative—that is, designed for immediate feedback so that students and teachers can see students’ areas of strength and weakness and address areas for growth. Summative tests, such as a final exam that measures how much was learned but offers no opportunities for a student to improve, have been found to be less effective.

Testing Format Matters

Teachers should tread carefully with test design, however, as not all tests help students retain information. Though multiple choice tests are relatively easy to create, they can contain misleading answer choices—that are either ambiguous or vague—or offer the infamous all-, some-, or none-of-the-above choices, which tend to encourage guessing.

While educators often rely on open-ended questions, such short-answer questions, because they seem to offer a genuine window into student thinking, research shows that there is no difference between multiple choice and constructed response questions in terms of demonstrating what students have learned.

In the end, well-constructed multiple choice tests, with clear questions and plausible answers (and no all- or none-of-the-above choices), can be a useful way to assess students’ understanding of material, particularly if the answers are quickly reviewed by the teacher.

All students do not do equally well on multiple choice tests, however. Girls tend to do less well than boys and perform better on questions with open-ended answers, according to a 2018 study by Stanford University’s Sean Reardon, which found that test format alone accounts for 25 percent of the gender difference in performance in both reading and math. Researchers hypothesize that one explanation for the gender difference on high-stakes tests is risk aversion, meaning girls tend to guess less.

Giving more time for fewer, more complex or richer testing questions can also increase performance, in part because it reduces anxiety. Research shows that simply introducing a time limit on a test can cause students to experience stress, so instead of emphasizing speed, teachers should encourage students to think deeply about the problems they’re solving.

Setting the Right Testing Conditions

Test achievement often reflects outside conditions, and how students do on tests can be shifted substantially by comments they hear and what they receive as feedback from teachers.

When teachers tell disadvantaged high school students that an upcoming assessment may be a challenge and that challenge helps the brain grow, students persist more, leading to higher grades, according to 2015 research from Stanford professor David Paunesku. Conversely, simply saying that some students are good at a task without including a growth-mindset message or the explanation that it’s because they are smart harms children’s performance—even when the task is as simple as drawing shapes.

Also harmful to student motivation are data walls displaying student scores or assessments. While data walls might be useful for educators, a 2014 study found that displaying them in classrooms led students to compare status rather than improve work.

The most positive impact on testing comes from peer or instructor comments that give the student the ability to revise or correct. For example, questions like, “Can you tell me more about what you mean?” or “Can you find evidence for that?” can encourage students to improve engagement with their work. Perhaps not surprisingly, students do well when given multiple chances to learn and improve—and when they’re encouraged to believe that they can.