Grades can wield enormous power over students. Course grades can influence students’ self-esteem and self-worth and impact how they understand their own identity. Grades can also be influential in admissions processes that have a long-term impact on a student’s trajectory.
Other fields that face similar challenges can offer guidance for best practices about grading. In particular, test validity studies offer a useful framework to formatively assess our grading methods.
Consider Grade Validity
Creating a meaningful grade is difficult. A grade should communicate something important about a student. It should measure subject mastery, growth, or another element that a teacher deems useful. At same time, measuring analytic reasoning or reading comprehension is inherently subjective and complex. Without direct measures, teachers rely on indirect evidence such as tests, assignments, and observations.
This challenge is not unique to education. Psychometrics, the science of measuring mental capacities, has struggled with it for decades. In response to these challenges, psychometrics developed a concept of validity. As it applies to course grades, it asks “to what degree—if at all—on the basis of evidence and rationales, should the [grade] be interpreted and used in the manner proposed?” In simple terms, does a grade mean what we say it means?
Educators already employ a variety of strategies to alleviate grade validity concerns, but psychometrics offers some insights that may further strengthen these efforts. Here are three steps, derived from those insights, that can improve the validity of grades.
Define the meaning: The first step to improving validity is defining a grade’s intended meaning. In my own experience as a student and teacher, I’ve encountered quite a variety of answers to the deceptively complex question “What does this grade mean?” Some answers included a student’s positional rank among his or her peers, content mastery, participation or effort, growth, or some combination. Educators will have their own answer to this question, and their institutions may define or limit their answer, but a clear construction of the intended meaning creates a useful reference point for assessment.
To help decide the meaning of your grade, ask yourself who are your primary stakeholders, and how will they use it? For example, an elementary school teacher may expect parents to use grades to check their student’s understanding of the material. In that case, a content mastery meaning would fit best. Alternatively, a high school teacher may expect colleges to use grades to differentiate student applications. In this case, a relative performance meaning would better align with the expected use. There may be competing uses, but avoid the pitfall of selecting multiple meanings. A grade should communicate one clear message.
Include all relevant criteria: With a clear meaning in mind, the next step asks if the grading method incorporates all and only relevant criteria. A grade becomes more valid the more relevant evidence it includes and irrelevant evidence it excludes. For example, a math teacher who grades for content mastery should ensure that every math unit contributes to the final grade. The more units she excludes, the further the distance between what the grade intends to mean (mastery of all units) from what it actually means (mastery of some units).
Consider revising grading practices that drop test scores. Dropping tests often excludes relevant information from the final grade. Avoid nonacademic factors such as behavior and attendance because they are often irrelevant to the grade’s meaning. Administrative penalties may also not be indicative of a grade’s overall meaning. Students who submit late work or forget to write their name need help with developing professionalism, but these missteps are rarely evidence of mastery.
Consider the consequences: The final step is to consider how the grade will impact students. Validity is not blind to real-world consequences, and even the most carefully constructed grade can suffer from validity issues if it harms students. A teacher who grades on relative performance and awards only one A may accurately differentiate students, but the process may create unintended problems such as students becoming excessively competitive and disincentivized from studying together.
It’s difficult to predict unintended social consequences, but asking questions can help:
- How will the grading method impact student relationships? Some practices may create too competitive an environment and discourage communal learning.
- How will the grade affect students’ desire for lifelong learning? In practice, this may mean allowing test retakes so that students can reclaim their self-confidence and desire to learn, or it could mean adopting a no-zero policy to help avoid students’ failing and giving up on a subject.
Whatever meaning we want grades to have, our method of constructing them should reference that meaning and respect their impact on students’ lives.