Finding common ground with grading systems
We've seen the differences. Now what are the similarities?
As David and I write and engage with others about grading, there’s definitely a sense that the time is coming, and maybe is already here, for a wholesale change in how we grade in higher education. When David wrote last week about the profusion of alternative grading techniques that are out there, I think the sheer variety signifies a deep and widespread desire to make this change. People are realizing that reforming assessment and grading can have outsized results in improving higher education as a whole. It’s one of those places where 20% of the effort will produce 80% of the results.
But the variety can also be overwhelming. Instructors might say, I want to change my grading practice, but should I go with specifications grading? Standards-based grading? Ungrading? Contract grading? Most real-life approaches to alternative grading don’t fit neatly into any of those boxes, and often none of these general categories will be a perfect fit to your students in your classes. And how are we supposed to keep up with all these terms? Do you have to be an expert even to get started?
It seems smarter to focus on the overall ideas that unify these different approaches. So this week, rather than introduce another kind of grading practice, we’re going to pull back to a higher altitude and try to distill what all these ideas have in common and come up with a general framework for these practices. Not a “definition” of anything — there’s still too many idiosyncrasies and varied practices to hope for something that’s both precise and general — but instead a map, with room for interpretation, that stakes out some of the common ground that we seem to be walking together.
Despite the differences in the ways that all these grading practices are worked out in real classrooms, what do they seem to have in common? Here’s what I see:
Student work is evaluated against clearly defined and context-appropriate standards for what constitutes “acceptable work”. In other words, the systems are rooted in students knowing what acceptable work looks like, using standards that are professionally appropriate but scaled to the level of the student. Standards-based grading and specifications grading are obviously built on this principle (just look at the names). Ungrading advocates might disagree (see Alfie Kohn’s famous essay “The Trouble with Rubrics”). But even when ungrading, although you might not use a concrete rubric, you are still making decisions about whether student work is “good enough” or not. Presumably those decisions aren’t just made by “gut feel” (which is one way of saying “personal bias”) but through standards that you, as a content expert, believe are appropriate for determining quality. In other words, we’re all using standards. Ethics and common decency would say we should externalize those and be up-front with students about it, and so that’s part of the system.
Student work, when evaluated, is given helpful, actionable feedback that the student can and should use to learn and improve their work. Feedback is the beating heart of all of these practices. Traditional grading looks at student work, assigns a number or a letter to it — and that’s all. It gives student work the silent treatment. In all these alternative practices, instead, the students’ work opens up a conversation and initiates a feedback loop.
Student work doesn’t have to receive a mark, but if it does, the mark is a progress indicator and not an arbitrary number. The alternative practices we’ve mentioned here all share the realization that marks, if given, are just at-a-glance summaries of what the feedback says — nothing more. They are there primarily for convenience and for entry into a gradebook. In particular, these grading practices do not pretend that numbers assigned to student work (75%, 8/10, etc.) are numerical data. They are not. They are categorical data disguised in numerical form, like zip codes, and the statistical contortions used by traditional grading to convert those numbers into letter grades are fundamentally irrelevant and merely give the illusion of objectivity. (“Objectivity theater” is how it’s been described.) It would probably be better to dispense with marks altogether, as ungrading typically does, given their tendency to distract and demotivate students. But if we must put marks in a gradebook, they should be informative. They should be informative categorical data rather than fake numerical data.
Students can revise, resubmit, or reattempt work without penalty, using the feedback they receive, until the standards are met or exceeded. All of these alternative frameworks are predicated on feedback loops. This seems to be their defining and essential ingredient. They don’t only have clear and appropriate standards and regular streams of feedback: They also allow students to combine their work, the standards, and the feedback and then try again. It’s in the trying again that grading turns into growth. And we don’t penalize this, because what kind of person penalizes growth?
Not a definition
There is a temptation at this point to look to the four observations I’ve just made and turn them into a definition of a general category of grading, with a special name, of which SBG, specifications grading, etc. are all instances. (David and I are mathematicians, after all — abstraction is what we do.) But I am going to resist that temptation, and I think you should too, for two reasons.
First, definitions are exclusionary by nature. When you define a thing, you draw a line between instances of that thing and non-instances of it, and the “canonical” instances tend to receive pride of place. This is OK in some situations (e.g. defining terms in mathematics so you can meaningfully prove theorems about them) but in other situations, especially education, it tends to be highly counterproductive because it locks people out unnecessarily. If you’re thinking of instituting a grading system that involves a lot of feedback and revision, but for whatever reason you still want to assign points to things, you shouldn’t feel left out of this conversation or pressured to do things a different way because a definition said so. If you’re an ungrader and feel that some of the observations above don’t quite fit what you’re trying to accomplish, you should still feel welcome at the table and able to have a real conversation about student success with someone who does specifications grading.
Second, definitions of educational ideas in my experience tend to derail people’s focus. I learned this when writing my flipped learning book. Flipped learning at the time needed an operational definition that made it possible for people to do research about it, and made it OK for instructors not to use video. So I came up with one; but a lot of faculty stopped asking good questions about flipped learning (What’s the best way to use class time if I’m not lecturing?) and instead focused on whether what they were doing was “real” flipped learning or not. So rather than give a definition of “Proficiency Grading” or “Awesome Grading” or whatever you might want to call it, let’s just not, for now, and focus instead on how best to do whatever it is we are describing here.
Four Pillars (beta version)
So we are setting up a big tent with a lot of room underneath for anybody who wants to think about the sort of grading approaches being described here. Stealing shamelessly from our friends in the IBL community (specifically the “pillars of IBL teaching”) I’d like to close here by visualizing this “tent” as a building with four pillars.
(A graphic designer I am not.) As advertised, this is a beta version, not in any way guaranteed to be complete or even correct. In fact David has already informed me that I need to work on this some more. (I mean, are those pillars even touching the pediment? What kind of physics are we using here? — DC) But that’s what the comment section is for, and anyway I think it’s more useful than a definition of a term.
In fact what I hope, is that in the near future, what we’re describing here won’t need a special term — it will just be “grading”, and grading using these practices will be so normative that it’s the departures from these practices that will need special terminology.
Since we first wrote this post, we have written more detailed posts diving into the details of each pillar. Check them out!
Pillar 1: How to write standards (and a follow-up: What does it mean to meet a standard?)
Pillar 2: The care and feeding of helpful feedback
Pillar 3: Giving marks that indicate progress (follow-up: More reasons to avoid using numbers for grades)
Pillar 4: The heart of the feedback loop: Reattempts without penalty
Click here to receive Grading for Growth in your inbox, every Monday.
Every definition of flipped learning up to that point had stated that students must watch videos prior to class. This was even used as one of the exclusion criteria in one of the most cited early research reviews on flipped learning at the time. It was a dumb criterion to have, so I fought back with my own definition. Fortunately I don’t think grading suffers from that kind of issue for now. But more precise definitions might be necessary in the future for research purposes; we’ll see.
In fact, it has been called something before: Mastery grading, or sometimes “mastery-based grading”. There are several issues with this term, none of which I am going to discuss here and now. The point is to focus, for now, on the thing itself rather than what the thing is called.
For a long time I’ve said the same thing about flipped classrooms (“Eventually we’ll just call it ‘the classroom’”) but Sharona Krinksy, our friend and the main driver of the annual Grading Conference, is the one who’s said this the most about grading.