Using collaborative grading as a new professor
Benefits and challenges of moving away from percentage-based grading
Today’s guest post is from Noel Warford. Noel is a visiting assistant professor of Computer Science at Oberlin College in Oberlin, Ohio, about thirty minutes west of Cleveland. He uses alternative grading in classes at all levels of the curriculum, but has especially experimented with this in upper-level classes on human-computer interaction. Noel received his Ph.D. from the University of Maryland, where he specialized in digital safety for at-risk users. He is also a professional pipe organist and worked as a church musician for about a decade.
In spring 2024, I started teaching at Oberlin College, shortly after defending my dissertation. I was returning to my alma mater after a well-timed conversation with an old professor of mine, who is now my colleague and department chair. Due to the timing of the semester and my defense, I had one month to prepare for two classes - our introductory computer science course and a new course at the institution, Introduction to Human-Centered Security, the focus of this article.
Oberlin College is a small1 liberal-arts college in Northeast Ohio that has a storied history of activism and radical acceptance2. It partners closely with Oberlin Conservatory, an undergraduate-only music conservatory. Introduction to Human-Centered Security is an upper-level elective for the Computer Science major, one of the largest majors on campus. Despite the number of majors, most upper-level courses (including the one I am discussing today) are limited to 20-25 students in their last two years of the degree. I will be comparing two versions of the course: my first semester, taught with a traditional grading system, and my second, taught with a collaborative grading framework.
The first version: traditional grading
The first version of this course was traditionally graded, with a roughly 25% split across four types of assignments: Reading Engagements (i.e., class prep assignments), a 20-minute presentation on a paper in the field, a semester-long project wherein small groups replicated an existing study in human-centered security at a small scale, and participation (based mostly on attendance).
To be blunt, this assessment scheme did not work for me or the students. Below are several of the questions both the students and I found ourselves asking over the course of the semester:
Why is a 20-minute presentation weighted the same as a semester-long project?
What happens to my grade if a group member doesn't pull their weight? Will the rest of the group be punished?
I had a very strict 'no late submissions' policy for Reading Engagements - what were students supposed to do if they got very sick or had a major life event interrupt their academic schedule?
How, in fact, did I evaluate participation? My syllabus included language like "Your class participation grade will be based on the quantity and quality of your contributions in class." - what does that mean?
I often found myself floundering both when students asked these questions of me and when I asked these questions of myself. I also felt that I should not modify the syllabus or make exceptions to course policies (like the strict deadlines for Reading Engagements) as I was worried that would decrease the students’ perception of the fairness of the class.
Most of these reflections on difficulty are post-hoc. During the semester, I thought things were going well, despite the above questions — I largely attributed this to the natural difficulty of teaching for the first time. However, I received harsh student evaluations that despite my perception, many students were frustrated with the course. As a result, I realized that I wanted to make some changes to improve things for next time.
The second version: collaborative grading
Over the summer of 2024, still reeling from poor student evaluations, I happened across the Grading for Growth book. I wish I had a tidy origin story for where and how I found it, but I don't remember! Nevertheless, the promises of an alternative grading system were enticing, so I set about revising my class entirely, moving to a collaborative grading model.
In this new version of the class, I shifted entirely to giving feedback. The assignments for this version of the class were as follows:
Reading Engagements. These were effort-based assignments to be done ahead of each class. The basic instructions for this assignment were to show me that the student had engaged in some way with the reading. They were free to use any medium they preferred. One student exclusively turned in voice memos all semester.
Concept Application. This was a single assignment where students applied course concepts from the first few weeks of class (such as analytic frameworks of privacy and security) to three different scenarios. I gave written feedback and asked for revisions if students clearly misunderstood or underexplained core concepts.
General Audience Communication. This was a midterm group assignment where the students created material that explained a security or privacy concept to a lay audience. I gave written feedback to each group, but as many of them made videos or other time-intensive communications, I did not ask for revisions here.
Activists’ Privacy Guide. This was a final project where the students constructed a privacy guide for political activists as an entire class (i.e., the final product was one big guide to which every student contributed). This guide was meant to translate digital-safety concepts into actionable steps activists could take to protect themselves online. I gave several rounds of written feedback on components of the work completed by individual students and small groups, as well as on the final product which was produced at the class-level.
Over the course of the semester, students completed two structured reflections (inspired by Susan Blum's examples in Ungrading) to determine both their midterm and final grades. Both reflections were extensive, with the final reflection being long by virtue of covering the entire semester, rather than the first half. For each reflection, students assembled a portfolio of their assignments, answered some short- and long-answer questions and then individually discussed their reflection and portfolio with me during our final exam period, during which we came to a consensus on their final grade. I provided narrative criteria for the grade levels, quoted from the syllabus as follows:
A student who earns an A can clearly and concisely explain usable security concepts and apply them to real scholarly problems. They demonstrate creativity and clarity of communication when applying usable security concepts to novel problems, and complete all major projects in a manner which exceeds expectations3. They demonstrate excellent group work skills, both in contribution to projects and group management.
A student who earns a grade of B has demonstrated a solid understanding of usable security concepts and has completed all major projects. Their project materials are well-constructed but do not exceed expectations for the course. They demonstrate good group work skills, both in contributions and group management, or do remarkably well in one of those two categories while doing much less in the other (i.e., a student who does a great job of making sure everyone stays on task but doesn't contribute much work themselves.)
A student who earns a grade of C has demonstrated a basic understanding of usable security concepts and has completed a reasonable attempt at all major assignments that demonstrates a basic understanding of the course material. The student has contributed to group work in a modest way.
A D in this class indicates a good-faith, but unsuccessful, effort to earn a C. An F is given in cases of no evidence of meaningful progress and usually results from a near-complete disengagement from the course.
At the start of the semester, many students worried that they would not feel motivated to do any work for the class, which I suspect is common for students encountering collaborative grading for the first time. Some, however, indicated a cautious optimism at the benefits of a lack of pressure. We revisited my assessment policy more than once during the semester, as I felt a little uncertain myself employing this strategy for the first time.
Surprises
This shift to collaborative grading had two main unexpected outcomes. The first was the ability to course-correct mid-semester. Originally, I had planned two assignments that I ended up removing from the schedule entirely: an individually-completed literature review on a human-centered security topic of the students’ choice, and a grant proposal assignment where groups would propose new research projects. After the 2024 U.S. presidential election, the students and I decided collaboratively to work on a digital-safety resource for activists (described above) instead of the grant proposal project I had originally planned. Although it was my prerogative to change assignments in my traditionally graded course, that would have required me to think through questions like “what percentage should this new assignment take up?” and “how can I manage student perceptions of the fairness of their grade calculation?” Since the second version of the class was collaboratively graded, I found it much smoother to make a dramatic change mid-semester.
The second was an enormous increase in the completion of low-stakes work. In both semesters, I assigned Reading Engagements, but with different submission policies. In semester 1, I was very strict - no Reading Engagements could be turned in after that class period, and this made up 25% of the grade. If a student missed them, tough - they lost the associated points. In semester 2, since their marks on Reading Engagements no longer served as a percentage portion of their grade, I felt I could be much more lenient on deadlines, accepting them until the end of the semester - this did not add much extra load on my end, as for this assignment, I did not give detailed feedback. The big surprise for me here was the amount of times students went back to old readings and completed them. Many students, now that they had to rely on internal motivation to do this, found that they actually noticed what they were missing by not doing the readings and, in several cases, they were able to bring concepts from later on in the course into their reflections.
Lessons Learned
Overall, I consider my transition to collaborative grading a success, but with some serious caveats. The students appreciated the flexibility and the class as a whole felt much less adversarial on round two. In addition, I noticed a large increase in engagement, in terms of both work submitted and class discussion. Students repeatedly emphasized the value of the mindset shift to internal rather than external motivation for getting things done in my class.
The first challenge I think I will face when teaching a collaboratively graded class in the future is finding a better way to assess understanding. Although students were generally good at assessing their level of engagement with the course, I don’t think I provided them enough feedback on their understanding of core concepts. When it came time for the final reflection, I noticed that even students who I had observed having a shakier understanding of course content, either through my feedback on their Concept Application assignment or individual conversations, rated themselves as having a high understanding of course material. In the future, I want to find a way to explicitly connect feedback on mid-semester assignments to that final evaluation.
The second core challenge will be to make sure that I am clear about expectations, both to my students and to myself. The narrative grade criteria reference meeting or exceeding expectations for student work. However, while writing this article, I realized that I never actually clearly communicated those expectations, which is embarrassing in hindsight! My internal expectations were, generally, that students should demonstrate their ability to understand and apply various core frameworks from the discipline of human-centered security, and analyze and evaluate the validity and results of scientific work in the field. However, I was not clear to them or, frankly, to myself on how they should demonstrate this. I believe that my assignments as written did, in fact, get them to practice some of these skills. But, I still have a lot of work to do as a teacher to make sure I am evaluating these expectations and communicating them to my students.
I would encourage other readers of this publication to carefully consider these challenges in collaborative grading, and I would love to chat with anyone who has found good ways to address this problem!
In comparison to most schools! It is quite large (~3000 students) compared to many other liberal arts colleges.
Oberlin was the first American school of higher education to admit Black students (1835) and women (1837).
More on my expectations in the “Lessons Learned” section at the end of this post!