Today we bring you a guest post written by two of our colleagues and conference co-organizers. Sharona Krinsky and Robert Bosley have been working and presenting together about Grading for Growth for the last six years. They both teach at California State University, Los Angeles in the Mathematics department. They are also organizers of The Grading Conferences for Higher Ed STEM and 7-12 Grade STEM faculty. Sharona has her master’s degree in Mathematics from The Ohio State University and Bosley has his master’s degree in Curriculum and Instructional Design of Mathematics from University of Texas, Arlington.
In their post Finding Common Ground with Grading Systems Robert Talbert and David Clark write that the first piece of common ground among all alternative grading systems is “Student work is evaluated against clearly defined and context-appropriate standards for what constitutes “acceptable work”.” One of the questions we most often hear is Who decides what is “acceptable work”? This is easily done with rubrics by an individual instructor working independently in a course that stands alone. But how do we do this in a group setting, such a course with multiple graders or in a coordinated course? Is the use of rubrics enough to ensure common grading practice amongst members of the group?
In his 2011 study, Hunter Brimi1 asked 90 high school teachers—who had received nearly 20 hours of training in a writing assessment program—to grade the same student paper on a 100-point percentage scale (rubric). Among the 73 teachers who responded, scores ranged from 50 to 96. This was with teachers trained in writing assessment! If teachers trained for over 20 hours cannot grade the same paper with the same grade, how can graders in the same course or a coordinated course be expected to do so? In this post, we will describe how we bring together all the instructors from a course to determine acceptable work through a process called normed grading.
One possible solution is what we have done with the Cal State LA coordinated Quantitative Reasoning with Statistics General Education course, Math 1090/1092. A brief history: In 2017, then chancellor of the California State University system, Timothy White, issued Executive Order 1110, part of which called for the elimination of remedial mathematics courses throughout the CSU system, a system of 23 universities that serves nearly 500,000 students in the state of California. At California State University Los Angeles, Sharona was tapped as one of the instructors responsible for redesigning for Math 1090 - Quantitative Reasoning with Statistics. The purpose of this course is to provide a general education course for students of any major to become critical consumers of Statistics. As part of the redesign, Sharona and her co-coordinator, Dr. Silvia Heubach, decided to include standards-based grading as the grading system for the course.
At the same time, we (Sharona and Bosley) were co-leading a dual enrollment version of the Math 1090 course through a local non-profit, College Bridge. Students from local area high schools were enrolled in Math 1090 at Cal State LA and the courses were taught on site at the high schools with a team teaching environment between a high school teacher and a university instructor. As part of that program, the entire team of teachers and instructors met regularly to grade course exams together. This process, called Normed Grading, is well-known in the K-12 educational environment.
Fast forward to fall 2018, when the redesigned Math 1090 course was being launched at Cal State LA, over 2,000 students enrolled in the program in individual sections of no more than 25 students. We had over 20 instructors teaching in the newly redesigned course, most of whom had never taught a standards-based graded course. The focus on student success meant that there was a desire to ensure that the student experience was similar regardless of which section of the course the student was in and that students were able to be successful across the board. Central to that similarity would be instructor consistency in what constituted “acceptable work” for determining if students had met a given standard.
With Bosley’s experience in working in the K-12 environment on normed grading, Sharona’s experience with teaching standards-based graded classes, and Silvia’s experience mentoring instructors and grad students, we were able to design and implement a process whereby all the instructors came together in weekly meetings to decide as a group what was considered “acceptable work” for each standard on each assessment. Our normed grading process uses the following steps:
Each assessment is specifically aligned to the specific learning outcome (standard) that is being assessed. We have had conversations throughout the semesters of teaching this class whereby we have honed in on what content we specifically want to see our students demonstrate that they know or can work with. We have iteratively rewritten our assessments to enable students to highlight the content knowledge and/or skills we are looking for.
We decide on the appropriate type of rubric for the class/standard/assessment. In the case of Math 1090, we are using a four-level rubric where the top two levels are both considered acceptable and the bottom two levels are not yet acceptable.
We meet and grade several pieces of student work to determine where the line is crossed between acceptable and not yet acceptable. Each instructor decides individually whether they think a certain piece of work meets their definition of acceptable and then all instructors reveal their determination simultaneously. If there is disagreement, then a detailed discussion occurs. This is the heart of the norming process. In these discussions, all instructors are all looking at the exact same individual student work (anonymized) and having robust conversations about why this particular work is or is not meeting our expectations. We look for work that is on the borderline between acceptable and not acceptable, and come to a consensus about where that line is drawn.
We continue to go through this process on an iterative basis until we reach consensus. Once we have consensus about what acceptable work looks like, each individual instructor grades their own classes with spot checking by the coordinator for consistency. Over the course of the program, only one time has Sharona spotted an inconsistency between the agreed-upon norm and the individual instructor’s grading.
Instructors provide feedback to the coordinators about the ability of the assessment to measure students in order to loop back around and improve the assessment for the future.
So, what is “acceptable work”? In our case, acceptable work is defined by the group of instructors teaching the course on an assessment level basis. This definition of acceptable work is communicated to students through several avenues. The primary method of communication is specific, individualized feedback provided on assessments. This is supplemented with detailed descriptions within the Student Learning Outcomes and example problems written in alignment with the assessments. Additionally, as a group we are continually refining our assessment questions to better enable students to have guidance on providing the evidence of their knowledge, understanding, and competency in the material and skills we are looking for.
How do we make sure that we have common grading in a group? Through working together to simultaneously grade specific student work, discuss our determination of what is acceptable, and come to a consensus.
As of the writing of this post, we have begun teaching our eighth semester with the redesigned course. Our group of instructors teaching the course has stabilized, with most instructors teaching it multiple times. We continue to refine and improve all aspects of the course, and we strive to enhance the opportunities for our students to succeed. By coming together as a team, our overall understanding of what we are teaching our students, why we are teaching this material, and how students can demonstrate success is greatly enhanced.2
Brimi, Hunter 2011. Reliability of Grading High School Work in English. https://doi.org/10.7275/j531-fz38
For more information about our Quantitative Reasoning with Statistics redesign, please see our article Silvia Heubach & Sharona Krinsky (2020) Implementing Mastery-Based Grading at Scale in Introductory Statistics, PRIMUS, 30:8-10, 1054-1070. https://doi.org/10.1080/10511970.2019.1700576
The process sounds excellent but I have a huge problem with being called "norming" as most people will take that to mean norm-referenced, i.e., bell curve. Please come up with a better title - moderated or ???