“Prologue: Mallory’s Dilemma” excerpted from Grading for Equity: What It Is, Why It Matters, and How It Can Transform Schools and Classrooms by Joe Feldman. Thousand Oaks, CA: Corwin, 2019.
This is the first article in a two-part series about equitable grading practices. This article sets up some of the challenges. In part two, learn how teachers are addressing this issue.
By Joe Feldman
The data couldn’t be possible. Actually, it shouldn’t be possible.
Mallory had just completed her first year as principal of Centennial College Prep Middle School, a new public charter school in Huntington Park, California. As a young, white woman leading a school that served nearly all Latino students, many living below the poverty line, Mallory had approached her job humbly, not immediately pushing initiatives and changing policies to align to her own personal vision (what she called the “new sheriff in town approach”). Instead, her priority was to first understand her school community: its context, history, strengths, and needs. She had watched, listened, and built relationships with her faculty, students, and their families. She had visited classrooms, reviewed teachers’ lesson plans, and studied the school’s statistics: attendance percentages, disciplinary referrals, and test scores.
Whether the data she reviewed was “hard” data like test scores or “soft” data like her observations of teacher-student dynamics in classrooms, Mallory kept a sharp lookout for how the school could be made more equitable. Mallory’s vision was that students should have equal opportunities for success regardless of their ethnicity, first language, gender, income, or special needs. She paid attention to patterns of unequal achievement or opportunity in her school. For example, were boys being referred more frequently to the office? Were poorer students showing a common weakness on a strand of skills on the writing assessment? Did students who received special education services have a higher rate of absenteeism?
But that wasn’t all. To Mallory, one of the most important indications of a high quality, equitable school is that students are successful regardless of their teacher.
One teacher’s students shouldn’t learn different material or be less prepared for the next grade than another teacher’s students. Fortunately, based on her classroom visits and other data, Mallory found that although teachers approached their work in ways that reflected their individual backgrounds and personalities, students’ learning experiences were generally consistent across classrooms. Students in the same course taught by two different teachers—such as Ms. Thompson’s and Ms. Richardson’s sixth-grade English classes—were learning the same skills, reading the same books and essays, getting the same homework, receiving similar support, and taking the same tests. Mallory was confident that regardless of their sixth-grade teacher, students would be similarly prepared for seventh-grade English.
Since teachers were aligned with what and how they were teaching, and because the school didn’t track students or create unbalanced classes where one sixth-grade English class would be stronger than the others, Mallory reasoned that by all accounts the performance of students should be comparable across teachers of the same course. In other words, the rate of As, Bs, Cs, Ds, and Fs in any course should be relatively similar for each teacher of that course. But that wasn’t happening. Strange things were showing up in the data.
Take, for example, her school’s sixth-grade math and English classes, each taught by three different teachers:
If you were a student in two of the three teachers’ math classes you had about a 20 percent chance of getting a D or F, but if you were in the third teacher’s math class, you had 0 percent chance of getting a D or F. In the English classes, taught by three different teachers including Ms. Richardson and Ms.Thompson, the range of D and F rates—4 percent, 22 percent, and 35 percent—was even more dramatic. Mallory double-checked the grade data, then double-checked that students in the classes weren’t significantly different—in other words, one teacher’s students as a group didn’t have lower standardized test scores or higher rates of absences. No, the groups of students were similar; the only difference among the classes seemed to be the chances of receiving a particular grade.
Mallory put on her detective hat and considered, investigated, and then rejected several explanations: No substantive differences in instruction. Teachers were using the same curriculum with the same tests and even scored those tests as a team to ensure fairness and uniform evaluation. Mallory scoured students’ previous test scores and grades, with no indication of drastically different profiles of the classes as a whole. No substantive difference in the classroom physically—it wasn’t as if one classroom had a broken thermostat or was closer to a noisy playground. What was even odder was that students with identical standardized test scores received different grades depending on their teacher. The teachers were teaching similarly, the students were demonstrating similar achievement, but the grades showed inconsistency. This data seemed unexplainable, impossible, and grossly inequitable.
On a lark, Mallory looked at the syllabus for each class—each teacher of a course had created her own personalized version—and it shocked her. Each teacher’s syllabus began with a similar introduction to the course content and description of important materials for the class, but then it was as if each teacher was in an entirely different school:
- One teacher accepted no homework after the attendance bell rang, some deducted points if homework was late (although the amount deducted ranged from a few points to two letter grades’ worth), and another accepted work beyond the due date up until the end of the quarter, with no penalty.
- One teacher gave each daily homework assignment a grade of 10 percent or 100 percent based on how much of the homework was completed and correct, and allowed students who had received 10 percent up to one week to correct mistakes. Another gave full credit for an assignment if the student showed effort to complete it, regardless of whether answers were correct.
- One teacher reduced points on an assignment if the student didn’t completely and correctly write her or his first and last name, along with the title of the assignment. Another subtracted points if an assignment was submitted on notebook paper that had ripped holes or ripped edges.
- Most teachers organized their gradebook by grouping types of assignments into categories (Homework, Classwork, Tests, etc.), and weighted each category to denote its importance (Homework = 30% of the grade; Tests = 70%). However, no teacher had the same weightings for any categories. For example, the weight of tests ranged from 40 percent to 70 percent of a student’s grade.
- Some teachers had only three categories of assignments (Tests, Classwork, and Homework), while others included categories that seemed more subjective, such as Citizenship, Participation, and Effort. There was no explanation in the syllabus of how these subjective categories were calculated or on what they were based.
- Other teachers didn’t use percentage weights at all, but assigned different point values to different assignments. For example, Homework assignments might be 5 to 10 points each, with tests worth 100 points.
Teachers’ different grading policies made it possible for two students with the same academic performance to receive different grades. What particularly confused and concerned Mallory was that some teachers were grading students on criteria that seemed to have nothing to do with their academic achievement — such as whether their paper had intact holes or had the proper heading — and others were basing parts of students’ grades entirely on subjective criteria, such as effort, that were susceptible to teachers’ implicit biases. This grade data that couldn’t be possible suddenly was.
A few days later, something happened that changed Mallory’s confusion to concern. Maria, a shy but earnest eighth grader, came to her office nearly in tears. Last year as a seventh grader, she had received a B in math, her most challenging subject, but this year was barely passing with a D. What was really frustrating Maria was that even though she often handed in homework assignments late or incomplete — she had after-school responsibilities at home in addition to dance class three times a week — she consistently performed well on every exam. She obviously had learned the math and had shown it when it mattered most, and though last year this type of performance had earned her a B, her teacher this year gave zeros for late or incomplete homework, resulting in her D. Maria was feeling a crisis of confidence: Other students copied to get their homework in on time for the homework points, which Maria had resisted, but would she have no other choice? Had last year’s teacher lied to her about her math skills? Was she not as good at math as she thought? Or was this year’s teacher out to get her?
To Mallory, no longer were her teachers’ inconsistent policies a theoretical dilemma. The school had spent months of planning and coordination to make sure teachers in the math department were using sequenced curriculum and that each teacher was preparing students to be ready for the next year — called “vertical alignment.” Yet teachers’ different approaches to grading was undermining all of it, sending confusing messages about learning and impacting students’ grades and promotion rates, their beliefs about school, and even their self-image.
Mallory had to talk to her teachers about what was happening. The prior year, she had broached many conversations — some quite difficult and uncomfortable — with her teachers about curriculum, teaching strategies, job responsibilities, even evaluation. Surely, she assumed, they would be as astonished as she was when they saw the data and would reconsider how they graded.
But now came her second shock: When she began a discussion of grades with her teachers, it was like poking a hornet’s nest. Nothing prepared her for the volatility of conversations about teachers’ grading practices. Many of her teachers, previously open to exploring new ideas about nearly every aspect of their work, reacted with defensiveness and adamant justification. Teachers with higher failure rates argued proudly that their grading reflected higher standards, that they were the “real teachers.” A teacher with low failure rates explained that he was the only teacher who cared enough to give students retakes and second chances. One teacher simply refused to discuss the topic, citing her state’s Education Code that protected teachers from administrators’ pressure to change or overwrite grades. One teacher began to cry, confessing that she had never received any training or support on how to grade and feared that she was grading students unfairly. Conversations about grading weren’t like conversations about classroom management or assessment design, which teachers approached with openness and in deference to research. Instead, teachers talked about grading in a language of morals about the “real world” beliefs about students; grading seemed to tap directly into the deepest sense of who teachers were in their classroom.
When she talked about these grading problems with principals of other schools, Mallory was surprised and dismayed to learn that grading varied by teacher in every school. This phenomenon was widespread, even the norm. Teachers thoughtfully and intentionally were creating policies that they believed, in their most thoughtful professional judgment, would promote learning. Yet they were doing so independently and often contradicting each other, yielding in each school a patchwork of well-intentioned but ultimately idiosyncratic approaches to evaluating and reporting student performance. Even when a department or a group of teachers made agreements — for example, to have homework count for no more than 40 percent of a grade — teachers’ other unique policies and practices, such as whether homework would be accepted after the due date, made their attempts at consistency seem halfhearted and ineffectual.