False Formatives

I just presented a poster session in Chicago for the NCME Special Conference on Classroom Assessment (Piantaggini, 2024). While I had some rough details for a proposed dissertation study, the focus of discussion with scholars who stopped by was my new assessment model and the theoretical framework that brought me to it. The message I got was “I think you’re onto something,” so I’m sharing my work here to get more eyes on it. Please contact me with any embarrassingly scathing criticism. Otherwise, reply publicly with any other thoughts or questions. After all, this is my blog, not peer review!

So, in this blog post, I’ll describe the model you see above, and how I got there, starting with a major dilemma I identified when reviewing literature on classroom assessment: confusion over grading formative assessments…

Continue reading

Current Reading: Flexible Deadlines ≠ “No Deadlines” (i.e., Extensions vs. Reassessing)

One concern with flexible deadlines is that in the absence of late work penalties, students will wait until the absolute last, last, last, last, LAST possible moment to turn in their assignments. The fear is that this will create a ton of extra work for the teacher, and that students will not develop time management skills since there are no consequences of a lower grade/reduced points)…because all students in traditional points-based grading systems turn in ALL of their assignments on time, right? And then they graduate and become college students who continue to turn in ALL of their assignments on time, right? And then they graduate and become employees who complete ALL of their tasks on time while being adults who get done ALL of their errands on time, right? All because of low grades and reduced points in school…right? This belief has prevailed despite the lack of empirical evidence to support it. Granted, the fear does seem to play out in some cases when flexible deadlines are misused, or there is some other assessment policy getting in the way. Nonetheless, for any change to take place, this belief must be addressed…

Continue reading

Current Reading: Assessing Students Not Standards (Jung, 2024)

Given over 20 years of schools attempting to implement standards-based grading (SBG), Lee Ann Jung’s 2024 release, Assessing Students Not Standards, offers a refreshing alternative. Is it part of a post-SBG era? Maybe. There are a lot of SBG concepts that are universally good, and the message is clear from researchers and teachers: let’s keep those. But there’s more. We can rebuild SBG. We have the experience. We can make SBG better than it was. Better, stronger, faster.

Another message is getting clearer, too, and seems right at home with the ungrading movement. Jung states “we need to grade better, but we also need to grade less. A lot less” (p. 20). This is aligned with my own research on exploring 1) ways to reduce summative grading, and 2) find formative grading alternatives (i.e., so they remain formative). So, let’s get into some stuff in the book…

Continue reading

Current Reading: Grades Discourage Students Seeking A Challenge

Susan Harter was interested in the effects of extrinsic rewards on children. In 1978, she found that grades influenced whether students chose to complete tasks at a higher difficulty. In this study, 6th graders were assigned to a “game” group, or “grade” group and were asked to solve 8 anagram puzzles. The children in the game group were told to choose one of four difficulty levels. Those in the grade group were given the additional instructions that they would be graded on the number of correctly-solved anagrams (i.e., 8 = A, 6 = B, 4 = C, 2 = D, and 0 = F).

Unsurprisingly, the game group (i.e., ungraded) chose the more-challenging tasks, while the grades group chose the less-challenging tasks. Surprisingly, when children were asked which tasks they might have chosen if they were in the other group, the responses matched. That is, the game group said that if they were in the grades group they would have chosen easier tasks, and the grades group said that if they were in the game group they would have chosen harder tasks. In other words, each group confirmed that being graded discourages taking risks.

As if the negative association of grades on learning weren’t clear enough from those findings alone, Harter also recorded the children…smiling! Yes, smiling. It turns out that children in the games group smiled more after solving each puzzle than children in the grades group.

This all makes sense.

In talking to various stakeholders, high-achieving students are overly concerned with their GPA. They will do anything to maintain it, which includes opting for classes that get an “easy A.” Sadly, these students are often stressed out, motivated by high-stakes rewards, and tend to enjoy themselves less. They don’t do much smiling, and tend to be unhappy with their academics. On the other end, struggling students aren’t motivated by grades at all! For them, these rewards are actually punishments (i.e., low grade, after low grade, after low grade). They hardly smile, and tend to be unhappy because there’s often very little hope for improving, especially in grading systems with zeros and averaging low scores into the course grade!

Families are another important stakeholder group. They often view grades as a motivator that gets their students working hard, and striving to do better. But do they know grades are likely keeping their star-student from going above and beyond? I wonder what they would think about Harter’s findings. After all, it’s almost impossible to know that grades are negatively impacting students from the surface. That is, when the focus is on achievement, the process of how students get there tends to be ignored. High grades might give family members the sense that their student is being challenged, but are they? Who’s to say they aren’t doing just enough to get those high grades, and nothing more once they’re achieved? Who’s to say they’re actually taking the kind of intellectual risks their families want? And are they even enjoying learning? Harter’s study suggests otherwise.

To summarize Harter’s findings, grades discouraged children from taking risks, and children had less fun solving puzzles because of grades. This is just one more study showing how grades get in the way of learning. Furthermore, Harter’s findings support my work in searching for ways to 1) reduce summative grading (i.e., grade fewer assessments, less often), and 2) use formative grading alternatives (i.e., use practices that allow formatives to remain ungraded).

Reference
Harter, S. (1978). Pleasure Derived from Challenge and the Effects of Receiving Grades on Children’s Difficulty Level Choices. Child Development, 49(3), 788–799. https://doi.org/10.2307/1128249

Multi-day Assessments

I *cannot* find a source for this, but Google Images came to the rescue when I knew I had seen something like it before in a textbook. Know it? Let me know!

I now get to see a decent amount of teaching by pre-service students, as well as current teachers in the field—something every educator would benefit from, yet is almost never built into teaching schedules, sadly. One thing I overheard last fall was something like “after writing your conclusion, be sure to submit your lab; I’ll be reviewing these over the weekend,” and then the class began a brand new unit.

It occurred to me that the teacher wasn’t going to finish reviewing students’ work for days—outside of contractual hours, no less—which means the teacher wasn’t going to discover any struggling individuals (or groups) until long after anything could be done, like the two boys I saw in the back of the room who had fairly blank lab reports. In other words, this isn’t an example of timely feedback that would have otherwise improve learning, which affects both teacher and student.

In this particular case, when the teacher found out that a certain number of students didn’t understand Unit 1 content, what was the plan? Pause the current Unit 2, then go back?! Why did they go on to Unit 2 in the first place?! I found myself wondering: “what could be done with the lab report to avoid all this? How might we break up the assignment so that all the feedback is timely, and there’s no moving forward only to fall back?” Let’s take a cue from some graduate work…

Continue reading

Current Reading: Zeros = -6.0!!!!

**Updated 4.6.24 w/ quantitative results on minimum 50 grading**

We know that the 100-point scale has a staggering 60 points that fall within the F range, then just 10 points for each letter grade above. This major imbalance means that averaging zeros into a student’s course grade often has disastrous results, and can become mission insurmountable for getting out of that rut.

Still, the argument against zeros is surprisingly still going on, with advocates in plenty of schools everywhere claiming the old “something for nothing myth” when alternatives are suggest, like setting the lowest grade possible as a 50 (i.e., “minimum 50). In other words, teachers are still unconvinced that they need to stop using zeros. Well, we’re heading back 20 years to when Doug Reeves (2004) used a 4.0 grading scale example to show exactly how utterly absurd and destructive zeros are in practice. This is perhaps the most compelling mathematical case against the zero I’ve come across yet….

Continue reading

Current Reading: Retakes—When They Do And Don’t Make Sense

My recent review of assessment has continued, which now includes two major findings:

  1. Grading is a summative function (i.e., formative assessments should not be graded).
    (Black et al., 2004; Black & Wiliam, 1998; Bloom, 1968; Boston, 2002; Brookhart, 2004; Chen & Bonner, 2017; Dixson & Worrell, 2016; Frisbie & Waltman, 1992; Koenka & Anderman, 2019; Hughes, 2011; O’Connor et al., 2018; O’Connor & Wormeli, 2011; Peters & Buckmiller, 2014; Reedy, 1995; Sadler, 1989; Shepard et al., 2018; Shepard, 2019; Townsley, 2022)
  2. Findings from an overwhelming number of researchers spanning 120 years suggest that grades hinder learning (re: reliability issues, ineffectiveness compared to feedback, or other negative associations).
    (Black et al., 2004; Black & Wiliam, 1998; Brimi, 2011; Brookhart et al, 2016; Butler & Nisan, 1986; Butler, 1987; Cardelle & Corno, 1981; Cizek et al., 1996; Crooks, 1933; Crooks, 1988; Dewey, 1903; Elawar & Corno, 1985; Ferguson, 2013; Guberman, 2021; Harlen, 2005; Hattie & Timperley, 2007; Johnson, 1911; Koenka et al., 2021; Koenka, 2022; Kohn, 2011; Lichty, & Retallick, 2017; Mandouit & Hattie, 2023; McLaughlin, 1992; Meyer, 1908; Newton et al., 2020; O’Connor et al., 2018; Page, 1958; Rugg, 1918; Peters & Buckmiller, 2014; Shepard et al., 2018; Shepard, 2019; Starch, 1913; Steward & White, 1976; Stiggins, 1994; Tannock, 2015; Wisniewski et al., 2020)

In other words, 1) any assessment that a teacher grades automatically becomes summative, even if they call it “formative” (I’m referring to these as false formatives), and perhaps more importantly, 2) grades get in the way of learning. These findings suggest that best way to support learning is by a) limiting grading to only true summative assessments given at the end of the grading period (e.g., quarter, trimester, semester, academic year), and b) using alternatives to grading formative assessments that otherwise effectively make them summative. Therefore, my next stage of reviewing literature focuses on reducing summative grading and exploring formative grading alternatives (i.e., so they remain formative). As for now, one of those practices *might* be retakes, which has been on my mind ever since I saw a tweet from @JoeFeldman. Given the findings above, which establish a theoretical framework to study grading, let’s take a look at how retakes are used now, and how they could be used in the future, if we even need them at all…

Continue reading

CI Assessments

I was recently asked a very good question about how to change one’s assessments to align more with CI. By that, we’re talking about comprehension-based language teaching (CLT) that prioritizes comprehensible input (CI) in the Latin classroom. First, it helps to think in terms of what standards were being assessed beforehand, even if they weren’t explicitly called “standards.” These old standards were mostly discrete skills you’d expect to find in tests accompanying popular textbooks, like vocabulary recall, derivative knowledge, grammar identification, and cultural trivia. New standards based on CI—whatever they are—will have meaning at the core. My suggestion is to focus on assessing comprehension of Latin, because that’s more than enough to ask for. One benefit of this standard is that is that it has those old discrete skills embedded within something larger and more meaningful that you can assess (i.e., comprehension). Let’s look at how each one of the old standards is contained within assessing comprehension…

Continue reading

Current Reading: Formative Assessment

I’m going to start sharing some findings in a series called “Current Reading” as part of a lit review I’m doing on assessment and grading; nothing too fancy or cerebral, but definitely more than blog post ideas.

Why the announcement?!

On the one hand, this is not new. I’ve shared plenty of direct quotes and sources in my blog posts in the past. Also, consider this a symptom of being steeped in academia once again. I’m reading hundreds of pages of research a week, and it’s important to digest and keep track of studies that support my own research. This includes knowing who wrote about what, and when. On the other hand, a second language acquisition (SLA) researcher Bill VanPatten mentioned something online recently when I shared a 2020 post with a summary of CI non-examples. His comment was how ideas in that post were oddly familiar ones throughout the field. That’s completely true; I never claimed they were *my* original thoughts. Like many of my posts written to pass along information, that 2020 summary doesn’t include citations to any particular study. It’s a collection of ideas that have consensus in the SLA community, and that lack of citations was intentional, not an oversight.

Why intentional? For nearly all of my blog’s 12 year history, I never wrote for the academic community that would be interested in that kind of stuff. I was writing for other teachers. I sometimes added just a shorthand author and year (e.g., Feldman, 2018) to some statements that would give most people what they needed to track down the original—if they really wanted to read that original! In my experience, though, most teachers don’t read research, so I haven’t bothered much with bibliographies. Since I’m no longer teaching, and I’m now using bibliographies a lot more these days, I do want to make a clear distinction between posts of the past and posts moving forward. Granted, my posts are still actually written for teachers, make no mistake! My degree program is Teacher Education and School Improvement (TESI), and I’m still sharing ideas for practical implementation. The one difference is that they’ll now include more breadcrumbs for everyone to follow—myself included. After all, there has been no better way for me, personally, to consolidate thoughts and work through concepts than by writing these blog posts. You might also benefit as well. Now, for the good stuff…

Continue reading