Redefining Assessment: Beyond AI Detectors in the ClassroomDec 06, 2023
Written by Matthäus Huelse
As an EdTech coach, I've had the opportunity to engage in many enriching conversations with teachers, administrators, superintendents, and fellow EdTech coaches. The vast majority of the time, these discussions revolve around maximizing the benefits of AI tools in an educational setting. However, today I want to shift the focus to one of the more intricate and, perhaps, more critical aspects – the limitations of AI detectors in educational assessments and the pressing need to explore alternative methods.
The objective here isn't to present definitive answers but to spark a conversation. Society and social norms change slowly – a process I often compare to the slow but steady turning of a large ship. Much like with the early internet, we did not know the rules of social conduct we would need when Facebook and MySpace were in their infancy. (Did I just age myself?) We do not know the rules we will establish in the future around AI, as we are literally debating them while trying to learn what AI can actually do. These discussions are vital (akin to the ones you might find yourself having at a Thanksgiving dinner table), especially amidst the current headlines about OpenAI and the ongoing debates surrounding their leadership – a separate ethical conundrum I won’t delve into here. I will stay in my lane, and I will keep to what I know.
I often get asked about my preferred AI detector and, while I can list a few, I find myself hesitant to fully endorse any. Initially, I saw AI detectors as a necessary defense against AI-generated content, but their limitations have become increasingly apparent. Companies like Turnitin, for instance, acknowledged a 1% false positive rate with their AI detector. To put this into context, at Vanderbilt University, out of 75,000 papers submitted to Turnitin in a year, about 750 could have been incorrectly labeled as AI-written. Moreover, AI detectors have been found to be more likely to label text written by non-native English speakers as AI-written. This issue has led institutions like Vanderbilt to disable Turnitin’s AI detection tool, citing concerns for the best interests of students and faculty (Vanderbilt University's statement on AI Detection). International students, for example, often face a higher rate of false positives, possibly due to their extensive vocabulary acquired while learning English as a second language.
This brings us to a crucial realization: AI detectors are not future-proof. They will never be infallible. Whether it's the evolution of technology that will make AI detectors obsolete or the same human creativity that brought us watches with hidden calculators that push academic integrity, we cannot rely solely on these tools. The question then arises: if a student is suspected of using AI, how do we verify their work? The common solutions such as oral assessments and in-person demonstrations suggest that alternatives might be more effective than essays themselves.
This dilemma reminds me of a recent episode on our Restart Recharge podcast, where my co-host Katie and I delved into the Mann Gulch fire – a powerful example of adaptability and quick thinking in crisis. In that incident, the firefighters, faced with a rapidly spreading fire, had to make a life-saving decision to drop their tools and devise an unprecedented plan to escape. This story is not just about survival but also about the necessity of thinking outside the box and being willing to let go of traditional methods when they no longer serve us. In education, as we confront the challenges of integrating AI, we might find ourselves in a similar situation. Are we holding onto tools and methods that are no longer effective? Maybe, like those firefighters, it's time for us to reassess our strategies, drop what's not working, and embrace new, innovative approaches to student assessment. This could mean exploring alternative assessment methods or incorporating AI tools in a way that complements, not replaces, our professional judgment.There are more lessons to be learned from this conversation and if you are interested, tune in and listen to our final episode of the season (320 - Redefining Professional Development).
Perhaps it's time to reconsider our reliance on traditional tools like the essay. Was it ever truly the gold standard for assessing skills beyond writing? We struggle to meet the demands of our profession, often weighed down by various duties and constraints. Maybe this is the time to reassess and look at something new for the next calendar year or even the rest of the current school year. If you need inspiration, check out my previous blog post about different AI tools and applications.
While discussing the challenges associated with AI detectors, it's essential to acknowledge their role in the educational landscape. I'm not fundamentally against AI detectors; however, I believe they should not be the final verdict in assessing academic integrity. These tools serve as one piece of the puzzle, helping to flag potential AI-generated content, but they can't be relied upon as the sole judge.
It's time for us to rethink our assessment strategies. I encourage fellow educators to explore new ways to evaluate student learning. Let's be bold and innovative, stepping beyond traditional essays and tests. And while we're at it, why not see how AI can lend a hand? It's not about replacing our judgment, but enhancing it. So, let's dive into training and discussions on AI, not just to keep up, but to lead the way in creating fairer, more effective methods of assessing our students.