We Must Stop The Obsession With Detecting AI

16 Sept

Detecting the use of AI tools is a frequent concern across many organisations I speak to—whether it's schools checking homework, grant-makers scrutinising applications, or recruiters screening CVs for AI-generated content. If you’re trying to detect AI use, then please know this:

Research shows that AI detection tools can be inaccurate.
Worse, they can introduce bias against people who have English as a second language. And perpetuate the digital divide.
Fundamentally, in many cases, it’s probably irrelevant anyway.

This is a classic ‘Other AI Frontier’ problem, where AI tools meet organisations that have previously had little reason to engage with them, needing leaders to react swiftly to the change. It’s no surprise that ‘detect and penalise’ is the go-to response. But we must shift this to ‘Embrace and Adapt’.

AI Detectors can be Fallible – and may lead to discriminatory outcomes

There is much research to suggest that AI detectors are unreliable. This recent TES article from Jack Dougall a great example, along with a number of sources in this Ethan Mollick Substack, and even OpenAI themselves. It’s clear that AI detection tools can misclassify both AI and human work, often unfairly flagging work by people for whom English is a second language.

The consequences of this are concerning. If you don’t have English as your first language, and/or you have access to fewer digital resources, you are more likely to be flagged and penalised for AI use, potentially exacerbating existing inequalities.

This doesn’t mean we should completely bin the detectors. Research by Mike Perkins et al identifies many of the same issues, but also argues that if used as part of a non-punitive approach, these detection tools can contribute to learning and academic integrity.

This isn’t unique to education

Working with organisations across different sectors gives us the benefit of taking learning from one to the other. While school children cheating in homework may be the thing that comes to most people’s minds, the problem reaches into any environment where we assess people or organisations. Recruiters in all organisations are facing a tide of AI generated applications.

And anyone distributing grant funding will be facing similar challenges. From over a decade running programmes distributing millions of pounds of public money myself, I know how much care goes into making sure that the organisations being funded will spend that money wisely.

Yet I continually hear people say things like "All the AI-assisted applications are rubbish". What they really mean here is, "All the applications that I can tell are written by AI are rubbish." This distinction matters. The problem isn’t AI itself, but the lazy or thoughtless use of it.

Should we even care?

In many instances, we should actually be looking to reward the people or organisations who are willing to work co-intelligently with AI.

Organisations should prioritise recruiting people who can work responsibly with AI. Recruitment processes should reward applicants who showcase their personal values and creativity, rather than those who repeat company jargon or industry buzzwords.

Frequently when I invite people in organisations to share their AI use, I hear how beneficial it has been in supporting someone with dyslexia. I want to employ people like this, not penalise them for using AI.

If you’re distributing grant funding, it makes sense to incentivise organisations who are going to spend our money efficiently by working responsibly with AI. So make this part of the assessment.

And students are going to need the skills to work with these technologies, to help them to enhance their own learning and prepare them for the workplaces they will enter.

All of these are reasons why we must move away from a 'detect and penalise' approach to an ‘embrace and adapt’ attitude.

Embrace: Set Expectations, Be Transparent

Firstly, we need to embrace the fact that people and organisations will look to use AI in multiple ways. We can do so by being overt about the level of AI use that we expect to see in a piece of work, or a funding application, or even a CV.

There are, after all, 50 shades of generative AI use, ranging from tools like Grammarly for minor edits, to full-scale, automated content generation, with many hybrid uses in between.

In education, for instance, Leon Furze’s AI Assessment Scale offers a structured way to set expectations for AI use. It empowers teachers and students with a clear language for discussing the appropriate use of AI in tasks, moving beyond a binary yes/no approach.

Being similarly open about expectations for its use in grant funding, or in job applications, is a great first step. Creating transparency in one direction will inevitably invite it back in the other direction, encouraging applicants to share how they have used AI in the process.

This will move us forwards and open some powerful conversations. But it still won’t stop the determined from trying to short circuit the process.

Adapt: What Are We Really Trying to Test?

Time (and money) that would have been invested in fruitless AI detection efforts, should instead be invested in thinking clearly about what we are actually trying to assess, and how that might be circumvented with AI. Then designing the assessment process accordingly.

This Financial Times article highlights the use of generative AI to game psychometric tests, with people who had access to paid-for-tiers of the generative AI tools performing significantly better, skewing the available talent pool towards people from higher socio-economic backgrounds and exacerbating the digital divide.

This inevitably leads us to needing multi-stage assessment process, with written work focused on things that AI finds harder to replicate… personal stories, real motivations, or nuanced human judgement. And then using in-person assessments to test the kinds of knowledge and understanding that AI tools find it much easier to ‘cheat’ on.

This AI In Education podcast episode with Philip Dawson explores this in way more detail than I could possibly do justice to, and introduces the wonderful metaphor of a ‘swiss cheese’ approach to assessment, accepting that AI will create holes in the assessment process, but if we use multiple layers where those holes are in different places, we can still create a solid assessment.

Moving Forward: From Detection to Discussion

When asked, 'What are you doing about AI?', the answer shouldn’t be, 'We’ve implemented an AI detection tool.' It’s time to shift the conversation toward open and transparent discussions on AI use. Are we assessing what truly matters, or relying on flawed tools to sidestep deeper understanding of AI’s impact on our work and creativity?

A change of mindset from 'detect and penalise' to 'embrace and adapt' is essential both in education and in the workplace, and will create the environment for AI inspired innovation and creativity to shine through.