skip to main content

How to detect text which has been written by ChatGPT

'One problem with trying to understand ChatGPT's impact on writing is that we are chasing a moving target.' Photo: Getty Images
'One problem with trying to understand ChatGPT's impact on writing is that we are chasing a moving target.' Photo: Getty Images

Analysis: AI-generated texts tend be more predictable than human-written text with the use of certain words, phrases and emojis

It's three years since ChatGPT was unleashed onto the world, disrupting any industry where writing (and reading) is a thing. Almost as soon as it was released, efforts to detect its fingerprints in text began. Detection methods can be split into local methods, which try to ascertain whether a piece of writing is AI-generated, or global methods, which don’t work on the level of an individual article, but rather look for linguistic trends in AI-generated texts.

A few obvious ways to tell that an article has been AI-generated are made-up references or the accidental inclusion of phrases such as an "As an AI language model". Most of the time, however, more sophisticated methods are needed. Some of these methods are based on a statistical measure called perplexity which essentially measures how surprising a sequence of words is. AI-generated text tends to have lower perplexity or be more predictable than human-written text.

From RTÉ Radio 1's Brendan O'Connor Show, can AI write better poetry than a poet?

Other methods are based on the concept of watermarking, which is a process that hides a signal in the text that is being generated. Others use machine learning algorithms to learn patterns that can distinguish between AI-generated and human-written text. Although improving all the time, none of these methods are reliable enough to be used in practice. The main problem is their potential for false positives: a student could be falsely accused of using AI to complete an assignment.

The global approach to detecting AI’s presence in text is to try to find words, phrases or syntactic patterns that are associated with AI-generated writing There are two main ways of doing this: comparing text written before and after 2022, looking for strange spikes in usage of particular words or phrases, and/or comparing text that we know to have been written by a person to texts that we know to have been AI-generated.

Sometimes dramatic increases in a word’s popularity can be explained by world events, e.g. pandemic-related words such as omicron. Sometimes, though, there isn’t an obvious explanation suggesting that ChatGPT or other language models might be at play. The phrase "I rise to speak" used by American politicians has seen a noticeable surge in popularity amongst British politicians according to an analysis of recent speeches in British parliament. Perhaps the best-known discovery from this line of study has been ChatGPT’s apparent fondness for the word ‘delve’ in scientific writing.

We need your consent to load this rte-player contentWe use rte-player to manage extra content that can set cookies on your device and collect data about your activity. Please review their details and accept them to load the content.Manage Preferences

From RTÉ Radio 1's News At One, newly-released ChatGPT-5 claims to be 'PhD-level'

One problem with trying to understand ChatGPT’s impact on writing is that we are chasing a moving target. The models that underlie ChatGPT change every few months, and the companies developing these models are always trying to make them more human-like. So if ‘delve’ is a signifier of AI-generated text, the AI models can be tweaked so that responses containing ‘delve’ are no longer preferred; or ChatGPT users can include instructions in their prompts to avoid the word.

To illustrate this point, a recent Washington Post study, which analysed over 300,000 ChatGPT messages from June 2024 to July 2025, found that the use of ‘delve’ by ChatGPT is declining. At the same time, Generative AI is changing human writing. Many people are suspicious of AI and may avoid words that they know are associated with it when writing. Others may find themselves using these words more often because they are being subtly influenced by the AI-generated articles they are reading. It is not easy to tease these different factors apart.

A whopping 70% of all ChatGPT messages analysed contained an emoji, with about a third containing ✅

So what are ChatGPT’s new favourite words? According to the Washington Post study, ChatGPT’s new favourites are ‘core’ and ‘modern’. Emojis are also popular, particularly the brain 🧠 and checkmark ✅ emojis. A whopping 70% of all messages analysed contained an emoji, with about a third containing ✅. The phrase ‘not just X, but Y’ is on the rise, as are informal contractions like ‘it’s’ and ‘you’re’. The dash punctuation symbol (—) continues to grow in popularity.

So how can we know for certain that something has been written by a person? All we can do is to continue to delve into the core research of this thoroughly modern conundrum ✅🧠

Follow RTÉ Brainstorm on WhatsApp and Instagram for more stories and updates


The views expressed here are those of the author and do not represent or reflect the views of RTÉ