Analysis: We designed the Irish grammar test to end all Irish grammar tests to answer this question. The results were not great
More and more public and private services are integrating AI into their operations, and government agencies are even exploring it as a possible silver bullet for Irish language-related challenges. But can AI even understand Irish?
To answer the question we designed the Irish grammar test to end all Irish grammar tests. This essentially involved creating Irish sentences that explore different grammatical features such as the tuiseal ginideach or verb conjugation. We then prompted AI models with multiple choice questions where they were presented with a grammatically correct and an incorrect sentence.
This collaboration between researchers at UL and UCC is different from determining how good the AI models are at tasks like writing Irish homework, translating TikTok captions, or describing Irish memes. Our approach assesses the grammatical building blocks, if any, that these AI models have learned about the language itself.
We need your consent to load this rte-player contentWe use rte-player to manage extra content that can set cookies on your device and collect data about your activity. Please review their details and accept them to load the content.Manage Preferences
From RTÉ Archives, Áine Ní Ghlinn reports for Cúrsaí in October 1987 on young people learning Irish
Why is this important?
The race to inject AI technology into every walk of life seldom pauses to reflect on the implications. Until now, there's been no comprehensive analysis of how well AI models actually understand Irish. This gap indicates that the Irish language, like many of the world’s 7,000 living languages, is often treated as an afterthought.
The consequences are clear. Rolling out AI models like Claude or ChatGPT without evaluating their language skills risks erodes the quality of the language itself. It relegates Gaeilgeoirí to an inferior experience, while majority-language speakers (like English speakers) reap the productivity gains from the technology.
Our experiments found that fluent Gaeilgeoirí outperform both proprietary and open-source models across all grammatical tasks. Interestingly, the best model, OpenAI’s GPT-5, performs 17% worse than fluent speakers. Alarmingly, open-source models perform roughly 40% worse than fluent speakers at the same tasks. In fact, their performance is comparable to random chance, i.e., a random coin toss could be as accurate at choosing the grammatically correct sentences.
We need your consent to load this rte-player contentWe use rte-player to manage extra content that can set cookies on your device and collect data about your activity. Please review their details and accept them to load the content.Manage Preferences
From RTÉ Radio 1's Today with David McCullagh, is AI helping or hindering children's learning?
Why is it worrying?
This means proprietary models will likely be prioritised for Irish language AI. Using proprietary models typically involves sending data to the tech giants that own them. This effectively grants companies broad control over the data, enabling them to sell, analyse, or repurpose it without clear limits. Should Gaeilgeoirí have to surrender their data to these corporations in order to be included in public services, or to benefit from AI tools that enhance productivity?
Interestingly, Gaeilgeoirí and the AI models differ in the kinds of mistakes they make. Gaeilgeoirí struggled the most with sentences involving nouns and cases. For example: bun na sléibhte vs. bun na tsléibhte ("bottom of the mountains"). This example requires recognising the rules of the tuiseal ginideach iolra (the genitive plural case).
Read more: 5 quirks we found using AI to translate text into Gaeilge
On the other hand, the AI models struggled the most with verbs and negation. For example, baineadh geit aisti inné vs bhaineadh geit aisti inné ("A fright was gotten out of her yesterday"). This example requires understanding that the briathar saor (the autonomous verb) is used without lenition (séimhiú, adding -h) in the past tense.
Irish is far less represented in models like ChatGPT than English, largely because there’s so little digital data available for training. Interestingly, countries of a similar size, such as Bulgaria, have more resources available for their languages than Gaeilge. Irish was suppressed under English rule, and yet, a century after independence, the legacy of colonisation still shapes our technological language equality.
We need your consent to load this rte-player contentWe use rte-player to manage extra content that can set cookies on your device and collect data about your activity. Please review their details and accept them to load the content.Manage Preferences
From RTÉ Brainstorm, here are 15 slang words as Gaeilge to use this week
Why should we care?
If government organisations and companies inject this technology into their services as is, flawed or erroneous AI-generated text could erode and water down the quality of Irish being learned and used. Also, if English speakers can use AI to speed up paperwork, write their emails and more, should Gaeilgeoirí be left lagging behind in the dark ages, quill in hand?
This study offers an insight into how AI models understand the general Caighdeán Oifigiúil rather than any specific dialect. Assessing the models on their capacity to handle features like synthetic verb forms in Kerry Irish, or other distinct features in Conamara and Ulster Irish, would likely yield even poorer results, given the limited resources available for those dialects.
With President Catherine Connolly aiming to make Irish the working language of the Áras; Irish-language influencers like An Maolchathach and Laura Pakenham engaging people in a new way; films like Kneecap and An Cailín Ciúin bringing international attention to the language, and roughly one million people actively learning Irish and five million people having started learning outside the country on Duolingo alone, it is more important than ever to support the language. More conversations are needed that include Gaeilgeoirí when solutions are discussed to these challenges.
We need your consent to load this rte-player contentWe use rte-player to manage extra content that can set cookies on your device and collect data about your activity. Please review their details and accept them to load the content.Manage Preferences
From RTÉ News, President Connolly urges schoolchildren to continue learning Irish
What can be done?
We could ignore the problems and allow AI models to water down Irish literacy, exploit the personal data of Gaeilgeoirí and deepen the digital divide between Irish and English speakers. Or we could take another path, one that empowers Gaeilgeoirí and researchers by including them in the rollout of this technology. For starters, public, government-funded datasets should be shared openly with prospective researchers, not gatekept by institutions.
Capturing the Irish language in the form of a chatbot would be the ultimate act of language preservation. No matter what happens to the language, distilling its features into a conversational AI could allow future generations to learn and converse in the language through the ages of its existence.
Put simply, using existing AI tools for Irish will not be a silver bullet for solving the language inequality challenges that exist in this country.
Follow RTÉ Brainstorm on WhatsApp and Instagram for more stories and updates
The views expressed here are those of the author and do not represent or reflect the views of RTÉ