skip to main content

How did Irish authors' books end up in Meta's AI data?

Sally Rooney is one of the Irish authors whose work has been 'scraped' to train Meta's AI programme
Sally Rooney is one of the Irish authors whose work has been 'scraped' to train Meta's AI programme

We’ve been here before. Remember the 'You wouldn’t steal a car’ piracy ads in the early noughties, ushered in by the increase in dodgy DVD copies of the latest Hollywood blockbusters? In the 1980s and 1990s, picking up pirated VHS films or cassette tapes of the biggest pop acts at the local Sunday market was the norm, their terrible quality offset by a kind of ‘wink-wink, nudge-nudge’ acceptance.

Later, there was outrage that the likes of Napster meant that you could torrent (i.e. steal) the music of practically any artist you could think of, and then came Spotify - the streaming service that upended the music industry and left many musicians literally unable to make a living from their art.

As a music fan, the free (or very reasonable, if you’re a subscriber) access to the back catalogues of thousands of artists is undeniably amazing. On the other hand, it unquestionably devalued music, making a commodity out of something that an artist or artists had poured their heart and soul (well, at least most of the time) into.

The same could even be said for newspaper paywalls: how often have you seen someone on social media complain that an article is ‘behind the paywall’, not considering the time, effort and costs that go into writing that article, song or story?

Sadly, it means that there is now an expectation that certain art forms should be free to consume. Here we are, decades later, having much the same conversations. Perhaps that’s one of the reasons why there has been such little kerfuffle about the revelation that LibGen has essentially 'stolen' the work of hundreds of authors - many of them Irish - under the guise of ‘progress’. You may have once thought that it would be impossible to pirate a book, surely?

If you didn’t know, LibGen, or Library Genesis to give it is full title, has been in the news in recent weeks after it was revealed that Meta CEO Mark Zuckerburg approved the company’s use of the ‘shadow library’ data set (i.e. a pirate library) to train Meta’s AI programme. What this means is that essentially, millions of copyrighted books - 7.5 million, in fact, alongside 81 million research papers - have been ‘pirated’ on LibGen and used by Meta, for free, without the permission or compensation for any of the affected authors. What’s more, The Great Book Swindle was only disclosed on a wider scale after The Atlantic republished a searchable database of all the titles that had been stolen, after a copyright infringement case was taken against Meta by several US-based authors including comedian Sarah Silverman and author/journalist Ta-Nehisi Coates.

The question is, who among the most successful authors of all time - i.e. those with the deepest pockets - will do their bit and stand up to the likes of Meta?

The audacity is breathtaking, but sadly unsurprising.

Some of the Irish authors affected include Aoife Barry, whose book Social Capital was an insightful look at the perils of Big Tech, social media and its effect on society, particularly in Ireland - the irony is not lost. In a recent Sunday Times column, she wrote about the sinister impact that AI training is currently having on the publishing industry, and how she initially turned down an offer by her own publisher, Harper Collins, to have Social Capital used in AI training by an unnamed company. "The livelihoods of artists and the value of their work has never been a true consideration for the current generation of today's big tech AI pioneers," she wrote. "Instead they are focused on growing their own business, their own success, and their own profits. All artists are learning that in the current climate their work is only of value if it can be used, and even then they are not guaranteed to be asked for permission or even compensated for that."

Author Aoife Barry

Works by some of the biggest Irish writers of their generation also feature on LibGen, including Sally Rooney, John Banville, Colm Toibin and Claire Keegan and many others. There are even extensive titles by JK Rowling and Stephen King in there. The question is, who among the most successful authors of all time - i.e. those with the deepest pockets - will do their bit and stand up to the likes of Meta?

Irish-based author Vanessa Fox O’Loughlin, whose pen name is Sam Blake, is chair of the Society of Authors. She is also the founder of Inkwell Group, a consultancy service for writers, and recently voiced her disgust at the revelation that her own work had been pilfered by LibGen, calling it "illegal, shocking, and utterly devastating for writers".

"A book can take a year or longer to write," she pointed out. "Meta has stolen books so that their AI can reproduce creative content, potentially putting these same authors out of business."

There are protests planned in various locations, including one outside Meta’s London HQ this weekend, where well-known names including Kate Mosse (Labyrinth), Tracey Chevalier (Girl with a Pearl Earring) and more plan to lead a demonstration. Numerous well-known authors, including Kazuo Ishiguro, Richard Osman and Val McDermid have signed a letter from the Society of Authors that will be hand-delivered on the day, while a Change.org petition has already gathered almost 20,000 signatures in just a few days.

In Ireland, the Irish Writers’ Union has launched its own petition, writing an open letter to Minister for Trade Promotion, Artificial Intelligence and Digital Transformation Niamh Smyth and encouraging its members to make a "formal, legal complaint to Meta" demanding immediate cessation of their copyrighted work being used without permission, and compensation for past unauthorised use of their work in AI training programs.

It’s hard to know where, or how this blatant theft will end. Much like the music industry, it seems that the billion-dollar companies hold all the cards, and the rights of authors are being steamrolled into oblivion with no trace of accountability.

AI can undoubtedly be an impressive force in society, but why should it come at the expense of our artists? Once again, it is those within the creative arts that suffer at the hands of that aforementioned ‘progress’. American inventor Charles F. Kettering once sagely said "The price of progress is trouble." Ain’t that the truth.


The views expressed here are those of the author and do not represent or reflect the views of RTÉ

Read Next