Can the machines speak your language?

Opinion: technology and AI have brought unprecedented change to the translation industry yet there is still huge demand for human translators

The application of neural networks to automatic translation in the past couple of years has brought about systems that can translate more fluently than ever. Expectations for neural machine translation (NMT) have soared, particularly since Google published a paper that promised to bridge the gap between human and machine translation quality.

Growing translation industry enthusiasm for machine translation (MT) means that over 80 percent of translation companies now deploy it. Media articles about AI and technological unemployment for translators have become more common and sci-fi translation products such as Skype Translator and Google Pixel Buds have been launched. At present, these products attempt to combine three imperfect technologies (automatic speech recognition, machine translation and a text-to-speech service), but they can aid low-risk communication. For many, the implication is clear: the big MT and AI breakthrough is just around the corner.

The problem is, the big breakthrough has been just around the corner for quite some time. At the well-publicised public demonstration of an English-Russian MT system at Georgetown University in Washington DC in 1954, one professor predicted that their system would learn to translate whole books accurately within five years.

In the following decade, a critical report on MT progress by the US Automatic Language Processing Advisory Committee dampened expectations for MT. Subsequent research has focused on producing systems for very specific scenarios that may augment the translator rather than replacing him or her. The state of the art has moved on from the early rule-based MT systems, with dictionaries and hand-coded grammatical rules. We then had statistical MT, which uses previous translations to compute the most likely correct translation and monolingual text to compute how likely a sentence or phrase is to appear in the target language. Now, we have NMT.

But as MT systems have improved, human translators haven’t been replaced. In fact, both machine and human translation are doing better than ever. There are now more translators employed than ever before as the explosion of digital content means more material to translate. More words per day than ever before are translated using MT (Google MT averaged 143 billion words per day in 2016), and MT is also proving useful as a productivity tool for translators to correct errors in the output (an activity few translators enjoy).

The situation for professional translators is not perfect. Most translators work on a freelance basis, which appeals to some but does not suit many others. Add to this constant pressure on price, abstract measures of quality, fears of being replaced by AI and the fast pace of change for the profession and it’s clear why would-be translators are sometimes put off.

However, there is currently a shortage of translators, with well-paying institutions like the UN and European Commission struggling to fill some roles (particularly for Irish). The US Bureau of Labour Statistics and other research institutions believe that the demand for translation and interpreting makes it one of the fastest-growing careers.

Rather than fearing technology, it’s important for translators to understand what they can expect from it, especially NMT. Neural networks were first proposed in the 1940s, but it’s only recently that sufficient computing power has become available to put them into practice. NMT systems are computationally expensive and require weeks of training. This training requires vast amounts of bilingual data – aligned source and target sentences that have been translated by humans - and also requires humans for validation. There are various automatic measures of MT quality, but few correlate closely with human judgement.

An NMT system involves one neural network wherein words from a sentence are encoded, processed through a further layer or two (the "deep learning" part) and the most statistically likely word is finally output at the decoding stage. Words are produced one by one until the translated sentence is complete. Another neural network allows the system to deal with variable sentence length, while yet another incorporates context from the training data, source text and target words produced so far.

For the near future at least, human translators are still very much needed

This is where NMT proves valuable, usually producing words in the correct context. This is also why errors may be difficult to spot, as target sentences look fluent, even if some words have been translated incorrectly. These mistakes are difficult to predict and are one reason why MT is not generally recommended for texts that are high-risk or those that have a long shelf-life (printed texts, for example). Tests have shown that NMT systems can produce useful literary translations, but it would be unwise to publish machine translations without at the very least a human review in most circumstances.

There are other unsolved problems with NMT, such as inconsistent use of terminology, the occasional production of words that don’t exist, retention of metadata and the difficulty of creating systems for a specific field due to the massive amount of data required for training. Many translators had successfully incorporated statistical MT into their workflow, but nobody is sure how best to work with NMT as a translation aid.

Then, there are the more general problems with AI, such as bias in training data being reproduced in translations, and the difficulties of paying royalties on previous translations used for training. The translation industry is further complicated here, as rights of ownership of translations and databases of previous work are commonly disregarded.

For the near future at least, human translators are still very much needed. This especially applies for those with a broad skillset and subject-matter expertise who can tailor an appropriate translation process for clients, who elect to work with MT and who can understand what can and cannot be expected of MT. MT is increasingly serving a very useful purpose too, helping its users get a rough idea of the meaning of text in another language, providing fast translations of low-value material, and augmenting human translators.

The views expressed here are those of the author and do not represent or reflect the views of RTÉ

Can the machines speak your language?

More stories on

Joss Moorkens