Aids in Polish Translations
Artificial Intelligence is transforming and improving the translation industry. Today, there are many machine translation (MT) applications and computer-assisted translation (CAT) systems that support human translators in their work. Machine-translation tools operate using databases consisting of billions of words, phrases and expressions, to ensure that translations are accurate and natural, just as a human would write the text.
The truth is that we had to wait quite a long time before seeing efficient machine translations, which require many modern technologies, including artificial intelligence (AI) and the Internet. Today, machine-translation systems are complex applications that rely on machine-learning technology to enable them to independently translate text from the source language to the target language. To translate efficiently, they require enormous databases containing hundreds of millions of words, phrases, expressions and sentences. Using this data, MT applications can then ‘decide’ which translation is the most appropriate and accurate.
Naturally, these systems are not (and probably never will be) perfect. Human translators are still necessary if you want to obtain correct and meaningful translations. However, this is an interesting subject, so we decided to take a closer look at it.
How does machine translation work?
One of the most modern and complex MT systems is Microsoft Translator (also called Bing Translator). This will serve us as an example as we examine what machine translation looks like. In its early days, Microsoft Translator was based entirely on the Statistical Machine Translation (SMT) model, which means that it would search for the most probable translation – statistically, the one most frequently found in its database.
(This is a good place for a little digression: what exactly might such a database consist of? In fact, this could include, for example, the entire Wikipedia database. To give you an idea of the scale, English Wikipedia alone contains about 3.5 billion words. )
Currently, however, Microsoft is focusing on a technology called Neutral Machine Translation (NMT). This does not mean that SMT (statistical translation) has been abandoned. The new technology is designed to complement it, and thus significantly increase the likelihood of achieving a precise and accurate translation.
NMT has been developed since 2016. Granted, a huge database is still required, but the new system takes into consideration not only the frequency of a given translation, but also the context in which it appears in the text.
Microsoft Translator translates entire fragments of text (including idioms) in many language pairs. The system first ‘reads’ the source text and then decodes it to create the best translation possible. It translates not just the words but entire phrases and expressions, and also takes the context into consideration.
MT applications use databases called ‘corpora’. As a rule of thumb, the better corpora you have, the better the results you’ll obtain. In general, the results produced by free MT applications are inferior to those produced by paid MTs because they use large corpora without verifying the context. Paid MTs usually have high-quality corpora in a given field, which is why their accuracy is much greater. We will tackle this subject in the next blog post.
The privacy issue
We need to address the elephant in the room: the issue of the privacy of machine-translated content. Generally speaking, companies owning MT tools allow you to use their translation systems for free, but, in exchange, they will keep your translated text for their purposes. This is an especially critical issue when you’re translating highly confidential texts or personal data.
For this reason, most LSPs (Language Service Providers) prohibit the use of free MT tools because the translated content is saved and stored in the automated translation tool (machine translator).
Microsoft Translator and Google Translate
As we mentioned earlier, Microsoft Translator is one of the main machine-translation applications. It also supports Polish, but, like all such tools, it’s not perfect. For instance, when you go to the Translator’s website, it informs you that the entire content has been translated into your language (in our case, Polish) using Translator. However, the first thing a Polish-speaking user will notice are the mistakes made by Machine Translator. Whatever was intended by ‘you can be the first knowledge about new languages’, the translation into Polish is, we can all agree, far from ideal!
There is also Google Translate, another global machine translation tool, which supports 109 languages, including Polish. (Microsoft Translator supports ‘only’ 65 languages.) According to Wikipedia:
‘In November 2016, Google announced that Google Translate would switch to a neural machine translation engine – Google Neural Machine Translation (GNMT) – which translates “whole sentences at a time, rather than just piece by piece. It uses this broader context to help it figure out the most relevant translation, which then rearranges and adjusts to be more like a human speaking with proper grammar”. Originally only enabled for a few languages in 2016, GNMT is used in all 109 languages in the Google Translate roster as of 2020.’
So, as we can see, Google has taken a road similar to Microsoft. However, this tool, too, has its problems. It handles Polish–English translations quite well, but the results with other language combinations are much worse! For example, translating from Polish to Spanish, Google translates the Polish word nigdy (never) into siempre, which means always.
Other machine translation tools
Next in line is IBM Watson Language Translator. IBM Watson is an artificial-intelligence super-computer, designed to implement AI into almost any industry, for example advertising, education, IoT (Internet of Things) and financial solutions. Based on it, IBM has created its own machine-translation tool: IBM Watson Language Translator. Just like Google Translate and Microsoft Translator, it uses neural machine translation to improve translation speed and accuracy. IBM Watson supports 30 language pairs, but with Polish, it primarily focuses on translations from and into English.
Systran Machine Translator is worth mentioning because it’s a free solution. Anyone can translate a text from Polish into English and from English into Polish. Just visit translate.systran.net. It’s combined with a dictionary, so you can also check the meaning of each word. However, there is a 5000-character limit for each portion of the text.
Now, let’s talk about another significant aid in translators’ work: CAT tools.
Computer-assisted translation tools do not translate on their own, but merely help translators work efficiently. Although these tools are a crucial support, they only partially automate the translation process, relying on terminology databases and previous translation memory. CAT tools aid translators and linguists in editing, managing and storing translations, but the translation itself is done by the human translator, although, the vast majority of CAT tools do have add-ons that make MT translations accessible.
According to research from 2013, the three most popular CAT tools are SDL Trados, Wordfast, and MemoQ. In Poland, nine out of ten translators use at least one CAT tool. According to SDL Trados, over 250,000 translators use their software globally. One of its interesting features is a machine-translation engine, which supports translations from German into Polish, from Polish into English and from English into Polish. Wordfast and MemoQ also support Polish, as either the source or target language.
To sum up, although machine-translation applications are developing very rapidly, it is clear that efforts are concentrated mostly on the English language. Two exceptions to this are Google Translate and Microsoft Translator, both of which support at least several dozen language pairs, including Polish. In terms of Polish machine translations, Polish–English is the most accurate and provides the highest quality and consistency of the text. In any event, the machine-translated text still needs to be verified by a human translator. That’s why our company offers a service called MTPE – Machine Translation Post-Editing. If you have a machine-translated text or document and want to polish it up, then drop us a line: it’s one of our primary services!
Meanwhile, we’re still waiting for a machine translation tool of Polish origin …