Actualités, CAT, Langues, Traductions

What is Neural Machine Translation (NMT)

Last year professionals had talked much about NMT. What is NMT?

Neural machine translation (NMT) is a machine translation approach that uses a large artificial neural network to predict the likelihood of a sequence of words, typically modeling entire sentences in a single integrated model.

All of the machine translation products (websites or apps) were based on algorithms using statistical methods to try to guess the best possible translation for a given word. This technology is called statistical machine translation.

However, one of the limitations of statistical machine translation is that it only translates words within the context of a few words before and after the translated word. For small sentences, it works pretty well. For longer ones, the translation quality could vary.

Now we have a new machine learning technology called deep learning or deep neural networks, one that tries to mimic how the human brain works (at least partially).

At a high-level, neural network translation works with in two stages:

— A first stage models the word that needs to be translated based on the context of this word (and its possible translations) within the full sentence, whether the sentence is 5 words or 20 words long.

— A second stage then translates this word model (not the word itself but the model the neural network has built of it), within the context of the sentence, into the other language.

One way to think about neural network-based translation could be to think of a fluent speaker in another language that would see a word, say “dog”. This would create the image of a dog in his or her brain, then this image would be associated to, for instance “le chien” in French. The neural network would intrinsically know that the word “chien” is masculine in French (“le” not “la”). But, if the sentence were to be “the dog just gave birth to six puppies” , it would picture the same dog with puppies nursing and would then automatically use “la chienne” (female form of “le chien”) when translating the sentence.

Because of this approach, sentences that are generated from a neural network based machine translation are usually better than statistical machine ones but also sound more fluent and natural, as if a human had translated them and not a machine.

Source: Microsoft

Langues, Traductions

Some interesting facts about translation history – 1

The history of translation (as distinct from oral interpreting) must have started soon after the development of writing, the expression of language with letters or other marks. The earliest records of writing dates to Egyptian glyphs from about 3400 CE. However, the earliest records of translation do not appear until nearly a thousand years later, as bilingual or even trilingual inscriptions. The earliest of these date back to about 2500 CE in the form of bilingual vocabularies in Sumerian . Some of the tablets recorded financial data, while another group contained ritual and literary texts. A later example is a bilingual Greek-Aramaic inscription from the third century CE with a version of some of Ashoka’s edicts that was found in Kandahar, Afghanistan.

Perhaps the best known example of these multilingual inscriptions is the Rosette Stone, which bears a decree issued in 196 BC in three scripts: Ancient Egyptian hieroglyphs, Demotic (Egyptian) script and Ancient Greek. The text of the decree is essentially the same in all three scripts, and although slightly earlier bilingual and trilingual inscriptions have been found- The Rosetta Stone was the key to our current understanding of ancient Egyptian culture. It is now held by the British Museum in London.

CAT, Langues, Traductions

Machine and manual translation – 1

Machine translation is also known as Computer Aided Translation, is basically the use of software programs which have been specifically designed to translate both verbal and written texts from one language to another.

Trados is one of a few computer-assisted translation tools (CAT tools). Its primary function is to allow translators to reuse translations. SDL purchased Trados a few years ago, and their products are generally branded now under the name of « SDL/Trados ». The advantages of using machine translation include the fact that you can make documents in several languages easily.  Generally, it is rather useful for specialized texts (medical, technical, legal), I think. In my opinion any CAT tool is good for those parts of texts that repeat: if you have to translate extracts from business registers, school-leaving certificates, birth certificates, legal records or other such documents, then a CAT tool will do good. One extremely good thing is that Trados keeps the original formatting, so you usually don´t have to deal with the visual form of a document.  When more people work on large projects, it helps to use the same terminology and thus increase the overall quality of translated documents. One of advantages of CAT-Tools is terminology handling.  A good Multiterm-glossary can be extremely useful for legal documents, too. If you are dealing with repetitive texts that are crawling with specific terminology, then this software is the tool for you too.  Next advantage is that with CAT-tools you cannot accidentally leave out a sentence – something that can all too easily happen when overwriting. Another advantage of Trados  is an excellent way to review other people’s texts.

The main disadvantage of using machine translation is his cost, but you can leverage it in a very short time. I heard several people have said that Trados (or any CAT tool) is no good for creative texts, literary translation etc. The other disadvantage is the accuracy of translated material (text) depending on word ordering of original text.

Nuances, cultural differences, and vocabulary that is very local need to be translated by a person. Systematic and formal rules are followed by machine translation so it cannot concentrate on a context and solve ambiguity and neither makes use of experience or mental outlook like a human translator can.

Langues, Traductions

Online translation tools

The free online translation is powered by Google Translate, Microsoft Translator, Babylon Translator and other machine translation systems.

Google Translate is a statistical multilingual machine-translation service provided by Google Inc. to translate written text from one language into another. Google translate supports 66 languages which makes 4290 language combinations.
Microsoft Translator implements  machine translation platform to translate between 38 languages developed by Microsoft Research, as its backend translation software. Microsoft developed a translation portal as part of its Bing services to translate texts or entire web pages into different languages. When translating an entire web page, users are allowed to browse the original web page text and translation in parallel, supported by synchronized highlights, scrolling, navigation and language detection.
Babylon Translator integrates Language Weaver’s Enterprise Translation model that delivers high-speed automated translation technology. The Babylon translator supports 32 languages and offers translation of words, phrases, and texts. Users have the possibility of translating full sentences, and translate from virtually any language to any language.

On a basic level, machine translation performs simple substitution of words in one natural language for words in another, but that alone usually cannot produce a good translation of a text, because recognition of whole phrases and their closest counterparts in the target language is needed. Solving this problem with corpus and statistical techniques is a rapidly growing field that is leading to better translations, handling differences in linguistic typology, translation of idioms, and the isolation of anomalies.

There are several approaches to the machine translation technology.

Statistical machine translation
The statistical machine translation approach generates translations using statistical methods based on bilingual texts. It uses existing source and target language translations (done by human translators) to find patterns it then uses to build rules for translating between those languages. The statistical approach allows improving the accuracy of the translation with more bilingual texts utilized. Where such corpora are available, impressive results can be achieved translating texts of a similar kind, but such corpora are still very rare.

Rule-based machine translation
Machine translation systems work with natural language, a data set that is infinitely varying, ambiguous and structurally complex. To translate adequately, the rule-based machine translation system must encode the knowledge of hundreds of syntactic patterns, variations, and exceptions, as well as the relationship among these patterns. MT system must include the dictionary and specific semantic knowledge about the usage of tens of thousands of words.


Language learning tips

There are many reasons to learn a foreign language, from working in another country to discovering your roots, through intellectual curiosity, travel, and communication.

Which language should I learn?
Once you have decided to learn a language, you may not be quite sure which language to choose. To some extent, your choice depends on your reasons for learning a language. For example, if you’d like to communicate with as many people as possible, learning such languages as Mandarin Chinese, Spanish, French, Russian or Arabic would enable you to do so.

What materials and tools do I need to study a language?

There’s a wide range of materials and tools available to help you with your language studies, including language courses, dictionaries, grammar books, phrasebooks, online lessons…

How can I find time to study a language?

Finding time to study a language can be quite a challenge. You may think that you don’t really have enough of it, but it’s surprising how many spare moments you have during a typical day, and how they can add up to a useful amount of study.

What’s the best way to study?

After choosing a language, you can start thinking about how you’re going to study it. For some popular languages, there’s a wealth of materials available. For lesser-studied languages, the choice can be more limited. If courses are available in your area, it might help you to attend them, or you may prefer to study on your own, or to have individual lessons.

Learning pronunciation

Learning the pronunciation of a language is a very important part of your studies. It doesn’t matter so much if you just want to read and/or write the language, but if you want to speak a language well, as I’m sure you do, pay particular attention to the pronunciation and review it regularly.

Learning vocabulary

Building up your vocabulary in a foreign language can take many years. Learning words in context from written and spoken material is probably the most effective way to do this. You could also try learning words in a more systematic way – perhaps a certain number of words every day.

Learning grammar

Familiarity with the grammar of a language enables you to understand it, and also to construct your own phrases and sentences. It’s not essential to know all the grammatical terminology or to understand why words change, as long as you’re able to apply to relevant changes when necessary.

Learning alphabets and other writing systems

If the language you’re learning is written with a different alphabet or other type of writing system, learning it is well worth the effort. Some alphabets, such as Cyrillic and Greek, can be learnt without too difficulty. Others, such as Devanagari and Thai, are a more challenging.

Learning Chinese characters

If you’re learning one of the languages that use Chinese characters, such as Chinese, Japanese or Korean, you’re faced with quite a challenge. However, there are some techniques you can use to help you learn all those funny little pictures and symbols.

Careers using languages

What kind of jobs and careers are available to students of languages?