Category: Computer tools for translators

Getting to grips with post-editing machine translation

To meet the growing demand for translation, post-editing of machine translation output (PEMT) is being increasingly adopted as a mainstream alternative working method. The compelling reason behind this trend is the widely reported increase in productivity compared to human translation together with a comparable and sometimes higher quality level. The skills required for post-editing are different from those needed for the editing of author-written texts and different from those required for translation. This workshop aims to familiarize attendees with post-editing methods by analysing the typical mistakes of both neural and statistical machine translation (MT). It also provides some insight into why certain errors occur in raw MT output through a presentation of the historical development of the technology. It will conclude with a discussion of when PEMT should and should not be used and how raw MT output can be improved through preparatory steps.

Tagged with: ,

Half-day practical course on post-editing

PART 1
After looking at various standard industry guidelines for light and full post-editing, half the attendees will translate short texts from various languages into English or vice versa and the other half will full-post-edit machine-translated versions of the same texts. The two groups will then come together in pairs according to language combination to compare the results, along with the speaker, both in terms of productivity increase and overall quality.

PART 2
In the second part of the Lab, all the attendees will receive a machine-translated text to post-edit from Italian or Spanish into English, or vice versa. While they are doing so, they will also be asked to use any knowledge they may have of how machine translation works to attempt a preliminary categorization of the errors they find. The speaker will then present an analysis of the errors in the raw outputs, as well as other typical errors which occur, in order to provide practical tips for post-editors. Most of the error types analysed are language-independent and attendees who do not normally work with Spanish or Italian but are familiar with a Neo-Latin language are still likely to find the practical exercise useful.

Tagged with: ,

Getting to grips with post-editing machine translation

To meet the growing demand for translation, post-editing of machine translation output (PEMT) is being increasingly adopted as a mainstream alternative working method. The compelling reason behind this trend is the widely reported increase in productivity compared to human translation together with a comparable and sometimes higher quality level. The skills required for post-editing are different from those needed for the editing of author-written texts and different from those required for translation. This workshop aims to familiarize attendees with post-editing methods by analysing the typical mistakes of both neural and statistical machine translation (MT). It also provides some insight into why certain errors occur in raw MT output through a presentation of the historical development of the technology. It will conclude with a discussion of when PEMT should and should not be used and how raw MT output can be improved through preparatory steps.

Tagged with: ,

È opportuno servirsi del full post-editing anche nel caso di testi creativi?

TranslatingEurope Workshop 2019 a Milano

Dal confronto fra la traduzione umana e la traduzione automatica post-editata si nota che certi giri di parole, espressioni e scelte di termini si trovano con maggiore frequenza nella seconda di quanto non si trovino nella prima. Ciò implica che i testi post-editati, in media, sono meno ricchi nella varietà e nell’inventiva tipiche della traduzione umana, e qualsiasi tentativo di eliminare quelli che sono a tutti gli effetti marcatori di traduzione automatica richiederebbe ulteriori sforzi di post-editing e annullerebbe la maggior parte del risparmio di tempo e dei vantaggi economici. Naturalmente varietà e inventiva non sono sempre caratteristiche auspicabili in una traduzione. Tuttavia ci sono numerose tipologie di testo in cui l’omogeneizzazione e l’uniformità renderebbero la traduzione meno interessante da leggere e meno stimolante intellettualmente. In questi casi, la mancata eliminazione di questi marcatori può portare a lungo andare all’impoverimento lessicale della lingua target.

In questa presentazione si illustrano i rischi connessi all’utilizzo indiscriminato della traduzione automatica post-editata per mettere l’LSP in condizione di valutare quando è opportuno usarla.

Tagged with: ,

Is full post-editing of machine translation output a pipe dream?

Comparison shows that certain turns of phrase, expressions and choices of words occur with greater frequency in post-edited machine translation output than they do in human translation. This implies that post-edited texts, on average, lack the variety and inventiveness of human translation, and any attempt to eliminate what are effectively machine translation markers would require additional post-editing effort and nullify most, if not all, of the time and cost-saving advantages. Of course variety and inventiveness are not always desirable features. Nevertheless, there are various kinds of text where homogenization and uniformity would make the translation less interesting to read and less intellectually stimulating. In such cases, failure to eradicate these markers may eventually lead to lexical impoverishment of the target language.

This talk will illustrate the risks involved in using post-edited machine translation output indiscriminately and put the translator in a position to explain when its use might be detrimental.

Tagged with: ,

Raw Output Evaluator, a freeware tool for manually assessing MT raw output

Raw Output Evaluator is a freeware tool, which runs under Microsoft Windows. It allows quality evaluators to compare and manually assess raw outputs from different machine translation engines. The outputs may be assessed in comparison to each other and to other translations of the same input source text, and in absolute terms using standard industry metrics or ones designed specifically by the evaluators themselves. The errors found may be highlighted using various colours. Thanks to a built-in stopwatch, the same program can also be used as a simple post-editing tool in order to compare the time required to post-edit MT output with how long it takes to produce an unaided human translation of the same input text. The MT outputs may be imported into the tool in a variety of formats, or pasted in from the PC Clipboard. The project files created by the tool may also be exported and re-imported in several file formats. Raw Output Evaluator was developed for use during a postgraduate course module on machine translation and post-editing.

Tagged with: ,

Tutto quello che avresti voluto sempre sapere sul post-editing ma che avevi paura di chiedere*

Ovvero un esempio del perché serve il post-editor in un solo titolo

Programma
    • Breve storia della traduzione automatica
    • Come funzionano le cose: la traduzione automatica
      (ovvero la traduzione automatica per negati)
    • Linee guida per il post-editing
    • Sfida tra post-editing e traduzione umana
    • La puzza di traduzione automatica
    • Individuazione e classificazione degli errori della traduzione automatica
    • Tecniche per migliorare la qualità dell’output grezzo

Sono previsti due esperimenti pratici dall’inglese all’italiano da eseguire con il proprio portatile/tablet, che i partecipanti sono invitati a portare con sé (eventualmente con cavi di alimentazione). Non è richiesto alcun software particolare oltre a un comune word processor, ma serve almeno una discreta conoscenza della lingua inglese.

* Il titolo principale è l’output grezzo di un noto motore di traduzione automatica neurale, senza post-editing..

Tagged with: ,

Machine Translation Markers in Post-Edited Machine Translation Output

Photo courtesy of Sarah Bawa Mason

The author has conducted an experiment for two consecutive years with postgraduate university students in which half do an unaided human translation (HT) and the other half post-edit machine translation output (PEMT). Comparison of the texts produced shows – rather unsurprisingly – that post-editors faced with an acceptable solution tend not to edit it, even when often more than 60% of translators tackling the same text prefer an array of other different solutions. As a consequence, certain turns of phrase, expressions and choices of words occur with greater frequency in PEMT than in HT, making it theoretically possible to design tests to tell them apart. To verify this, the author successfully carried out one such test on a small group of professional translators. This implies that PEMT may lack the variety and inventiveness of HT, and consequently may not actually reach the same standard. It is evident that the additional post-editing effort required to eliminate what are effectively MT markers is likely to nullify a great deal, if not all, of the time and cost-saving advantages of PEMT. However, the author argues that failure to eradicate these markers may eventually lead to lexical impoverishment of the target language.

Read the full academic paper.

Download the presentation.
Translating and the Computer 40

Tagged with: ,

Raw Output Evaluator, a freeware tool for manually assessing raw outputs from different machine translation engines

Raw Output Evaluator

Photo courtesy of Sarah Bawa Mason

Raw Output Evaluator is a freeware tool, which runs under Microsoft Windows. It allows quality evaluators to compare and manually assess raw outputs from different machine translation engines. The outputs may be assessed in comparison to each other and to other translations of the same input source text, and in absolute terms using standard industry metrics or ones designed specifically by the evaluators themselves. The errors found may be highlighted using various colours. Thanks to a built-in stopwatch, the same program can also be used as a simple post-editing tool in order to compare the time required to post-edit MT output with how long it takes to produce an unaided human translation of the same input text. The MT outputs may be imported into the tool in a variety of formats, or pasted in from the PC Clipboard. The project files created by the tool may also be exported and re-imported in several file formats. Raw Output Evaluator was developed for use during a postgraduate course module on machine translation and post-editing.

Read the full academic paper.

Download the presentation.
Translating and the Computer 40

Tagged with: ,

The stink of machine translation

Listen to Lisa Agostini’s interview of me talking about my presentation. Thanks to MET and Julian Mayers (Yada Yada) per their permission to post the interview here.

Post-editors are asked to do either light post-editing, to get rid of the worst machine translation errors, or full post-editing, to bring the output up to the same standard as human translation.

But is full post-editing in reality a pipe dream?

The speaker has conducted an experiment for two years running with groups of postgraduate university students in which half do an unaided human translation and the other half post-edit machine translation output. Comparison of the texts produced shows that certain turns of phrase, expressions and choices of words occur with greater frequency in the post-edited machine translation output than they do in human translation. This is easily explained by the fact that even neural machine translation systems seem to choose the most statistically frequent solutions even when those solutions occur less frequently than the sum of the frequencies of all the other possible solutions, and post-editors faced with an acceptable solution tend not to edit it. This however implies that post-edited machine translation output, on average, lacks the variety and inventiveness of human translation, and therefore does not in fact reach the same standard. It is evident that the additional post-editing effort required to eliminate what are effectively machine translation markers would nullify most, if not all, of the time and cost-saving advantages of post-edited machine translation. On the other hand, failure to eradicate these markers may eventually lead to lexical and syntactic impoverishment of the target language.

The speaker provides examples of post-editing and translation from English into Italian. However, with the aid of some back-translations, the mechanisms at play should be equally clear to non-Italian speakers, particularly if they are familiar with other Neo-Latin languages.

Tagged with: ,