Building a Custom Machine Translation Engine as part of a Postgraduate University Course: a Case Study
In 2015, I was asked to design a postgraduate course on machine translation (MT) and post-editing. Following a preliminary theoretical part, the module concentrated on the building and practical use of custom machine translation (CMT) engines. This was a particularly ambitious proposition since it was not certain that students with undergraduate degrees in languages, translation and interpreting, without particular knowledge of computer science or computational linguistics, would succeed in assembling the necessary corpora and building a CMT engine. This paper looks at how the task was successfully achieved using KantanMT to build the CMT engines and Wordfast Anywhere to convert and align the training data.
The course was clearly a success since all students were able to train a working CMT engine and assess its output. The majority agreed their raw CMT engine output was better than Google Translate’s for the kinds of text it was trained for, and better than the raw output (pre-translation) from a translation memory tool.
There was some initial scepticism among the students regarding the effective usefulness of MT, but the mood clearly changed at the end of the course with virtually all students agreeing that post-edited MT has a legitimate role to play.
Translating and the Computer 39: proceedings. Asling: International Society for Advancement in Language Technology, 16-17 November 2017; pp. 35-39 (ISBN 978-2-9701095-3-2).