Easy for humans, hard for computers

Machine translation can do impressive things, as I’ve noted before. However, there are some things humans do easily that would be very difficult (perhaps impossible) to build into MT programs.

I’m part of an ongoing machine translation post editing job for a client who needs to translate large amounts of customer feedback coming in every day. In theory, MTPE is perfect for this because it can handle a high volume of text cheaply and the client does not need the translations to be beautifully written – they just need to get a basic understanding of the customer feedback.

However, because this is feedback from the general public, it has a lot of little problems that turn out to be big problems for the MT:

  1. Spelling errors

The general public is really lazy about capitalization, which is a pity because capitalization is the only thing that lets MT know if a German-speaking writer means “they” (sie) or “you [formal]” (Sie). At least 50% of the people who type this feedback write “sie” when they should be writing “Sie,” and the MT has no choice but to translate it as “they,” because it’s ultimately just a pattern-matching machine. But as a human, it’s usually easy for me to tell from the context which one the writer means.

Typos are another source of misunderstanding. In one of my jobs a respondent had typed just two letters in reverse order, turning “Zahnbürste” (toothbrush) into “Zahnbrüste” which the MT interpreted as “dental breasts.” Not a bad guess for a machine, but humans can see right away what kind of error this is and what the correct translation should be.

Major typos just show up on the target side unchanged because the MT can’t interpret them at all. 90% of the time I can figure them out quickly.

  1. Autocorrect

There’s no way for the MT to figure out when someone writes a company to complain about substandard “girls” that they actually meant “forks.” Humans are good at context clues.

  1. Poorly expressed ideas

If you’re not using punctuation, you don’t speak the language you’re typing in very well, or you’re just too angry to construct a coherent sentence, the MT is going to translate your comment as a crazy word salad, e.g.:

“9 months prior has I will go to the Media and the way in which you handled this üb he England, I can also not the process started I do I go to bild newspaper in the order to the internet to my attorney I have never before a customer hang 1 ¾ year, I repeat it’s totally shameless I have sent back the questionnaire now and receive a reminder, the [sic], they know even if the eingeganen not an is; it’s a bottomLess outrageous only pass is its motto it’s not my problem it’s really shameless two weeks, I give you still danns 14 days, I give you still then”

Comments from people who struggle with German show up now and again in this job. One of them began “Where whore does it say you can’t use this product on tattooed skin?” At first I thought the commenter was insulting the customer service rep (“Where, whore, does it say…”) but I concluded it was probably something like, “Where the hell does it say…” based on a series of judgements MT doesn’t have the mental flexibility or experience for.

So never fear, human translators, even if MT takes over most of the translation business, there will always be work for you.

[This post was updated on 9/4/2018 to include some good examples that crossed my path.]

3 comments

    1. My main goals in writing this blog are to improve my SEO and and to cause chaos in New Jersey.

Leave a comment

Your email address will not be published. Required fields are marked *