Category: Machine translation

  • Senta spinnt

    The Milwaukee Symphony Orchestra put on an excellent concert performance of Der fliegende Holländer (The  Flying Dutchman) last week. Soprano Melody Moore brought the house down with her powerful singing and her spirited characterization of Senta.

    Speaking as someone with a degree in Complaining About Wagner, I must admit I really like this opera. It’s probably my favorite Wagner opera — and yes, I know that means I’ll never win the Wagner Snob of the Year award, but that’s OK. Superfans who attain enlightenment by boring themselves to death at Tristan und Isolde or Parsifal are welcome to it.

    One line jumped out at me from the spinning scene (go to 46:00 if it doesn’t take you straight there):

    Mary, who’s trying to keep the girls on task, says to Senta:

    Du böses Kind! Wenn du nicht spinnst,
    vom Schatz du kein Geschenk gewinnst.

    (You naughty girl, if you don’t spin,
    you’ll get no gift from your sweetheart. )

    It’s mildly amusing because in modern colloquial German, if someone “spins” it means they’re crazy. Du spinnst = you spin = you’re nuts! (Whereas in Wagner’s time a crazy person was “toll” but nowadays that would mean they’re cool.)

    I wondered how the best free machine translator, namely DeepL, would handle this. Behold:

    If you’re not crazy, from the treasure you don’t win a gift.

    And Google Translate says: If you are not crazy, from the treasure you win no gift. (wrong and awkward)

    This is a reasonable error for MT to make because it’s entirely possible that every single time it’s encountered “du spinnst” in a text, “you’re crazy” has been the correct translation. But it’s still wrong. I tried giving it some more context in case it had been programmed to recognize names from classic literature:

    MARY (to Senta): You evil child! If you are not crazy, from the treasure you don’t win a gift.

    No luck. You’ve probably also noticed the other big error here: “Schatz” does mean “treasure” but in this context it means “sweetheart.” In a similar vein, DeepL translates “I love my sweetie pie” as “Ich liebe meinen süßen Kuchen” (with “süße Torte” and “süße Pastete” as equally clueless alternatives).

    It’s interesting to consider how much effort you, as a human, have to expend to understand what is going on in this scene and determine what the correct translation of “du spinnst” would be — practically none. Whereas the best MT, despite its speed, has no idea what a spinning wheel is, no concept of how and why words acquire new meanings over time, and no ability to think, “Is this a proverb? Is it from a fairy tale? An opera? I’d better check.”

    And actually, you might come out of The Flying Dutchman thinking Senta’s a little crazy. But DeepL doesn’t think at all.

  • These pants.

    More avant-garde poetry brought to you by a machine translation system that is still learning the ropes:

    I put my son on my son,
    no. 6,
    and he scratched herself,
    thought it was all about this pants,
    I thought that it comes from these pants.
    I am still sending a picture of these pants,
    I am still sending pictures of these pants,
    I am still sending pictures of these pants,
    I still send photos of them.
    Maybe the pants came from the pants.

    (This output is dramatically different from the source, which was longer and didn’t mention pants or pictures nearly this much. Poor grammar/punctuation in source probably threw the machine into a panic.)

  • Easy for humans, hard for computers

    Machine translation can do impressive things, as I’ve noted before. However, there are some things humans do easily that would be very difficult (perhaps impossible) to build into MT programs.

    I’m part of an ongoing machine translation post editing job for a client who needs to translate large amounts of customer feedback coming in every day. In theory, MTPE is perfect for this because it can handle a high volume of text cheaply and the client does not need the translations to be beautifully written – they just need to get a basic understanding of the customer feedback.

    However, because this is feedback from the general public, it has a lot of little problems that turn out to be big problems for the MT:

    1. Spelling errors

    The general public is really lazy about capitalization, which is a pity because capitalization is the only thing that lets MT know if a German-speaking writer means “they” (sie) or “you [formal]” (Sie). At least 50% of the people who type this feedback write “sie” when they should be writing “Sie,” and the MT has no choice but to translate it as “they,” because it’s ultimately just a pattern-matching machine. But as a human, it’s usually easy for me to tell from the context which one the writer means.

    Typos are another source of misunderstanding. In one of my jobs a respondent had typed just two letters in reverse order, turning “Zahnbürste” (toothbrush) into “Zahnbrüste” which the MT interpreted as “dental breasts.” Not a bad guess for a machine, but humans can see right away what kind of error this is and what the correct translation should be.

    Major typos just show up on the target side unchanged because the MT can’t interpret them at all. 90% of the time I can figure them out quickly.

    1. Autocorrect

    There’s no way for the MT to figure out when someone writes a company to complain about substandard “girls” that they actually meant “forks.” Humans are good at context clues.

    1. Poorly expressed ideas

    If you’re not using punctuation, you don’t speak the language you’re typing in very well, or you’re just too angry to construct a coherent sentence, the MT is going to translate your comment as a crazy word salad, e.g.:

    “9 months prior has I will go to the Media and the way in which you handled this üb he England, I can also not the process started I do I go to bild newspaper in the order to the internet to my attorney I have never before a customer hang 1 ¾ year, I repeat it’s totally shameless I have sent back the questionnaire now and receive a reminder, the [sic], they know even if the eingeganen not an is; it’s a bottomLess outrageous only pass is its motto it’s not my problem it’s really shameless two weeks, I give you still danns 14 days, I give you still then”

    Comments from people who struggle with German show up now and again in this job. One of them began “Where whore does it say you can’t use this product on tattooed skin?” At first I thought the commenter was insulting the customer service rep (“Where, whore, does it say…”) but I concluded it was probably something like, “Where the hell does it say…” based on a series of judgements MT doesn’t have the mental flexibility or experience for.

    So never fear, human translators, even if MT takes over most of the translation business, there will always be work for you.

    [This post was updated on 9/4/2018 to include some good examples that crossed my path.]

  • If this translation is wrong, I don’t want to be right

    Recently, while slumming it at the Extremely Cheap Translation Service, I ran across two texts that were strange in the same way.

    The first was a German text: “Hallo meine hübsche Dame, wie machst du diesen schönen Morgen?”, which was already translated as “Good morning my pretty lady, how are you doing this fine morning?” (checking existing translations into English is my job at the Extremely Cheap Translation Service).

    And….yeah….that is what it says. Except that it doesn’t, because that’s not how you ask someone how they are doing in German. “Wie machst du” is a literal translation of the English “How are you doing?” Also, you don’t call a Dame “du”. So as a German sentence, this thing was a mess. But my job was to approve the English translation and, yes, the English translation made perfect sense in English, although it did fail to reflect the fact that the German didn’t make sense in German because it was too English, so perhaps in that sense it was not a good translation, and it should have read, “Hello my pretty lady, how goes it to You this fine morning?” Anyway, I approved it, but I would have liked to know who put this through and why. If they were back-translating to check that their German sentence was correct, they got the wrong impression.

    A few weeks later it happened again, at length and in French. This time it was advertising copy about a winter vest. The French side was essentially a literal translation of English, only sometimes it was worse than literal. For example:

    “Ce veste volonté toujours tirer par.” (“This vest will always pull through.”)

    The “worse than literal” part here is that “will” is “volonté,” which is the noun “will,” as in “Thy will be done.” The slightly less bad but still really bad thing about this sentence is that “tirer” means “pull” and “par” means “through” but if you know anything about languages, you know that kind of translation is not going to work. And keep in mind, this was a job where the source text was French, so what I was supposed to do was evaluate the English target text.

    And the English side was perfect. That’s actually a little weird because the TM software used by the Extremely Cheap Translation Service is programmed to translate normal French into English, so it seems like this bizarre Franglais should have resulted in something confusing. It could be that it came out weird and the first human editor (there are always two human editors) guessed the client’s intentions and rewrote it.

    This time I flagged it – it was long and seemed to have a more serious destiny than that little German sentence – and wrote a note about how the target text was fine but source text was nonsense. Incidentally, that’s about all the attention you can expect from your editors at the Extremely Cheap Translation Service.

    ETA: Just to be clear, the ECTS didn’t do anything wrong here. Their clients were just being weird.