Author: lefreeburn

  • Deadly MT

    I often run legal boilerplate (with all specifics of the case removed) through DeepL as a preliminary measure. Today’s experiment was from an alimony contract. And I’m pretty sure the person who signed it didn’t intend to agree to this:

    Because of the fulfilment of the obligations arising from this document, I submit myself to immediate execution.

    (Wegen der Erfüllung der Verbindlichkeiten aus dieser Urkunde unterwerfe ich mich der sofortigen Zwangsvollstreckung.)

  • Crow vs. Raven

    Die Krähe

    Eine Krähe war mit mir
    Aus der Stadt gezogen,
    Ist bis heute für und für
    Um mein Haupt geflogen.

    Krähe, wunderliches Tier,
    Willst mich nicht verlassen?
    Meinst wohl bald als Beute hier
    Meinen Leib zu fassen?

    Nun, es wird nicht weit mehr gehn
    An dem Wanderstabe.
    Krähe, lass mich endlich sehn
    Treue bis zum Grabe!

    Reflecting on this song from Winterreise, I had what I thought was a pretty good insight: namely, that even though the bird in it is a crow, you should translate “Krähe” as “raven” because they have the same number of syllables and the same vowel sounds.

    In particular, it totally takes care of this measure:

    “Ra-ven,” — that’s perfect.

    “Anyway,” I thought, “Who can even tell the difference between a raven and crow?” And then of course I googled “raven vs. crow” and it turns out that in addition to being smaller, crows “like being in human populated areas” and are “more social and audacious” whereas ravens are aloof and cautious. Crows make that high-pitched “caw, caw” sound but ravens make a “low and hoarse” sound. And ravens (“Raben”) are mentioned elsewhere in Winterreise. Presumably the poet had his reasons for choosing one or the other.

    So even though my primary motivation for translating this poem was to fit the word “raven” neatly into it, I decided to maintain the original crow. Here is a crow-based version:

    As I set off through the snow
    Leaving town forever
    Round my head there flew a crow
    And it leaves me never.

    Strange bird, following my way,
    Will you not depart?
    Surely, you intend to prey
    On my broken heart?

    Not much longer now, my friend
    Till my walk is over,
    Crow, be with me at the end,
    My one faithful lover.

    But then I was out for a drink on New-Year’s-Eve Eve and thought what the heck, let’s do one with a raven too. I jotted down a first draft with a Moscow Mule (thanks go out to my drinking buddy for rhyme suggestions) and refined it this morning. Here you go:

    One dark raven came with me
    As I left the silent town
    Still it keeps me company
    Flying round and round.

    Raven, strange uncanny thing,
    Will you never leave me?
    After all my wandering
    Will your talons claim me?

    Now have patience, soon I’ll lay
    Down my staff with one last breath
    Raven, only you will stay
    Faithful unto death.

    Which of my versions do you prefer – raven or crow? Or do you have a better version of your own? If so, paste it in the comments.

    Incidentally, when I translate these Lieder I think more about whether they can be sung to the tune than whether they match the number of German syllables. Which means sometimes you’d be singing a one-syllable English word on two notes where the German has a different syllable on each note, and vice versa.

    Song:

  • Warning: explicit language

    This morning someone on my local NPR station said that English is a very implicit language, while American Sign Language is a very explicit language.

    What the heck does that mean?

    The speaker, who interprets into ASL at stage productions, cited the example of translating the phrase “a cosmopolitan city”: “Well, you can finger-spell ‘cosmopolitan,’ but what does that actually mean? It’s a diverse city that has money and culture. So you have to expand on that, you don’t just finger-spell ‘cosmopolitan.’ “

    That’s a good example of strategy for interpreting and translation, and an interesting insight about ASL. But does it show that ASL is “more explicit” than English?

    What this example seems to indicate is that ASL has a smaller vocabulary than English. This occurred to me because I have a small French vocabulary and it causes me to have conversations like this:

    Me: Hello, I would like to pay to have a car for one day.

    French person: Ah, you would like to rent a car, Madame.

    Now, most people would say, “That lady doesn’t know a lot of French.” Et oui, c’est vrai. But you could also say something like, “That lady’s French idiolect is more explicit than standard French. She mentioned payment and a limited time period, both of which are merely implied by the word ‘rent’.”

    Somewhat analogously, ASL has a smaller word bank than the major spoken languages. Apparently this fact often leads to underestimation of its value and is therefore a sore point – see for example this post on Quora or Wikipedia’s assertion that “ASL users face stigma due to beliefs in the superiority of oral language to sign language.” Honestly, this is news to me, because I’ve never seen ASL portrayed as anything but super cool, but any linguist who’s made small talk has encountered the idea that the language with the most words is The World’s Best Language, so I get that an apparent paucity of vocab in ASL could be seen as a marker of inferiority.

    OK. So ASL doesn’t have one word that is the precise equivalent of “cosmopolitan.” It probably doesn’t have the words “antediluvian” or “postlapsarian” either.  But guess what? All those “words” are compounds made up of smaller words. A cosmopolitan place is a “world city” (Greek cosmos + polis), while antediluvian things happened “before [the] flood” (Latin ante + diluvium) and postlapsarian things happened “after [the] fall” (Latin post + lapsus). In other words, if ASL has words for “world,” “city,” “before,” “flood,” “after,” and “fall,” it can recreate all those compound words – and because they’re not cloaked (as they are in English) in the sounds of other languages, their meaning will be plainer.

    A language is a tool, and if you can use it to express what you want to express, it’s a good tool. You don’t need to worry about how many words you have. I get along just fine in France with my circumlocutions.

    And no language has achieved perfect specificity anyway. What are glasses, for example? Do you drink out of them or wear them over your eyes? Once a news team at the mall asked my friend what she was buying her mom for Christmas, and she said, “Glasses…um, drinking glasses” and made a drinking motion as if she needed a kind of sign language to come to her rescue where English had failed. In German, there’s no confusion: a “Brille” is a pair of eyeglasses and “Gläser” are what you drink from. But then German has the same word for “lentils” and “lenses.” You just have to deduce from context that your friend isn’t getting a prescription for contact lentils, or inviting you over for a steaming bowl of lens soup. We all seem to be communicating pretty well despite these deficiencies in vocab, though.

    Back to the person I heard on the radio: at first I thought this was one of the dumb generalizations people are always making about language, like “English is very precise” (people who say this kind of stuff generally only speak English). But now I think that person was getting a point across in a rather brilliant way. Being a specialist in ASL and therefore sensitive to concerns about how it is perceived, they didn’t want to frame this interpreting problem as “ASL doesn’t have as many words as English.” Instead they thought it through in a different way: one English word, like “cosmopolitan,” implies a whole set of characteristics, whereas in ASL you would specify each characteristic, so…”English is a very implicit language, while American Sign Language is a very explicit language.” Got it.

  • Krabat

    “There’s a kind of magic that must be learned with toil and difficulty, line by line, spell by spell, the magic of the Book of Necromancy; and then there’s another kind that springs from the depths of the heart, from caring for someone and loving him. It’s hard to understand, I know, but you had better trust that magic, Krabat.”

    Just in time for Halloween, here’s a spooky book that’s been a favorite of German-speaking children since 1971.

    Entitled simply Krabat, it first appeared in English as The Satanic Mill – an accurate title but not one that screams “stocking stuffer.” So some shrewd editor has given it a new English title remarkably similar to the US title of the first Harry Potter book, Harry Potter and the Sorcerer’s Stone: namely, Krabat and the Sorcerer’s Mill.

    To be fair, that’s also accurate, though if you pick up this book hoping for something in the HP vein, you’ll notice a distinct lack of quaint school uniforms, lovable professors, and zany spells. The HP series does engage with darkness and death, but its overall mood is fun and sparkly (and merchandisable). In Krabat there’s not much sparkle to balance out the murder, suicide, nightmares, and heavy bags that no one wants to know the contents of. There’s a guy called Big Hat with – wait for it – a big hat, but it doesn’t tell people whether they’re Hufflepuffs or whatever. There’s a school for magic, but you wouldn’t want to enroll. Its only real plus is that there’s always plenty of oatmeal.

    Krabat is a Wendish, aka Sorbian, boy living during the Thirty Years’ War. Like HP, he’s an orphan who receives a call (in this case, through a voice in his dreams) to join the apprentices at a mill that doubles as a school for magic. Although there’s camaraderie among the boys and a few traditions that brighten up the year, the prevailing atmosphere is one of dread. You never get the feeling that Krabat is any better off for having joined the mill and learned what there is to learn from the Book of Necromancy – by following the sorcerer’s call, he becomes ensnared in a trap from which spells cannot free him. If you’d like to know what does free him, buy it from the NYRB here.

    The NYRB translation is by Anthea Bell, one of the superstars of the translation world, whose work you’ll be familiar with if you’ve ever read an Asterix book in English. She also translated the Inkworld trilogy by Cornelia Funke, who loved Krabat as a child…another turn of the literary mill wheel.

    I gave Krabat and the Sorcerer’s Mill to a fourteen-year-old American girl and she devoured it in a couple days. Her only complaint was that the dénouement is very rapid, which is true. Anyway, consider buying it for a young person you know…especially now that you don’t have to explain to their parents why The Satanic Mill is a great children’s book.

  • Interlingual puns

    Some people seem to think (see this post) that a really great machine-translation program would be able to “handle complicated multilingual puns with ease.” But what is a “multilingual pun” anyway?

    The prefix “multi” implies more than two, and honestly, off the top of my head I can’t think of any puns involving more than two languages. If you have one, please send it in!

    But I know of some puns between two languages – I think these are properly called “interlingual puns.”

    For example, here’s an old joke my Dad used to enjoy telling: “What did the alien say in the music store?” – “Take me to your Lieder.”

    Ha ha ha ha ha ha ….sigh.

    Consider whether an AI machine translator could handle this pun with ease – by “handle” I assume the author meant “translate.” Could the best ever future MT translate this joke into Japanese, or Thai, or French, or Navajo, or Spanish? Um, no? Neither could a human translator. It’s just not possible. If it appeared in a story you were translating, you’d insert a similar joke, but you couldn’t reproduce this exact joke. Some things really are not translatable.

    The joke works by combining English and German, so could you translate it into German? Again, no. It wouldn’t work in reverse, so to speak. German speakers with good English and a good knowledge of 20th century pop culture would be able to get it, though.

    OK, let’s try another one. I saw this pun being assembled in real time on social media. A friend of mine who was getting a PhD in Arab Studies posted:

    ABD

    which his academic pals recognized as the acronym for “all but dissertation,” i.e. he had reached an important phase in his course of study. Congratulations were offered in the comments section, but someone also left this comment:

    abd al-dissertation

    which is a pretty funny pun because it sounds like an Arabic name meaning “servant of the dissertation,” and that of course is what he was going to be for the next year or so. (For more context on the Arabic name in question, go here.)

    Is this translatable into any other language? Nope. And there’s nothing humans or MT can do about it. Part of the romance of translation is the bittersweet knowledge that some things just can’t be carried over into another language, like rare flowers that won’t grow outside their native land.

  • Homeless in Vienna

    Vienna has once again been ranked the world’s best city to live in. Apparently it’s won the Mercer Quality of Living Survey 10 years in a row and from what I saw on my last visit there, I’d say the honor is well-deserved.

    But Vienna hasn’t always been such a great place to live. In the good old days it was probably the best place to develop your mad composition skills, eat delicious pastries, and get syphilis, but the actual living part was a challenge.

    Throughout the nineteenth century and into the twentieth, Vienna suffered from an increasingly severe housing shortage. The city’s population increased roughly tenfold over the course of the nineteenth century. Although that wasn’t an unusual rate of urban growth for the time —and the fastest-growing capital in Europe, Budapest, had people living in trees — Vienna’s housing situation was still comparatively difficult. For one thing, at a time when most cities had already spread beyond their old walls, expansion of Vienna proper was still inhibited by its old fortifications and the military exercise grounds right outside them.

    Vienna and suburbs, with space for target practice in between. (public domain)

    The city walls were torn down in the late 1850s and replaced by the Ringstrasse. This made room for expansion and new construction, driven mainly by tax cuts and private initiative. But it didn’t create sufficient, affordable housing for the city’s growing population. In fact, a speculative real-estate bubble where the security often rested merely on planned construction was a factor in the Vienna stock market crash of 1873.

    Bokelmann, The Broken Bank, 1877 (German, not Austrian, but hey.)

    So what were your housing options in Olde Vienna?

    1. The street: If lying in the gutter wasn’t your thing, you could live under a bridge or in a little cave dug into a railway embankment. Young women sometimes turned to prostitution just to get a bed for the night.

    2. Single-family house: In my town of 10,000 in the US, almost everyone lives in one of those. By 1910, there were 2,031,420 people in Vienna and only 5734 single-family homes. By my calculations, that’s one house for every 354 people, so you won’t be surprised to learn they accommodated only 1.2% of the city’s population. Scroll down this blog to see some lovely houses belonging to that 1.2%.

    3. Nice apartments: You’ll see a lot of these if, like me, you mainly go to Vienna to commune with dead composers. For example, go here to see the building Franz Schubert was born in. His family lived in one room with a little kitchen alcove, but it was a pretty big room with windows and a courtyard where children could play, so basically a win. And in 1801 they moved to a nicer place. I assume by 1860 or so, anyone who occupied such a dwelling would keep a vice-like grip on it and sub-let only to the classiest tenants.

    4. Bassenawohnung or ZKK – two names for the same kind of squalid apartment for the average citizen. The “Bassena” was a sink in the hallway, the sole source of running water for a number of apartments. It was also the place to exchange gossip. “ZKK” stands for “Zimmer, Küche, Kabinett” (“room, kitchen, closet”).  These were generally 22 to 28 m², with a communal toilet out in the hall by the Bassena. Some tenement apartments had only 2 rooms (no closet) or really just one and a bit.

    A Bassena (photo credit)

    Not only were large families often crammed into these apartments, but because many of them needed help with the rent they also sub-let sections of the place to people who couldn’t afford, or perhaps couldn’t even find, an apartment of their own.  Even in the smallest, most crowded apartment there was room for a “Bettgeher” (a “bed-goer” or bed lodger), who paid for a place to sleep and nothing else. He might get space in a drawer for his possessions if he was lucky. In the 1870s, Bettgeher and other sub-renters made up a quarter of the Viennese population.

    “It is almost impossible to penetrate the secret of a Viennese apartment,” Ingeborg Bachmann wrote, “Even a person’s best friends cannot do it.” It’s not hard to understand why.

    The popularity of the Viennese coffee house, with its cozy opulence, owes much to the fact that people were keen to minimize the amount of time they spent at home. The writer Peter Altenberg exemplified this strategy. He rented a hotel room to sleep in, but the Café Central was where he ate, worked, socialized, and got his mail – essentially, it was where he lived.

    Scarcity of resources during the First World War made the housing crisis acute. The Imperial edict for the protection of renters issued in 1917 froze rent, which eased financial pressure on tenants but also turned the overcrowding problem into a homelessness problem when these same tenants kicked out their no-longer-needed Bettgeher. This sparked the Siedlerbewegung or “settler’s movement,” in which thousands of people moved out to the edge of the city to occupy land, plant gardens and build simple shelters out of wood they cut from nearby forests. Within seven years, the movement had grown into 30,000 families managing 6.5 million square meters.

    The spirit of this movement flowed into the policy of the Social Democrats, who finally turned Vienna’s housing situation around in the 1920s with a building program for the working classes; their best-known accomplishment is probably the Karl-Marx-Hof.  For more about Viennese social housing, with plenty of  photos, see this or this or this or this. Interestingly, if you google “Paris housing crisis,” you get articles about how hard it is to find affordable housing in Paris today. If you google “Vienna housing crisis,” the top articles are all about how wonderfully Vienna solved its housing crisis.

    If you live in Vienna nowadays, you can’t argue with Karl Kraus over a Mélange but at least you’ll have indoor plumbing and there won’t be a day laborer sleeping on your kitchen floor. As far as the average citizen is concerned, this is probably the best time in history to live there. It sure seems that way from this video by Thomas Pöcksteiner and Peter Jablonowski. Vienna is not forgetting to be awesome.

    A Taste of Vienna from FilmSpektakel on Vimeo.

    I got most of the info for this post from The Viennese by Paul Hofmann, Wittgenstein’s Vienna by Allan Janik and Stephen Toulmin, and articles by Michael Klein of TU Wien.

  • Puns and jokes

    In this post I promised to go through some pun-translation strategies.

    What makes puns hard to translate is that there is almost never one “right” or “best” solution. Puns give rise to several different scenarios:

    1. You just translate the straight meaning and write a footnote about how it was a pun in the source text. Sad, right? But very common in certain contexts, e.g. academia. I actually had to do it last week.

    On the other hand, some academic translators do get creative, especially if the goal of the translation is twofold: to inform readers and to give them an experience of the text that parallels the original. Erika Rummel does this in her translation of Reformation dialogues, e.g.:

    LEGATE: I also confer doctorates.

    BRUNO: Donkey doctorates.

    That line has a footnote, which reads: Literally, “troubles, not doctorates”; the pun dolores/doctores cannot be rendered into English.

    For a dialogue that you might want to read out in class and have some fun with, inserting a new joke is a good idea. But she still has to explain the original joke in the footnote so students can be fully informed about the content of the source text.

    2. If you’re working with related languages, you might get lucky: for example, Kurt Schuschnigg’s rhyming declaration “Bis in den Tod! Rot-Weiß-Rot!” is easily translated as “Red-White-Red until we’re dead!” because historical linguistics has done your work for you.

    3. You can think of a similar pun using different words. In the 2015 film Er ist wieder da, main character Sawatzki thinks a rat is pregnant; the rat is actually male and its owner says: “Die sind die Eier.” (Lit. “Those are the eggs” with “eggs” being German slang for testicles.) Sawatzki, still confused about whether the rat is male or female, responds, “Die Ratten legen Eier?” (“Rats lay eggs?”). In English, your choices are “nuts” or “balls,” so the confusion over laying eggs is out. Instead, the subtitler came up with “Those are his nuts” – “Rats collect nuts too?” which is pretty good. (You can watch this movie on Netflix, by the way, as Look Who’s Back. It’s actually more about Hitler than it is about pet rats.)

    4. In some cases, you have room to think of something very different from the original. In Fontane’s Effi Briest, Effi’s cousin tells a lame joke about Job because Bible jokes are all the rage in Berlin:

     »Die Fragestellung – alle diese Witze treten nämlich in Frageform auf – ist übrigens in vorliegendem Falle von großer Simplizität und lautet: ‘Wer war der erste Kutscher?’ Und nun rate.«

    »Nun, vielleicht Apollo.«

    »Sehr gut. Du bist doch ein Daus, Effi. Ich wäre nicht darauf gekommen. Aber trotzdem, du triffst damit nicht ins Schwarze. «

    »Nun, wer war es denn?«

    »Der erste Kutscher war ‘Leid’. Denn schon im Buche Hiob heißt es: ‘Leid soll mir nicht widerfahren’, oder auch ‘wieder fahren’ in zwei Wörtern und mit einem e.«

    OK. Basically, cousin Briest asks “Who was the first coachman?” and the answer is “sorrow,” because in the Book of Job it says “Sorrow shall not befall me” and in German the word for “befall” is “widerfahren,” which sounds just like “wieder fahren,” which in turn means “to drive again.” So, “Sorrow shall not befall me” and “Sorrow shall not drive me again”* sound alike in German, hence the joke.

    As you can see, the original pun is completely untranslatable. What to do?

    Here’s how Helen Chambers and Hugh Rorrison handled it in their translation for Penguin Classics:

    ‘The question in this case — all these jokes take the form of questions by the way — is of the utmost simplicity: “What was our Lord’s favourite plaything called?” Now guess.’

    ‘Little lambkin, perhaps.’

    ‘A brave try. You’re an ace, Effi. I’d never have thought of that. But you’re wide of the mark.’

    ‘Well, what was it then?’

    ‘Our Lord’s favourite plaything was called “Gladly”, because in the hymn it says ‘Gladly the cross I’d bear” or “cross-eyed bear”, “eyed”, e-y-e-d.’

    They had to find a completely different joke that was equally cringey and also related to the Bible. Other translators would have thought of something else again — many of these would work (I say “many” because you have to check that the joke would have made sense in 1895, when Effi Briest came out).

    When I first read this, I agreed that Chambers and Rorrison’s joke had the right level of lameness, but thought it erred in not actually being a “Bible joke” per se. Cousin Briest explicitly introduces it as a Bible joke and says the pun comes from the Book of Job, but “Gladly the cross I’d bear” is from a hymn. However, after much searching — searching through German websites that offer the full text of Luther’s Bible, but also global Google searches with and without quotation marks and with variations in phrasing — I don’t think cousin Briest’s punchline is actually a Bible verse at all. Apart from Effi Briest, the only place I found it was in this forum, where it’s attributed (falsely, I think) to the book of Daniel:

    Wer war der erste Berlina? Das war Daniel in der Löwengrube: “Leid soll mir nicht widerfahren.” Damit ist auch die Frage beantwortet wer der erste Kutscher war -> Leid

    Similarly, “Gladly the cross I’d bear” seems to be a misquotation from “Keep Thou My Way” by Fanny Crosby. So all in all, this joke matches both the style and the dubious sourcing of the original joke quite perfectly.

    Those are a few examples of pun translation. Now, apropos my earlier post about how good MT could get, do you think an MT could ever deal with this problem? I don’t.

    *********

    *Although, does “Leid soll mir nicht wieder fahren,” actually make sense with that dative “mir”? Is it just an imperfection that makes the joke extra lame? Or is the speaker being driven into? Usually when you drive someone somewhere, that someone is in the accusative.

  • That’s pathetic

    Here’s the start of a German sentence I’m working on right now:

    Die überwältigende Musik in Kombination mit kurzen, pathetisch vorgetragenen Deklarationen von hehren Zielen …

    And here’s how the best free machine translator renders it into English:

    The overwhelming music combined with short, pathetic declarations of noble goals …

    But according to dict.cc, “pathetisch” could be translated as solemn, emotive, histrionic, pathetic, lofty, dramatic, impassioned, melodramatic, emotional, or declamatory. Sounds like we need an actual human to reflect on the context here and make an informed choice.

  • How good can MT get?

    Over at Slate Star Codex, Scott Alexander has a good post about the future of AI, but I need to nitpick these speculations about what a “future superintelligent Google Translate” could do:

    For example, take Google Translate. A future superintelligent Google Translate would be able to translate texts faster and better than any human translator, capturing subtleties of language beyond what even a native speaker could pick up. It might be able to understand hundreds of languages, handle complicated multilingual puns with ease, do all sorts of amazing things.

    This description raises interesting questions about what the best possible machine translation (MT) would be like. Let’s go through it point by point:

    Is MT faster than any human translator? Yes.

    Can it “understand hundreds of languages”? No. The problem with MT is that it doesn’t actually “understand” any languages in the sense humans do. It matches patterns. For more on this topic, see Scott Spires’ post “Machine translation and savant syndrome” or my posts “Senta spinnt” and “Easy for humans, hard for computers.” (As an update to that post, I should say MT programs are getting better at dealing with common typos. But I think my basic point still stands.) What would have to happen for MT to truly understand a language? It would have to be an entity as complex as Star Trek’s Data – something that moves through the world, interacts with people, has experiences (including experiences of real-world communication and miscommunication) and personal memories – and even he has trouble sometimes.

    Could the best possible future MT “capture subtleties of language beyond what even a native speaker could pick up?”

    Seems unlikely.

    Now, there are aspects of language that machines measure more accurately than humans. For example, they can measure the resonance of the phonemes you produce in cycles per second. An AI can store tons of vocab, which means it can make very precise matches very quickly. An AI with a huge data set of spoken language could analyze very subtle aspects of speech most humans would miss. It might be able to conclude from your speech that you’ll be diagnosed with a neurological disorder next year, or guess your age with almost perfect accuracy. There’s probably already an AI that does this kind of thing.

    But what would it take for an artificially intelligent translator of written texts to pick up more from a text than a human could, or capture more of its subtleties? What would this look like, and what inputs would be needed? What would be in your training data set and what instructions would you give the AI?

    To start with, you could feed it tons and tons of books. For German-to-English MT, you’d take every published book that exists in both English and German, and present them to your AI in pairs. For example, DeepL would not have made the error in “Senta spinnt” if its training data had included the original German libretto of The Flying Dutchman and a good English translation of same. An AI trained on all available pairs of translated literary classics would outperform human translators at identifying literary quotations, and if you gave it Schlegel’s German version of Hamlet , it would recognize that for what it is and give you Shakespeare’s Hamlet rather than this:

    To be or not to be; that is the question:

    Obs nobler in the mind, the arrow and spin

    Endure the angry fate or,

    Wielding against a sea of ​​plagues,

    By resistance they end? Dying – sleeping –

    Nothing else! And to know that a sleep

    The heartache and the thousand blows ends,

    Our meat’s heritage, it’s a goal

    To wish for the most intimate. Dying – sleeping –

    Sleep! Maybe dream too! Yes, there is.

    So MT could get better at recognizing existing translations. But of course, a large set of training data also helps MT to create good new translations of its own. DeepL has access to masses of web content as training data, which is why it’s so good at translating boilerplate:

    For Germany’s CDU, the following applies: we must resolutely combat climate change and implement the Paris Agreement consistently. Strong climate protection legislation is the foundation on which we can credibly achieve our goals. We take this seriously and clarify how, for example, a “CO2 cap” with a binding climate protection path in the form of a national certificate trade could be implemented in the near future, particularly in the areas of transport and buildings. [press release translated by DeepL.]

    That’s decent, but it’s not better than what a human would do, and could it ever be? If DeepL analyzed all the press releases ever written, in what way would its output be consistently better than the best human translators? I think it would be a slightly improved version of what it is now: much faster, and almost as good.

    What specific aspects of MT could get substantially better than they are now? One area with strong potential for improvement is matching styles to time periods. I can imagine a future MT where you could select a time period for the text you’re putting in, so that, say, the MT wouldn’t translate “Mama und Papa” from a nineteenth-century text as “Mom and Dad”. I can also imagine one that would convert German footnotes into MLA or APA style in English. Both of those are useful but they’re also things humans already do, so again, the MT wouldn’t be outperforming humans.

    As far as I can tell, there’s a ceiling for MT improvement set by the MT’s total lack of knowledge and experience. It doesn’t know anything and it’s never done anything, been anywhere, or met anyone. It never will. It doesn’t have a theory of mind enabling it to guess whether the average reader will find a given sentence easy or hard to understand and to adjust its phrasing accordingly. It doesn’t know that certain turns of phrase might annoy certain kinds of people. It doesn’t know whether its translation of an ad grabs people’s attention or not. It doesn’t know who is feeding it a text or what they plan to do with the translation. (It could have some information about those issues – e.g., a metric that says “NSFW” is an attention-grabbing term – but it wouldn’t have the understanding required for human-style judgement calls.) What it can do is analyze lots of data about which words and phrases in different languages correspond to each other under what textual circumstances and…isn’t that it? Apart from speed and stored vocab, in what specific way could it actually exceed human translating ability? That’s an honest question, so if you have an answer, please comment.

    On to the last point: Could this future MT handle complicated multilingual puns with ease? This assertion is really interesting because puns are among the hardest aspects of translation. And they offer a broad scope for action that ranges from essentially doing nothing to making really wild choices. I wish Mr. Alexander would come over here and explain how he imagines such a thing would be accomplished.

    There’s a lot to say about puns and this post is already too long. So I’ll write a follow-up post about puns and jokes…I have plenty of material. It’ll be fun. Until then, please comment with your ideas about how good MT could really be.

  • Find the difference

    People tend to think of machine-translation post editing as “easier” than old-fashioned translation.

    But I’ve come to think of MTPE not as easier than traditional translation, but as requiring a different set of skills. Namely, the same skills required by find-the-difference puzzles.

    Find-the-difference puzzles vary in difficulty, of course, just as MT jobs do. You might — especially if you’re working with an extremely cheap online translation service — get an MT job consisting of single sentence with one glaring error, in which case it feels like this puzzle:

    Or you might get a longer, more complex text with subtler errors, like this puzzle:

    The worst scenario is a long text where the MT has done a mostly acceptable job. You compare 10 sentences in a row and they’re all fine. This makes you lazy and then you overlook the errors that do crop up from time to time. It’s like getting two versions of a crowded Brueghel painting and the only difference between them is that one person is missing a shoe.

    In any case, the mental processes required for MTPE are different from those required by traditional translation. I was going to say that translating the old-fashioned way is like unraveling a knitted garment and knitting it back up again according to a different pattern. But since my first metaphor was a find-the-difference puzzle, I should compare it to a puzzle. Maybe it’s like this puzzle:

    Both jobs have the same goal, which is to produce a text that is correct and comprehensible in another language. They also require similar basic skills — in both cases you need to have a good understanding of two languages and decent writing skills in the target language (although it’s worth noting that MTPE doesn’t require excellent writing skills, just “premium mediocre” ones) but your brain is doing a different kind of job in each case. So I wouldn’t be surprised if some translators are better at one than the other. Perhaps the future of the industry will see a sharp division between find-the-difference people and rearrange-the-shapes people, handling different kinds of texts.

    [“Picture puzzler” comes from Highlights magazine and the Pi puzzle from mathisfun.com]