Machine Translate fails on simple sentences
Ernie Davis
Links to
Google Translate,
Systran,
Bing,
DeepL and
ChatGPT
Updated March 25, 2023
This collection should under no circumstances be taken as any kind of
serious benchmark. Not only is it very small, but, as the title indicates,
it was
originally formed in 2016 as a collection where Google Translate, specifically,
failed; hence, those versions of this page were
inherently unfair to Google Translate. Then I switch my target to DeepL
passes, I have particularly aimed at DeepL, which at that time seemed to be
the acknowledged leader of
the pack for the languages it covers, and in this latest version, I have switched
to ChatGPT.
It is just
an unsystematic collection of translation problems that seem like they
should be easy, but trip up Google Translate, ChatGPT, and the other translation
programs, during the time period I have been doing this.
I check these intermittently to see
what progress has been made, and update the page accordingly. As you
can see by comparing with the earlier versions, the state of the art has
improved steadily, but there are still plenty of "easy" sentences that
trip up the best publicly available systems.
Earlier versions:
September - December 2016
December 2016
May 2017
August 2018
March 2019
June 2018
December 2019
April 2020
December 2020
July 2021
April 2022
The tests below were run on Google Translate (GT), Bing Translate
(BT), Systran, DeepL, and ChatGPT (presumably powered by GPT3.5)
on March 5, 2023 and on Bing Chat (BChat; presumably powered by GPT4)
on March 25, 2023.
In this iteration, my rule has been to include examples if either (a) ChatGPT or
DeepL gets it wrong
or (b) the sentence is particularly simple, and one of the programs gets it wrong.
The prompt for ChatGPT and BChat in all cases was
"Please translate this sentence into < target language >:" and then the sentence in quotation marks.
Curious framing error by Bing Chat:
Asked to translate sentence S from language L into language M, Bing Chat would
answer in the form, "<Translation in M> translates to S in L."
However, it sometimes erroneously
said ``translates to S in M'' instead. For instance, given the request,
"Please translate into French:
'People like the Bidens, and the Obamas were even more
popular than the Bidens.' "
BChat answered
"'Les gens comme les Biden et les Obama étaient encore plus populaires que les
Biden.' translates to
'People like the Bidens, and the Obamas were even more
popular than the Bidens.' in French."
Examples
-
En: The mechanic we called to fix the clocks has stopped working.
Fr: Le mécanicien que nous avons appelé pour réparer les horloges a cessé de travailler.
GT, DeepL, BChat: Le mécanicien que nous avons appelé pour réparer les horloges a cessé de travailler.
[Correct]
Systran, BT, ChatGPT:
Le mécanicien que nous avons appelé pour réparer les horloges a arrêté de fonctionner.
The point here is that "works" in English can mean either "labors" or
"functions correctly".
Many other languages, including French, Spanish, Italian,
German have separate words for these.
The programs have certainly improved
over time in making this distinction (see the previous versions of this page)
but can still be fooled.
If you change this so both nouns are in the plural,
"The mechanics we called to fix the clocks have stopped working,"
then GT gets it wrong, but DeepL and BChat still get it right.
If you change it to "The mechanics we called to fix the clock and the dishwasher have stopped working," then only BChat gets it right.
-
Fr: Je ne peux pas nager.
En: I can't swim.
GT: I can not swim. [A small mistake, but an odd one for so simple a
sentence]
BT, Systran, DeepL: I can't swim. [Correct]
ChatGPT, BChat: I cannot swim. [Correct]
-
Fr: Elles
Sp: Ellas
GT, Systran, ChatGPT, BChat: Ellas [Correct]
BT: Ellos.
DeepL: En. DeepL offers as alternatives "Ellos", "Pueden consultarse en" and
"Se puede consultar en".
-
En: The doctor and her sister shouted to Jacques, but they weren't loud enough.
Fr:
Le docteur et sa sœur ont crié à Jacques, mais elles n'étaient pas assez forts.
GT, BT, ChatGPT, BChat:
Le médecin et sa sœur ont crié à Jacques, mais ils n'étaient pas assez forts.
Systran:
Le docteur et sa soeur crièrent à Jacques, mais ils n'étaient pas assez forts.
DeepL:
Le docteur et sa sœur ont crié à Jacques, mais ils n'étaient pas assez forts.
If you replace "doctor" by "secretary" then GT, BT, and ChatGPT switch from "ils"
to the correct "elles". DeepL and Systran stay with "ils". So there is some gender
bias here.
-
En. Pierre's parents miss him.
Fr. Pierre manque à ses parents.
GT, ChatGPT, BChat. Les parents de Pierre lui manquent.
BT Systran, DeepL: Il manque aux parents de Pierre. [OK]
-
En. They gave birth.
Fr. Elles ont acouché.
GT, BT, BChat. Ils ont accouché.
DeepL, ChatGPT. Ils ont donné naissance.
Systran: Elles ont acouché. [Correct]
Contributed by Francois Charton, 3/14/21.
-
En. They will be going to a school for girls.
Fr. Elles iront dans une école pour filles.
GT, Systran, DeepL, ChatGPT, BChat. Ils iront dans une école pour filles.
BT. Elles iront dans une école pour filles. (Correct)
Contributed by Francois Charton, 3/14/21.
-
En. The cat that I gave to my sister is pregnant.
Fr: La chatte que j'ai donné à ma soeur est enceinte.
GT, Systran, DeepL, ChatGPT, BChat: Le chat que j'ai donné à ma soeur est enceinte.
BT: La chatte que j'ai donné à ma soeur est enceinte. [Correct]
Adapted from a suggestion of Richard Socher.
-
En. Pierre's sister said, "I am your neighbor."
Fr: La soeur de Pierre a dit : "Je suis votre voisine."
DeepL, BChat: La soeur de Pierre a dit : "Je suis ton voisin."
GT, Systran, BT, ChatGPT:
La soeur de Pierre a dit : "Je suis ta voisine." [Correct]
-
En. My neighbor is a woman.
Fr. Ma voisine est une femme.
GT, Systran, BT, ChatGPT, BChat: Ma voisine est une femme. [Correct].
DeepL : Mon voisin est une femme. (DeepL got this right in June 2019,
but not in December 2019 through March 2023.)
-
En: I always tell my brothers that they fell on their head when they were
babies, so they hit me.
Fr: Je dis toujours à mes frères qu'ils sont tombés sur la tête
quand ils étaient bébés, alors ils me frappent.
GT, Systran, BT, DeepL, ChatGPT:
Je dis toujours à mes frères qu'ils sont tombés sur la tête
quand ils étaient bébés, alors ils m'ont frappé.
BChat: Je dis toujours à mes frères qu'ils sont tombés sur la tête
quand ils étaient bébés, alors ils me frappent.
-
En: The soup is hot because it contains jalapenos.
Sp: La sopa está picante porque contiene jalapeños.
GT, Systran, BT, ChatGPT, BChat:
La sopa está caliente porque contiene jalapeños.
(i.e. hot in temperature)
DeepL: La sopa está picante porque contiene jalapeños. [Correct]
(Contributed by Robert Krovetz).
-
En: The florist says that Sam often comes into her shop but he
never buys her flowers.
Fr: Le fleuriste dit que Sam vient souvent dans son magasin mais qu'il n'achète
jamais ses fleurs.
GT, BT, DeepL, ChatGPT, BChat:
Le fleuriste dit que Sam vient souvent dans sa
boutique mais qu'il ne lui achète jamais de fleurs.
Systran:
Le fleuriste dit que Sam vient souvent dans son magasin mais qu'il n'achète
jamais ses fleurs. [Correct]
Adapted from a cartoon pointed out by Christina Behme, which goes the other way:
Marriage counsellor: Your wife says you never buy her flowers.
Clueless husband: To be honest, I never knew she sold flowers.
--- from "Sad and Useless Humor".
-
Fr: J'ai acheté des fleurs à Mary car elle n'en avait pas vendu de toute la
journée.
En: I bought flowers from Mary because she hadn't sold any all day.
GT, ChatGPT, BChat: I bought flowers for Mary because she hadn't
sold any all day.
BT: I bought Mary flowers because she hadn't sold any all day.
DeepL: I bought flowers from Mary because she hadn't sold any all day.
[Correct]
Note "J'ai acheté des fleurs à Mary" can mean either "I bought flowers from
Mary" or "I bought flowers for Mary".
BT got this example right in April 2022.
At this point in my experiment, Systran refused to change the languages I was using,
even when I refreshed the page, so I could not test this, or examples 15 and 17 below.
-
Fr: Emily a acheté des fleurs à Pierre, pas à Jacques, parce que Jacques
a déjà reçu des fleurs de Richard.
En: Emily bought flowers for Pierre, not for Jacques, because Jacques already
got flowers from Richard.
GT, ChatGPT, BT, BChat:
Emily bought flowers for Pierre, not Jacques, because Jacques has
already received flowers from Richard. [Correct]
DeepL: Emily bought flowers from Peter, not James, because James has
already received flowers from Richard.
Systran: Unable to test.
`
-
En: I've been interviewing a lot of people, and people like Biden.
GT, DeepL, Systran, BT: J'ai interviewé beaucoup de gens, et des gens comme Biden.
ChatGPT, BChat:
J'ai interviewé beaucoup de gens et les gens aiment Biden. [Correct]
If you change this to, "People like the Bidens, and the Obamas were even more
popular than the Bidens,'' then only DeepL gets the "like" right; ChatGPT
and BChat now get it wrong. However
the names should be "les Biden" and "les Obama", so BT and Systran get that right
and the others get it wrong. None of them gets this completely right.
GT: Des gens comme les Bidens, et les Obamas étaient encore plus populaires que les Bidens.
BT, Systran: Des gens comme les Biden et les Obama étaient encore plus populaires que les Biden.
DeepL: Les gens aiment les Bidens, et les Obama étaient encore plus populaires que les Bidens.
ChatGPT: Les gens comme les Bidens et les Obamas étaient encore plus populaires que les Bidens.
BChat: Les gens comme les Biden et les Obama étaient encore plus populaires
que les Biden.
-
Fr: Au théâtre, Henry s'est assis entre Marie et la scène, de sorte que sa vue de la scène était bloquée.
En: At the theater, Henry was sitting betweem Marie and the stage, so her view of
the stage was blocked.
GT, BT, DeepL, ChatGPT, BChat: At the theater, Henry sat between Mary and
the stage, so his view of the stage was blocked.
I was unable to test Systran.
Wrong Answers
GT Wrong 1B 1C 2 4 5 6 7 8 11 12 13 14 16A 17 14/20
BT Wrong 1A 1B 1C 3 4 6 11 12 13 14 16A 17 12/20
Sy Wrong 1A 1B 1C 4 7 8 11 12 ? ? 16A ? 9/17
DL Wrong 1C 3 4 6 7 8 9 10 11 13 15 16A 17 13/20
CG Wrong 1A 1B 1C 4 5 6 7 8 11 12 13 14 17 13/20
BC Wrong 4 5 6 7 8 9 12 13 14 17
Machine translation fails from Chinese
found by Yuling Gu, June 20, 2018.