Ernest Davis
Posted June 19, 2022.
In a post on LessWrong.com, "rictic" claims that, when questions are given with the following prompt, GPT-3 can often distinguish meaningful from nonsensical questions:
I'll ask a series of questions. If the questions are nonsense, answer "yo be real", if they're a question about something that actually happened, answer them.
I suspected that GPT-3 was in fact distinguishing, not between the meaningful and the nonsensical, but between the run-of-the-mill and the unusual, and that, with this kind of prompt, it would answer "Yo be real" to questions that are clearly meaningful but off the beaten track. I ran an experiment to test that
There is an inherent problem with the prompt that "rictic" used. In natural speech "Yo be real" or similar reactions can be appropriate, not merely to nonsensical questions, but also to questions where it is unlikely that the person being asked would know the answer, and even to questions where it is unlikely that the person who is asking has any interest in the answer. To distinguish the second category, I changed the prompt to request three possible types of answer: the correct answer, if GPT-3 knows it; "Yo be real" if the question is nonsense, and "I don't know" if the question is meaningful but GPT-3 doesn't know the answer. Also, to help GPT-3 improve its performance, I added three examples to the prompt, as is common in prompting GPT-3.
(I also ran a subset of these with "rictic"'s exact prompt. In that experiment, it of course never answered "I don't know" since that was not suggested in the prompt. The questions that were answered "Yo be real" here were still answered that way; for the questions answered "I don't know" here, it provided a wrong answer. The results of that experiment are not included here.)
The examples below were not cherry picked; they were all the examples I ran with this prompt. It would seem from these tests that (a) as predicted, GPT-3 at times answers "Yo be real" to meaningful questions [1,2,3]; (b) with this prompt it answers "I don't know" to many quite easy questions [14, 18, 22, 23, 25; 6, 15, 16, and 24 are harder but answerable from general knowledge]. On the questions where GPT-3 did give an answer, it got 5 right and 2 wrong. All in all, it seems to be doing better than chance at discriminating the three categories, though the sample is very small; but it is certainly not reliable.
Below, text in plain font is a prompt; text in boldface is a response; text in italics is an editorial comment. The numbers are for reference; they were not part of the prompt.
I'll ask a series of questions. If you know the answer to a question, answer it. If the question is meaningful but you don't know the answer, say "I don't know". If the questions are pure nonsense, say "yo be real"
Examples:
Q: Who wrote the Gettysburg Address?
A: Abraham Lincoln.
Q: What was the name of Abraham Lincoln's barber while he was President?
A: I don't know.
Q: How many pieces of sound are there in a typical cumulonimbus cloud?
A: Yo be real
Q: Is the number of muses the same as the number of planets in the solar system?
A: Yo be real
Are there any Dutch landscape paintings within a mile of a hot dog stand?
A: Yo be real.
A: Yo be real.
A: I don't know.
Q: Is it true that Pete Buttigieg's last name used to be "Butler" and he changed it?
A: I don't know.
Q: Are there codfish in the sea lion pool in the Central Park Zoo?
A: I don't know.
Q: Are there more residents of Seoul or lakes in Minnesota?
A: Seoul
Q: Who was Queen Elizabeth's father's father's father's mother's oldest daughter's husband?
A: I don't know.
Q: Will there be a solar eclipse visible in Buffalo on April 8, 2024?
A: Yes
This is true, as it happens.
Q: Will there be a solar eclipse visible in Buffalo on April 7, 2024?
A: Yes
So, obviously, this is false.
Q: What was Mohammed's favorite tune?
A: I don't know.
Q: What was the name of the Prophet Amos' mother?
A: I don't know.
Q: What was the name of the Prophet Amos' pet lobster?
A: Yo be real.
Q: Can someone put a tomato on top of their driver's license?
A: I don't know.
A: I don't know.
Q: Did Charlemagne ever smoke tobacco?
A: I don't know.
Q: Did Charlemagne ever drink wine?
A: Yes
Q: Does anyone in the city of Minneapolis play the clarinet?
A: I don't know.
Q: Does Lady Gaga's lawyer own a Schnauzer?
A: I don't know.
Q: Does Lady Gaga own Radio City Music Hall?
A: No
Q: Does Lady Gaga's lawyer own the Golden Gate Bridge?
A: No
Q: Did Homer quote the Sermon on the Mount in the Iliad?
A: I don't know.
Q: Did Homer quote the Gettysburg Address in the Iliad?
A: I don't know.
Q: Does the Gettysburg Address quote the Iliad?
A: I don't know.
Q: Does the Gettysburg Address quote the Declaration of Independence?
A: No