Rebooting AI
Building Artificial Intelligence We Can Trust

Gary Marcus and Ernest Davis

Updates

This web page contains updates on the authors' experiments described in Rebooting AI and a list of relevant books and articles that have been published since the text of Rebooting AI was finalized in January 2019.

Authors' experiments

More recent results on the authors' experiments described in the book.

p. 34. Google Image search for "mother and son" and for "professor". As of August 2, 2019:

Of the 100 top images returned for "mother and son", 69 were white, 26 were non-white or mixed race, 5 were indeterminate (line drawing or silhouette).

Of the 100 top images for the query "mother", 85 were white women, 12 were non-white women, 2 were men, and one was a female baby. Now, about 1/3 of these were in fact pictures of Jennifer Lawrence or Michelle Pfeiffer in the horror film, mother!, but that's just a different kind of lopsidededness.

Of the 100 top images for the query "professor", 76 are men, and 24 are women. And of the 24 women, 2 are models and 2 are Mary Poppins. (A few of the men are also models.) In the top 50, there are only 4 women, only 1 of whom is an actual professor (2 are models and 1 is Mary Poppins).

p. 70. Questions to "Google Talk to Books"
Where did Harry Potter meet Hermione Granger? As of August 2, 2019: Of the 20 passages: 6 refer to the Harry Potter stories; most of the rest are about unrelated people or characters named "Harry". None are from the actual Harry Potter books. Only one of the 20 passages mentions Hermione Granger. None of the passages returned answers the question, and there is no reason to suppose that the question is answered in any of the books.

Were the Allies justified in continuing to blockade Germany after World War I? As of August 2, 2019: Of the 20 passages returned, 3 correctly deal with the period after World War 1. None of the passages returned discuss the issue explicitly; however one of the books returned certainly discusses the issue close to the passage excerpted; and a second may include a discussion of the issue.

What were the seven Horcruxes in Harry Potter? As of August 2, 2019: Google Talk to Books is now doing comparatively well with this. 2 of the passages returned answer the qustion, and all but 3 are from books that have to do with Harry Potter. I suspect that the improvement is due to the fact that new books have been published with an explicit list.

Who was the oldest Supreme Court justice in 1980? As of August 2, 2019: All 20 passage have to do with the Supreme Court, but, as it happens none of them even mention Justice Brennan, who is the correct answer to the question.

Who betrayed his teacher for 30 pieces of silver? As of August 2, 2019: Google Talk to Books has actually deteriorated on this question. In April 2018, six of the answers referred to Judas Iscariot; as of August 2019, only three do.

Who betrayed his teacher for 30 coins? As of August 2, 2019: Only one returned passage refers to Judas.

Who sold out his teacher for 30 coins? As of August 2, 2019: None of the returned passages refers to Judas.

p. 72. Asking the model on the Allen Institute web site questions about the Almanzo story.
How much money was in the pocketbook?
What was in the pocketbook?
Who owns the pocketboook?
Who found the pocketbook?

As of August 10, 2019: The results are only slightly changed from those reported in the book. The first question is answered correctly, "Fifteen hundred dollars."
The second answer is meaningless: "He opened it and hurriedly counted the money."
The third question is answered correctly: "Mr. Thompson".
The fourth question is answered incorrectly: "Mr. Thompson".

p. 79. Asking Google, "What is the capital of Mississippi?" and "How much is 1.36 euros in rupees?"
The book reports that Google gave the correct answer. It continues to give the correct answer as of August 2, 2019.

p. 79. Asking Google "Who is currently on the Supreme Court?"
As of July 19, 2019, Google gives the correct list. Also, if you ask it "Who was on the Supreme Court in YEAR?," and specify a year when some famous case was decided, it gives the correct answer. For instance, if you ask "Who was on the Supreme Court in 1973?" Google answers with the list of judges who decided Roe v. Wade. However, if you ask about some other year, then Google is at sea; if you ask "Who was on the Supreme Court in 1980?" then the highlighted answer is "John Jay" (the first Chief Justice) together with a link to the Wikipedia page that lists everyone ever on the court.

To the question "How many women are currently on the Supreme Court?" Google answers with a snippet starting "4 have been women" i.e. ignoring "currently" and including Sandra Day O'Connor.

p. 79. Asking Google, "When was the first bridge ever built?" As of August 2, 2019, the answer is the same passage about the first iron and steel bridge described in the text.

p. 80. Questions to Alexa:
Is Donald Trump a person? Is an Audi a vehicle> Is an Edsel a vehicle? Can an Audi use gas? Can an Audi drive from New York to California? Is a shark a vehicle?

p. 81. Question to Siri: Hey Siri find a fast food restaurant that's not MacDonald's

p. 82. Questions to Wolfram Alpha: What is the weight of a cubic foot of gold? How far is Biloxi, Mississippi from Kolkata? What is the edge length of an icosahedron with an edge length of 2.3 meter? How far is the border of Mexico from San Diego? What is the volume of an icosahedron whose edges are 2.3 meters long? Was Ella Fitzgerald alive in 1960?

As of August 2, 2019, the results are unchanged from those reported in the book: The first three questions are answered correctly, the last three incorrectly.

p. 85. Asking Google Translate to translate Je mange un avocat pour le déjeuner. from French to English.

As of August 2, 2019, Google Translate produces the translation, "I eat a lawyer for lunch." So does the DeepL translation program, which is currently the best of the publicly available machine translation programs (though it handles only 9 languages as compared to Google Translate's 103).

As mentioned in the Endnotes, Ernie Davis maintains a collection of errors by machine translation programs on seemingly simple sentences.

p. 88. Asking Google Translate to translate The electrician whom we called to fix the telephone works on Sundays from English to French.

As discussed in the footnote, as of March 2019, Google Translate translated the sentence "The electrician whom we called to fix the telephone works on Sunday". However, at that time, if the sentence was placed in quotation marks or in parentheses or "electrician" was changed to "engineer", the translation was no longer correct.

As of August 2019: Google Translate translates the sentence "The electrician whom we called to fix the telephone works on Sunday" correctly. The translation is still correct if the sentence is put in quotation marks or if "works" is changed to "does not work." However, if the sentence is put in parentheses, or "electrician" is changed to "engineer" or "does not work" is changed to "didn't work" then Google Translate mistranslates it. DeepL gets all these right translating into French, but it get the sentences, "The electrician who we called to fix the phone does not work any more" and "The plumber who we called to fix the dishwasher does not work any more," wrong.

p. 135: Asking Amazon Rekognition to label the two details from the image of Julia Childs kitchen. As of August 3, 2019. This has significantly improved. Amazon Rekognition now identifies the left hand image as "Furniture" and "Chair" with 99.9% confidence, as "Tabletop" with 75% confidence and as "Dining Table" with 70.5% confidence. The right hand image, however, is labelled as "Food", "Sweet", and "Confectionary" with 86.1% confidence and as "Accessory", "Tie", and "Accessories" with 66.9% confidence.

As of August 3, 2019, Amazon Rekognition labels the image below as "Human" with 79.8% confidence.

Recent books

Kate Crawford, Atlas of AI, Yale University Press, 2021.

Melanie Mitchell, Artificial Intelligence: A Guide for Thinking Humans, Basic Books, 2019.

Stuart Russell, Human Compatible: Artificial Intelligence and the Problem of Control, Viking, 2019.

Janelle Shane, You Look Like a Thing and I Love You: How Artificial Intelligence Works and Why It's Making the World a Weirder Place, Voracious Press, 2019.

Recent articles

Particularly pertinent articles that have appeared since the text of the book was finalized.

Ajunwa, Ifeoma. 2019. Beware of Automated Hiring. The New York Times, Oct. 8, 2019.

Albert-Deitch, Cameron. 2019. Tesla's 'Smart Summon' Feature is Causing Problems in Parking Lots, and Regulators are Paying Attention. Inc. October 4, 2019.

Aschwanden, Christie. 2020. Artificial Intelligence Makes Bad Medicine Even Worse. Wired, January 10, 2020.

Bandler, James, Patricia Callahan, Doris Burke, Ken Bensinger, 2019. Inside Documents Show How Amazon Chose Speed Over Safety in Building Its Delivery Network, and Caroline O’Donovan, ProPublica and BuzzFeed News, December 23, 2019.

Barber, Gregory. 2019. Artificial Intelligence Confronts a 'Reproducibility' Crisis, WIRED September 16, 2019.

Bender, Emily and Alexander Koller. 2020. Climbing toward NLU: On Meaning, Form, and Understanding in the Age of Data, ACL-2020.

Boudette, Neal E. 2019. Despite High Hopes, Self-Driving Cars Are 'Way in the Future.' The New York Times, July 17, 2019.

Cai, Fangyu. 2019. Adversarial Patch on Hat Fools SOTA Facial Recognition. Synced, August 29, 2019.

Chen, Angela and Karen Hao. 2o20. Emotion AI researchers say overblown claims give their work a bad name. Technology Review, February 14, 2020.

Church, Kenneth, Annike Schoene, John Ortega, Raman Chandrasekar, and Valia Kordoni, 2022. Emerging trends: Unfair, biased, addictive, dangerous, deadly, and insanely profitable, Natural Language Engineering, 1-26.

Dacrema, Maurizio Ferrari, Paolo Cremonesi, and Dieter Jannach. 2019. Are We Really Making Much Progress? A Worrying Analysis of Recent Neural Recommendation Approaches. arXiv preprint, arXiv 1907.06902v1.

Dale, Robert. 2020. GPT-3: What's it good for. Towards Data Science, Dec. 8, 2020.

Davies, Alex. 2019. GM's Cruise Rolls back its target for self-driving cars. Wired, July 24, 2019.

Evans, Will. 2019. Behind the Smiles: Amazon's internal injury records expose the true toll of its relentless drive for speed. Reveal News, November 25, 2019.

Evans, Will. 2020. How Amazon hid its safety crisis. Reveal News, September 29, 2020.

Feathers, Todd. 2019. Flawed Algorithms are Grading Millions of Students' Essays. Vice.com, August 20, 2019.

Fink, Sheri. 2019. This High-Tech Solution to Disaster Response May Be Too Good to Be True. The New York Times, August 9, 2019.

Fisher, Christine. 2019. Google reportedly dispands review panel monitoring DeepMind Health AI. engadget, April 15, 2019.

Fisher, Max and Amanda Taub. 2019. How YouTube Radicalized Brazil. The New York Times, August 11, 2019.

Flaherty, Katie. 2019. A RoboCop, a park, and a fight: How expectations about robots are clashing with reality. NBC News, Sept. 27, 2019.

Gandhi, Kanishk, and Brenden Lake. 2019. Mutual exclusivity as a challenge for neural networks. arXiv preprint, arXiv 1906.10197.

Ghaffary, Shirin. Robots aren't taking warehouse employees' jobs, they're making their work harder. Vox October 22, 2019.

Ghosh, R. 2019. Robots out, humans in: Boeing assigns 777 assembly job back to humans after robotics failure. International Business Times November 15, 2019.

Giles, Martin. 2019. Triton is the world's most murderous malware, and it's spreading. Technology Review, March 5, 2019.

Gross, Judah Ari. 2019. Moscow blamed for disruption of GPS systems at Ben Gurion Airport. Times of Israel, June 27, 2019.

Harwell, Drew. 2019. Google’s search tool falsely called Mueller report ‘fiction’ Washington Post,, June 10, 2019.

Heaven, Douglas. Why deep-learning AIs are so easy to fool. Nature, October 9, 2019.

Heaven, Will Douglas. Hundreds of AI tools have been built to catch covid. None of them helped. MIT Technology Review, July 30, 2021

Heinzerling, Benjamin. 2019. NLP's Clever Hans Moment has Arrived: A review of Timothy Niven and Hung-Yu Kao, 2019: Probing Neural Network Comprehension of Natural Language Arguments. Heinzerling blog, July 21, 2019.

Hendryks, Dan, Kevin Zhao, Steven Basart, Jacob Steinhardt, Dawn Song. 2019. Natural Adversarial Examples. arXiv preprint, arxiv 1907.07174

Hill, Kashmir. 2020. The secretive company that might end privacy as we know it. New York Times, January 18, 2020.

Hodson, Hal. 2019 DeepMind and Google: The battle to control artificial intelligence. 1843: Stories of an Extraordinary World, April/May 2019.

Houser, Kristin. 2019. Amazon Claims Rekognition can now Detect Fear. Futurism, August 14, 2019.

Hymas, Charles. 2019. AI used for first time in job interviews in UK to find best applicants. The Telegraph, September 27, 2019.

Johnson, Carolyn. 2019. Racial bias in a medical algorithm favors white patients over sicker black patients, Washington Post October 24, 2019.

Kapoor, Sayash and Arvind Narayanan. 2022. Leakage and the Reproducibility Crisis in ML-Based Science, arXiv preprint 2207.07048.

Knight, Will. 2019. An AI Pioneer Wants His Algorithms to Understand the 'Why', WIRED, October 8, 2019.

Lake, Brenden and Gregory Murphy. 2020. Word meaning in mind and machines. arXiv preprint, 2008.01766.

Lu, Hongling, Gennady Erlikeman, and Philip J. Kellman. 2019. Deep convolutional networks do not classify based on global object shape. PLOS Computational biology, December 7, 2018.

Mac, Ryan. 2021. Facebook Apologizes After A.I. Puts `Primates' Label on Video of Black Men. The New York Times, September 3, 2021.

Marcus, Gary. 2020. The Next Decade in AI: Four Steps Toward Robust Artificial Intelligence arXiv preprint, arXiv 2002.06177

Marcus, Gary and Ernest Davis. 2019. Are Neural Networks About to Reinvent Physics? Nautilus November 21, 2019.

Marcus, Gary and Ernest Davis. 2020. GPT-3, Bloviator: OpenAI's language generator has no idea what it's talking about. Technology Review, August 22, 2020.

Marshall, Aarian. 2019. Self-Driving Trucks are Ready to do Business in Texas. WIRED, August 6, 2019.

Marshall, Aarian, 2019. Feds Say Tesla Autopilot is Partly to Blame for a 2018 Crash. WIRED, September 4, 2019.

Marshall, Aarian and Alex Davies. 2019. Uber’s Self-Driving Car Didn’t Know Pedestrians Could Jaywalk. WIRED, November 5, 2019.

McMahon, Bryan. 2020. How the police use AI to track and identify you. The Gradient, October 3, 2020.

Menn, Joseph. 2019. Microsoft turned down facial-recognition sales on human rights concerns. Reuters, April 16, 2019.

Merchant, Brian, 2019. There is absolutely no reason to trust the safety record of Tesla's Autopilot System. , Gizmodo, May 31, 2019.

Metz, Cade, 2019. We Teach A.I. Systems Everything, Including Our Biases The New York Times, November 11, 2019.

Metz, Cade and Adam Satariano, 2020. An Algorithm that grants freedom, or takes it away. The New York Times, February 6, 2020.

Metz, Cade and Natasha Singer, 2019. A.I. Experts Question Amazon's Facial-Recognition Technology. The New York Times, April 3, 2019.

Mozur, Paul. 2019. One Month, 500,000 Face Scans: How China is using A.I. to Profile a Minority. The New York Times, April 14, 2019.

Ongweso Jr, Edward. 2019. Racial Bias in AI Isn’t Getting Better and Neither Are Researchers’ Excuses. Motherboard Tech by Vice. July 29, 2019.

Pavlus, John. Same or Different? The Question Flummoxes Neural Networks. Quartz, June 23, 2021.

Rousseau, Anne-Laure, Clément Baudelaire, and Kevin Riviera. 2020. Doctor GPT-3: Hype or reality? Nabla October 27, 2020.

Schöller, Christoph, Vincent Aravantios, Florian Lay, Alois Knoll. 2019. The Simpler the Better: Constant Velocity for Pedestrian Motion Prediction. arXiv preprint, arXiv 1903.07933.

Shane, Janelle. 2019. Is That a Giraffe or a Cockroach? Slate, November 5, 2019.

Shead, Sam. 2021. Amazon's Alexa assistant told a child to do a potentially lethal challenge. CNBC, December 29, 2021.

Silverman, Craig. 2019. How to Game Google to Make Negative Results Disappear. BuzzFeed News June 27, 2019.

Simonite, Tom. 2019. The Best Algorithms Struggle to Recognize Black Faces Equally. WIRED, July 22, 2019.

Speer, Robyn. 2017. ConceptNet Numberbatch 17.04: better, less-stereotypes word vectors. ConceptNet blog, April 24, 2017

Sung, Morgan. 2019. The AI Renaissance portrait generator isn't great at painting people of color. Mashable, July 23, 2019.

Thomas, Rachel and Chris Wiggins. 2019. A Conversation about Tech Ethics with the New York Times Chief Data Scientist. fast.ai March 4, 2019.

Thompson, Clive. 2020. AI, the Transcription Economy, and the Future of Work. Wired. February 10, 2020.

Torbati, Yaganeh. Google Says Google Translate Can’t Replace Human Translators. Immigration Officials Have Used It to Vet Refugees. ProPublica, September 26, 2019.

Vincent, James. 2019. The Problem with AI Ethics. The Verge, April 3, 2019.

Vincent, James. 2019. This colorful printed patch makes you pretty much invisible to AI. The Verge, April 23, 2019.

Wilson, Kyle. 2020. The world's second largest Wikipedia is written almost entirely by one bot. Motherboard. Feb. 11, 2020.

Wykstra, Stephanie, and Undark. 2020. It was supposed to detect fraud. It wrongfully accused thousands instead. How Michigan's attempt to automate its unemployment system went horribly wrong. Atlantic Magazine June 7, 2020.

Zittrain, Jonathan. 2019. Intellectual Debt: With Great Power Comes Great Ignorance. Medium, July 24, 2019. Shorter version published in The New Yorker with the title The Hidden Costs of Automated Thinking, July 23. 2019.

Rebooting AI Building Artificial Intelligence We Can Trust