Ernest Davis: Reports on Informal Experiments

Don’t Ride This Bike! Generative AI’s persistent trouble with compositionality and parts, Gary Marcus and Ernest Davis, December 8, 2024.
Complete results.

Testing GPT-4-o1-preview on math and science problems: A follow-up study October 2024

ChatGPT: Experiments in analyzing and generating meter and rhyme. April 2024.

Testing GPT-4 with Wolfram Alpha and Code Interpreter plug-ins on math and science problems. Ernest Davis and Scott Aaronson. arXiv 2308.05713. August 2023. Additional Material

DALL-E is really lousy at object parts and body parts October 2022.

Some more GPT-3 Experiments. June 2022.

Experiments in Commonsense Reasoning in GPT-3: Status Report from June 2022 Ernest Davis and Gary Marcus, June 2022.

A very preliminary analysis of DALL-E 2, by Gary Marcus, Ernest Davis, and Scott Aaronson. arXiv 2205.13807. April 2022.

Experiments testing GPT-3's ability at commonsense reasoning: results. by Gary Marcus and Ernest Davis, August 2020.

Winograd Schemas and Machine Translation: Some Examples January 2020.

Google Translate fails on simple sentences October 2016 with subsequent updates.