Ernest Davis: Informal Experments

Ernest Davis: Reports on Informal Experiments

Rerunning experiments on text to image generation with unusual object/body part specifications: July 2025

A set of 20 handcrafted problems from high school and introductory undergraduate math, and the performance of three recent variants of ChatGPT, July 4, 2025.

The Bot and the Psalter, April 26, 2025

Don’t Ride This Bike! Generative AI’s persistent trouble with compositionality and parts, Gary Marcus and Ernest Davis, December 8, 2024.
Complete results.

Testing GPT-4-o1-preview on math and science problems: A follow-up study October 2024

ChatGPT: Experiments in analyzing and generating meter and rhyme. April 2024.

Testing GPT-4 with Wolfram Alpha and Code Interpreter plug-ins on math and science problems. Ernest Davis and Scott Aaronson. arXiv 2308.05713. August 2023. Additional Material

DALL-E is really lousy at object parts and body parts October 2022.

Some more GPT-3 Experiments. June 2022.

Experiments in Commonsense Reasoning in GPT-3: Status Report from June 2022 Ernest Davis and Gary Marcus, June 2022.

A very preliminary analysis of DALL-E 2, by Gary Marcus, Ernest Davis, and Scott Aaronson. arXiv 2205.13807. April 2022.

Experiments testing GPT-3's ability at commonsense reasoning: results. by Gary Marcus and Ernest Davis, August 2020.

Winograd Schemas and Machine Translation: Some Examples January 2020.

Google Translate fails on simple sentences October 2016 with subsequent updates.