Question templates in the CycIC training set

Ernest Davis
Originally posted May 2020; revised July 2020.

CycIC is a data set, created to evaluate AI systems ability to do commonsense reasoning, generated synthetically from the CYC system.

This document is an analysis of the file "cycic_training_questions.jsonl", downloaded from the link above on July 22, 2020.

A few general comments:

My primary interest in these is what they show about CYC. What these problems mostly illustrate is that CYC has a huge taxonomy of categories of things with properties, and a random collection of inference techniques.

My comments on individual templates are in italics.

Template 1: "would likely be damaged when it hit the concrete" or "When it hit the concrete, it would likely be damaged": 247 questions:

Comment: Current language modeling systems (BERT etc.) are good enough that they are very unlikely to have any problem ignoring the distractor object (glass jar etc.) So it is quite pointless (doubly so, of course, in the third example below.)

Misty had a glass jar and a monitor. She dropped the monitor off the roof.
True or false: the thing she dropped would likely be damaged when it hits the concrete.

Misty had a dishcloth and a blanket. She dropped the blanket off the roof.
True or false: the thing she dropped would likely be damaged when it hits the concrete.

Misty had a figurine made of porcelain and a figurine made of porcelain [sic]. She threw the figurine made of porcelain off the roof.
True or false: the thing she dropped would likely be damaged when it hits the concrete.

Template 2: "Jerome got a surprise delivery." 153 questions

Jerome got a surprise delivery. The thing delivered was a moon snail or a hat. The thing delivered to Jerome definitely isn't a man made thing.
True or False: a hat was delivered to Jerome.

Jerome got a surprise delivery. The thing delivered was a Puerto Rican crested toad or a cupcake. The thing delivered to Jerome definitely isn't a creature.
True or False: a cupcake was delivered to Jerome.

Jerome got a surprise delivery. The thing delivered was a candy thermometer or a blue book. The thing delivered to Jerome definitely isn't a kitchen appliance.
True or False: a blue book was delivered to Jerome.

Template 3: "April missed a call": 639 questions

Note that there are only 288 combinations of an hour of a day with an integer number of hours between 1 and 12. In particular, the question "April missed a call at noon. She returned it 1 hour later." occurs, word for word, 8 times in the training set.

April missed a call at 10 a.m.. She returned it 11 hours later. When did she return the call?
A. 2 a.m. B. 1 p.m. C. 8 p.m D. 9 p.m. E. 2 p.m.

April missed a call at 5 p.m.. She returned it 9 hours later. When did she return the call?
A. 9 p.m. B. 12 a.m. C. 11 a.m. D. 11 p.m. E. 2 a.m.

April missed a call at 2 a.m.. She returned it 1 hour later. When did she return the call?
A. 12 a.m. B. 11 p.m. C. 4 a.m. D. 1 a.m. E. 3 a.m.

Template 4: "In Dill Town, you can only buy" 285 questions

In Dill town, you can only buy fruit at Pepper's Store, and you can only buy flowers at Cherry's Store. In a strange coincidence, any time Basil is at Pepper's Store, Lily is at Carrot's Store, and any time he is at Cherry's Store, she is at Rose's Store. Basil is buying purple mangosteens. Where is Lily?
A. Pepper's. B. Rosemary's C. Carrot's. D. Rose's Store.

In Dill town, you can only buy herbs at Carrot's Store, and you can only buy flesh at Pepper's Store. In a strange coincidence, any time Basil is at Carrot's Store, Lily is at Cherry's Store, and any time he is at Pepper's Store, she is at Rose's Store. Basil is buying Cilantro. Where is Lily?
A. Pepper's Store B. Rose's Store C. Rosemary's Store D. Carrot's Store E. Cherry's Store

In Dill town, you can only buy office products at Carrot's Store, and you can only buy cereal grains at Pepper's Store. In a strange coincidence, any time Basil is at Carrot's Store, Lily is at Cherry's Store, and any time he is at Pepper's Store, she is at Pepper's Store. Basil is buying grain of rice. Where is Lily?
A. Carrot's Store B. Cherry's Store C. Pepper's Store D. Rose's Store E. Rosemary's Store

Template 5: "If a <car brand> has passed its inspection ..." 13 questions.

These aren't even right. A broken window, a cracked headlight, a broken muffler or a missing front bumper will all cause a car to fail inspection.

If a Dodge just passed its inspection, then the Dodge has no issues with its?
A. CD player. B. tinted car window. C. DVD player. D. radar system E. bumper.

If a eagle [sic] just passed its inspection, then Chrysler Eagle has no requirements involving its?
A. radar detector. B. tinted car window. C. headlight. D. ashtray. E. CD player.

If a Suzuki just passed its inspection, then the Suzuki has no issues with its?
A. radar detector. B. muffler. C. GPS receiver. D. tinted car window. E. DVD player.

Template 6: "Who did something affecting X" 136 questions

Comment: Distractors consisting of names not mentioned in the text are completely useless.

Melody was gobbled by Rob. Melody commended Charity. Who did something affecting Melody?
A Joy. B. Charity. C. Daisy. D. Cliff. E. Rob.

Charity intimidated Rob. Rob derided Charity. Who did something affecting Rob?
A. Rob. B. Cliff. C. Daisy. D. Joy. E. Charity.

Melody was extradited by Cliff. Melody shamed Cliff. Who did something affecting Cliff?
A. Rob B. Cliff. C. Joy, D. Melody D. Daisy.

Template 9: "In Strange Town, you're only allowed ..." 327 questions.

Note the odd fact that, in the second and third examples, apparently the cinema is in the graveyard.

In Strange Town, you're only allowed to play a team sport in the theater district, you're only allowed to attend entertainment events in the graveyard, and you're only allowed to engage in politics in the shipyard. Joy is baseball play. [sic] Where is Joy?
A. the shipyard. B. the graveyard. C. the financial district. D. the botanical garden. E. the physical universe.

In Strange Town, you're only allowed to play board games in the town square, you're only allowed to attend entertainment events in the graveyard, and you're only allowed to play a card game in the campground. Daisy is watching a movie at the cinema. Where is Daisy?
A. the camptown. B. the town square. C. the graveyard. D. the shipyard E. the park

In Strange Town, you're only allowed to engage in politics in the forest, you're only allowed to attend entertainment events in the graveyard, and you're only allowed to play a team sport in the theater district. Cliff is watching PG-rated films at a theater. Where is Cliff?
A. the financial district B. the graveyard C. the forest D. the shipyard E. the theater district

Template 10: "Cliff is friends with everyone ..." 116 questions.

Cliff is friends with everyone from Autumn Town. He's enemies with everyone from Summer Town. He doesn't know about anyone who lives outside those two towns. Joy lives in Winter Town.
True or False: Cliff and Joy are on friendly terms.

Cliff is friends with everyone from Winter Town. He's enemies with everyone from Summer Town. He doesn't know about anyone who lives outside those two towns. Joy lives in Summer Town.
True or False: Cliff opposes Joy.

Cliff is friends with everyone from Winter Town. He's enemies with everyone from Spring Town. He doesn't know about anyone who lives outside those two towns. Joy lives in Winter Town.
True or False: Cliff is an opponent of Joy.

You may be wondering: How can there be 116 different variants of this rather limited form? The tally of the relations that is queried true/false is as follows:

 6 "are mutual enemies"
 8 "are pals"
12 "are on friendly terms"
 8 "are on unfriendly terms" 
 9 "considers Joy an enemy" 
11 "is an opponent of" 
10 "is aware of" 
 1 "is buddies with" 
 2 "is friends with" 
 2 "has a negative vested interest in" 
 7 "has a positive vested interest in" 
 6 "has a vested interest in" 
 5 "Joy is liked by Cliff" 
 7 "knows something about" 
 6 "likes"  
 2 "likes Joy as a friend" 
14 "opposes" 

Template 11: "... typically last ..." or "typically lasts" 136 instances.

Note the odd plurals in some of these answers: "fallings asleep", "watchings a movie at the cinema" etc. I am surprised to learn that there is a typical length to "changes of device state"; I can't guess how long it would be.

______ typically last 60 seconds:
A. changes of device state B. July fifths C.Minuteman III ICBM stage 1 flights D. groanings E. national hero celebrations"

______ typically last a few hours.
A. fallings asleep B. watchings a movie at the cinema C. Novembers D. September 26ths E. assemblings wooden furniture parts

What typically lasts a few seconds?
A. July 23rd B. January tenth C. opening a container D. August second E. changing a motor vehicle's oil

Template 12: "Ginger bet Duke that no ..." or "Duke bet Ginger that no ..." 673 questions

Note: In all, or almost all of these, the two categories involved in the bet are disjoint, so the bet is won. Once the AI realizes that, it no longer has to worry about the content of the bet; all it has to do is to match the profession of the subject of "bet" with the synonymous or identical profession in the multiple choice.

Duke bet Ginger that no Strawberry crab is a lake. Ginger is a brewer. Duke is a psychiatrist. Which of them won the bet?
A. lead guitarist B. brewer C. real estate agent D. pitcher E. psychiatrist

Duke bet Ginger that no machine tool is pasta. Ginger is an architect. Duke is a cowgirl. Which of them won the bet?
A. cowgirl B. pediatrician C. architect D. chemist E. short-order cook

Duke bet Ginger that no sea is a picnic ham. Ginger is a secretary. Duke is a controller. Which of them won the bet?
A. librarian B. secretary C. accountant D. mathematician E. travel agent