Updated Winograd Schema Challenge Competition Rules
-
Deadline: The deadline for registration is July 1, 2016.
The competition itself will be held at
IJCAI 2016 in New York City, July 12, 2016.
-
Input format: Contestant programs will receive their input
in the form of an .xml file. An example file may be found at
http://www.cs.nyu.edu/faculty/davise/papers/WSCExample.xml.
The structure of the .xml should be self-explanatory on inspection
of this file.
3.Problems: All problems have the following form. There is a text of a
single sentence or a few sentences that contains one or more pronouns
with ambiguous referents. Each problem asks about one such pronoun. The
pronoun that is the subject of the problem is demarcated in the XML
input with the tag < pron > < /pron >. (In viewing the XML file in a web
browser, the pronoun appears in boldface.) After the text, there is a short
excerpt from the text containing the pronoun that is the subject of the problem
and a few words that occur on one side or the other; this is for the benefit
human viewers. Finally, a list of possible referents is given, labelled
"A", "B", "C" ... For example:
Babar wonders how he can get new clothing. Luckily, a very rich old man
who has always been fond of little elephants understands right away
that he is longing for a fine suit.
As he likes to make people happy, he gives him his wallet.
he is longing
All questions will appear to be Winograd Schema halves. There will be a sentence or short sequence of sentences with at least one pronoun that has two or more possible referents; the system.s task is to correctly identify the referent of the pronoun. Winograd Schema halves have an alternate, hidden form in which a special word or phrase can be substituted in the sentence, resulting in another referent for the pronoun. Sentences that do not have this alternate, hidden form are known as Pronoun Disambiguation Problems.
4.Number of rounds and questions: There will be two rounds in the competition. The first round will consist of at least 60 Pronoun Disambiguation Problems. The second round will consist of at least 60 Winograd Schemas. Only those excelling in the first round will advance to the second round.
5.Format for submissions: 1.Submissions must be made in executable code form on a disc or other memory device that will run on a personal computer or laptop.
2.Submissions should be mailed to (and arrive by July 1, 2016):
Charlie Ortiz
Nuance Communications
1198 E. Arques Ave.
Sunnyvale, CA 94085
3.If your program requires access to an Internet search engine during processing please let us know ahead of time so that this can be accommodated. Internet access will be made available through fiber optic or cable modem line. Cellular and wifi access will be blocked. A restricted set of internet sites will be available, including Google. All internet access will be monitored and recorded.
4. Each submission must accept WSs and PDPs in the standard form indicated above
5.Your software should create a file named TeamName-output.tx, of two separate lines followewhere TeamName is your team.s name, with output formatted so that each problem consists of two separate lines followed by a line, as follows: . Schema Number:
. Answers
. Carriage Return
An example is shown in http://www.cs.nyu.edu/faculty/davise/WSSampleOutput.txt
6. Time limit for processing a WS: 5 minutes of CPU time for each WS.
7.Evaluation criteria: The criteria for the grand prize of $25,000 will be given to the best entry that scores at least 90% or within 3 percentage points of human performance, whichever is higher.
8.Registration form: All entries should fill out the registration form which will be available at www.CommonsenseReasoning.org/winograd.html
9.Reproducibility: This competition is meant to advance science. A prerequisite for receiving a prize is demonstrating, through sharing code, publishing reproducible algorithms, etc. reproducibility. Further details will be made available.
10.Supporting Team Formation: A web page will be linked from this page containing the names of teams that are looking for collaborators as well as software that they might want to share with other teams. Both academic, industry and mixed teams may enter the competition.
11.Frequency of competition: TBD
12.Announcement of results: The results of each competition will be made public at the www.commonsensereasoning.org and the www.nuance.com sites
13.Evaluation committee (To be announced):
14.Best of competition prizes: For entries that do not meet the human threshold specified in (7), the following prizes will be given . $3,000 for the best entry that scores over [TBD]
. $1,500 for the second best entry that scores over [TBD ]
15. Future tests: The committee is considering possible future theme based tests (e.g., for particular areas of commonsense reasoning) or separate tracks.
16.Questions: If you have any questions about the competition, please contact leora.morgenstern@leidos.com or charles.ortiz@nuance.com
Updated Winograd Schema Challenge Competition Rules
-
Deadline: The deadline for registration is July 1, 2016.
The competition itself will be held at
IJCAI 2016 in New York City, July 12, 2016.
-
Input format: Contestant programs will receive their input
in the form of an .xml file. An example file may be found at
http://www.cs.nyu.edu/faculty/davise/papers/WSCExample.xml.
The structure of the .xml should be self-explanatory on inspection
of this file.
3.Problems: All problems have the following form. There is a text of a
single sentence or a few sentences that contains one or more pronouns
with ambiguous referents. Each problem asks about one such pronoun. The
pronoun that is the subject of the problem is demarcated in the XML
input with the tag < pron > < /pron >. (In viewing the XML file in a web
browser, the pronoun appears in boldface.) After the text, there is a short
excerpt from the text containing the pronoun that is the subject of the problem
and a few words that occur on one side or the other; this is for the benefit
human viewers. Finally, a list of possible referents is given, labelled