nfitzgerald committed 7f248f2

updated README.txt

  • Participants
  • Parent commits 6adc00b

Comments (0)

Files changed (2)

+1. Obtaining the Corpus
+There are two ways to obtan the corpus. The first way is to use git, with the
+following command:
+$ git clone
+The corpus can also be downloaded as a .zip file under the "Downloads" link:
+2. Corpus directory and description
 Here is a directory of the files in this repository:
-    Contains the .png images shown to the subjects on Mechanical Turk.
+    Contains the .png images shown to the subjects on Mechanical Turk. Each
+    file is named by the scene-ID of the particular scene.
                 - The expression was ungrammatical in a way that could not be
                   easily resolved (i.e. was just list of attributes, not a
                   proper noun phrase).
+    <state> - contains the world-state information for each scene. There are
+        two files:
+            SceneIndex.txt - lists, for each scene, which objects are in the
+                target-set (G) - i.e. which objects are circled in the
+                corresponding image. Each line of the file is formatted as:
+                    <sceneID>::<selected object>
+                where <selected objects> is a space-seperated list of the
+                objectIDs of the corresponding object.
+            Attributes.tsv - lists, for each object, what attributes that
+                object has (i.e. colour, shape etc.) Each lines of the file is
+                formatted as:
+                    <objectID>\t<attributes>
+                where attributes is a comma-seperated list of the objects
+                attributes. Each attribute is written as "value:type", for
+                example "red:color". Every object has the attribute "misc:misc"
+                which corresponds to words like "toy" or "object" which can
+                refer to every object in the scene.

File state/

Binary file removed.