Commits

nfitzgerald committed 7f248f2

updated README.txt

  • Participants
  • Parent commits 6adc00b

Comments (0)

Files changed (2)

 
 ----------------------------------------------------------------------
 
+1. Obtaining the Corpus
+
+There are two ways to obtan the corpus. The first way is to use git, with the
+following command:
+
+$ git clone https://nfitzgerald@bitbucket.org/nfitzgerald/genx-referring-expression-corpus.git
+
+The corpus can also be downloaded as a .zip file under the "Downloads" link:
+
+https://bitbucket.org/nfitzgerald/genx-referring-expression-corpus/downloads
+
+
+----------------------------------------------------------------------
+
+2. Corpus directory and description
+
 Here is a directory of the files in this repository:
 
 <images>
-    Contains the .png images shown to the subjects on Mechanical Turk.
+    Contains the .png images shown to the subjects on Mechanical Turk. Each
+    file is named by the scene-ID of the particular scene.
 
 <state>
 
                 - The expression was ungrammatical in a way that could not be
                   easily resolved (i.e. was just list of attributes, not a
                   proper noun phrase).
+
+    <state> - contains the world-state information for each scene. There are
+        two files:
+
+            SceneIndex.txt - lists, for each scene, which objects are in the
+                target-set (G) - i.e. which objects are circled in the
+                corresponding image. Each line of the file is formatted as:
+                    <sceneID>::<selected object>
+                where <selected objects> is a space-seperated list of the
+                objectIDs of the corresponding object.
+
+            Attributes.tsv - lists, for each object, what attributes that
+                object has (i.e. colour, shape etc.) Each lines of the file is
+                formatted as:
+                    <objectID>\t<attributes>
+                where attributes is a comma-seperated list of the objects
+                attributes. Each attribute is written as "value:type", for
+                example "red:color". Every object has the attribute "misc:misc"
+                which corresponds to words like "toy" or "object" which can
+                refer to every object in the scene.

File state/.convert_names.py.swp

Binary file removed.