Search for relationships between image regions in a database of annotated images using a sketch or an example of the desired relationships. This is a small proof-of-concept tool based on the RAID descriptor (RAID: A Relation-Augmented Image Descriptor).
This tool requires Matlab version >= 2015b. It was tested on Windows 64 bit. It should also work on other platforms (including Unix and Mac), but was not tested there. Some recompilation may be required on these platforms (you will be asked when starting the tool). A Matlab-compatible compilation environment for your platform may then be necessary, for details go here.
- Clone or download code into a folder.
- Download dataset available here and place it in a folder called 'data' on the same level as the code folder (the tool expects to find a '../data' folder relative to the code folder)
- Open Matlab, navigate to the code folder and type 'restartEditor'. (At this point you may be asked to recompile if you are NOT running Windows 64 bit).
Use the navigate button on the top left (or press 1) to toggle navigation mode. In navigation mode, you can pan with the left mouse button and zoom with the right mouse button (or ctrl + left mouse button).
Image Dataset Navigation
Drag the grey image preview bar with the right mouse button (or ctrl + left mouse button). Select an image by double clicking on it.
Select the example image from the dataset. Then select an example source region by clicking on the 'Example Query Source' button and selecting an image region. Select an example target region in the same way. Then press one of the four query buttons, X stands for an arbitrary label. Result Images containing similar relationships, sorted by similarity to the query, are then shown in the gray image preview bar. To reset, press 'Clear Query'.
Sketch the query source region by pressing 'Sketch Query Source' and drawing on the gridded canvas. Similarly, sketch the query target region by pressing 'Sketch Query Target'. To change the size of the brush hold the middle mouse button (or shift + left mouse button) and drag the mouse up or down. Erase with the right mouse button (or ctrl + left mouse button) and reset the sketch by pressing the Delete button. Pressing the 'l' button while the 'Sketch Query Source' or 'Sketch Query Target' buttons are pressed to add labels to the source- or target regions, respectively. Then press one of the four query buttons, X stands for an arbitrary label. Result Images containing similar relationships, sorted by similarity to the query, are then shown in the gray image preview bar. To reset, press 'Clear Query'.
The default dataset set are 10000 manually annotated images from the COCO dataset. A different dataset can be loaded from the File menu. Pre-computed descriptors are loaded along with each dataset. This can take quite some time, since the data structure used to store the pre-computed descriptors is not optimal for fast saving/loading. A faster data structure would definitely be possible, but would require re-writing some parts of the code.
Custom Datasets (experimental)
To create a custom dataset, annotations of image regions and dataset meta-information has to be created. Details on the formats are shown below. Then, the dataset needs to be loaded in the editor (File > Load Database) and RAID descriptors need to be pre-computed for all image region relatinoships (Actions > Compute & Save Descriptors). The pre-computed descriptors can then be loaded manually (File > Load Descriptors) or automatically when the dataset is loaded if they are referenced in the dataset meta-information, as described below. The dataset is then ready for querying.
Annotations of image regions are stored in separate files, one per image. A simple xml format describes the annotated image regions, below we show a simple example:
<?xml version="1.0" encoding="utf-8"?> <xml> <annotation> <object id="1" iscrowd="0" label="2"> <polygon verts="0.46107189542483656,0.60164141414141414,0.54691339869281053,0.60164141414141414,0.54691339869281053,0.40719696969696972,0.46107189542483656,0.40719696969696972,"/> </object> <object id="2" iscrowd="0" label="1"> <polygon verts="0.38471241830065356,0.53577020202020198,0.6152843137254902,0.53577020202020198,0.6152843137254902,0.47020959595959599,0.38471241830065356,0.47020959595959599,"/> </object> </annotation> </xml>
Each region is stored as an object with one or more polygons. The 'id' attribute contains a unique id for each image region and the 'label' attribute contains the index of the label of the region (the list of possible labels are stored in a separate file that will be described below). Polygons vertices are stored in the format x,y,x,y,... Coordinates are normalized to lie in [0,1], where (0,0) is the lower left image corner and (1,1) the upper right corner. Annotations are expected to have the same name as the image files (including any sub-paths from the dataset root folder), but ending in '.xml'.
In memory, annotations are stored in the
ImageAnnotation class, which contains a list of
ImageObject, corresponding to the
object xml nodes:
classdef ImageAnnotation < handle properties(SetAccess=protected) imgobjects = ImageObject.empty(1,0); % the list of image objects relationships = ImageObjectRelationshipSet.empty; % optional: a list of labeled image object relatinoships end
classdef ImageObject < handle properties(SetAccess=protected) id = 0; % id of the image object, needs to be unique within the image (0 means no id) label = 0; % label index of the image object (0 means no label) polygon = cell(1,0); % polygons that form the image region for this object; % 2xn array, first row is x-, second row y-coordinate % (polygons are always clockwise, since every polygon is supposed % to represent an outer contour, holes are not allowed for now) iscrowd = false; % true if the polygon represents multiple objects, %too many to annotate all individually (for example a pile of apples) saliency = nan; % saliency of the image object (nan means unknown) end
Annotations can be read with
annotation = AnnotationImporter.importAnnotation(filename)
and written with
Database meta-information is stored in several files that are referenced in a simple text file. An example is shown below:
dbinfo : ../data/annotations/artificial_manual/dbinfo.txt categories : ../data/annotations/artificial_manual/categories.xml relcategories : ../data/annotations/artificial_manual/relcategories.xml categorycolors : ../data/annotations/artificial_manual/categorycolors.txt imagepath : ../data/images/artificial annotationpath : ../data/annotations/artificial_manual/artificial workingset : ../data/annotations/artificial_manual/artificial_AllButDwarfingAndCapping.txt trainingset : ../data/annotations/artificial_manual/artificial_AllButDwarfingAndCapping.txt descriptors : ../data/descriptors/Descriptors_artificial_manual_AllButDwarfingAndCapping__iradf-1_oradf-0p5.zip
Each of these files is described below:
A text file listing all database images. For example:
Each line contains one database images, followed by a comma and a 0 (this number indicates if the image has annotated region relationships, but setting it to 1 is optional).
bridgingHor_0.png,0 bridgingHor_1a.png,0 bridgingHor_1b.png,0 bridgingHor_1c.png,0 bridgingHor_2a.png,0 bridgingHor_2b.png,0 bridgingHor_3a.png,0 bridgingHor_3b.png,0 ...
An xml file listing all label categories for image regions used in the database, for example:
<?xml version="1.0" encoding="utf-8"?> <xml> <category id="1" name="window"/> <category id="2" name="car"/> <category id="3" name="ladder"/> <category id="4" name="fence"/> <category id="5" name="tree"/> ... </xml>
Each category gets a unique id that is referenced by the image annotations and a human-readable name.
An xml file listing all label categories for image region relationships used in the database, for example:
<?xml version="1.0" encoding="utf-8"?> <xml> <relcategory id="1" name="bridgingHor"/> <relcategory id="2" name="crossingVert"/> <relcategory id="3" name="capping"/> <relcategory id="4" name="hanging"/> <relcategory id="5" name="leaning"/> ... </xml>
Populating this list with relationship categories is optional and only needed if relationship labels are required (e.g. when classifying relationship). The list can be empty for issuing sketch- or example-based queries.
A text file listing the color used for each label category for image regions, for example:
#00FF00 #0000FF #FF0000 #00F6FF #FF7BDC #FFDC7B #008DFF #006A09
Line 1 contains the color for the label with id 1, line 2 for id 2, etc. Colors are stored as RGB hex triplets. The function
distinguishable_colors can create a set of preceptually distinct colors and the functions
rgbdec2rgbhex convert between these hex triplets and a standard RGB color representation with 3 values in [0,1].
imagepath and annotationpath
These refer to the root path of the image dataset and the annotation files for the dataset. In these root folders, images and annotations need to have the same sub-folder hierarchy and the same file names, except for the image extension, which is always '.xml' for annotations.
This line is optional and can be omitted. Instead of working with the full dataset, a subset can be used. This subset is specified by listing all subset image files in a text file, for example:
bridgingHor_0.png bridgingHor_1a.png bridgingHor_1b.png bridgingHor_1c.png bridgingHor_2a.png bridgingHor_2b.png bridgingHor_3a.png ...
One line for each image. Queries will be issued into this subset of the full dataset. If this line is not present, the workingset is set to the full dataset.
This line is optional and can be omitted. The training set is only needed when doing classification (not for queries). It is specified in the same way as the workingset.
If pre-computed RAID descriptors are available for the workingset, they can be referenced here. These descriptors are then used for queries.
For any questions or comments, please contact paul.guerrero (a) ucl.ac.uk