# Overview

Atlassian Sourcetree is a free Git and Mercurial client for Windows.

Atlassian Sourcetree is a free Git and Mercurial client for Mac.

# triNNity-optimizer

This project is a front-end for the triNNity DNN optimization approach.

## Build and Installation

`stack install`

## Optimization Modes

The currently supported optimization modes are `heuristic`

and `pbqp`

.

The `heuristic`

mode has a number of heuristics which can be used to generate
an instantiation of a network. To see a list of these modes, say
`triNNity-optimizer heuristic --help`

The `pbqp`

mode uses the topology of the network and the provided cost models
to generate a large PBQP query, suitable for solving with an external PBQP
solver.

## Network Instantiation

The result of the optimization process is a solution file, which specifies what algorithm to use for each convolution in the network.

For convenience, the optimizer can apply that solution to generate the code for
an *instantiation* of the network. At present, only the triNNity primitive
library is supported, but support for other frameworks can be added fairly
easily.

For more information, say `triNNity-optimizer instantiate --help`

.

### User Interface and Input Files

The optimizer uses several simple plaintext file formats to represent the necessary features of networks and of the hardware.

#### Network Topology Description File

This file describes the topology of the network, in terms of the layers. At present, we only model convolution layers because all other layers are inconsequential in terms of performance. This may change in future.

The first line of the file must contain three whitespace separated integers: the
number of nodes (N), the number of parameters (P) which define a node, and the number
of edges (E). For example, in `examples/networks/alexnet/alexnet.topology`

, the
first line is `5 7 4`

, indicating N=5 nodes, each defined by P=7 parameters,
and E=4
edges.

The following N lines must contain P whitespace separated integers, defining each node. The subsequent E lines must contain exactly 2 whitespace separated integers, with the source and sink index for the edge.

While nodes and edges in the DNN graph may be specified in any order, they are
referred to by their index in the **order specified in the topology file**.

#### Network Constraints Description File

The constraints file describes constraints on network nodes that must hold
irrespective of the solving method. For example, if some node **must** accept
CHW input, or some node **must** be implemented by a specific algorithm, et
cetera, this is the place to record that information.

The file should consist of N comma-and-whitespace separated lines. Each line should consist of an integer, naming the graph node to which the constraint applies, and three constraint fields, which are strings. The first field is the input data layout, the second is the algorithm name to use, and the third is the output data layout. You can use an asterisk to indicate a wild-card field.

For example, if the second convolution in our network should be implemented
using the `frobnicate`

algorithm, the constraints file would contain a line
like:

1, *, frobnicate, *

If the seventh convolution should produce output in HCW format (perhaps because of a special nonlinearity, or another operation requiring this layout), the constraints file would contain a line like:

6, *, *, hcw

#### Algorithm and Data Layout Description File

This file defines the set of algorithms available to implement the graph nodes. Each line has three comma-separated fields: algorithm name, input layout identifier, and output layout identifier. This file is the source of all information about what algorithms and data layouts exist; when you name a data layout or algorithm in the network constraints file, it is the algorithm names and layout names in this file that you are matching against.

This file typically only needs to be written once for each backend library that you would like to connect to the optimizer. It is a static descriptor of the algorithms contained in that library.

#### Solution File

Both the heuristic and solver-based optimization modes produce a solution which specifies how the network should be configured. This solution file has a single integer per line, specifying in order, for each graph node listed in the topology, the index of the algorithm to use to implement that node. Each index returned is the algorithm index in the Algorithm and Data Layout Description File.

### Solver Specific Files

For solver-based optimization, cost model data must be provided so that the optimizer can make better decisions based on real benchmarking of some hardware where you intend to run your network. There are two cost model files that must be produced; one for nodes (convolutions) and one for edges (data layout transformations).

#### Algorithm Cost Model

The convolution algorithm cost file has a straightfoward layout. Each column corresponds to one set of convolution parameters which has been benchmarked. Each row corresponds to a convolution algorithm.

The first P rows are special: they must list, in order, the parameters which define the convolution being benchmarked. These are the same parameters from the network topology file.

The first column is also special: it must contain the string corresponding to the name of the algorithm being benchmarked, or in the first P rows, the name of the convolutional parameter on the row.

All other fields should contain floating point numbers. The value zero is
**specially treated** to mean that the algorithm in question cannot be used for
a convolution with the specified parameters. For example, Winograd convolution
cannot be used with `k == 1`

.

#### Data Layout Transformation Cost Model

The cost file for data layout transformations is also straightforward. Each row
corresponds to the edge with the same index in the network topology file. There
are only 3 special rows at the start, for the number of input channels, input
width, and input height of the graph edge. The first column is again special,
containing the name of the transformation, which is something like
`chw-to-hcw`

.

#### Inferred Costs

It is often not possible to microbenchmark every possible scenario in a large project. For example, to produce a full fine-grained cost model for ResNet-152 using the triNNity library as the back-end, 152 * 71 = 10792 microbenchmarks would have to be run. If each benchmark took one minute, that's almost 180 hours of benchmarking!

The optimizer supports inferred costs, and can optimize networks with convolutions with unseen parameters. When a convolution does not appear in microbenchmarking data, the optimizer will compute an approximate cost by interpolation (currently using the L2 norm) from similar convolutions that do appear in the cost model.

### Example usage

This tool ships with example network descriptions and cost model data to allow you to quickly test integration with your toolchain without having to first produce your entire benchmarking data set.

#### Heuristic Mode

triNNity-optimizer heuristic -x LOCAL_OPTIMAL \ -t examples/networks/alexnet/alexnet.topology \ -a examples/libraries/triNNity/triNNity-algorithms.csv \ -n examples/cost-models/Intel\(R\)\ Core\(TM\)\ i5-2500K\ CPU\ @\ 3.30GHz/hcy.csv

This command runs the optimizer in heuristic mode, selecting the
`LOCAL_OPTIMAL`

heuristic. We pass in the AlexNet topology description, the
algorithm names and data layout descriptions for the `triNNity`

library,
(`triNNity-algorithms.csv`

), and finally the cost
model to use, generated with microbenchmarking. The output (on standard output)
should be as follows:

0 50 44 44 44

The heuristic suggests we use algorithms with index 0, 50, 44, 44, and 44 for
the respective layers of AlexNet. Looking at
`examples/libraries/triNNity/triNNity-algorithms.csv`

, we can see that these
correspond to the algorithms `direct-sum2d`

, `winograd-3x3-5x5-vec-4`

, and
`winograd-2x2-3x3-vec-4`

in the triNNity library.

This output is in the "solution" file format described earlier, and can be redirected to a file to save for later use.

The command `list-heuristics`

can be used to show a list of the available
heuristics.

#### PBQP Mode

triNNity-optimizer pbqp -t examples/networks/alexnet/alexnet.topology \ -a examples/libraries/triNNity/triNNity-algorithms.csv \ -n examples/cost-models/Intel\(R\)\ Core\(TM\)\ i5-2500K\ CPU\ @\ 3.30GHz/hcy.csv \ -e examples/cost-models/Intel\(R\)\ Core\(TM\)\ i5-2500K\ CPU\ @\ 3.30GHz/hcy-transforms.csv \ -s examples/networks/alexnet/alexnet.constraints

PBQP mode requires the network constraints file and the edge costs file in addition
to the node costs. The optimizer traverses the graph structure of the network
building a *query* that covers all possible instantiations of the network under
the supplied constraints and with the supplied costs.

The output (on standard output) is a plaintext query prepared for input to a PBQP solver. We don't reproduce the output here because the PBQP query is quite large, but it can be viewed by copying and pasting the command given into your terminal. This query can be redirected to a file for later use, or piped directly into the input of your preferred PBQP solver, if it supports reading from standard input.

The PBQP solver should process the query and give back a solution that looks just like the one in the heuristic example above (though probably with different methods selected, and some solvers append metadata, such as projected total costs, that can be ignored). The exact means of producing the solution file will vary from solver to solver, and may require the user to do a little plumbing work on the command line.

#### Instantiate Mode

triNNity-optimizer instantiate -t examples/networks/alexnet/alexnet.topology \ -a examples/libraries/triNNity/triNNity-algorithms.csv \ -m examples/libraries/triNNity/triNNity-methods.csv \ -s /tmp/test.solution \ -i TRINNITY \ -l examples/networks/alexnet/alexnet.layers

Instantiate mode is a very simple tool that turns a solution into a more useful
format. For example, a header files which contains `#define`

statements that
specify the algorithm to use for each layer using enum values from a library.

The example shown uses the backend `-i TRINNITY`

which generates a header file
suitable for use with the example networks which ship with the `triNNity`

library.

The auxiliary file `alexnet.layers`

contains the mapping from the layers as they
are specified in the topology file to logical identifiers that can be used when
emitting code (for backend frameworks that identify layers with a string key).