HTTPS SSH

triNNity-optimizer

This project is a front-end for the triNNity DNN optimization approach.

Build and Installation

stack install


Optimization Modes

The currently supported optimization modes are heuristic and pbqp.

The heuristic mode has a number of heuristics which can be used to generate an instantiation of a network. To see a list of these modes, say triNNity-optimizer heuristic --help

The pbqp mode uses the topology of the network and the provided cost models to generate a large PBQP query, suitable for solving with an external PBQP solver.

Network Instantiation

The result of the optimization process is a solution file, which specifies what algorithm to use for each convolution in the network.

For convenience, the optimizer can apply that solution to generate the code for an instantiation of the network. At present, only the triNNity primitive library is supported, but support for other frameworks can be added fairly easily.

For more information, say triNNity-optimizer instantiate --help.


User Interface and Input Files

The optimizer uses several simple plaintext file formats to represent the necessary features of networks and of the hardware.

Network Topology Description File

This file describes the topology of the network, in terms of the layers. At present, we only model convolution layers because all other layers are inconsequential in terms of performance. This may change in future.

The first line of the file must contain three whitespace separated integers: the number of nodes (N), the number of parameters (P) which define a node, and the number of edges (E). For example, in examples/networks/alexnet/alexnet.topology, the first line is 5 7 4, indicating N=5 nodes, each defined by P=7 parameters, and E=4 edges.

The following N lines must contain P whitespace separated integers, defining each node. The subsequent E lines must contain exactly 2 whitespace separated integers, with the source and sink index for the edge.

While nodes and edges in the DNN graph may be specified in any order, they are referred to by their index in the order specified in the topology file.

Network Constraints Description File

The constraints file describes constraints on network nodes that must hold irrespective of the solving method. For example, if some node must accept CHW input, or some node must be implemented by a specific algorithm, et cetera, this is the place to record that information.

The file should consist of N comma-and-whitespace separated lines. Each line should consist of an integer, naming the graph node to which the constraint applies, and three constraint fields, which are strings. The first field is the input data layout, the second is the algorithm name to use, and the third is the output data layout. You can use an asterisk to indicate a wild-card field.

For example, if the second convolution in our network should be implemented using the frobnicate algorithm, the constraints file would contain a line like:

1, *, frobnicate, *

If the seventh convolution should produce output in HCW format (perhaps because of a special nonlinearity, or another operation requiring this layout), the constraints file would contain a line like:

6, *, *, hcw

Algorithm and Data Layout Description File

This file defines the set of algorithms available to implement the graph nodes. Each line has three comma-separated fields: algorithm name, input layout identifier, and output layout identifier. This file is the source of all information about what algorithms and data layouts exist; when you name a data layout or algorithm in the network constraints file, it is the algorithm names and layout names in this file that you are matching against.

This file typically only needs to be written once for each backend library that you would like to connect to the optimizer. It is a static descriptor of the algorithms contained in that library.

Solution File

Both the heuristic and solver-based optimization modes produce a solution which specifies how the network should be configured. This solution file has a single integer per line, specifying in order, for each graph node listed in the topology, the index of the algorithm to use to implement that node. Each index returned is the algorithm index in the Algorithm and Data Layout Description File.


Solver Specific Files

For solver-based optimization, cost model data must be provided so that the optimizer can make better decisions based on real benchmarking of some hardware where you intend to run your network. There are two cost model files that must be produced; one for nodes (convolutions) and one for edges (data layout transformations).

Algorithm Cost Model

The convolution algorithm cost file has a straightfoward layout. Each column corresponds to one set of convolution parameters which has been benchmarked. Each row corresponds to a convolution algorithm.

The first P rows are special: they must list, in order, the parameters which define the convolution being benchmarked. These are the same parameters from the network topology file.

The first column is also special: it must contain the string corresponding to the name of the algorithm being benchmarked, or in the first P rows, the name of the convolutional parameter on the row.

All other fields should contain floating point numbers. The value zero is specially treated to mean that the algorithm in question cannot be used for a convolution with the specified parameters. For example, Winograd convolution cannot be used with k == 1.

Data Layout Transformation Cost Model

The cost file for data layout transformations is also straightforward. Each row corresponds to the edge with the same index in the network topology file. There are only 3 special rows at the start, for the number of input channels, input width, and input height of the graph edge. The first column is again special, containing the name of the transformation, which is something like chw-to-hcw.

Inferred Costs

It is often not possible to microbenchmark every possible scenario in a large project. For example, to produce a full fine-grained cost model for ResNet-152 using the triNNity library as the back-end, 152 * 71 = 10792 microbenchmarks would have to be run. If each benchmark took one minute, that's almost 180 hours of benchmarking!

The optimizer supports inferred costs, and can optimize networks with convolutions with unseen parameters. When a convolution does not appear in microbenchmarking data, the optimizer will compute an approximate cost by interpolation (currently using the L2 norm) from similar convolutions that do appear in the cost model.


Example usage

This tool ships with example network descriptions and cost model data to allow you to quickly test integration with your toolchain without having to first produce your entire benchmarking data set.

Heuristic Mode

triNNity-optimizer heuristic -x LOCAL_OPTIMAL  \
                             -t examples/networks/alexnet/alexnet.topology  \
                             -a examples/libraries/triNNity/triNNity-algorithms.csv  \
                             -n examples/cost-models/Intel\(R\)\ Core\(TM\)\ i5-2500K\ CPU\ @\ 3.30GHz/hcy.csv

This command runs the optimizer in heuristic mode, selecting the LOCAL_OPTIMAL heuristic. We pass in the AlexNet topology description, the algorithm names and data layout descriptions for the triNNity library, (triNNity-algorithms.csv), and finally the cost model to use, generated with microbenchmarking. The output (on standard output) should be as follows:

0
50
44
44
44

The heuristic suggests we use algorithms with index 0, 50, 44, 44, and 44 for the respective layers of AlexNet. Looking at examples/libraries/triNNity/triNNity-algorithms.csv, we can see that these correspond to the algorithms direct-sum2d, winograd-3x3-5x5-vec-4, and winograd-2x2-3x3-vec-4 in the triNNity library.

This output is in the "solution" file format described earlier, and can be redirected to a file to save for later use.

The command list-heuristics can be used to show a list of the available heuristics.

PBQP Mode

triNNity-optimizer pbqp -t examples/networks/alexnet/alexnet.topology  \
                        -a examples/libraries/triNNity/triNNity-algorithms.csv  \
                        -n examples/cost-models/Intel\(R\)\ Core\(TM\)\ i5-2500K\ CPU\ @\ 3.30GHz/hcy.csv  \
                        -e examples/cost-models/Intel\(R\)\ Core\(TM\)\ i5-2500K\ CPU\ @\ 3.30GHz/hcy-transforms.csv  \
                        -s examples/networks/alexnet/alexnet.constraints

PBQP mode requires the network constraints file and the edge costs file in addition to the node costs. The optimizer traverses the graph structure of the network building a query that covers all possible instantiations of the network under the supplied constraints and with the supplied costs.

The output (on standard output) is a plaintext query prepared for input to a PBQP solver. We don't reproduce the output here because the PBQP query is quite large, but it can be viewed by copying and pasting the command given into your terminal. This query can be redirected to a file for later use, or piped directly into the input of your preferred PBQP solver, if it supports reading from standard input.

The PBQP solver should process the query and give back a solution that looks just like the one in the heuristic example above (though probably with different methods selected, and some solvers append metadata, such as projected total costs, that can be ignored). The exact means of producing the solution file will vary from solver to solver, and may require the user to do a little plumbing work on the command line.

Instantiate Mode

triNNity-optimizer instantiate -t examples/networks/alexnet/alexnet.topology \
                               -a examples/libraries/triNNity/triNNity-algorithms.csv \
                               -m examples/libraries/triNNity/triNNity-methods.csv \
                               -s /tmp/test.solution \
                               -i TRINNITY \
                               -l examples/networks/alexnet/alexnet.layers

Instantiate mode is a very simple tool that turns a solution into a more useful format. For example, a header files which contains #define statements that specify the algorithm to use for each layer using enum values from a library.

The example shown uses the backend -i TRINNITY which generates a header file suitable for use with the example networks which ship with the triNNity library.

The auxiliary file alexnet.layers contains the mapping from the layers as they are specified in the topology file to logical identifiers that can be used when emitting code (for backend frameworks that identify layers with a string key).