563213d
README.txt edited online with Bitbucket

# Overview

Atlassian Sourcetree is a free Git and Mercurial client for Windows.

Atlassian Sourcetree is a free Git and Mercurial client for Mac.

Parallel and Distributed Training of Neural Networks (MATLAB/Python) Description ------- This code is a general library implementing parallel and distributed algorithms for training neural networks, based on the framework of successive convex approximations (SCA, see [1-3]). This can be used to train a neural network whenever training data is distributed over a network of interconnected agents, following an iterative two-step process: 1) Optimization: each agent solves a strongly convex approximation of its own (non-convex) training problem. This step can eventually be parallelized up to one weight per processor. 2) Consensus: information is exchanged over the network via two local consensus steps. The framework is described in the following paper: Scardapane S. & Di Lorenzo, P. (2017). "A framework for parallel and distributed training of neural networks". Neural Networks, in press. A preprint can be found at: https://arxiv.org/abs/1610.07448 Organization (MATLAB) ------- Most of the code is contained in the "classes" folder. The basic classes are: * MultilayerPerceptron.m: a standard NN with a single hidden layer. * LearningAlgorithm.m: an abstract class for defining training procedures. * DistributedAlgorithms/NextMLP.m: abstract class for defining distributed (possibly parallel) algorithms based on the SCA framework. Four implementations of the NextMLP framework are provided: * L2_NextMLP.m: squared loss and l2 regularization on weights. The surrogate function is defined by linearizing only the neural network model and keeping fixed the rest of the cost function (see Sec. 4.2 in the paper). * Lin_L2_NextMLP.m: same cost function as before, but the surrogate is obtained by linearizing the overall error function. This is slower, but the optimum is obtained without need of computing an inverse (again, see Sec. 4.2 in the paper). * L1_NextMLP.m: squared loss and l1 regularization to impose sparsity, while the surrogate is obtained by partial linearization. The resulting l1-minimization problem is solved with an ad-hoc library contained in the "functions/L1General" folder (see Sec. 4.3a in the paper). * Lin_L1_NextMLP.m: squared loss and l1 regularization, with complete linearization of the error function. The optimum can be expressed in closed form with the use of soft-thresholding (again, see Sec. 4.3a in the paper). Two additional algorithms are provided as comparison in the centralized case: * CentralizedAlgorithms/VanillaMLP.m: basic stochastic gradient descent with backpropagation. * CentralizedAlgorithms/MatlabMLP.m: a wrapper to the training functions in the Neural Networks toolbox of MATLAB. Additionally, the library provides some baselines training algorithms, and some utility functions to split the dataset and initialize the network of agents. Usage (MATLAB) ------- To launch a simulation, simply use the script 'test_script.m'. All the configuration parameters are specified in the 'params_selection.m' file. Two classes of algorithms are compared: * Centralized algorithms, defined in the 'centralized_algorithms' struct. * Distributed algorithms, defined in the 'distributed_algorithms' struct. The usage of all the other parameters is described in the comments. Additionally, we provide an almost extensive unitary testing suite which can be executed with the 'run_test_suite.m' script. Tests are found in the "tests" folder. Python ------- A Python porting, built on top of the popular Theano and Lasagne libraries, is available in the folder 'python'. Differently from the MATLAB version, this can be run with multiple hidden layers in the network and cross-entropy losses, while the l1 regularizers are not yet implemented. Centralized and distributed algorithms are available in two separate modules, while the script 'run_simulation' can be used to run all the different experiments. Its configuration is similar to the MATLAB equivalent. The code is still under active development, so it may change during the next months. Also, the test suite and the documentation are only partially provided. Licensing ------- The code is distributed under BSD-2 license. Please see the file called LICENSE. The MATLAB code includes the L1General library by M. Schmidt. Copyright information is given in the respective folder. It also uses several utility functions from MATLAB Central. Copyright information and licenses can be found in the 'functions' folder. The MATLAB classes for handling network topologies (folder 'classes/NetworkUtilities', and partitioning of the dataset (folder 'classes/PartitionStrategies'), are adapted from the Lynx MATLAB toolbox: https://github.com/ispamm/Lynx-Toolbox. References ------- [1] P. Di Lorenzo and G. Scutari, "NEXT: In-Network Nonconvex Optimization" (2016). IEEE Transactions on Signal and Information Processing over Networks, 2(2), pp. 120-136. [2] Facchinei, F., Scutari, G., & Sagratella, S. (2015). "Parallel selective algorithms for nonconvex big data optimization". IEEE Transactions on Signal Processing, 63(7), pp. 1874-1889. [3] Scardapane S. & Di Lorenzo, P. (2017). "A framework for parallel and distributed training of neural networks". Neural Networks, in press. Contacts ------- o If you have any request, bug report, or inquiry, you can contact the author at simone [dot] scardapane [at] uniroma1 [dot] it. o Additional contact information can also be found on the website of the author: http://ispac.diet.uniroma1.it/scardapane/