Algorithm convergence and performance diagnostics

This tutorial is meant to discuss the tools (plots, basically) for answering the question:

does my optimization algorithm truly work?

In order to do that we are going to look at the behavior of the algorithm in the sense of achieved function value, and, where possible, its actual samples distribution.

The task before the workshop on Particle Swarm Optimization

In order to fully utilize the workshop contents please prepare in the following way:

Intall your favorite IDE on your laptop
Prepare a Monte Carlo optimization algorithm for Rastrigin and Rosenbrock functions (see .NET code for references)
Prepare a Grid Search optimization algorithm for Rastrigin and Rosenbrock functions
Within the optimization code create a logger generating CSV file with columns identical to described below
Install R 3.6.1 and RStudio 1.2.5
Test results logged by your random and grid search optimizers within the provided R Script
Test provided hill climbing results with the provided contour plot R Script
- Remember to set the proper location of your files within the R scripts!
Bring the laptop to the workshop with you

Optimization environment preparation

In order to present algorithms convergence and sampling distribution, we are going to need some data. For this part I am going to write a naive Monte Carlo approach and a grid search in .NET Core, but feel free to use any language you prefer, as long as we will be compatible with regards to the output log format.

.NET Core code will be available here

Crucial code of Monte Carlo optimizer

If we enclose the following code within proper classes:

public override ValuedSample Optimize()
{
    ValuedSample bestSample = null;
    for (int sampleIdx = 0; sampleIdx < SampleCount; sampleIdx++)
    {
        double[] x = RandomSampleWithinFunctionBounds();
        ValuedSample testSample = new ValuedSample(x, FunctionToOptimize);
        if (bestSample == null || testSample.Value < bestSample.Value)
        {
            bestSample = testSample;
        }
    }
    return bestSample;
}

private double[] RandomSampleWithinFunctionBounds()
{
    double[] x = new double[FunctionToOptimize.Dimension];
    for (int dim = 0; dim < FunctionToOptimize.Dimension; dim++)
    {
        x[dim] = FunctionToOptimize.LowerBoundary[dim] +
            (FunctionToOptimize.UpperBoundary[dim] -
             FunctionToOptimize.LowerBoundary[dim]) * 
             Utilities.RandomGenerator.NextDouble();
    }

    return x;
}

We will be able to run the basic random optimization approach in a following way:

class Program
{
    static void Main(string[] args)
    {
        Utilities.SetRandomSeed(seed: 1);
        IQualityFunction qualityFunction = new RastriginFunction(dimension: 2);
        OptimizationAlgorithm algorithm = new MonteCarloAlgorithm(qualityFunction, sampleCount: 10000);
        ValuedSample result = algorithm.Optimize();
        Console.WriteLine(result.Value);
        Console.WriteLine(String.Join(';', result.X));
    }
}

Logging the algorithm run

Obviously the approach presented above would give us only the final result - which is of course the most important thing - but here we are after observing the whole algorithm run, in order to research and debug it.

For the purpose of the optimization process we have used a ValuedSample class - simple structure for storing argument vector and resulting IQualityFunction (objective function, feasibility function) value.

Let's start with logging the series of evaluations and than improve them with logging additional data:

public class OptimizationLogger
{
    private const int SAMPLES_BUFFER = 1000;
    private List<ValuedSample> samplesToStore;
    private List<ValuedSample> samples;
    private object addingLock = new object();
    private object storingLock = new object();
    private bool isFirstSample;

    public string LogName { get; }
    public string LoggedAlgorithmName { get; }
    public DateTime LoggingStartTime { get; private set; }

    public OptimizationLogger(string logName, string loggedAlgorithmName)
    {
        LogName = logName;
        LoggedAlgorithmName = loggedAlgorithmName;
        ResetLogger();
    }

    void ResetLogger()
    {
        LoggingStartTime = DateTime.Now;
        lock (addingLock)
        {
            samples = new List<ValuedSample>();
        }
        lock (storingLock)
        {
            isFirstSample = true;
            File.Delete(LogName);
            samplesToStore = new List<ValuedSample>();
        }
    }

    public void LogSample(ValuedSample sample)
    {
        lock (addingLock)
        {
            samples.Add(sample);
        }
        if (samples.Count >= SAMPLES_BUFFER)
        {
            FlushSamples();
        }
    }

    public void FlushSamples()
    {
        lock (addingLock)
        {
            lock (storingLock)
            {
                samplesToStore.AddRange(samples);
                Thread storingThread = new Thread(StoreSamples);
                storingThread.Start();
            }
            samples.Clear();
        }
    }

    protected void StoreSamples()
    {
        lock (storingLock)
        {
            if (isFirstSample && samplesToStore.Any())
            {
                File.AppendAllLines(LogName, new string[] {
                    string.Format(
                        "Name\t" +
                        "Value\t{0}",
                        string.Join("\t", samplesToStore[0].X.Select((s,idx) => string.Format("X{0}", idx+1)))
                        )
                });
                isFirstSample = false;
            }
            IEnumerable<string> lines = samplesToStore.Select(sample =>
                string.Format(CultureInfo.InvariantCulture, "{0}\t{1}\t{2}", LoggedAlgorithmName, sample.Value,
                    string.Join("\t", sample.X.Select(x =>
                        string.Format(CultureInfo.InvariantCulture, "{0}", x)
                        ))
                    )
                );
            File.AppendAllLines(LogName, lines);
            samplesToStore.Clear();
        }
    }
}

Having such an OptimizationLogger object it may be now utilized within algorithms or quality functions.

For example it can be added to MonteCarloAlgorithm.Optimize() method

public override ValuedSample Optimize()
{
    ValuedSample bestSample = null;
    for (int sampleIdx = 0; sampleIdx < SampleCount; sampleIdx++)
    {
        double[] x = RandomSampleWithinFunctionBounds();
        ValuedSample testSample = new ValuedSample(x, FunctionToOptimize);
        Logger.LogSample(testSample);
        if (bestSample == null || testSample.Value < bestSample.Value)
        {
            bestSample = testSample;
        }
    }
    Logger.FlushSamples();
    return bestSample;
}

...after being injected into the MonteCarloAlgorithm class itself:

static void Main(string[] args)
{
    OptimizationLogger logger = new OptimizationLogger("mc-rastrigin-log.csv", "MC");
    Utilities.SetRandomSeed(seed: 1);
    IQualityFunction qualityFunction = new RastriginFunction(dimension: 2);
    OptimizationAlgorithm algorithm = new MonteCarloAlgorithm(qualityFunction, logger, sampleCount: 10000);
    ValuedSample result = algorithm.Optimize();
    Console.WriteLine(result.Value);
    Console.WriteLine(String.Join(';', result.X));
}

CSV file format

The above code (see full source here) will generate tab separated CSV file with the following form:

Name    Value   X1  X2
MC  20.612892773532423  -2.57363369823137   -3.9859816736662674
MC  5.520117218854509   -0.337810638108202  2.7812262095051006
MC  -0.06087417797589767    1.6129934723735753  -0.6883061626778479
MC  18.93160368211865   -1.4941822607322512 4.545149707526504
MC  16.69946316855684   -4.083031515592259  1.4587448866007593
MC  27.542993367402246  -4.826759037834946  -2.580181549796919
MC  28.087825693297862  -1.8420722517008299 5.015215592335545
MC  3.797238642274225   1.8649407371622235  1.5860519631142038
MC  4.9216728406385215  -2.224852746569017  1.1811472206847498
...

The examples of CSV files are provided here.

After those initial steps we are ready to create our first diagnostic tool in R (of course, any environment with sensible plotting ability (python, matlab, octave) will be sufficient).

Convergence plots

The most basic way of verifying the performance of the algorithm, is to observe the speed of its convergence to the optimal value (minimum in this case).

Usually convergence speed should be considered in two domains

number of function samples necessary to obtain certain function value
computations time in which a certain function value has been obtained

In real-world applications the second measure is definitely more important, while the first one enables researchers to compare the algorithm regardless of the computational power and code optimization prowess of the designer, focusing on the algorithm properties.

Convergence plot

The above plot can be easily generated with the following R script:

#start with setting the location of CSV log files
#in R '~' means home directory location
setwd('~/Repozytoria/Optymalizacja/tutorials/r/sample-data/')

#load CSV file for algorithm run
rastrigin.run = read.csv('mc-rastrigin-log.csv', sep='\t', dec='.')

#plot a cumulative minimum of samples function values
plot(cummin(rastrigin.run$Value),
     #y axis will be presented in logarithimic scale
     log='y',
     #a line plot will be created
     type = 'l',
     #line thickness will be set to 2 units
     lwd=2,
     #line color will be set to blue
     col = 'blue',
     #plot title
     main = 'Rastrigin - Monte Carlo',
     #axis labels
     xlab = 'Evaluations',
     ylab = 'log(function.value)'
     )

if we want to generate an image file it will be enough to create a png printing device around plot call (and close it afterwards to close the output stream)

...

#create png with certain width and height
png('evaluations-convergence.png', width = 600, height = 300)

#plot a cumulative minimum of samples function values
plot(cummin(rastrigin.run$Value),
     #y axis will be presented in logarithimic scale
     log='y',
     #a line plot will be created
     type = 'l',
     #line thickness will be set to 2 units
     lwd=2,
     #line color will be set to blue
     col = 'blue',
     #plot title
     main = 'Rastrigin - Monte Carlo',
     #axis labels
     xlab = 'Evaluations',
     ylab = 'log(function.value)'
     )

#close image file output stream
dev.off()

After adding a second optimization algorithm, namely a grid search, we may now plot the results pf algorithms against one another in order to see which algorithm performed better:

Convergence plot with grid search

Because grid search eventually finds the optimum at (0,0) it was necessary to switch off the log scale. Here its finally victory over Monte Carlo can be observed:

Convergence plot woth grid search zoomed in

Comparison plot can be generated with an R script in the following form (remember to set proper files location with setwd()):

#load CSV files for algorithm run
rastrigin.mc.run = read.csv('mc-rastrigin-log.csv', sep='\t', dec='.')
rastrigin.gs.run = read.csv('gs-rastrigin-log.csv', sep='\t', dec='.')

#plotting to file - remember to call dev.off()
#png('evaluations-convergence-with-gs.png', width = 600, height = 300)

#plot a cumulative minimum of samples function values
#to get the column in R we may use DATA$COLUMN syntax
plot(cummin(rastrigin.mc.run$Value),
     #a line plot will be created
     type = 'l',
     #line thickness will be set to 2 units
     lwd=2,
     #line color will be set to blue
     col = 'blue',
     #plot title
     main = 'Rastrigin - Comparison',
     #axis labels
     xlab = 'Evaluations',
     ylab = 'function.value'#,
     #limit the y-axis to [0,0.1],
     #ylim=c(0,1e-1)
     )
#add a second line to the same plot
lines(cummin(rastrigin.gs.run$Value+1e-16),
      lwd=2,
      col = 'red')
#draw a legend
legend("topright", legend = c('MC','GS'), lwd = c(2,2), col = c("blue","red"))

#if calling png above - use also dev.off
#dev.off()

However, writing a Grid Search algorithm has not been motivated to get that miniscule advantage over Monte Carlo approach...

Sampling distribution plots over 2D contours

The real reason behind writing a Grid Search has been to obtain data useful for plotting contours of 2D functions. And against those contours we can plot samples gathered by the algorithm in order to observe its way of sampling the optimized function arguments:

Rastrigin contour plot Rosenbrock contour plot

In order to generate those plots you may use a following R script:

#in order to get a better plot experience we are going to use ggplot2 package
if('ggplot2' %in% rownames(installed.packages()) == FALSE) {
  install.packages('ggplot2')
}
require(ggplot2)
if('svglite' %in% rownames(installed.packages()) == FALSE) {
  install.packages('svglite')
}
require(svglite)

#start with setting the location of CSV log files
#in R '~' means home directory location
setwd('~/Repozytoria/Optymalizacja/tutorials/r/sample-data/')

#load CSV files for algorithm run
algorithm.run = read.csv('mc-rosenbrock-log.csv', sep='\t', dec='.')
#now gs data will be used to draw a contour of the function
function.data = read.csv('gs-rosenbrock-log.csv', sep='\t', dec='.')

v <- ggplot(function.data, aes(X1, X2, z = Value))  + coord_fixed()
v <- v + geom_raster(aes(fill = Value)) +
  scale_fill_gradientn(colours=c("#00FF00FF","#FFFF00FF","#FF0000FF")) +
  stat_contour(binwidth = sd(function.data$Value) / 2, colour = 'black') +
  #adjust transparency on the basis of samples count
  geom_point(data = algorithm.run, alpha = 700 / nrow(algorithm.run), aes(X1, X2))
print(v)
ggsave('rosenbrock-contour.png', plot = v, width=8, height=6, units = 'in')

While presented sampling of the Monte Carlo is not very informative you can also check results of the Hill Climbing approach stored in hc-rastrigin-log-10000.csv and hc2-rastrigin-log-10000 files:

Rastrigin contour with hill climbing

Wiki

Tutorials / Introduction to algorithm diagnostics