Wiki
Clone wikiTutorials / Introduction to algorithm diagnostics
Algorithm convergence and performance diagnostics
This tutorial is meant to discuss the tools (plots, basically) for answering the question:
- does my optimization algorithm truly work?
In order to do that we are going to look at the behavior of the algorithm in the sense of achieved function value, and, where possible, its actual samples distribution.
The task before the workshop on Particle Swarm Optimization
In order to fully utilize the workshop contents please prepare in the following way:
- Intall your favorite IDE on your laptop
- Prepare a Monte Carlo optimization algorithm for Rastrigin and Rosenbrock functions (see .NET code for references)
- Prepare a Grid Search optimization algorithm for Rastrigin and Rosenbrock functions
- Within the optimization code create a logger generating CSV file with columns identical to described below
- Install R 3.6.1 and RStudio 1.2.5
- Test results logged by your random and grid search optimizers within the provided R Script
- Test provided hill climbing results with the provided contour plot R Script
- Remember to set the proper location of your files within the R scripts!
- Bring the laptop to the workshop with you
Optimization environment preparation
In order to present algorithms convergence and sampling distribution, we are going to need some data. For this part I am going to write a naive Monte Carlo approach and a grid search in .NET Core, but feel free to use any language you prefer, as long as we will be compatible with regards to the output log format.
.NET Core code will be available here
Crucial code of Monte Carlo optimizer
If we enclose the following code within proper classes:
public override ValuedSample Optimize() { ValuedSample bestSample = null; for (int sampleIdx = 0; sampleIdx < SampleCount; sampleIdx++) { double[] x = RandomSampleWithinFunctionBounds(); ValuedSample testSample = new ValuedSample(x, FunctionToOptimize); if (bestSample == null || testSample.Value < bestSample.Value) { bestSample = testSample; } } return bestSample; } private double[] RandomSampleWithinFunctionBounds() { double[] x = new double[FunctionToOptimize.Dimension]; for (int dim = 0; dim < FunctionToOptimize.Dimension; dim++) { x[dim] = FunctionToOptimize.LowerBoundary[dim] + (FunctionToOptimize.UpperBoundary[dim] - FunctionToOptimize.LowerBoundary[dim]) * Utilities.RandomGenerator.NextDouble(); } return x; }
We will be able to run the basic random optimization approach in a following way:
class Program { static void Main(string[] args) { Utilities.SetRandomSeed(seed: 1); IQualityFunction qualityFunction = new RastriginFunction(dimension: 2); OptimizationAlgorithm algorithm = new MonteCarloAlgorithm(qualityFunction, sampleCount: 10000); ValuedSample result = algorithm.Optimize(); Console.WriteLine(result.Value); Console.WriteLine(String.Join(';', result.X)); } }
Logging the algorithm run
Obviously the approach presented above would give us only the final result - which is of course the most important thing - but here we are after observing the whole algorithm run, in order to research and debug it.
For the purpose of the optimization process we have used a ValuedSample
class - simple structure for storing argument vector and resulting IQualityFunction
(objective function, feasibility function) value.
Let's start with logging the series of evaluations and than improve them with logging additional data:
public class OptimizationLogger { private const int SAMPLES_BUFFER = 1000; private List<ValuedSample> samplesToStore; private List<ValuedSample> samples; private object addingLock = new object(); private object storingLock = new object(); private bool isFirstSample; public string LogName { get; } public string LoggedAlgorithmName { get; } public DateTime LoggingStartTime { get; private set; } public OptimizationLogger(string logName, string loggedAlgorithmName) { LogName = logName; LoggedAlgorithmName = loggedAlgorithmName; ResetLogger(); } void ResetLogger() { LoggingStartTime = DateTime.Now; lock (addingLock) { samples = new List<ValuedSample>(); } lock (storingLock) { isFirstSample = true; File.Delete(LogName); samplesToStore = new List<ValuedSample>(); } } public void LogSample(ValuedSample sample) { lock (addingLock) { samples.Add(sample); } if (samples.Count >= SAMPLES_BUFFER) { FlushSamples(); } } public void FlushSamples() { lock (addingLock) { lock (storingLock) { samplesToStore.AddRange(samples); Thread storingThread = new Thread(StoreSamples); storingThread.Start(); } samples.Clear(); } } protected void StoreSamples() { lock (storingLock) { if (isFirstSample && samplesToStore.Any()) { File.AppendAllLines(LogName, new string[] { string.Format( "Name\t" + "Value\t{0}", string.Join("\t", samplesToStore[0].X.Select((s,idx) => string.Format("X{0}", idx+1))) ) }); isFirstSample = false; } IEnumerable<string> lines = samplesToStore.Select(sample => string.Format(CultureInfo.InvariantCulture, "{0}\t{1}\t{2}", LoggedAlgorithmName, sample.Value, string.Join("\t", sample.X.Select(x => string.Format(CultureInfo.InvariantCulture, "{0}", x) )) ) ); File.AppendAllLines(LogName, lines); samplesToStore.Clear(); } } }
Having such an OptimizationLogger
object it may be now utilized within algorithms or quality functions.
For example it can be added to MonteCarloAlgorithm.Optimize()
method
public override ValuedSample Optimize() { ValuedSample bestSample = null; for (int sampleIdx = 0; sampleIdx < SampleCount; sampleIdx++) { double[] x = RandomSampleWithinFunctionBounds(); ValuedSample testSample = new ValuedSample(x, FunctionToOptimize); Logger.LogSample(testSample); if (bestSample == null || testSample.Value < bestSample.Value) { bestSample = testSample; } } Logger.FlushSamples(); return bestSample; }
...after being injected into the MonteCarloAlgorithm
class itself:
static void Main(string[] args) { OptimizationLogger logger = new OptimizationLogger("mc-rastrigin-log.csv", "MC"); Utilities.SetRandomSeed(seed: 1); IQualityFunction qualityFunction = new RastriginFunction(dimension: 2); OptimizationAlgorithm algorithm = new MonteCarloAlgorithm(qualityFunction, logger, sampleCount: 10000); ValuedSample result = algorithm.Optimize(); Console.WriteLine(result.Value); Console.WriteLine(String.Join(';', result.X)); }
CSV file format
The above code (see full source here) will generate tab separated CSV file with the following form:
Name Value X1 X2 MC 20.612892773532423 -2.57363369823137 -3.9859816736662674 MC 5.520117218854509 -0.337810638108202 2.7812262095051006 MC -0.06087417797589767 1.6129934723735753 -0.6883061626778479 MC 18.93160368211865 -1.4941822607322512 4.545149707526504 MC 16.69946316855684 -4.083031515592259 1.4587448866007593 MC 27.542993367402246 -4.826759037834946 -2.580181549796919 MC 28.087825693297862 -1.8420722517008299 5.015215592335545 MC 3.797238642274225 1.8649407371622235 1.5860519631142038 MC 4.9216728406385215 -2.224852746569017 1.1811472206847498 ...
The examples of CSV files are provided here.
After those initial steps we are ready to create our first diagnostic tool in R
(of course, any environment with sensible plotting ability (python, matlab, octave) will be sufficient).
Convergence plots
The most basic way of verifying the performance of the algorithm, is to observe the speed of its convergence to the optimal value (minimum in this case).
Usually convergence speed should be considered in two domains
- number of function samples necessary to obtain certain function value
- computations time in which a certain function value has been obtained
In real-world applications the second measure is definitely more important, while the first one enables researchers to compare the algorithm regardless of the computational power and code optimization prowess of the designer, focusing on the algorithm properties.
The above plot can be easily generated with the following R script:
#start with setting the location of CSV log files #in R '~' means home directory location setwd('~/Repozytoria/Optymalizacja/tutorials/r/sample-data/') #load CSV file for algorithm run rastrigin.run = read.csv('mc-rastrigin-log.csv', sep='\t', dec='.') #plot a cumulative minimum of samples function values plot(cummin(rastrigin.run$Value), #y axis will be presented in logarithimic scale log='y', #a line plot will be created type = 'l', #line thickness will be set to 2 units lwd=2, #line color will be set to blue col = 'blue', #plot title main = 'Rastrigin - Monte Carlo', #axis labels xlab = 'Evaluations', ylab = 'log(function.value)' )
plot
call (and close it afterwards to close the output stream)
... #create png with certain width and height png('evaluations-convergence.png', width = 600, height = 300) #plot a cumulative minimum of samples function values plot(cummin(rastrigin.run$Value), #y axis will be presented in logarithimic scale log='y', #a line plot will be created type = 'l', #line thickness will be set to 2 units lwd=2, #line color will be set to blue col = 'blue', #plot title main = 'Rastrigin - Monte Carlo', #axis labels xlab = 'Evaluations', ylab = 'log(function.value)' ) #close image file output stream dev.off()
After adding a second optimization algorithm, namely a grid search, we may now plot the results pf algorithms against one another in order to see which algorithm performed better:
Because grid search eventually finds the optimum at (0,0) it was necessary to switch off the log scale. Here its finally victory over Monte Carlo can be observed:
Comparison plot can be generated with an R script in the following form (remember to set proper files location with setwd()
):
#load CSV files for algorithm run rastrigin.mc.run = read.csv('mc-rastrigin-log.csv', sep='\t', dec='.') rastrigin.gs.run = read.csv('gs-rastrigin-log.csv', sep='\t', dec='.') #plotting to file - remember to call dev.off() #png('evaluations-convergence-with-gs.png', width = 600, height = 300) #plot a cumulative minimum of samples function values #to get the column in R we may use DATA$COLUMN syntax plot(cummin(rastrigin.mc.run$Value), #a line plot will be created type = 'l', #line thickness will be set to 2 units lwd=2, #line color will be set to blue col = 'blue', #plot title main = 'Rastrigin - Comparison', #axis labels xlab = 'Evaluations', ylab = 'function.value'#, #limit the y-axis to [0,0.1], #ylim=c(0,1e-1) ) #add a second line to the same plot lines(cummin(rastrigin.gs.run$Value+1e-16), lwd=2, col = 'red') #draw a legend legend("topright", legend = c('MC','GS'), lwd = c(2,2), col = c("blue","red")) #if calling png above - use also dev.off #dev.off()
However, writing a Grid Search algorithm has not been motivated to get that miniscule advantage over Monte Carlo approach...
Sampling distribution plots over 2D contours
The real reason behind writing a Grid Search has been to obtain data useful for plotting contours of 2D functions. And against those contours we can plot samples gathered by the algorithm in order to observe its way of sampling the optimized function arguments:
In order to generate those plots you may use a following R script:
#in order to get a better plot experience we are going to use ggplot2 package if('ggplot2' %in% rownames(installed.packages()) == FALSE) { install.packages('ggplot2') } require(ggplot2) if('svglite' %in% rownames(installed.packages()) == FALSE) { install.packages('svglite') } require(svglite) #start with setting the location of CSV log files #in R '~' means home directory location setwd('~/Repozytoria/Optymalizacja/tutorials/r/sample-data/') #load CSV files for algorithm run algorithm.run = read.csv('mc-rosenbrock-log.csv', sep='\t', dec='.') #now gs data will be used to draw a contour of the function function.data = read.csv('gs-rosenbrock-log.csv', sep='\t', dec='.') v <- ggplot(function.data, aes(X1, X2, z = Value)) + coord_fixed() v <- v + geom_raster(aes(fill = Value)) + scale_fill_gradientn(colours=c("#00FF00FF","#FFFF00FF","#FF0000FF")) + stat_contour(binwidth = sd(function.data$Value) / 2, colour = 'black') + #adjust transparency on the basis of samples count geom_point(data = algorithm.run, alpha = 700 / nrow(algorithm.run), aes(X1, X2)) print(v) ggsave('rosenbrock-contour.png', plot = v, width=8, height=6, units = 'in')
While presented sampling of the Monte Carlo is not very informative you can also check results of the Hill Climbing approach stored in hc-rastrigin-log-10000.csv
and hc2-rastrigin-log-10000
files:
Updated