Wiki

Clone wiki

bnpy / QuickStart / GMM-soVB

Goal

Here we describe how to use bnpy to do inference via an algorithm called stochastic online Variational Bayes (soVB). We'll consider a simple Gaussian mixture model applied to the Asterisk toy data as the target application.

For more conceptual information about soVB, checkout this guide.

Run the code

As a first step, we'll run a quick single-task experiment using all the default parameters. To execute soVB from command line, we call Run.py with the same standard syntax we've used for EM and VB. We'll just insert soVB as the 4th positional argument (which specifies the learning algorithm).

To fit an 8-component model, taking at most 50 laps of the data, execute

python -m bnpy.Run AsteriskK8 MixModel Gauss soVB --nLap 50 --K 8

Expected Output

Minibatch Iterator: 10 batches
  num batch 10, num obs per batch 2500
  num obs (total across all batches): 25000
Allocation Model:  Finite mixture with K=8. Dir prior param 1.00
Obs. Data  Model:  Gaussian distribution
Obs. Data  Prior:  Gauss-Wishart jointly on mu[k], Lam[k] 
  E[ mu[k] ]     = [ 0.  0.]
  E[ CovMat[k] ] = 
  [[ 1.  0.]
   [ 0.  1.]]
Learn Alg: soVB
Trial  1/1 | alg. seed: 4226944 | data order seed: 8541952
savepath: /results/AsteriskK8/MixModel/Gauss/soVB/defaultjob/1
    0.100/50 after      0 sec. | K    8 | ev -9.336609061e+04 
    0.200/50 after      0 sec. | K    8 | ev -4.635748733e+04 
    0.300/50 after      0 sec. | K    8 | ev -4.577378086e+04 
    5.000/50 after      1 sec. | K    8 | ev -2.177887241e+04 
   10.000/50 after      1 sec. | K    8 | ev -7.414749454e+03 
   15.000/50 after      2 sec. | K    8 | ev -8.135766318e+02 
   20.000/50 after      2 sec. | K    8 | ev  6.876642455e+02 
   25.000/50 after      3 sec. | K    8 | ev  1.011611776e+03 
   30.000/50 after      3 sec. | K    8 | ev  1.014090661e+03 
   35.000/50 after      4 sec. | K    8 | ev  1.014686244e+03 
   40.000/50 after      4 sec. | K    8 | ev -7.781143250e+01 
   45.000/50 after      5 sec. | K    8 | ev -1.417308333e+02 
   50.000/50 after      5 sec. | K    8 | ev  1.017075716e+03 
... done. all data processed.

Visualization of learned parameters

As usual, PlotComps.py can be called for visualization.

python -m bnpy.viz.PlotComps AsteriskK8 MixModel Gauss soVB --doPlotData

ELBO trace plot

We can plot the ELBO trace over time with

python -m bnpy.viz.PlotELBO AsteriskK8 MixModel Gauss soVB

soVB is an algorithm that processes data in batches and updates parameters using noisy estimates from each batch. Thus, its estimates of the ELBO will not look as clean as algorithms like EM or VB.

As shown above, soVB will produce a trace with generally increasing ELBO, but the monotonic increasing guarantees that apply to typical full-dataset VB's ELBO do not apply here.

Updated