Default optimizer: "L-BFGS-B" or "BFGS"?

Issue #147 resolved
Alireza Khorshidi created an issue

I do not quite remember why we switched the optimizer from 'BFGS' in v0.4 to 'L-BFGS-B' in the development version here?

L-BFGS-B apparently uses less memory, but we should not have memory issue since we have limited number of parameters. Instead, I would expect that in another aspect "BFGS" be more exact (and probably better).

When I try v0.4 with BFGS and development version with L-BFGS-B, I am seeing that v0.4 reduces the function more smoothly than the development version. That might be due to the optimizer, though I am not sure at this point.

Comments (6)

  1. Alireza Khorshidi reporter

    :)) Yes, you are right, I should try to write more descriptive messages. Maybe since L-BFGS-B was newer than BFGS, I thought it should be more improved!

    I gave it a shot on a large dataset of about 3500 images and two chemical elements. Starting from the same sets of initial parameters, L-BFGS-B reduces the loss function to 70.98, but BFGS reduces it to 1.37, much smaller than L-BFGS-B. Attached are the log texts.

    The wikipedia page says that L-BFGS-B "is particularly suited to problems with very large numbers of variables (e.g., >1000)", which is roughly the case for 4 chemical elements and 10-10-10-10 hiddenlayers, and so we will not encounter it that much.

    In addition, if BFGS hits memory issue, a message will appear and then we can think of how to reduce the memory use (e.g. use L-BFGS-B instead of BFGS). I wouldn't concern about the memory unless it becomes an issue. Having said all that, I vote for switching back to BFGS.

  2. Log in to comment