View source
new_version
  • Contributors
    1. Loading...
Author Commit Message Date Builds
2 commits behind master.
Mike Hughes
ENH Improvements to examples and docs
Mike Hughes
ENH Updates to examples/, in prep for better web documentation
Mike Hughes
DOC source/ now has good structure for api/, examples/, and obsmodel/ and allocmodel/
Mike Hughes
DOC Gaussian obsmodel first demo
Mike Hughes
ENH Updated docs and examples
Mike Hughes
REORG deleting demos/ directory, to make way for newer, easy-to-maintain examples/
Mike Hughes
INPROG All supported LearnAlgs have converted to minimizing loss, and using trace_*.txt, snapshot_*.txt format
Mike Hughes
INPROG moving towards universal use of loss instead of ev, simpler to understand
Mike Hughes
FIX BagOfWordsData, so more clearly named
Mike Hughes
REORG remove munkres from third-party/, its now an official dependency
Mike Hughes
FIX setup.py now successfully compiles all relevant extensions (.cpp or .pyx files)
Mike Hughes
REORG datasets/ now has lots of old content moved to zzz_unsupported
Mike Hughes
REORG run_notebook_docs.py draft script is in place
Mike Hughes
ENH updated run_nosetests.py and run_doctests.py
Mike Hughes
REORG Updated tests/
Mike Hughes
FIX Updated some basic tests to pass
Mike Hughes
FIX AllocModel.py had bad typo, now fixed
Mike Hughes
MERGE Huge merge of master and update branches
Mike Hughes
ENH Improved BestJobSearcher.py to take --scoreTxtfile, which specifies which result is used to rank
Mike Hughes
FIX GroupXData assert verifies that all sequences are of length >= 0
Mike Hughes
FIX FiniteHMM uses startAlpha instead of initAlpha
Mike Hughes
ENH AutoRegGauss can now use Xprev arrays that have different dim compared to X
Mike Hughes
Merge branch 'ENH-better-moves-with-relational' of https://bitbucket.org/michaelchughes/bnpy-dev into ENH-better-moves-with-relational
Mike Hughes
FIX creating GaussRegressYFromDiagGaussX now specifically includes 'D' and 'E' as PriorArgs
Mike Hughes
FIX HDP_TestFromExisting had a bug in post-merge-proposal elbo computation due to forgetting to call setMergeUIDPairs(). Now fixed.
Mike Hughes
DOC Improved docs
Mike Hughes
DOC Updated obsmodel and allocmodel docs
Mike Hughes
DOC Updated docs for DiagGauss observation model
Mike Hughes
Merge branch 'ENH-better-moves-with-relational' of https://bitbucket.org/michaelchughes/bnpy-dev into ENH-better-moves-with-relational
Mike Hughes
DOC Updated documentation with good examples for obsmodels
Geng Ji
ENH HDP_FromExisting could merge between new and generic topics
Mike Hughes
MERGE HDP_TestFromExisting does warm_start init from previous DocTopicCounts and faster merges
Mike Hughes
ENH Improved HDP_TestFromExisting to show how proposed stats can be computed just for needed cluster pairs, not for any other clusters
Geng Ji
Merge branch 'ENH-better-moves-with-relational' of https://bitbucket.org/michaelchughes/bnpy-dev into ENH-better-moves-with-relational
Mike Hughes
ENH Added tests/allocmodel/TestDPELBOVsGamma.py, a prelim exploration of how gamma impacts overall Lalloc values
Geng Ji
Solve the minor conflict
Geng Ji
ENH HDP_TestFromExisting
Mike Hughes
ENH HDP_TestFromExisting now has functional merge moves during refinement phase
Geng Ji
ADD HDP_testFromExisting.py
Geng Ji
Merge branch 'ENH-better-moves-with-relational' of https://bitbucket.org/michaelchughes/bnpy-dev into ENH-better-moves-with-relational
Geng Ji
ADD noiseSD to FromExistingBregman
Mike Hughes
Merge branch 'ENH-better-moves-with-relational' of https://bitbucket.org/michaelchughes/bnpy-dev into ENH-better-moves-with-relational
Mike Hughes
ENH SuffStatBagIO passes all doctests. Looks ready for primetime
Mike Hughes
ENH Added file io functionality for SuffStatBag. bnpy.ioutil.SuffStatBagIO.py has methods saveSuffStatBag(file_path, SS) and loadSuffStatBag(file_path)
Mike Hughes
FIX FromExistingBregman.py and its tests. weird spliced text error and matplotlib incompatibility error.
Mike Hughes
MERGE diverse edits to FromExistingBregman
Mike Hughes
FIX RunBregKMeans.py removed erroneous print statement
Mike Hughes
FIX ZeroMeanGaussObsModel now standard to use B/(nu-D-1) as mean parameter. TODO what to do here to be formally correct... the kmeans objective is not monotonic under this choice.
Mike Hughes
ENH FromExistingBregman improved with better debugging output, clearer notation (Kfresh instead of K), etc
Mike Hughes
ENH TestFromExistingBregmanKMeans is ready
Mike Hughes
ENH Added TestFromExistingBregmanKMeans to verify correctness of from existing initialization
Mike Hughes
FIX RunBregKMeans test succeeds for all obsmodels now
Mike Hughes
DOC improved docs for breg inits
Geng Ji
FIX bugs
Geng Ji
FIX bugs related with Bregman kmeans
Mike Hughes
ADD motorcycle_crash dataset
Mike Hughes
ENH GaussRegressYFromDiagGaussX now supports merge moves. hooray
Mike Hughes
ENH Added ability to do local step on test dataset, when Y is missing
Mike Hughes
FIX bad printing of 1D array to strings in get_prior_info_string()
Mike Hughes
FIX default settings for regression model use pnu/ptau instead of nu/tau
Mike Hughes
Merge branch 'ENH-better-moves-with-relational' of https://bitbucket.org/michaelchughes/bnpy-dev into ENH-better-moves-with-relational
Mike Hughes
ADD FromExistingBregman, which is still a work in progress
Mike Hughes
ENH GaussRegressYFromDiagGaussX is now an obsmodel. Combines both DiagGauss for X and GaussRegressY functionality
Mike Hughes
ENH New obsmodel GaussRegressYFromFixedX
Mike Hughes
Merge branch 'ENH-better-moves-with-relational' of https://bitbucket.org/michaelchughes/bnpy-dev into ENH-better-moves-with-relational
Mike Hughes
ENH Improved GraphXData object to read from .graph files
Mike Hughes
ENH plotTrace will no longer skip runs that are very short (<3 entries)
Mike Hughes
ENH Added support for soVB to use warm starts. Requires --doMemoizeLocalParams 1 (RAM) or 2 (disk), and --initDocTopicCountLP memo
Mike Hughes
FIX bug in restarts for local step many docs where topic count was overwritten after exiting early...
Mike Hughes
MAINT Housekeeping, changed embed statement to avoid red flags
Mike Hughes
Merge branch 'ENH-better-moves-with-relational' of https://bitbucket.org/michaelchughes/bnpy-dev into ENH-better-moves-with-relational
Mike Hughes
ENH TaskRanker,JobSearcher now avoid tasks with errors when computing the best task. TODO what happens when all tasks have errors
Mike Hughes
FIX FiniteTopicModel no longer computes slack term explicitly. Not necessary, saves memory and (small) time.
Mike Hughes
ENH Revised C++ code for many-doc-local-step to do faster warm start
Mike Hughes
ENH Memoized tracking of local params can now cache to disk
Mike Hughes
ENH memoVB supports warm starting
Mike Hughes
Merge branch 'ENH-better-moves-with-relational' of https://bitbucket.org/michaelchughes/bnpy-dev into ENH-better-moves-with-relational
Mike Hughes
ENH Improved visualizations, ability to rank runs on disk, and ability to read topic models from disk
Mike Hughes
Merge branch 'ENH-better-moves-with-relational' of https://bitbucket.org/michaelchughes/bnpy-dev into ENH-better-moves-with-relational
Mike Hughes
ENH Updated log-elapsedtime tracking to include both Memo and Stoch algs
Mike Hughes
FIX Running 'make all' now builds the correct c++ code for sparse topics
Mike Hughes
FIX warnings.warn instead of warnings.warning, and force topics loaded from disk to be a 2D array even for K=1
Mike Hughes
ENH Merge laptop work and desktop work
Mike Hughes
FIX Maybe better handling of K=1 case for bregmanmixtures
Mike Hughes
ENH Improved init of randexamples for Mult, can do randexamples+lam1 to get bigger smoothing
Mike Hughes
ENH Standardized OptimizerForPi and removed lots of junk. Now has same interface for both frankwolfe and natural-gradient-descent.
Mike Hughes
ENH Improved merge/delete planning and pi_d optimization.
Mike Hughes
ENH streamlining merge/delete/birth selection and associated log messages
Mike Hughes
ENH Updated merge and delete moves for DP and HDP. Delete selection now purely size-based, not driven by weird score. Among all clusters, we try to delete largest one under threshold.
Mike Hughes
Merge branch 'ENH-better-moves-with-relational' of https://bitbucket.org/michaelchughes/bnpy-dev into ENH-better-moves-with-relational
Mike Hughes
ENH Towards functional delete and birth for HDP
Mike Hughes
FIX BarsViz can plot more than 5 cols now
Mike Hughes
ENH Towards better merge pair selection for HDP
Mike Hughes
ENH Improved restricted local step with optional ELBO computation. TryDelete and TryMerge scripts helpful for debugging.
Mike Hughes
ENH HDP delete move now corresponds to recent changes for delete. Rewrites all resp mass for absorbing set, rather than just reassigning mass from target only.
Mike Hughes
ENH Added TryDelete and TryMerge functions, so its easy to run merge/delete for DPMixtureModels forward and see if they improve
Mike Hughes
ENH Merge moves now refresh clusters after every m_nLapToReactivate laps, so that we keep trying things.
Mike Hughes
ENH revised birthmove in light of recent delete improvements. birthmove now has nUpdateSteps=1
Mike Hughes
ENH DeleteMove for DPMixtureModel substantially improved. Rewrites all resp for absorbing set, rather than just reassigning mass from target
Mike Hughes
Merge branch 'ENH-better-moves-with-relational' of https://bitbucket.org/michaelchughes/bnpy-dev into ENH-better-moves-with-relational
Mike Hughes
FIX issues with 1D gaussian init of prior params
Mike Hughes
ENH Trying out mixture estimation strategy for init pi_d
Mike Hughes
Merge branch 'ENH-better-moves-with-relational' of https://bitbucket.org/michaelchughes/bnpy-dev into ENH-better-moves-with-relational
Mike Hughes
FIX faster, less-memory-hungry cython code for entropy calculation needed some tweaks to play well with proposal moves
Mike Hughes
ENH Updated figs for producing surrogate bound plots
Mike Hughes
ENH Added test to try out the natural param of multinomial likelihood.
Mike Hughes
ENH Fix heldout estimation score for gaussian models so it normalizes by (#atoms * #dims), not just #atoms
Mike Hughes
ENH EM algorithm can do local step with nnz s level
Mike Hughes
FIX WordsData's LoadFromFile_ldac had a bad memory leak, probably due to slicing arrays and returning fragments of those arrays. This commit resets to a pure python text file reader, which should not have the leak issue.
Mike Hughes
ENH memory profiling can access any named field of psutil now, like rss or data, etc
Mike Hughes
FIX BestJobSearcher needed to strip trailing '/' to do symlink creation, and needed to search for '-varname=' instead of just 'varname'
Mike Hughes
ENH sparsifyLogResp now handles duplicates
Mike Hughes
ENH Callback for mixture models on GroupXData now does plain old point estimation of logsumexp(logpi + logphi). Should be easier to justify.
Mike Hughes
ENH DPMixture and FiniteMixture now both fully support nnzPerRowLP flag
Mike Hughes
FIX FiniteMixtureModel uses Dir(gamma/K, gamma/K, ... gamma/K) now, so that in limit K->inf approaches DP
Mike Hughes
FIX adjusted many-doc algorithm so that can handle duplicates. Uses more-elegant way that sorts indices by values
Mike Hughes
ENH New script BestJobsearcher will do grid search over nuisance parameters to find best performing jobs
Mike Hughes
FIX FiniteTopicModel handles nnz=1 case with Hresp=0
Mike Hughes
Merge branch 'ENH-better-moves-with-relational' of https://bitbucket.org/michaelchughes/bnpy-dev into ENH-better-moves-with-relational
Mike Hughes
FIX InferHeldout now saves test-timeTrain and validation-timeTrain so they have (approx) same value, always trying to subtract away time spent on evaluation
Mike Hughes
Merge branch 'ENH-better-moves-with-relational' of https://bitbucket.org/michaelchughes/bnpy-dev into ENH-better-moves-with-relational
Mike Hughes
ENH Fixed FiniteTopicModel so its alpha hyperparameter has same interpretation as alpha in HDPTopicModel
Mike Hughes
ENH Standardized saving procedure. Memoized will save (sparsified) WordCount stats, but only after the alg starts updating global params. Stochastic saves (sparsified) lam values, never suff stats directly since these are for one-batch only.
Mike Hughes
ENH Slightly faster merge gap computations in MultObsModel, but avoidign lots of memory allocation and function calls.
Mike Hughes
ENH Better elapsed time logging of io for saving models to disk. Also, Improved speed up multobsmodel dramatically by precomputing prior cFunc for evidence computation (2x speed increase on this bit)
Mike Hughes
ENH Callbacks now track the cumulative time in train+eval, and also evalonly and trainonly in predlik-timeEvalOnly.txt and predlik-timeTrain.txt
Mike Hughes
ENH PlotHeldoutLik now accomodates showing the Kactive per document
Mike Hughes
ENH Updated callbacks to always do validation and test set metrics
Mike Hughes
FIX learnalgs like soVB now set_start_time() before first save, to avoid crash. Obsmodels handle leftover kwargs to calc_evidence
Mike Hughes
Merge branch 'ENH-better-moves-with-relational' of https://bitbucket.org/michaelchughes/bnpy-dev into ENH-better-moves-with-relational
Mike Hughes
Updated PlotTrace to handle new predlik-timeTrain.txt output
Mike Hughes
ENH Added ability to control verbosity of local step with --verboseLP. Also fixed bug so that topic models do not do sparse inference when L=0
Mike Hughes
ENH Added capability to use validation set to InferHeldoutTopics
Mike Hughes
FIX Small bug in JobFilter where kwarg is interpreted as int, then compared to str
Mike Hughes
ENH using faster exp function to speedup inference
Mike Hughes
ENH Standardized loading from csr and csc txt files. Also, made possible to track heldout data vs training time
Mike Hughes
ENH added automatic times_saved_params.txt that's saved by every infer alg
Mike Hughes
Merge branch 'ENH-better-moves-with-relational' of https://bitbucket.org/michaelchughes/bnpy-dev into ENH-better-moves-with-relational
Mike Hughes
ENH Several improvements to memory and speed. Data object no longer holds on to dense DocTypeCountMat after anchorwords init, and we save 100MiB on nips/wiki/etc because of this. Also, FiniteTopicModel now also supports sparse assignments, just like HDPTopicModel.
Mike Hughes
ENH Speed up anchor init by avoiding unnecessary copyign
Mike Hughes
ENH added util functions to monitor memory consumption
Mike Hughes
ENH Added CleanBarsK10 to contrast with noisier BarsK10V900. WordsData generation uses slightly different seeding, so that we can get diff values from get_data(seed=1, nDocTotal=1) and get_data(seed=3, nDocTotal=1). InferHeldoutTopics uses new strategy to divide vocab into training and testing sets, so that we have good balance of (by default) 10% seen words and 90% unseen words in the retrieval experiments. Note: likelihood computation always uses entire test set of seen words, but retrieval experiment may drop some seen word types at random.
Mike Hughes
Merge branch 'ENH-better-moves-with-relational' of https://bitbucket.org/michaelchughes/bnpy-dev into ENH-better-moves-with-relational
Mike Hughes
ENH Removed lots of old cruft
Mike Hughes
ENH UPdated ModelReader and ModelWriter to use new plain-text dump format for topicmodels
Mike Hughes
FIX small bugs in SOVB algorithm related to buildRunInfo, and in DataIteratorFromDisk
Mike Hughes
ENH Updated interface to C++ so can use initDocTopicCountLP with fastfirstiter to trigger precomputation of vocab-specific sparse resp
Mike Hughes
ENH Added plaintext output for Kactive at each doc into localstep-transcript.txt
Mike Hughes
ENH Improved TopRPrecision metric, so that we have fixed baseline of Rprec=0.1 for random guessing (at least 10% of heldout vocab types will be present, rest will be absent), and Rprec=1.0 for perfect performance
Mike Hughes
ENH TopLCPPX explores different ways to call nth_element to find indices of top L values in a vector. Turns out doesnt seem to be much practical difference between using a struct with a comparator, and sorting a copy of the data values.
Mike Hughes
ENH Added precomputation of type-topic sparse assignments. Started on TopLCPPX.cpp to make top-L computation faster (maybe)
Mike Hughes
ENH LocalStepManyDocs now has C++ equivalent that can do amoritized active-set revision *and* restarts. TODO: Integrate into calc_local_params.
Mike Hughes
Merge branch 'ENH-better-moves-with-relational' of https://bitbucket.org/michaelchughes/bnpy-dev into ENH-better-moves-with-relational
Mike Hughes
ENH Adding LocalStepManyDocs functionality, via C++ code
Mike Hughes
FIX WordsData .dim always set to .vocab_size. Also, InferHeldoutTopics prints useful info about local step params (which are always set to avoid restarts)
Mike Hughes
FIX Heldout metrics for topic models are working again. Fixed way-too-custom dependency on path name. Spruced up the logging functionality too, so it is saved to heldout-transcript-summary.txt and heldout-transcript-verbose.txt.
Mike Hughes
FIX Better error msg for LPkwargs access
Mike Hughes
FIX C++ code for local step doesn't do fixed active set inference when nnzPerRow==1
Mike Hughes
FIX Stupid bug where assertion was raised because forget to disregard -1 state in count
Mike Hughes
ENH Improvements to speed-up sparse local step for HDP topics, esp with Mult likelihoods. Code now amortizes the cost of sifting out the top L topics per token, and does some precomputation to avoid first dense iteration at each doc.
Mike Hughes
Merge branch 'ENH-better-moves-with-relational' of https://bitbucket.org/michaelchughes/bnpy-dev into ENH-better-moves-with-relational
Mike Hughes
FIX updated births to work with sparse resp
Mike Hughes
ENH Makefile consistent use of DNDEBUG now
Mike Hughes
ENH sparse activeonly now successfully can do restarts, and record them to file
Mike Hughes
ENH Updated Topic Model local step with assignments only to active topics
Mike Hughes
Merge branch 'ENH-better-moves-with-relational' of https://bitbucket.org/michaelchughes/bnpy-dev into ENH-better-moves-with-relational
Mike Hughes
Merge branch 'ENH-better-moves-with-relational' of https://bitbucket.org/michaelchughes/bnpy-dev into ENH-better-moves-with-relational
Mike Hughes
ENH Big speedup of reading .ldac files (plain text format for bag-of-words)
Mike Hughes
ENH Improved speed of sparse restarts
Mike Hughes
FIX added small assert and comment, to help explain weird edge case in Breg div
Mike Hughes
ENH initname bregmankmeansWithPriorMean will now insert a value at the prior's expected suff stats
Mike Hughes
FIX entropy calculation with cython code now works for restricted inference, when Resp.ndim==1 (not 2)
Mike Hughes
FIX anchorwords init now can do K=1 initialization
Mike Hughes
ENH Added switch --doSparseOnlyAtFinalLP 0, which is 0 by default, and 1 if we desire sparsity only after the final step of local inference.
Mike Hughes
Merge branch 'ENH-better-moves-with-relational' of https://bitbucket.org/michaelchughes/bnpy-dev into ENH-better-moves-with-relational
Mike Hughes
FIX HDP local step will not do sparseif nnzPerRow is equal to K (formerly was useful for debugging)
Mike Hughes
Merge branch 'ENH-better-moves-with-relational' of https://bitbucket.org/michaelchughes/bnpy-dev into ENH-better-moves-with-relational
Mike Hughes
FIX makefile has eigenpath dependency now
Mike Hughes
ENH Can now do merges with HDP and DP models with sparse local assignments.
Mike Hughes
Merge branch 'ENH-better-moves-with-relational' of https://bitbucket.org/michaelchughes/bnpy-dev into ENH-better-moves-with-relational
Mike Hughes
ENH Updated HDPTopicModel inference tools so can do sparse assignments, esp with WordCounts
Mike Hughes
Merge branch 'ENH-better-moves-with-relational' of https://bitbucket.org/michaelchughes/bnpy-dev into ENH-better-moves-with-relational
Mike Hughes
FIX DPMixture use np.sum(Hresp) not Hresp.sum(), because Hresp might be a float in 1-sparse case
Mike Hughes
FIX Bregman init can now specify num iters with --initname bregmankmeans+0, and obsmodels correctly handle HDPTopicModel's lack of an N field
Mike Hughes
ENH Added c++ library for doing fast sparse-assignments in topic model case
Mike Hughes
ENH Updated Bern lik so that it (1) does everything from class-level function calcLocalParams and calcSuffStats, and (2) can use sparse or dense assignments in calcSuffStats. TODO: deal with sparse assignments when dataatoms are words
Mike Hughes
ENH DiagGauss lik now supports sparse vs dense summary calculation
Mike Hughes
FIX EMAlg works again
Mike Hughes
ENH Mult likelihood now supports (1) parallelism for both doc and word atom types, and (2) sparse or dense assignments
Mike Hughes
FIX improved performance for moderate nnz values (2, 3, ...) by avoiding subtracting max of each row from all K entries, only using the top nnz entries
Mike Hughes
FIX sparsifyLogResp needs to be able to safely take exp of any value. So, need to subtract the max in each row. TODO only subtract max from topL entries, not all values.
Mike Hughes
ENH Can now train DPMixtureModel with enforced sparsity (at most L of the K states can be non-zero for any data atom's assignments). Works with Gauss and ZeroMeanGauss currently. TODO: extend to Mult, Bern, and other likelihoods.
Mike Hughes
ENH Added SparseRespStatsUtil, for computing things like sum(r[n,k] * x[n]**2) or sum(r[n,k] * x[n] * x[n].T)
Mike Hughes
ENH added sparsifyResp_ functions, in util/SparseRespUtil.py. These will take a matrix and return a sparseified version of it (with at most nnzPerRow entries non-zero)
Mike Hughes
Merge branch 'ENH-better-moves-with-relational' of https://bitbucket.org/michaelchughes/bnpy-dev into ENH-better-moves-with-relational
Mike Hughes
ENH Started work on trying sparser-versions of assignment distributions q(z)
Mike Hughes
Merge branch 'ENH-better-moves-with-relational' of https://bitbucket.org/michaelchughes/bnpy-dev into ENH-better-moves-with-relational
Mike Hughes
FIX raised warning is now a proper warning taht doesnt halt execution
Mike Hughes
Merge branch 'ENH-better-moves-with-relational' of https://bitbucket.org/michaelchughes/bnpy-dev into ENH-better-moves-with-relational
Mike Hughes
ENH Improved error messaging for issue with NaN in sparse restarts... todo find better,faster long-term solution
Mike Hughes
ENH Updated feature branch to latest stable less-memory-hungry fix recently applied to master
Mike Hughes
Merge branch 'ENH-better-moves-with-relational' of https://bitbucket.org/michaelchughes/bnpy-dev into ENH-better-moves-with-relational
Mike Hughes
ENH Improved PlotParamComparison by making connected line through top-ranked (by elbo) of all tasks
Mike Hughes
Merge branch 'ENH-better-moves-with-relational' of https://bitbucket.org/michaelchughes/bnpy-dev into ENH-better-moves-with-relational
Mike Hughes
ENH Updated init so can do bregmankmeans+0 to specify 0 iterations (with same after effects as randexamples), bregmankmeans+1 to do 1 iter, bregmankmeans+50 to do 50 iterations.
Mike Hughes
ENH PlotTrace.py Added kwarg drawLineToXMax that will continue all lines to the same maximum x value, so it is easier to compare them
Mike Hughes
ENH Updated PlotParamComparison to study differences between local optima
Mike Hughes
ENH Bernoulli can now take custom mean and scale, instead of lam1 and lam0
Mike Hughes
ENH Updated PrintTopics to improve the display of topics under Bern lik
Mike Hughes
FIX PlotComps for a topicmodel with associated vocab list now shows the top words
Mike Hughes
ENH Cleanup log messages for BPlanner.py
Mike Hughes
Merge branch 'ENH-better-moves-with-relational' of https://bitbucket.org/michaelchughes/bnpy-dev into ENH-better-moves-with-relational
Mike Hughes
ENH Updated BPlanner to use batch-specific stats to decide which births to try. We only disqualify a birth if all batches report a uid as ineligible (due to small size or past failures)
Mike Hughes
ENH Trying new BPlanner, with better criteria for choosing which comps to target
Mike Hughes
ADDED util function for sorting with respect to tiers
Mike Hughes
ENH Improved data reader to load from minibatch dataset
Mike Hughes
ENH Added capability to enable/disable birth proposal retention AND birth proposal merge cleanups
Mike Hughes
ENH Added avgPi option to init xPi
Mike Hughes
ENH Updated Eval statement in births to be more readable/searchable as a one-line summary
Mike Hughes
ENH Improved speed of ZeroMeanGauss (got rid of forloop for DivDataVec). Also simplified convergence logic for Birth proposal
Mike Hughes
ENH Upgraded ZeroMeanGauss to use triangular solver, which seems to give noticeable performance boost (see TestZeroMeanGaussLocalStepSpeed.py to try it out on new hardware)
Mike Hughes
ENH Improved logging message formatting and hopefully better flushing to disk. Added option to only retain stats for next lap if there are two comps with nontrivial mass.
Mike Hughes
ENH Added method in TryBirth that will find comp best targeted by a birth move
Mike Hughes
FIX shuffle move respects the different truncation limits aggregated across batches
Mike Hughes
ENH Added Letters dataset. Updated BPlanner to avoid trying the same comp too many times on the first lap. Updated TryBirth to load specific batch file from disk for the interactive trial.
Mike Hughes
FIX hdp restricted step uses HrespEmptyComp now, as required to do multi batch calculations correctly
Mike Hughes
FIX elbo tracking for Hresp in dp models across multiple batches. todo: same for hdp
Mike Hughes
Merge branch 'ENH-better-moves-with-relational' of https://bitbucket.org/michaelchughes/bnpy-dev into ENH-better-moves-with-relational
Mike Hughes
FIX small bookkeeping errors related to multiple batches. Also smaller args-ALG.txt error that printed whole dictionaries when should have printed nothing.
Mike Hughes
Merge branch 'ENH-better-moves-with-relational' of https://bitbucket.org/michaelchughes/bnpy-dev into ENH-better-moves-with-relational
Mike Hughes
ENH Updated InferHeldoutTopics callback to handle general hmodel-based inference, not specific to mult topic models
Mike Hughes
ENH Added ability to init obsmodel from an hmodel object. Also fixed saveEvery behavior to work even with saveEvery < 1
Mike Hughes
FIX hdp restricted step for sparse bernoulli now properly deals with 'on' words and 'off' words in lumped fashion
Mike Hughes
ENH Updated bregman init and HDP restricted steps to support Bern HDP on sparse wordct daata
Mike Hughes
FIX DPMixtureRestrictedLocalStep now supports normalized_counts
Mike Hughes
FIX Small bug where used allclose(sum_minDiv, 0.0) when should have just tested if equal to zero, since sum_minDiv may be quite small but still positive (eg 1e-10)
Mike Hughes
ENH DiagGaussObsModel converted to use DivDataVec format. todo: why not use smoothFracInit??
Mike Hughes
ENH GaussObsModel converted to use DivDataVec format. todo: why not use smoothFracInit??
Mike Hughes
INPROGRESS Converting GaussObsModel to std format
Mike Hughes
ENH Improved ioutil to be robust for loading with K=1, and let InferHeldoutTopics know how to find heldout set from batches/Info.conf
Mike Hughes
MAINT cleanup errant print statements
Mike Hughes
ENH births and deletes seem to proceed without major bugs after reorg that does all restricted-local step work in allocmodel specific file.
Mike Hughes
INPROGRESS Defined BRestrictedLocalStep to unify functions that become allocmodel specific
Mike Hughes
ENH Birth moves now use random seed specific to current learn alg and the current lap. Before, just used the lap, which made tasks that used the same predefined batches yield the same output, which was lame.
Mike Hughes
ENH Updated TryBirth.py script to auto-load the exact kwargs for births specified by the saved job, and to update any kwarg options specified by command line, like --b_Kfresh 10
Mike Hughes
FIX BernObsModel had bad comparison of CompDims tuple to a string 'K' instead of tuple 'K,'... resulted in some bad ELBO computations. Now fixed.
Mike Hughes
ENH Updated RunBregKMeans testing script, so it can run desired test via stdin specification of N and K and D
Mike Hughes
ENH FromScratchBregman back to computing objective up to additive constant. Would need to to recompute DataDivVec using smoothFrac=0... smoothFracInit value does not work.
Mike Hughes
ENH Updated to use DivDataVec in computation of Ldata objective in FromScratchBregman
Mike Hughes
ENH Updated test for bregman to look at zmg
188 commits not shown.