Wiki
Clone wikibnpy-dev / Datasets / DatasetList.md
Datasets for mixture models
Real-valued data (for Gaussian likelihoods)
-
Galaxy
Velocities (km/second) of 82 galaxies in a survey of the Corona Borealis region.
http://www.stats4stem.org/r-galaxy-data.html
Source: Roeder, K. (1990) Density estimation with confidence sets exemplified by superclusters and voids in galaxies. Journal of the American Statistical Association, 85, 617–624.
Example analysis: http://projecteuclid.org/download/pdf_1/euclid.aos/1016120364
-
Old Faithful
Waiting time between eruptions and the duration of the eruption for the Old Faithful geyser in Yellowstone National Park, Wyoming, USA.
http://www.stat.cmu.edu/~larry/all-of-statistics/=data/faithful.dat
Source (??): http://www.jstor.org/stable/2347385
Example analysis: See Bishop's PRML textbook.
Potential extension using lots more data: http://www.geyserstudy.org/geyser.aspx?pGeyserNo=OLDFAITHFUL
-
Flea Beetles
Data were collected on the genus of flea beetle Chaetocnema, which contains three species: concinna (Con), heikertingeri (Hei), and heptapotamica (Hep). Measurements were made on the width and angle of the aedeagus of each beetle. The goal of the original study was to form a classification rule to distinguish the three species.
http://www.dm.unibo.it/~simoncin/FleaBeetles.html
Source: Lubischew, A.A. (1962) On the use of discriminant functions in taxonomy. Biometrics, 18, 455-477.
Also found in: Hand, D.J., et al. (1994) A Handbook of Small Data Sets, London: Chapman & Hall, 254-255.
Updated