View source
dev_kinan
change destination
dev_selin
  • Contributors
    1. Loading...
Author Commit Message Date Builds
17 commits behind dev_kinan.
selin.aydin@rwth-aachen.de
update
selin.aydin@rwth-aachen.de
update
selin.aydin@rwth-aachen.de
delete
selin.aydin@rwth-aachen.de
ignore
selin.aydin@rwth-aachen.de
delete
selin.aydin@rwth-aachen.de
test: excerpt from cooc-matrix with the playlist in test set training: pd.Series, index = pid, entry=name
selin.aydin@rwth-aachen.de
more comments
selin.aydin@rwth-aachen.de
binary search for a weight is stopped if no improvement occurs after a step
selin.aydin@rwth-aachen.de
contains column number of playlists and playlist name for trainings set
selin.aydin@rwth-aachen.de
added initial variance treshold
selin.aydin@rwth-aachen.de
variance of evaluation results of the 10 subsets are considered
selin.aydin@rwth-aachen.de
Merge branch 'dev_selin2' of https://bitbucket.org/spotifyteam/praktrepo into dev_selin
selin.aydin@rwth-aachen.de
delete
selin.aydin@rwth-aachen.de
incomplete CV-procedure
selin.aydin@rwth-aachen.de
ignore - trying to repair this branch
selin.aydin@rwth-aachen.de
incomplete cross-validation procedure
selin.aydin@rwth-aachen.de
added some comments
selin.aydin@rwth-aachen.de
delete
selin.aydin@rwth-aachen.de
training set can now be generated (outputs a csv file with format: pid, name, [tracks])
selin.aydin@rwth-aachen.de
creating a training_set
Kinan Halloum
get_score is now much more efficient, in addition the similar playlist names are cached for more efficiency (most similar playlists generation for a given playlist name takes ~100ms. Similarity of track to a cached playlist names ~1ms)
selin.aydin@rwth-aachen.de
name_sim*.py files determine how well a track fits to a playlist name
selin.aydin@rwth-aachen.de
removed some dead code parts
selin.aydin@rwth-aachen.de
Code I used for generating the 7k recommendations
selin.aydin@rwth-aachen.de
no message
selin.aydin@rwth-aachen.de
Cleaned the code a bit
selin.aydin@rwth-aachen.de
Code is not cleaned yet, but the tracks for the playlists with zero tracks are generated (first 1000 playlists out of 1000000)
selin.aydin@rwth-aachen.de
Changed the analyzer of the TfidfVectorizer to n-grams to compare character n-grams and not words. Currently trying to find out how to handle emojis.
selin.aydin@rwth-aachen.de
tidying up my branch
selin.aydin@rwth-aachen.de
In the previous commit it was only possible to use existing playlist names and not newly invented.
selin.aydin@rwth-aachen.de
- Creating a list of playlist names reusing Kinan's code (approx. 30 min) - Computing the similarity of a playlist name to all other playlist names (in 10s :D) - Returns dataframe with pids of most similar playlist names