1. Jiazhen He
  2. TopicResponse

Overview

HTTPS SSH

TopicResponse

TopicResponse discovers topics with difficulty levels inferred by combining NMF-based topic modelling and Rasch modelling.

TopicResponse is distributed as open source under GPL license.

Relevant Publications

TopicResponse implements the TopicResponse algorithm introduced in the following paper:

TopicResponse: A Marriage of Topic Modelling and Rasch Modelling for Automatic Measurement in MOOCs
Jiazhen He, Benjamin I. P. Rubinstein, James Bailey, Rui Zhang, Sandra Milligan
    arXiv:1607.08720, 2016

Requirements

Python 2.7

The implementation is based on the Nimfa python module for nonnegative matrix factorisation. Our implementation of TopicResponse algorithm lies in

nimfa/methods/factorization/topicresponse.py

which is a new file added to the original nimfa module.

You can either install Nimfa 1.0 and copy the above file to corresponding directory or use our local repository of nimfa.

Preparing input data

The input 'data' dictionary includes:

  • data['V']: word-student matrix with normalised tfidf

  • data['Hideal']: the ideal number of topics per student participated, computed using formula in our paper for H_ideal

  • data['words']: word list, used for printing topics

Notes: The data directory contains only toy data as we cannot make our data public due to confidential and ethical issue.

Running Topicresponse

Start running a single test by

python run_single.py 

Options:

 -d, --dataset, default='do001'                 : course name
 -t, --num_topic, default=10                    : the number of topics
 -wi, --lambda_w_init, default=0.1              : regularisation parameter for ||W||
 -hi, --lambda_h_init, default=1.0,             : regularisation parameter for H binary constraint
 -hideali, --lambda_h_ideal_init, default=1.0   : regularisation parameter for H_ideal constraint
 -rasch, --lambda_rasch, default=0.1            : regularisation parameter for rasch

Results

  • *result.txt: contains the values of each part of the objective function, item difficulty (beta), and item infit.

  • *topic.txt: contains the generated topics.

License

TopicResponse is based on the following package. The modifications are Copyright (C) 2016-17 Jiazhen He

nimfa - A Python Library for Nonnegative Matrix Factorization Techniques Copyright (C) 2011-2012 Marinka Zitnik and Blaz Zupan

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.