Commits

Anonymous committed cf5d606

Comments (0)

Files changed (1)

 
 === Heathcare Reform (HCR) ===
 
-Run the following command to 
+Run the following command to preprocess the train portion of our Healthcare Reform dataset (only used to train a supervised model for comparison to the semisupervised models of interest):
+
+{{{
+$ updown preproc-hcr data/hcr/train/orig/hcr-train.csv src/main/resources/eng/dictionary/stoplist.txt > data/hcr/train/hcr-train-features.txt
+}}}
+
+You should see the following output:
+{{{
+Preprocessed 488 tweets. Fraction positive: 0.43237704
+}}}
+
+Run the following command to preprocess the development portion of our Healthcare Reform dataset:
+
+{{{
+$ updown preproc-hcr data/hcr/dev/orig/hcr-dev.csv src/main/resources/eng/dictionary/stoplist.txt > data/hcr/dev/hcr-dev-features.txt
+}}}
+
+You should see the following output:
+{{{
+Preprocessed 534 tweets. Fraction positive: 0.3220974
+}}}
+
+Run the following command to preprocess the test portion of our Healthcare Reform dataset:
+
+{{{
+$ updown preproc-hcr data/hcr/test/orig/hcr-test.csv src/main/resources/eng/dictionary/stoplist.txt > data/hcr/test/hcr-test-features.txt
+}}}
+
+You should see the following output:
+{{{
+Preprocessed 396 tweets. Fraction positive: 0.38636363
+}}}
 
 === Syntax highlighting ===