This assignment counts for 20% of the course grade.
Assignments turned in after the deadline but before April 29 are subject to a 30% grade penalty.
In this assignment you will write perceptron classifiers (vanilla and averaged) to identify hotel reviews as either true or fake, and either positive or negative. You may using the word tokens as features, or any other features you can devise from the text. The assignment will be graded based on the performance of your classifiers, that is how well they perform on unseen test data compared to the performance of a reference classifier.
The training and development data are the same as for Coding Exercise 2, and are available as a compressed ZIP archive on Blackboard. The uncompressed archive contains the following files:
train-labeled.txt
containing labeled training data
with a single training instance (hotel review) per line (total 960 lines).
The first 3 tokens in each line are:
True
or Fake
Pos
or Neg
dev-text.txt
with unlabeled development data,
containing just the unique identifier followed by the text of the
review (total 320 lines).
dev-key.txt
with the corresponding labels
for the development data, to serve as an answer key.
The perceptron algorithms appear in Hal Daumé III, A Course in Machine Learning (v. 0.99 draft), Chapter 4: The Perceptron.
You will write two programs: perceplearn.py
will learn
perceptron models (vanilla and averaged) from the training data, and
percepclassify.py
will use the models to classify new data. If
using Python 3, you will name your programs perceplearn3.py
and percepclassify3.py
. The learning program will be invoked
in the following way:
> python perceplearn.py /path/to/input
The argument is a single file containing the training data; the program
will learn perceptron models, and write the model parameters to two files:
vanillamodel.txt
for the vanilla perceptron, and
averagedmodel.txt
for the averaged perceptron.
The format of the model files is up to
you, but they should follow the following guidelines:
percepclassify.py
to successfully label new data.
The classification program will be invoked in the following way:
> python percepclassify.py /path/to/model /path/to/input
The first argument is the path to the model file (vanillamodel.txt
or
averagedmodel.txt
), and the second argument is the path
to a file containing the test data file; the program
will read the parameters of a perceptron model from the model file,
classify each entry in the test data, and
write the results to a text file called percepoutput.txt
in
the same format as the answer key.
All submissions will be completed through Vocareum; please consult the instructions for how to use Vocareum.
Multiple submissions are allowed; only the final submission will be graded. Each time you submit, a submission script trains your model on the training data, runs your classifier on the development data, and reports the results. Do not include the data in your submission: the submission script reads the data from a central directory, not from your personal directory. You are encouraged to submit early and often in order to iron out any problems, especially issues with the format of the final output.
The performance of you classifier will be measured automatically; failure to format your output correctly may result in very low scores, which will not be changed.
For full credit, make sure to submit your assignment well before the deadline. The time of submission recorded by the system is the time used for determining late penalties. If your submission is received late, whatever the reason (including equipment failure and network latencies or outages), it will incur a late penalty.
If you have any issues with Vocareum with regards to logging in, submission, code not executing properly, etc., please contact Siddharth.
After the due date, we will train your model on a combination of the training and development data, run your classifier on unseen test data twice (once with the vanilla model, and once with the averaged model), and compute the F1 score of your output for each of the four classes (true, fake, positive, and negative). Your grade will be based on the performance of your classifier. We will calculate the mean of the four F1 scores and scale it to the performance of a perceptron classifier developed by the instructional staff (so if that classifier has F1=0.8, then a score of 0.8 will receive a full credit, and a score of 0.72 will receive 90% credit; your vanilla perceptron will be compared to a reference vanilla perceptron, and your averaged perceptron will be compared to a reference averaged perceptron).
perceplearn.py
on the training data, running on a
MacBook Pro from 2016.