Extra Credit
For this MP, the algorithms are fixed. Do not change them.
Your main opportunity for
extra credit involves optimizing the smoothing methods to improve
performance. Notice that you may need to use different smoothing
methods for different groups of probabilities. (See e.g. the lecture
21 summary for some ideas.)
You may also wish to do more extensive experimentation.
For example,
- Run the tagger on
(slightly) mismatched training and test data. E.g. how well can
you get it to perform if trained on Brown but tested on
MASC?
- Separately report performance on test sentences that did/didn't
contain words not seen in the training data.
You could also try automatically tuning key smoothing parameters,
using a loop that systematically tries a variety of settings.
You could try using the form of an unseen word (e.g. its length, its
last few characters) to make a better guess at its part of speech.
If you experiment with several configurations,
you may wish to add extra flag options to the main mp5 function.
If you do this:
- Submit your revised mp5.py
- Make sure that the provided version of mp5.py will run your
best configuration, i.e. the one you'd like the autograder to use.