Report Checklist
Your report should briefly describe
your implemented solution and fully answer the questions for every part of the assignment. Your description
should focus on the most "interesting" aspects of your solution, i.e., any non-obvious
implementation choices and parameter settings, and what you have found to be especially
important for getting good performance. Feel free to include pseudocode or figures if
they are needed to clarify your approach. Your report should be self-contained and it should
(ideally) make it possible for us to understand your solution without having to run your source code.
WARNING: You will not get credit for any solutions that you have obtained,
but not included in your report!
For example, if you implement the bigram model, and you do not mention it in your report, then you will not receive extra credit for it.
Your report must be a formatted pdf document.
Pictures and or example outputs
should be incorporated into the document.
Exception: items which are very large or unsuitable for inclusion in a pdf document
(e.g. videos or animated gifs) may be put on the web and a URL included in your report.
For full credit, in addition to the algorithm descriptions, your report should include the following.
- State the accuracy, recall, and F1 scores on both the training and development sets.
- State your optimal Laplace smoothing parameter, and how you chose it. Mention other possible parameters you tried and the impact it had on your classification accuracy.
Additionally, provide answers to the following questions:
- We have said that this algorithm is called "naive" Bayes. What exactly is so naive about it?
- Naive Bayes can classify spam quite nicely, but can you imagine classification
problems for which naive Bayes performs poorly? Give an example, and explain why naive Bayes may perform poorly.
Extra credit
We reserve the right to give bonus points for any challenging or creative solutions that you implement.
This includes, but is not restricted to, the extra credit suggestion given above.