Assignment Description

The algorithms and data structures presented in CS 225 are chosen to be a strong but broad foundation of CS knowledge. As a team of two, go beyond the scope of the class to propose, implement, and benchmark a data structure or algorithm which is not covered in the class.

You will only receive credit if you successfully complete all of the following components:

  1. A project proposal
  2. Weekly development logs and mid-project checkin meeting
  3. A functional code base with tests for correctness
  4. A final report (and presentation) that proves correctness, benchmarks performance, and estimates the runtime efficiency of your method.

You should thoroughly read through this page before deciding if you will participate in an extra credit project!

All extra credit projects will be teams of 2! There are no exceptions for individuals or larger groups of any kind.

Team Formation

To participate in the CS 225 Project, you must create your own team. This semester team formation will be done using Prairielearn. One member of the team should create a new group using an UIUC-appropriate group name and the rest can join the project with the generated join code.

You will not be able to manually leave a final project team once created on Prairielearn. Be sure you and your partner are on the same page about your project (and who is creating vs who is joining) before starting!

Project Proposal (Approval by November 1st)

Your proposed final project must be the reproduction of a known (named) algorithm or data structure not covered in this or previous CS/ECE courses. Some suggested algorithms and data structures can be found under the ‘Project Goals’ link but you can also propose your own provided there is a peer-reviewed paper or textbook describing the method in detail. Due to the difficulty in manually grading proposals, each team is only guaranteed one chance to propose a project. Please make sure you have all required components before submitting.

The proposal itself has four parts:

  1. A GitHub repository. Your repository should be private for the duration of the course but must include both course instructors (and eventually your project mentor) as contributors. (The Prairielearn submission has the required github account information for course instructors). It is up to you to maintain your repository but it should be based on the standard structure presented in the example project.

  2. A processed dataset. Your algorithm must be benchmarked on a collection of datasets of different sizes. Accordingly, inside your GitHub repo’s data directory, you must have a collection of either .txt or .csv-formatted files containing the processed dataset you plan to use. While you can add additional datasets during the course of the project, the submitted proposal must contain enough already processed and formatted data to complete the project. If your dataset is too large to store in your GitHub repo, a link should be provided to your processed dataset.

  3. A set of tests to prove algorithm correctness. Writing tests before writing your code can be very beneficial for formalizing exactly what your code should be doing. Accordingly, inside your GitHub repo’s code/tests directory, you must have a single file containing at least three written tests that would (once passed) prove your algorithm works on at least one non-trivial (non-empty) input dataset. You are not expected to write all tests ahead of time but the format should match that of the provided tests in every CS 225 assignment. DO NOT WRITE TESTS BASED ON THE OUTPUT OF YOUR CODE AS A CIRCULAR WAY OF ‘PROVING’ CORRECTNESS. THE POINT OF TESTS IS TO PROVE YOUR OUTPUT IS CORRECT, NOT THAT IT IS DETERMINISTIC

  4. A written proposal. In no more than a few paragraphs and in your own words, describe what algorithm or data structure you will be implementing. Your proposal must have all of the following details:

    As you write these, consider the following:

     * Do you have to do anything to convert your csv input for the algorithm described? (Ex: A graph algorithm would require making the input into a graph.) 
     * Does your algorithm need more than just a dataset as input? (Ex: A* search requires a heuristic algorithm. If you choose to do A*, what are some possible heuristics you might use?)
     * If your algorithm works in stages, consider writing a separate function for each stage. (Ex: Many string algorithms have a preprocessing step. If you do one, how does the output of one function lead into another?) 
    

Project proposals will only be evaluated if they precisely match the expected format and contain all required information.

Development Log (Done weekly after project approval)

A successful final project is built slowly over many weeks not thrown together at the last minute. To incentivize good project pacing and to let your project mentor stay informed about the status of your work, each week you are required to submit a development log detailing:

  1. What goals you had set for the week and whether they were accomplished or not
  2. What specific tasks each member of your team accomplished in the week
  3. What problems you encountered (if any) that prevented you from meeting your goals
  4. What you plan to accomplish next week

The development log will be graded for completion, detail, and honesty – not progress. It is much better to truthfully evaluate the work you completed in a week then lie to make the project sound further along then it really is. It is totally acceptable to have an entry that says you tried nothing and accomplished nothing. However if every week starts to say that, both you and your project mentor will be able to identify the issue before it becomes impossible to fix.

Mid-Project Checkin (Meet before November 20th)

Both members of your team are required to meet with your project mentor for a check-in meeting at least once before the end of the project. You do not need to prepare a presentation but should come prepared to summarize your progress as well as have a frank discussion about any issues or concerns you have encountered as a team or as an individual team member. The goal here is to ensure that forward progress is being made and to address any issues that are impeding progress while there is still time to correct and recover. To that end, you should be up front and honest about your current progress.

That said, you should try to schedule your meeting once you have what you believe is a functioning version of your proposed algorithm (but before you have done your full benchmarking). You will be expected to explain and demonstrate the tests you have written proving that the algorithm works.

Final Project Deliverables

There are four main deliverables for this final project:

  1. A functional code-base. Your code must either work on the current semester’s docker container or with special arrangements with your mentor in a system that you agreed on. It must run with no errors using simple command line arguments and have loosely reproducable runtimes.

  2. A descriptive README. In addition to the code itself, you must include a human-readable README.md which describes:

  3. A written report. In addition to your code, your Github repository must contain a file final_report.pdf file which describes:

  4. A final presentation. Alongside your final report, you should submit a short video (10 minutes or less) describing your project. Your presentation should include slides or other visual aids and include the following content:

    In addition to quantitative results, your conclusions should also end with some individual thoughts you had about the project. What did you learn, what did you like or didn’t like, and what would you explore or implement next if given more time?

    To submit your final project video, you may either include it on Github or include a direct link to the video on your team Github. Videos can be hosted through Zoom cloud recordings, Youtube, Google drive, etc…