Final Project

Ferocious Final Projects

Due: Dec 12, 23:59 PM

Assignment Description

In a team of three or four students, you will propose and produce a final project involving some manner of data analysis using graph algorithms. As a team, you have considerable freedom to choose a project of interest to you – while we have some suggested projects through the Project Goals links, you are strongly encouraged to propose your own.

NOTE: Due to the structure of Prairielearn, regrade submissions will be done using a second assignment with a single deadline. Regrade penalties are the maximum of either weeks since the submission was originally due or the number of times your mentor has officially graded it. For example, if you submit a proposal regrade once two weeks after the due date its a double late penalty. If you submit three drafts of your proposal in one week, each submission has either a single, double, or triple late penalty. You must notify your project mentor when you have submitted something for a regrade.

Team Formation (Due October 28)

To participate in the CS 225 Final Project, you must create your own team of three or four students. This semester team formations (and most deliverable submissions) will be done using Prairielearn. Final project teams will be fixed on October 28th and all students who are not in a full team will instead take the final exam. If you would rather take the final, nothing on this page is relevant to you!

NOTE: Due to the nature of Prairielearn group assignments, you cannot access or submit any of the subsequent deliverables until you have a full team. Please use the following preview to observe the requirements for your first deliverables. (You will not be able to submit using the preview link)

Team Contract (Due November 4)

As a team, you must submit a 1-2 page document as a MD file which formalizes your team’s views on both core logistic issues as well as common pitfalls you may encounter over the course of your project. Once signed by each member of your team, it should be considered a binding agreement for all parties. Breaches of this contract can and should be brought up internally and – if not resolved – brought to the attention of course staff. You can access the team contract submission question on Prairielearn once you have a full team.

Final Project Proposal (Due Nov 4)

Even if you choose to use one or more of the suggested example project goals, as a team you are responsible for submitting a detailed project proposal according to the guidelines found on Prairielearn. Groups which do not submit an adequate proposal will be required to resubmit weekly until they have a mentor- or faculty-approved project.

Github Repo Creation (Due Nov 4)

As a team, you are responsible for creating and maintaining your own Github code repository. While your actual repo won’t be graded until the end of the project (and will be required to be in the form of a Github repo), to encourage you to actually use Github over the course of the project, you must also submit a link to your team’s Github on Prairielearn.

A collection of github repo examples can be found HERE which contains a suggested organization scheme and examples of several of the key deliverables. You should not copy these deliverables directly, but may use them as an example of the format and content you should be discussing as a team.

Development Log (Due weekly starting on Nov 4th)

A successful final project is built slowly over many weeks not thrown together at the last minute. To incentivize good project pacing and to let your project mentor stay informed about the status of your work, each week you are required to submit a development log detailing:

  1. What goals you had set for the week and whether they were accomplished or not
  2. What specific tasks each member of your team accomplished in the week
  3. What problems you encountered (if any) that prevented you from meeting your goals
  4. What you plan to accomplish next week

The development log will be graded for completion, detail, and honesty – not progress. It is much better to truthfully evaluate the work you completed in a week then lie to make the project sound further along then it really is. It is totally acceptable to have an entry that says you tried nothing and accomplished nothing. However if every week starts to say that, both you and your project mentor will be able to identify the issue before it becomes impossible to fix.

Mid-Project Checkin (November 16 – 18, November 28-30)

A few weeks into the final project, you are required to meet with your project mentor for a check-in meeting. You do not need to prepare a presentation but should come prepared to summarize your progress as well as have a frank discussion about any issues or concerns you have encountered as a team or as an individual team member. The goal here is to ensure that forward progress is being made and to address any issues that are impeding progress while there is still time to correct and recover. To that end, you should be up front and honest about your current progress.

While a significant amount of points for the checkin meeting is awarded for attending as a team, for full credit in the mid-project meeting you must have also completed at least one of your chosen algorithms or have a thorough data parsing pipeline (with all corrections / cleaning steps functional). You will be expected to demonstrate in the meeting the tests you have written proving that the algorithm works. This is to encourage you to start working on the final project long before the final weeks and ensure that you are writing real tests for your code as you develop it.

Final Project Deliverables (December 12)

There are four main deliverables for this final project. As a team, you are expected to distribute work on each deliverables fairly. This means that each student should be responsible for some part of each of the following:

  1. A functional code-base. Your code must be written in C++ and should be compilable and runnable on the standard CS 225 Dockerfile. It will be tested for reproducibility of your original results and it’s capacity to run on datasets of our choosing that exactly match your proposed formatting. Your code will be graded based on the following metrics:

    • Code ExecutionHow easy is it to run your code? For full credit, your code should be runnable using simple command line arguments, which include the ability to alter or adjust the input data or output location.

    • Code EfficiencyDoes your code match your target Big O efficiencies? For full credit, your code should have no obvious inefficiency in implementation and be capable of running to completion on your proposed dataset using reasonable hardware resources.

    • Code OrganizationIs your code human-readable? For full credit, all your variables, functions, and classes should be named appropriately and organized comments should detail the input, output, and intended behavior of major code blocks. Additionally, your final submission should be devoid of unnecessary or obsolete code.

    • Code CompletionHave you completed all your algorithms? For full credit, your code must be able to run all the proposed algorithms on the full dataset and have tests proving that the algorithms worked.

  2. A descriptive README. In addition to the code itself, you must include a human-readable README.md which describes:

    • Github Organization – You should describe the physical location of all major files and deliverables (code, tests, data, the written report, the presentation video, etc…)

    • Running Instructions – You should provide full instructions on how to build and run your executable, including how to define the input data and output location for each method. You should also have instructions on how to build and run your test suite, including a general description on what tests you have created. It is in your best interest to make the instructions (and the running of your executables and tests) as simple and straightforward as possible.

  3. A written report. In addition to your code, your Github repository must contain a results.md file which describes:

    • The output and correctness of each algorithm – You should summarize, visualize, or highlight some part of the full-scale run of each algorithm. Additionally, the report should briefly describe what tests you performed to confirm that each algorithm was working as intended.

    • The answer to your leading question – You should direct address your proposed leading question. How did you answer this question? What did you discover? If your project was ultimately unsuccessful, give a brief reflection about what worked and what you would do differently as a team.

  4. A final presentation. In addition to your project write-up, you should submit a short video (10 minutes or less) describing your project. Your presentation should include slides or other visual aids and include the following content:

    • Your Goals (Suggested time: 1-2 minutes) The presentation should begin with a summary of your proposed goals and a short statement about what you successfully accomplished and, if necessary, what you were ultimately unable to complete.

      Tip: Think of this as ‘setting the stage’ for your presentation, letting the viewer know what you will be discussing for the rest of the talk.

    • Your Development (Suggested time: 2-3 minutes) The presentation should include a high level overview of the work you put into the presentation. This is not meant to be a line by line recounting of your code but a highlight reel of the various design decisions you made and the challenges you encountered – and hopefully overcame – while working on the project.

      If you were unable to complete one of your goals, this is the best opportunity to explain what you did that didn’t work out, how you tried to address the problem, and what you might do in the future if you were tasked to do this or a similar project again.

      Tip: If you are struggling to identify content here, ask yourself questions like: “How did we get the data we wanted?”, “How did we choose our implementation strategy for an algorithm?”, “How did we ultimately test our code to ensure that it is working?”

    • Your Conclusions (Suggested time: 3-5 minutes) The presentation should end by answering the ‘leading question’ you were hoping to solve. This may include details such as the final or full-scale input dataset you used and the output of each of your algorithms but ambitious teams should focus on how these results led you to discover something interesting involving your real-world dataset. For example, a traversal algorithm on OpenFlights data may be used to identify the shortest path between two airports that your team would like to visit.

    In addition to quantitative results, your conclusions should also end with some individual thoughts you had about the project. What did you learn, what did you like or didn’t like, and what would you explore or implement next if given more time?

    To submit your final project video, you may either include it on Github or include a direct link to the video on your team Github. Videos can be hosted through Zoom cloud recordings, Youtube, Google drive, etc…