The final project is an excellent opportunity for you to explore an interesting computational perception problem. Projects should be done either individually or in a team of 2-3 students. Instructor and TAs will consult with you on your ideas and the execution throughout the semester. Your project will be worth 25% of your final class grade and will have the following deliverables:

  1. Project Proposal : 2 pages excluding references (30%), due March 7, 2023.
  2. Final Report : 8 pages excluding references and attached code (70%), due May 6, 2023
  3. Teaser Video (optional) : 3-min teaser video along with the final report (15% bonus points), due May 6, 2023

All write-ups should use the CVPR latex template



Project Topics

You could choose the final project topic. Feel free to make posts on Campuswire to form teams. There is also a "Project" tag/category for discussing project ideas. You're encouraged to discuss possible topics with us during office hours. Here are some general ideas on finding a good topic:

  • Select a classic paper from the computer vision and robotics literature (ICCV, CVPR, ECCV, ICRA, IROS, RSS, etc.), reimplement and test the approach described in that paper.

  • Make a big extension of one of our programming projects. For instance you could incoporate more sensor modality, making the algorithms significant times faster, or significant enhance the model’s capacity and robustness by leveraging deep learning.

  • Find an interesting public dataset and benchmark, try to explore various approaches to compete in the benchmark against other methods. You may want to build upon some publicly available code and modify based on it. However, merely running existing code an the dataset the authors have tested is not sufficient.

Project Proposal

You must turn in a brief project proposal that provides an overview of your idea and also contains a brief survey of related work on the topic.

The instructor will provide a list of suggested project ideas to choose from, though you are more than welcome to discuss other project ideas with us. Note that you cannot use research work that you started before this class as your project.

Proposals should be approximately two pages long, and should include the following information:

  • Project title and group members.
  • Overview. Describe the research problem and give an outline of what you propose to implement. Note that you can change later as your research exploration goes. Describe the desired outcome and the minimum goals (make sure the goals are realistic).
  • Resources. Description of potential datasets, robotic platforms or simulation environments to use. Specify any outside code you plan to use and how do you plan to execute. Be specific.
  • Planning. A rough timeline of action items. Discuss what you plan to deliver throughout the term and how you plan to divide up the work if you are in a team.
  • Background. Describe how your course project relates to your research background, knowledge, and skills. What are the tools and techniques that you are familiar with and which are new to you?

The grading breakdown for the proposal is as follows:

  • 40% for the significance of the problem and the technical soundness and novelty of the proposed method
  • 40% for the plan of activities
  • 20% for quality of writing

Final Report

Your final report is expected to be 8 pages excluding references in maximum. It should have roughly the following format:

  • Introduction:
    • Problem definition and motivation
    • Approach Outline
  • Background & Related Work:
    • Research background
    • Literature Review
  • Methods:
    • Overview of your proposed method
    • Technical details of the approach and algorithms that you developed
  • Experiments:
    • Benchmarks, baselines, metrics, and implementation details
    • Details of the experiments and results
    • Analysis of the results
  • Conclusion: discussion and future work

You also need to upload your code together with the final report.

The grading breakdown for the final report is as follows:

  • 40% for quality of writing (clarity, organization, flow, figure presentation, etc.)
  • 30% for technical approach (soundness, originality, etc.)
  • 30% for experiments and analysis (completeness, difficulties, correctness, code, etc.)

Teaser Video

Making a narrated spotlight video with a paper submission becomes overwhelmingly popular in robotics and vision conferences. It introduces the essence of the paper and allows authors to highlight the key results visually. Students are highly encouraged to make a teaser video for their final project and share it on Campuswire in this course. Top teaser videos will be highlighted on a "hall-of-fame" page and receive a 10% bonus of the final project grade (i.e., 5% of the total grade).


Perception Dataset

  • MS-COCO: Microsoft COCO dataset (detection, segmentation, captioning)
  • ADE20k:: Scene parsing benchmark (segmentation)
  • Matterport3D: Indoor 3D scenes (reconstruction, scene understanding, synthesis)
  • Oxford RoboCar:: Multi-sensor Localization and Mapping (lidar, radar, camera)
  • Cityscapes: Outdoor driving dataset (segmentation)
  • KITTI: Autonomous driving dataset (SLAM, detection, tracking, flow, etc.)
  • Waymo Autonomous driving dataset (detection, prediction)
  • NuScenes: Autonomous driving dataset (detection, prediction, planning)
  • ArgoVerse: Autonomous driving dataset (detection, prediction, planning)
  • ScanNet: Large-scale indoor 3D scenes (3D detection, synthesis, scene understanding)
  • TUM: Visual RGB(D) SLAM benchmarks (slam, VO, VIO)
  • TartanAir: Large-scale SLAM benchmarks (SLAM, optical flow)
  • HabitatAI: Indoor navigation simulation environment (navigation, manipulation, etc.)
  • CARLA:: Self-driving simulation environment (navigation, perception, planning, etc.)
  • PASCAL VOC:: Object recognition dataset (segmentation)

Resources