SURF 2024: Task-Relevant Metrics for Perception

From Murray Wiki
Jump to navigationJump to search

2024 SURF Task-Relevant Metrics for Perception

  • Mentor: Richard Murray
  • Co-mentor: Apurva Badithela

Project Description

In autonomous cyber-physical systems, oftentimes, perception and control modules are designed under different paradigms. Perception modules feature deep learning heavily while planning and control still incorporate traditional methods to a large extent. Oftentimes, we don’t do perception for the sake of perception but to aid in correct decision-making. Typically, perception and control modules are designed under different paradigms. This work identifies evaluation metrics of perception tasks that are useful in providing probabilistic guarantees on system-level behavior. For example, confusion matrices are popularly used in computer vision to compare and evaluate models for detection tasks, and a wide-variety of metrics such as accuracy, precision, recall, among others, can be derived from the confusion matrix. In prior work [2], we showed how confusion matrices can be used as a model of sensor error to provide probabilistic guarantees on system-level safety. However, not all perception errors are equally safety-critical. In [3], we leveraged knowledge of the controller as well as the system-level requirement to introduce task-relevant metrics for object detection and classification tasks. In this work, we seek to study other perception functionalities such as tracking objects across multiple frames, and find corresponding metrics to evaluate tracking in learned perception models in a manner that is aligned with system-level safety requirements.

Problem

In this SURF, we will explore the interface between perception and planning more carefully. Misclassification or misdetection in a single frame is unlikely to trigger a different decision from the planner. Therefore, we need to incorporate a notion of tracking objects across multiple frames to make system-level evaluations less conservative. This will require identifying new metrics beyond confusion matrices to capture detection performance across multiple frames. While there exist metrics to evaluate tracking, these metrics are not informed by the system-level task [1].

Goals for this SURF include:

  • Proposing new metrics for tracking or other perception tasks, and rigorously connecting these metrics to system-level evaluations of safety.
  • Evaluating state-of-the-art perception models on the nuScenes dataset with respect to tracking metrics derived from system-level specifications
  • Time permitting, to validate theoretical results on a hardware platform such as Duckietown.

Desired:

  • Experience programming in Python, ROS, OpenCV.
  • Coursework in control, robotics, computer vision.
  • Interest in theoretical research, robotics, and working with hardware, and industry datasets such as nuScenes.

References:

  • [1] Luiten, Jonathon, et al. "Hota: A higher order metric for evaluating multi-object tracking." International journal of computer vision 129 (2021): 548-578.
  • [2] Badithela, Apurva, Tichakorn Wongpiromsarn, and Richard M. Murray. "Leveraging classification metrics for quantitative system-level analysis with temporal logic specifications." 2021 60th IEEE Conference on Decision and Control (CDC). IEEE, 2021.
  • [3] Badithela, Apurva, Tichakorn Wongpiromsarn, and Richard M. Murray. "Evaluation Metrics for Object Detection for Autonomous Systems." arXiv preprint arXiv:2210.10298 (2022).