Murray Wiki - User contributions [en]

Nok Wongpiromsarn, May 2024

2024-05-13T17:45:28Z

Abadithe:

Nok Wongpiromsarn, an Assistant Professor at Iowa State, will visit Caltech on 20-21 May 2024.

20 May 2024
* 9:15a: Richard, 109 Steele Lab
* 10:00a: Apurva
* 10:45a: Open
* 11:30a: Open
* 12p-1:15p: Lunch
* 1:15p: Open
* 2:00p: Open
* 2:45p: Open
* 3:30p: Open
* 4:15p: Open
* ~6 pm: dinner with Richard

21 May 2024
* 10:15a: Open
* 10:30a: Open
* 11:15a: Open
* 12-1p: Lunch
* 1-3 pm: Apurva Badithela thesis defense
* 3:00 pm: Open
* 3:45 pm: Open
* 4:30 pm: Open
* 5:00 pm: Richard, 109 Steele Lab

SURF 2024

2023-12-18T00:24:58Z

Abadithe:

{{righttoc}}
This page is intended for students interested in working on SURF projects in the Summer of 2024. It contains information about how to apply for a SURF project in my group along with a list of project areas.

'''Note:''' Projects will be posted here starting after finals week and up to the start of classes. Please check back after that time for more information.

=== Applying for a SURF project in my group ===

Because I get many students interested in doing SURFs in my group and because we have several projects available, we use the first few weeks in January to sort out who we will work with in writing proposals. We only submit one proposal per project area and so we often can't accommodate everyone who wants to work in my group over the summer.

# A list of SURF project descriptions is given in the table below. Due to the number of SURF projects that we support, we are only able to support students who select from among these projects. Please make sure to read the project descriptions, required skills (if any) and skim a few of the listed references before contacting me about doing a SURF project.
# Students interested in writing proposals for SURF projects should contact me via e-mail by 10 Jan (Wed) and provide the following information:
#* A list of up to three SURF projects from the list below that you are interested in working on
#* A one page resume listing relevant experience and coursework
#* If you are not a Caltech student, I will also need the following additional information:
#** An unofficial copy of your academic transcript
#** Names of two faculty members at your current institution that I can contact for a reference
# Starting on 11 January, I will go through all applications and work with my group to identify who is a possible fit for each project. We will then contact you and ask for you to meet (or talk with) possible co-mentors so that we can eventually work out who we will work with in writing up a proposal.
# We hope to make final decisions on projects by about 1 Jan, at which point we will start working with students on writing up proposals.
# All applications should go through the normal SURF application process, described at www.surf.caltech.edu. SURF applications are due on ~22 Feb.
# If you are selected for a SURF, please be aware of the following information
#* All SURF projects in my group will start on 18 Jun (Tue). If you can't start on that date, please make sure that you indicate this when you contact me
#* All SURF projects are for a minimum of 10 weeks, although I usually recommend that you try to stay for 12 weeks if possible. It's hard to complete a project in just 10 weeks and spending a few extra weeks can greatly improve the project.
#* All SURF students in my group will be expected to devote full-time effort to their SURF project, so you cannot have a second job in addition to your SURF.
#* Additional information on SURF available here: https://sfp.caltech.edu/undergraduate-research/programs/surf

=== List of available projects ===

Projects will be posted as they come available. I recommend waiting until near the deadline submission before submitting your project preferences.

{| border=1 width=100%
|-
| '''Title''' || '''Grant/Project''' || '''Co-Mentors''' || '''Comments'''
|-
| {{SURF|2024|Task-Relevant Metrics for Perception}}
| TBD
| Apurva Badithela
|
|-
| {{SURF|2024|Establish synthetic biology toolkits for Steinernema nematode transgene expression}}
| Carnegie Institution for Science
| TBD
| Mentor: Mengyi Cao (PI)
|-
| {{SURF|2024|Bioengineering toolkit development for genetic alterations in the entomopathogenic nematode symbiont Xenorhabdus griffiniae}}
| TBD
| Elin Larsson
|
|-
| {{SURF|2023|Genetically-Programmed Synthetic Cells and Multi-Cellular Machines}}
| [[NSF Cell Free]]
| TBD
| Multiple projects may be available; competitive selection
|}

SURF 2024: Task-Relevant Metrics for Perception

2023-12-16T00:11:33Z

Abadithe: /* Project Description */

'''[[SURF 2024|2024 SURF]] Task-Relevant Metrics for Perception'''
* Mentor: Richard Murray
* Co-mentor: Apurva Badithela

==Project Description==
In autonomous cyber-physical systems, oftentimes, perception and control modules are designed under different paradigms. Perception modules feature deep learning heavily while planning and control still incorporate traditional methods to a large extent. Oftentimes, we don’t do perception for the sake of perception but to aid in correct decision-making. Typically, perception and control modules are designed under different paradigms. This work identifies evaluation metrics of perception tasks that are useful in providing probabilistic guarantees on system-level behavior. For example, confusion matrices are popularly used in computer vision to compare and evaluate models for detection tasks, and a wide-variety of metrics such as accuracy, precision, recall, among others, can be derived from the confusion matrix. In prior work [2], we showed how confusion matrices can be used as a model of sensor error to provide probabilistic guarantees on system-level safety. However, not all perception errors are equally safety-critical. In [3], we leveraged knowledge of the controller as well as the system-level requirement to introduce task-relevant metrics for object detection and classification tasks. In this work, we seek to study other perception functionalities such as tracking objects across multiple frames, and find corresponding metrics to evaluate tracking in learned perception models in a manner that is aligned with system-level safety requirements.

==Problem==
In this SURF, we will explore the interface between perception and planning more carefully. Misclassification or misdetection in a single frame is unlikely to trigger a different decision from the planner. Therefore, we need to incorporate a notion of tracking objects across multiple frames to make system-level evaluations less conservative. This will require identifying new metrics beyond confusion matrices to capture detection performance across multiple frames. While there exist metrics to evaluate tracking, these metrics are not informed by the system-level task [1].

Goals for this SURF include:
* Proposing new metrics for tracking or other perception tasks, and rigorously connecting these metrics to system-level evaluations of safety.
* Evaluating state-of-the-art perception models on the nuScenes dataset with respect to tracking metrics derived from system-level specifications
* Time permitting, to validate theoretical results on a hardware platform such as Duckietown.

==Desired:==
* Experience programming in Python, ROS, OpenCV.
* Coursework in control, robotics, computer vision.
* Interest in theoretical research, robotics, and working with hardware, and industry datasets such as nuScenes.

==References:==
* [1] Luiten, Jonathon, et al. "Hota: A higher order metric for evaluating multi-object tracking." International journal of computer vision 129 (2021): 548-578.
* [2] Badithela, Apurva, Tichakorn Wongpiromsarn, and Richard M. Murray. "Leveraging classification metrics for quantitative system-level analysis with temporal logic specifications." 2021 60th IEEE Conference on Decision and Control (CDC). IEEE, 2021.
* [3] Badithela, Apurva, Tichakorn Wongpiromsarn, and Richard M. Murray. "Evaluation Metrics for Object Detection for Autonomous Systems." arXiv preprint arXiv:2210.10298 (2022).

SURF 2024: Task-Relevant Metrics for Perception

2023-12-15T23:37:35Z

Abadithe: /* Project Description */

'''[[SURF 2024|2024 SURF]] Task-Relevant Metrics for Perception'''
* Mentor: Richard Murray
* Co-mentor: Apurva Badithela

==Project Description==

In autonomous cyber-physical systems, oftentimes, perception and control modules are designed under different paradigms. Perception modules feature deep learning heavily while planning and control still incorporate traditional methods to a large extent. Oftentimes, we don’t do perception for the sake of perception but to aid in correct decision-making. Typically, perception and control modules are designed under different paradigms. This work identifies evaluation metrics of perception tasks that are useful in providing probabilistic guarantees on system-level behavior. For example, confusion matrices are popularly used in computer vision to compare and evaluate models for detection tasks, and a wide-variety of metrics such as accuracy, precision, recall, among others, can be derived from the confusion matrix. In prior work (2), we showed how confusion matrices can be used as a model of sensor error to provide probabilistic guarantees on system-level safety. However, not all perception errors are equally safety-critical. In (3), we leveraged knowledge of the controller as well as the system-level requirement to introduce task-relevant metrics for object detection and classification tasks. In this work, we seek to study other perception functionalities such as tracking objects across multiple frames, and find corresponding metrics to evaluate tracking in learned perception models in a manner that is aligned with system-level safety requirements.

==Problem==
In this SURF, we will explore the interface between perception and planning more carefully. Misclassification or misdetection in a single frame is unlikely to trigger a different decision from the planner. Therefore, we need to incorporate a notion of tracking objects across multiple frames to make system-level evaluations less conservative. This will require identifying new metrics beyond confusion matrices to capture detection performance across multiple frames. While there exist metrics to evaluate tracking, these metrics are not informed by the system-level task [1].

Goals for this SURF include:
* Proposing new metrics for tracking or other perception tasks, and rigorously connecting these metrics to system-level evaluations of safety.
* Evaluating state-of-the-art perception models on the nuScenes dataset with respect to tracking metrics derived from system-level specifications
* Time permitting, to validate theoretical results on a hardware platform such as Duckietown.

==Desired:==
* Experience programming in Python, ROS, OpenCV.
* Coursework in control, robotics, computer vision.
* Interest in theoretical research, robotics, and working with hardware, and industry datasets such as nuScenes.

==References:==
* [1] Luiten, Jonathon, et al. "Hota: A higher order metric for evaluating multi-object tracking." International journal of computer vision 129 (2021): 548-578.
* [2] Badithela, Apurva, Tichakorn Wongpiromsarn, and Richard M. Murray. "Leveraging classification metrics for quantitative system-level analysis with temporal logic specifications." 2021 60th IEEE Conference on Decision and Control (CDC). IEEE, 2021.
* [3] Badithela, Apurva, Tichakorn Wongpiromsarn, and Richard M. Murray. "Evaluation Metrics for Object Detection for Autonomous Systems." arXiv preprint arXiv:2210.10298 (2022).

SURF 2024: Task-Relevant Metrics for Perception

2023-12-15T23:35:48Z

Abadithe: /* Project Description */

'''[[SURF 2024|2024 SURF]] Task-Relevant Metrics for Perception'''
* Mentor: Richard Murray
* Co-mentor: Apurva Badithela

==Project Description==

In autonomous cyber-physical systems, oftentimes, perception and control modules are designed under different paradigms. Perception modules feature deep learning heavily while planning and control still incorporate traditional methods to a large extent. Oftentimes, we don’t do perception for the sake of perception but to aid in correct decision-making. Typically, perception and control modules are designed under different paradigms. This work identifies evaluation metrics of perception tasks that are useful in providing probabilistic guarantees on system-level behavior. For example, confusion matrices are popularly used in computer vision to compare and evaluate models for detection tasks, and a wide-variety of metrics such as accuracy, precision, recall, among others, can be derived from the confusion matrix. In prior work [2], we showed how confusion matrices can be used as a model of sensor error to provide probabilistic guarantees on system-level safety. However, not all perception errors are equally safety-critical. In [3], we leveraged knowledge of the controller as well as the system-level requirement to introduce task-relevant metrics for object detection and classification tasks. In this work, we seek to study other perception functionalities such as tracking objects across multiple frames, and find corresponding metrics to evaluate tracking in learned perception models in a manner that is aligned with system-level safety requirements.

==Problem==
In this SURF, we will explore the interface between perception and planning more carefully. Misclassification or misdetection in a single frame is unlikely to trigger a different decision from the planner. Therefore, we need to incorporate a notion of tracking objects across multiple frames to make system-level evaluations less conservative. This will require identifying new metrics beyond confusion matrices to capture detection performance across multiple frames. While there exist metrics to evaluate tracking, these metrics are not informed by the system-level task [1].

Goals for this SURF include:
* Proposing new metrics for tracking or other perception tasks, and rigorously connecting these metrics to system-level evaluations of safety.
* Evaluating state-of-the-art perception models on the nuScenes dataset with respect to tracking metrics derived from system-level specifications
* Time permitting, to validate theoretical results on a hardware platform such as Duckietown.

==Desired:==
* Experience programming in Python, ROS, OpenCV.
* Coursework in control, robotics, computer vision.
* Interest in theoretical research, robotics, and working with hardware, and industry datasets such as nuScenes.

==References:==
* [1] Luiten, Jonathon, et al. "Hota: A higher order metric for evaluating multi-object tracking." International journal of computer vision 129 (2021): 548-578.
* [2] Badithela, Apurva, Tichakorn Wongpiromsarn, and Richard M. Murray. "Leveraging classification metrics for quantitative system-level analysis with temporal logic specifications." 2021 60th IEEE Conference on Decision and Control (CDC). IEEE, 2021.
* [3] Badithela, Apurva, Tichakorn Wongpiromsarn, and Richard M. Murray. "Evaluation Metrics for Object Detection for Autonomous Systems." arXiv preprint arXiv:2210.10298 (2022).

SURF 2024: Task-Relevant Metrics for Perception

2023-12-15T23:35:36Z

Abadithe: /* Project Description */

SURF 2024: Task-Relevant Metrics for Perception

2023-12-15T23:25:26Z

Abadithe: /* References: */

'''[[SURF 2024|2024 SURF]] Task-Relevant Metrics for Perception'''
* Mentor: Richard Murray
* Co-mentor: Apurva Badithela

==Project Description==

[[Image:classical_autonomy_stack.png|right|800px|Caption: System-level requirements are easier to formalize than requirements on perception tasks.]]

==Problem==
In this SURF, we will explore the interface between perception and planning more carefully. Misclassification or misdetection in a single frame is unlikely to trigger a different decision from the planner. Therefore, we need to incorporate a notion of tracking objects across multiple frames to make system-level evaluations less conservative. This will require identifying new metrics beyond confusion matrices to capture detection performance across multiple frames. While there exist metrics to evaluate tracking, these metrics are not informed by the system-level task [1].

Goals for this SURF include:
* Proposing new metrics for tracking or other perception tasks, and rigorously connecting these metrics to system-level evaluations of safety.
* Evaluating state-of-the-art perception models on the nuScenes dataset with respect to tracking metrics derived from system-level specifications
* Time permitting, to validate theoretical results on a hardware platform such as Duckietown.

==Desired:==
* Experience programming in Python, ROS, OpenCV.
* Coursework in control, robotics, computer vision.
* Interest in theoretical research, robotics, and working with hardware, and industry datasets such as nuScenes.

==References:==
* [1] Luiten, Jonathon, et al. "Hota: A higher order metric for evaluating multi-object tracking." International journal of computer vision 129 (2021): 548-578.
* [2] Badithela, Apurva, Tichakorn Wongpiromsarn, and Richard M. Murray. "Leveraging classification metrics for quantitative system-level analysis with temporal logic specifications." 2021 60th IEEE Conference on Decision and Control (CDC). IEEE, 2021.
* [3] Badithela, Apurva, Tichakorn Wongpiromsarn, and Richard M. Murray. "Evaluation Metrics for Object Detection for Autonomous Systems." arXiv preprint arXiv:2210.10298 (2022).

SURF 2024: Task-Relevant Metrics for Perception

2023-12-15T23:25:06Z

Abadithe: /* References: */

'''[[SURF 2024|2024 SURF]] Task-Relevant Metrics for Perception'''
* Mentor: Richard Murray
* Co-mentor: Apurva Badithela

==Project Description==

[[Image:classical_autonomy_stack.png|right|800px|Caption: System-level requirements are easier to formalize than requirements on perception tasks.]]

==Problem==
In this SURF, we will explore the interface between perception and planning more carefully. Misclassification or misdetection in a single frame is unlikely to trigger a different decision from the planner. Therefore, we need to incorporate a notion of tracking objects across multiple frames to make system-level evaluations less conservative. This will require identifying new metrics beyond confusion matrices to capture detection performance across multiple frames. While there exist metrics to evaluate tracking, these metrics are not informed by the system-level task [1].

Goals for this SURF include:
* Proposing new metrics for tracking or other perception tasks, and rigorously connecting these metrics to system-level evaluations of safety.
* Evaluating state-of-the-art perception models on the nuScenes dataset with respect to tracking metrics derived from system-level specifications
* Time permitting, to validate theoretical results on a hardware platform such as Duckietown.

==Desired:==
* Experience programming in Python, ROS, OpenCV.
* Coursework in control, robotics, computer vision.
* Interest in theoretical research, robotics, and working with hardware, and industry datasets such as nuScenes.

==References:==
[1] Luiten, Jonathon, et al. "Hota: A higher order metric for evaluating multi-object tracking." International journal of computer vision 129 (2021): 548-578.
[2] Badithela, Apurva, Tichakorn Wongpiromsarn, and Richard M. Murray. "Leveraging classification metrics for quantitative system-level analysis with temporal logic specifications." 2021 60th IEEE Conference on Decision and Control (CDC). IEEE, 2021.
[3]Badithela, Apurva, Tichakorn Wongpiromsarn, and Richard M. Murray. "Evaluation Metrics for Object Detection for Autonomous Systems." arXiv preprint arXiv:2210.10298 (2022).

SURF 2024: Task-Relevant Metrics for Perception

2023-12-15T23:00:39Z

Abadithe: /* Problem */

SURF 2024: Task-Relevant Metrics for Perception

2023-12-15T22:51:33Z

Abadithe: /* Problem */

SURF 2024: Task-Relevant Metrics for Perception

2023-12-15T22:51:13Z

Abadithe: Created page with "'''2024 SURF Task-Relevant Metrics for Perception''' * Mentor: Richard Murray * Co-mentor: Apurva Badithela ==Project Description== Caption: System-level requirements are easier to formalize than requirements on perception tasks. ==Problem== In this SURF, we will explore the interface between perception and planning more carefully. Misclassification or misdetection in a single frame is unlikely to tr..."

SURF 2024

2023-12-15T22:15:48Z

Abadithe: /* List of available projects */

{{righttoc}}
This page is intended for students interested in working on SURF projects in the Summer of 2024. It contains information about how to apply for a SURF project in my group along with a list of project areas.

'''Note:''' Projects will be posted here starting after finals week and up to the start of classes. Please check back after that time for more information.

=== Applying for a SURF project in my group ===

Because I get many students interested in doing SURFs in my group and because we have several projects available, we use the first few weeks in January to sort out who we will work with in writing proposals. We only submit one proposal per project area and so we often can't accommodate everyone who wants to work in my group over the summer.

# A list of SURF project descriptions is given in the table below. Due to the number of SURF projects that we support, we are only able to support students who select from among these projects. Please make sure to read the project descriptions, required skills (if any) and skim a few of the listed references before contacting me about doing a SURF project.
# Students interested in writing proposals for SURF projects should contact me via e-mail by 10 Jan (Wed) and provide the following information:
#* A list of up to three SURF projects from the list below that you are interested in working on
#* A one page resume listing relevant experience and coursework
#* If you are not a Caltech student, I will also need the following additional information:
#** An unofficial copy of your academic transcript
#** Names of two faculty members at your current institution that I can contact for a reference
# Starting on 11 January, I will go through all applications and work with my group to identify who is a possible fit for each project. We will then contact you and ask for you to meet (or talk with) possible co-mentors so that we can eventually work out who we will work with in writing up a proposal.
# We hope to make final decisions on projects by about 1 Jan, at which point we will start working with students on writing up proposals.
# All applications should go through the normal SURF application process, described at www.surf.caltech.edu. SURF applications are due on ~22 Feb.
# If you are selected for a SURF, please be aware of the following information
#* All SURF projects in my group will start on 18 Jun (Tue). If you can't start on that date, please make sure that you indicate this when you contact me
#* All SURF projects are for a minimum of 10 weeks, although I usually recommend that you try to stay for 12 weeks if possible. It's hard to complete a project in just 10 weeks and spending a few extra weeks can greatly improve the project.
#* All SURF students in my group will be expected to devote full-time effort to their SURF project, so you cannot have a second job in addition to your SURF.
#* Additional information on SURF available here: https://sfp.caltech.edu/undergraduate-research/programs/surf

=== List of available projects ===

Projects will be posted as they come available. I recommend waiting until near the deadline submission before submitting your project preferences.

{| border=1 width=100%
|-
| '''Title''' || '''Grant/Project''' || '''Co-Mentors''' || '''Comments'''
|-
| {{SURF|2024|Task-Relevant Metrics for Perception}}
| TBD
| Apurva Badithela
|-
| {{SURF|2024|Hierarchical Testing for Safety-Critical Autonomous Systems}}
| TBD
| Apurva Badithela
|
|-
| {{SURF|2024|Bioengineering toolkit development for genetic alterations in the entomopathogenic nematode symbiont Xenorhabdus griffiniae}}
| TBD
| Elin Larsson
|
|-
| {{SURF|2023|Genetically-Programmed Synthetic Cells and Multi-Cellular Machines}}
| [[NSF Cell Free]]
| TBD
| Multiple projects may be available; competitive selection
|}

SURF 2024: Hierarchical Testing for Safety-Critical Autonomous Systems

2023-12-15T22:12:01Z

Abadithe:

'''[[SURF 2024|2024 SURF]] Hierarchical Testing for Safety-Critical Autonomous Systems'''
* Mentor: Richard Murray
* Co-mentor: Apurva Badithela

==Project Description==

Automatically identifying failure cases of safety-critical autonomous systems is important for mainstream deployment of these systems. A few examples of such safety-critical robotic systems is illustrated on the right. Since autonomous robotic systems are complex and their domain of operation is very large, it is not possible to exhaustively verify correctness of the autonomous system with respect to safety specifications. Oftentimes, these systems need to reason over both discrete as well as continuous inputs and parameters.

[[Image:autonomous_robotic_systems.png|right|800px|Caption:Examples of autonomous robotic systems and their complexity.]]

State of the art methods include simulation-based falsification in which a simulator of the system (whose model is black-box) is queried with inputs until a failing trace is found. Current research in this area is in developing novel black-box optimization algorithms to query inputs in identifying these failing traces. However, most of these algorithms require the input vector to be continuous valued. Furthermore, these test inputs are often parameters that remain constant throughout the test, and are not reactive to system behavior. We wish to research the applicability of these methods to discrete-valued as well as mixed discrete-continuous inputs, and to reactive settings.

==Problem Motivated via a Simple Example:==
For example, consider a simple 2D double integrator system illustrated as a blue point mass as seen in the following .mp4 files. This system has two operating modes: north-south oscillating mode and east-west oscillating mode. The N-S and E-W poles are illustrated in red. When an external "switch" command is given to the system, it needs to safely switch to the other operating mode without entering the unsafe regions (shaded in blue).

The following video illustrates the system responding to a switch command. The system safely transitions from oscillating in the N-S mode to the E-W mode without entering the blue regions. Using black-box optimization, the time of the switch commanded is optimized to result in the worst-case possible trajectory (which is still far from the unsafe region). Observe that the switch is commanded when the system has gained a lot of momentum in transitioning to the other pole.

[[File:single_switch.mp4|right|800px|Caption: Single switch commanded results in safe trajectory.]]

In the video below, two switches are commanded in quick succession, and once again, the time of the switches is optimized to result in the worst-case possible trajectory. In this run, however, the system enters the unsafe region, thus demonstrating a failure in the control design. Fundamentally, the decision to switch twice (and similarly three times, four times etc.) is a discrete variable. Currently, identifying these discrete inputs in combination with continuous inputs is not well-studied in the literature.

[[File:double_switch.mp4 |right|800px|Caption: Two switch commands in quick succession shows unsafe trajectory.]]

Therefore, we seek to identify a sequence of discrete inputs, that together with worst-case low-level inputs, leads to a violating system trajectory.

==Desired:==
* Experience programming in Python
* Coursework: CDS 110
* Interest in theoretical and computational research in topics such as: safety-critical systems, autonomous robotic systems, control theory, and optimization.

==References:==
[1] Annpureddy, Yashwanth, et al. "S-taliro: A tool for temporal logic falsification for hybrid systems." International Conference on Tools and Algorithms for the Construction and Analysis of Systems. Berlin, Heidelberg: Springer Berlin Heidelberg, 2011.

SURF 2024: Hierarchical Testing for Safety-Critical Autonomous Systems

2023-12-15T21:31:56Z

Abadithe: /* Problem Motivated via a Simple Example: */

SURF 2024: Hierarchical Testing for Safety-Critical Autonomous Systems

2023-12-15T21:31:44Z

Abadithe: /* Problem Motivated via a Simple Example: */

SURF 2024: Hierarchical Testing for Safety-Critical Autonomous Systems

2023-12-15T21:22:21Z

Abadithe:

SURF 2024: Hierarchical Testing for Safety-Critical Autonomous Systems

2023-12-15T21:22:08Z

Abadithe: /* Project Description */

'''[[SURF 2024|2024 SURF]] Hierarchical Testing for Safety-Critical Autonomous Systems'''
* Mentor: Richard Murray
* Co-mentor: Apurva Badithela

==Project Description==

Automatically identifying failure cases of safety-critical autonomous systems is important for mainstream deployment of these systems. A few examples of such safety-critical robotic systems is illustrated on the right. Since autonomous robotic systems are complex and their domain of operation is very large, it is not possible to exhaustively verify correctness of the autonomous system with respect to safety specifications. Oftentimes, these systems need to reason over both discrete as well as continuous inputs and parameters.

[[Image:autonomous_robotic_systems.png|right|800px|Caption:Examples of autonomous robotic systems and their complexity.]]

State of the art methods include simulation-based falsification in which a simulator of the system (whose model is black-box) is queried with inputs until a failing trace is found. Current research in this area is in developing novel black-box optimization algorithms to query inputs in identifying these failing traces. However, most of these algorithms require the input vector to be continuous valued. Furthermore, these test inputs are often parameters that remain constant throughout the test, and are not reactive to system behavior. We wish to research the applicability of these methods to discrete-valued as well as mixed discrete-continuous inputs, and to reactive settings.

==Problem Motivated via a Simple Example:
For example, consider a simple 2D double integrator system illustrated as a blue point mass as seen in the following .mp4 files. This system has two operating modes: north-south oscillating mode and east-west oscillating mode. The N-S and E-W poles are illustrated in red. When an external "switch" command is given to the system, it needs to safely switch to the other operating mode without entering the unsafe regions (shaded in blue).

The following video illustrates the system responding to a switch command. The system safely transitions from oscillating in the N-S mode to the E-W mode without entering the blue regions. Using black-box optimization, the time of the switch commanded is optimized to result in the worst-case possible trajectory (which is still far from the unsafe region). Observe that the switch is commanded when the system has gained a lot of momentum in transitioning to the other pole.

[[File:single_switch.mp4|right|800px|Caption: Single switch commanded results in safe trajectory.]]

In the video below, two switches are commanded in quick succession, and once again, the time of the switches is optimized to result in the worst-case possible trajectory. In this run, however, the system enters the unsafe region, thus demonstrating a failure in the control design. Fundamentally, the decision to switch twice (and similarly three times, four times etc.) is a discrete variable. Currently, identifying these discrete inputs in combination with continuous inputs is not well-studied in the literature.

[[File:double_switch.mp4 |right|800px|Caption: Two switch commands in quick succession shows unsafe trajectory.]]

In this project, we aim to

SURF 2024: Hierarchical Testing for Safety-Critical Autonomous Systems

2023-12-15T21:13:14Z

Abadithe:

'''[[SURF 2024|2024 SURF]] Hierarchical Testing for Safety-Critical Autonomous Systems'''
* Mentor: Richard Murray
* Co-mentor: Apurva Badithela

==Project Description==

Automatically identifying failure cases of safety-critical autonomous systems is important for mainstream deployment of these systems. A few examples of such safety-critical robotic systems is illustrated on the right. Since autonomous robotic systems are complex and their domain of operation is very large, it is not possible to exhaustively verify correctness of the autonomous system with respect to safety specifications. Oftentimes, these systems need to reason over both discrete as well as continuous inputs and parameters.

[[Image:autonomous_robotic_systems.png|right|800px|Caption:Examples of autonomous robotic systems and their complexity.]]

State of the art methods include simulation-based falsification in which a simulator of the system (whose model is black-box) is queried with inputs until a failing trace is found. Current research in this area is in developing novel black-box optimization algorithms to query inputs in identifying these failing traces. However, most of these algorithms require the input vector to be continuous valued. Furthermore, these test inputs are often parameters that remain constant throughout the test, and are not reactive to system behavior. We wish to research the applicability of these methods to discrete-valued as well as mixed discrete-continuous inputs, and to reactive settings.

Problem Motivated via a Simple Example:
For example, consider a simple 2D double integrator system illustrated as a blue point mass as seen in the following .mp4 files. This system has two operating modes: north-south oscillating mode and east-west oscillating mode. The N-S and E-W poles are illustrated in red. When an external "switch" command is given to the system, it needs to safely switch to the other operating mode without entering the unsafe regions (shaded in blue).

The following video illustrates the system responding to a switch command. The system safely transitions from oscillating in the N-S mode to the E-W mode without entering the blue regions. Using black-box optimization, the time of the switch commanded is optimized to result in the worst-case possible trajectory (which is still far from the unsafe region). Observe that the switch is commanded when the system has gained a lot of momentum in transitioning to the other pole.

[[File:single_switch.mp4|right|800px|Caption: Single switch commanded results in safe trajectory.]]

In the video below, two switches are commanded in quick succession, and once again, the time of the switches is optimized to result in the worst-case possible trajectory. In this run, however, the system enters the unsafe region, thus demonstrating a failure in the control design. Fundamentally, the decision to switch twice (and similarly three times, four times etc.) is a discrete variable. Currently, identifying these discrete inputs in combination with continuous inputs is not well-studied in the literature.

[[File:double_switch.mp4 |right|800px|Caption: Two switch commands in quick succession shows unsafe trajectory.]]

In this project, we aim to

SURF 2024: Hierarchical Testing for Safety-Critical Autonomous Systems

2023-12-15T21:06:17Z

Abadithe: /* Project Description */

SURF 2024: Hierarchical Testing for Safety-Critical Autonomous Systems

2023-12-15T21:01:05Z

Abadithe:

File:Double switch.mp4

2023-12-15T20:54:00Z

Abadithe:

File:Single switch.mp4

2023-12-15T20:26:11Z

Abadithe:

SURF 2024: Hierarchical Testing for Safety-Critical Autonomous Systems

2023-12-15T20:24:32Z

Abadithe:

SURF 2024: Hierarchical Testing for Safety-Critical Autonomous Systems

2023-12-15T18:51:24Z

Abadithe:

SURF 2024: Hierarchical Testing for Safety-Critical Autonomous Systems

2023-12-13T19:09:31Z

Abadithe:

SURF 2024: Hierarchical Testing for Safety-Critical Autonomous Systems

2023-12-13T19:04:00Z

Abadithe: /* Project Description */

SURF 2024: Hierarchical Testing for Safety-Critical Autonomous Systems

2023-12-13T19:03:20Z

Abadithe: Created page with "'''2024 SURF Hierarchical Testing for Safety-Critical Autonomous Systems''' * Mentor: Richard Murray * Co-mentor: Apurva Badithela ==Project Description== Automatically identifying failure cases of safety-critical autonomous systems is important for mainstream deployment of these systems. A few examples of such safety-critical robotic systems is illustrated on the right. Since autonomous robotic systems are complex and their domain of operation is very l..."

File:Autonomous robotic systems.png

2023-12-13T19:02:04Z

Abadithe:

SURF 2024

2023-12-13T18:55:14Z

Abadithe: /* List of available projects */

{{righttoc}}
This page is intended for students interested in working on SURF projects in the Summer of 2024. It contains information about how to apply for a SURF project in my group along with a list of project areas.

'''Note:''' Projects will be posted here starting after finals week and up to the start of classes. Please check back after that time for more information.

=== Applying for a SURF project in my group ===

Because I get many students interested in doing SURFs in my group and because we have several projects available, we use the first few weeks in January to sort out who we will work with in writing proposals. We only submit one proposal per project area and so we often can't accommodate everyone who wants to work in my group over the summer.

# A list of SURF project descriptions is given in the table below. Due to the number of SURF projects that we support, we are only able to support students who select from among these projects. Please make sure to read the project descriptions, required skills (if any) and skim a few of the listed references before contacting me about doing a SURF project.
# Students interested in writing proposals for SURF projects should contact me via e-mail by 10 Jan (Wed) and provide the following information:
#* A list of up to three SURF projects from the list below that you are interested in working on
#* A one page resume listing relevant experience and coursework
#* If you are not a Caltech student, I will also need the following additional information:
#** An unofficial copy of your academic transcript
#** Names of two faculty members at your current institution that I can contact for a reference
# Starting on 11 January, I will go through all applications and work with my group to identify who is a possible fit for each project. We will then contact you and ask for you to meet (or talk with) possible co-mentors so that we can eventually work out who we will work with in writing up a proposal.
# We hope to make final decisions on projects by about 1 Jan, at which point we will start working with students on writing up proposals.
# All applications should go through the normal SURF application process, described at www.surf.caltech.edu. SURF applications are due on ~22 Feb.
# If you are selected for a SURF, please be aware of the following information
#* All SURF projects in my group will start on 18 Jun (Tue). If you can't start on that date, please make sure that you indicate this when you contact me
#* All SURF projects are for a minimum of 10 weeks, although I usually recommend that you try to stay for 12 weeks if possible. It's hard to complete a project in just 10 weeks and spending a few extra weeks can greatly improve the project.
#* All SURF students in my group will be expected to devote full-time effort to their SURF project, so you cannot have a second job in addition to your SURF.
#* Additional information on SURF available here: https://sfp.caltech.edu/undergraduate-research/programs/surf

=== List of available projects ===

Projects will be posted as they come available. I recommend waiting until near the deadline submission before submitting your project preferences.

{| border=1 width=100%
|-
| '''Title''' || '''Grant/Project''' || '''Co-Mentors''' || '''Comments'''
|-
| {{SURF|2024|Hierarchical Testing for Safety-Critical Autonomous Systems}}
| TBD
| Apurva Badithela
|
|-
| {{SURF|2024|Bioengineering toolkit development for genetic alterations in the entomopathogenic nematode symbiont Xenorhabdus griffiniae}}
| TBD
| Elin Larsson
|
|-
| {{SURF|2023|Genetically-Programmed Synthetic Cells and Multi-Cellular Machines}}
| [[NSF Cell Free]]
| TBD
| Multiple projects may be available; competitive selection
|}

Marta Kwiatkowska, Jul 2023

2023-07-17T19:05:58Z

Abadithe: /* Schedule */

Marta Kwiatkowska from the University of Oxford will be visiting on 21 Jul (Fri).

=== Schedule ===

* 8:30 am: Apurva (via Zoom)
* 9:15 am: Open
* 10:00 am: Richard Murray (109 Steele Lab)
* 11:00 am: Seminar (121 Annenberg)
* 12:00 pm: Lunch with Richard, Erik, Lulu
* 1:30 pm: Lulu Qian
* 2:15 pm: Erik Winfree
* 3:00 pm: Open (T&E?)
* 4:00 pm: Open (Pacti?)
* 5:00 pm: done for the day

=== Abstract ===

TBD

=== Bio ===

Marta Kwiatkowska is Professor of Computing Systems and Fellow of
Trinity College, University of Oxford. She was elected to Academia
Europea and received a prestigious ERC Advanced Grant VERIWARE "From
software verification to everyware verification", 2010-15.

Kwiatkowska's research is concerned with modelling and verification
techniques for probabilistic systems, with application to engineered and
biological systems. She spearheaded the development of probabilistic and
quantitative methods in verification on the international scene. Her
work on the theory to practice transfer of probabilistic model checking
was recognised by invitations to speak at the LICS 2003 and ESEC/FSE
2007 conferences. She led the development of the PRISM model checker
(www.prismmodelchecker.org), the leading software tool in the area and
widely used for research and teaching. Applications of probabilistic
model checking have spanned communication and security protocols,
nanotechnology designs, power management and systems biology. Her
research is currently supported by £3.7m of grant funding from EPSRC,
EU, DARPA, Oxford Martin School and Microsoft Research.

Kwiatkowska serves on editorial board of IEEE Transactions on Software
Engineering, Philosophical Transactions of the Royal Society A and
Science of Computer Programming, and has lectured at several summer
schools, including ESSLLI and the Marktoberdorf Summer School.

Roee Francos, 7 Jun 2023

2023-06-06T02:21:15Z

Abadithe:

Roee Francos, a PhD student under the supervision of Prof. Freddy Bruckstein at the Computer Science Department at the Technion- Israel Institute of Technology, will be visiting Caltech on Wed (7 Jun). Roey's research focuses on multi-agent systems, computer vision and trajectory planning problems in intelligent transportation systems.

If you have some time to meet with Roee, you can sign up for a time here (please edit to add your name and indicate where you will meet/pick up Roee):

* 9:15 am: Richard, 109 Steele Lab (not house)
* 10:00 am: open
* 10:45 am: open
* 11:30 am: open
* 12:15 pm: Lunch
* 1:30 pm: open
* 2:15 pm: Apurva
* 3:00 pm: CDS tea

Roee Francos, 7 Jun 2023

2023-06-05T14:49:37Z

Abadithe:

Roee Francos, a PhD student under the supervision of Prof. Freddy Bruckstein at the Computer Science Department at the Technion- Israel Institute of Technology, will be visiting Caltech on Wed (7 Jun). Roey's research focuses on multi-agent systems, computer vision and trajectory planning problems in intelligent transportation systems.

If you have some time to meet with Roee, you can sign up for a time here (please edit to add your name and indicate where you will meet/pick up Roee):

* 9:15 am: Richard, 109 Steele Lab (not house)
* 10:00 am: Apurva
* 10:45 am: open
* 11:30 am: open
* 12:15 pm: Lunch
* 1:30 pm: open
* 2:15 pm: open
* 3:00 pm: CDS tea

Lars Nielsen, March 2023

2023-03-10T19:13:46Z

Abadithe: /* 13 Mar (Mon) */

Lars Nielsen from Linkoping University in Sweden will visit Caltech on 13-17 March 2023.
__NOTOC__

{| width=100% border=1
|- valign=top
| width=20% |
=== 13 Mar (Mon) ===
* 8:30 am: Richard, 109 Steele Lab
* 9:30 am: Michael Dickinson
* 10:30 am: Apurva Badithela (Red Door)
* 11:15 am: open
* 12:00 pm: Lunch: Richard + Faculty
* 1:15 pm: John Doyle
* 2:00 pm: Seminar: 121 Annenberg
* 3:15 pm: open
* 4:00 pm: open
| width=20% |

=== 14 Mar (Tue) ===
* 9:00 am: open
* 10:00 am: Biocircuits meeting (optional)
* 12:00 pm: Lunch: students
* 1:15 pm: Soon-Jo Chung (235 Guggenheim)
* 2:00 pm: open
* 2:45 pm: open
* 3:30 pm: Joel Burdick (245 Gates-Thomas)
* 4:15 pm: open
* 6:00 pm: Dinner with Richard and RuthAnne
| width=20% |

=== 15 Mar (Wed) ===
* 1:00 pm: Markus Meister
* 1:45 pm: open
* 3:00 pm: CDS Tea
* 3:45 pm: open
| width=20% |

=== 16 Mar (Thu) ===
* Dinner with Richard and graduate students
| width=20% |

=== 17 Mar (Fri) ===
* 4:00 pm: wrap up with Richard
|}

=== Seminar information ===

Force-Centric Perspectives on Autonomous Vehicle Safety-Maneuvers

Lars Nielsen, Division of Vehicular Systems, Linköping University 
(postdoc at Caltech 85-86)

13 March, 2-3 pm OR 15 Mar, 2-3 pm 
213 Anneberg

Abstract:
Real-time avoidance maneuvers for vehicles have been developed using a force-centric perspective,
where the founding principles are obtained from studies of optimal maneuvers. The
developed optimization framework, the different criteria used, and the obtained solutions give
insight into how to control the forces on the vehicle. A highlight in this presentation is the first
algorithm not needing a tire-road friction estimate.

Lars Nielsen, March 2023

2023-03-09T00:39:14Z

Abadithe: /* 13 Mar (Mon) */

Lars Nielsen from Linkoping University in Sweden will visit Caltech on 13-17 March 2023.
__NOTOC__

{| width=100% border=1
|- valign=top
| width=20% |
=== 13 Mar (Mon) ===
* 8:30 am: Richard, 110 Steele Lab
* 9:30 am: Apurva Badithela
* 10:15 am: open
* 11:00 am: Michael Dickinson
* 11:45 am: Lunch: Richard + Faculty
* 1:15 pm: open
* 2:00 pm: Hold: seminar
* 3:15 pm: open
* 4:00 pm: open
| width=20% |

=== 14 Mar (Tue) ===
* 9:00 am: open
* 10:00 am: Biocircuits meeting (optional)
* 12:00 pm: Lunch: students
* 1:15 pm: Soon-Jo Chung (235 Guggenheim)
* 2:00 pm: open
* 2:45 pm: open
* 3:30 pm: open
* 4:15 pm: open
* 6:00 pm: Dinner with Richard and RuthAnne
| width=20% |

=== 15 Mar (Wed) ===
* 1:00 pm: Markus Meister
* 1:45 pm: open
* 3:00 pm: CDS Tea
* 3:45 pm: open
| width=20% |

=== 16 Mar (Thu) ===
* Dinner with Richard and graduate students
| width=20% |

=== 17 Mar (Fri) ===
* 4:00 pm: wrap up with Richard
|}

=== Seminar information ===

Force-Centric Perspectives on Autonomous Vehicle Safety-Maneuvers

Lars Nielsen, Division of Vehicular Systems, Linköping University 
(postdoc at Caltech 85-86)

13 March, 2-3 pm OR 15 Mar, 2-3 pm 
213 Anneberg

Abstract:
Real-time avoidance maneuvers for vehicles have been developed using a force-centric perspective,
where the founding principles are obtained from studies of optimal maneuvers. The
developed optimization framework, the different criteria used, and the obtained solutions give
insight into how to control the forces on the vehicle. A highlight in this presentation is the first
algorithm not needing a tire-road friction estimate.

Wen-Hua Chen, 4-21 Oct 2022

2022-10-03T16:11:50Z

Abadithe: /* 4 Oct (Tue) */

Professor Wen-Hua Chen from the University of Loughborough will visit Caltech on 4-21 Oct 2022. A schedule for the first few days of his visit is given below. Please feel free to sign up for any open times.

{| border=1
|-
| align=top width=50% |
=== 4 Oct (Tue) ===
* 8:30 am: Richard Murray, 109 Steele Lab
* 9:00 am: Soon-Jo Chung, 235 Guggenheim (including lab tour (CAST & ARCL))
* 9:45 am: Diana Bohler (logistics)
* 10:30 am: Open
* 11:15 am: Open
* 12:00 pm: Lunch with Richard
* 1:15 pm: Houman Owhadi, 201 Steele House
* 2:00 pm: Lijun Chen, 217 Annenberg
* 2:45 pm: Andrew Taylor, 325 Annenberg
* 3:30 pm: Noel Csomay-Shanklin, 325 Annenberg
* 4:15 pm: Apurva (meet at 325 Annenberg and then walk over to 238 Annenberg or Annenberg lounge)
* 5:00 pm: Done for the day

| align=top width=50% |

=== 5 Oct (Wed) ===

* 9:00 am: Ersin Das, Location: Steele House
* 9:45 am: Open
* 10:30 am: Open
* 11:15 am: Prithvi Akella, (Place TBD)
* 12:00 pm: Lunch
* 1:30 pm: Anima Anandkumar, 316 Annenberg
* 2:15 pm: Josefine Graebener, Location TBD
* 3:00 pm: CDS Tea, Annenberg
* 3:45 pm: Seminar - 121 Annenberg
* 5:00 pm: Richard Murray, 109 Steele Lab
* 6:00 pm: Dinner with NCS group
|}

'''Stability of Optimisation-Based Control: Brief Review and New Results'''

Prof Wen-Hua Chen 
Department of Aeronautical and Automotive Engineering 
Loughborough University

With the increase of the size and the complexity of systems and their performance specifications, it is more difficult to find analytic solutions for a control system as in traditional approaches to give optimal performance. Model Predictive Control (MPC) provides a promising mechanism to realise numerical optimal solutions online to achieve best possible performance. However, establishing stability and other formal properties of this type of optimisation-based control imposes significant challenges. This talk starts with the brief overview of 30 years’ journey in developing stability theory for MPC. It points out that despite all the success, there is still a significant gap between available theoretic tools and practical applications. For example, a terminal cost that covers the optimal cost-to-go is, in general, required to add the cost function in order to ensure stability of a MPC algorithm, but most of MPC used in practical applications does not have a terminal cost (for example, all cases studies in Matlab Nonlinear MPC Toolbox do not have a terminal cost but work well). This talk presents a new approach and development in this area. The stability condition is entirely complementary to the existing terminal based MPC stability theory. Opposite to the existing MPC stability conditions, the new stability conditions cover the terminal cost being less than the optimal cost-to-go including zero terminal cost even negative. The new conditions are established based on a property of a modified stage cost. Numerical results are presented to illustrate the links and differences between the new approach and the existing stability theory. It is hoped that this work would trigger more research into understanding the interaction between optimisation and feedback loops in both the AI and the control community so ensure efficiency and safety of future robotics and autonomous systems.

Dr Wen-Hua Chen holds Professor in Autonomous Vehicles in the Department of Aeronautical and Automotive Engineering at Loughborough University, UK. Prof. Chen has a considerable experience in control, signal processing and artificial intelligence and their applications in aerospace, automotive and agriculture systems. In the last 15 years, he has been working on the development and application of unmanned aircraft system and intelligent vehicle technologies, spanning autopilots, situational awareness, decision making, verification, remote sensing for precision agriculture and environment monitoring. He is a Chartered Engineer, and a Fellow of IEEE, the Institution of Mechanical Engineers and the Institution of Engineering and Technology, UK. Recently Prof Chen was awarded an EPSRC (Engineering and Physical Science Research Council) Established Career Fellowship in developing control theory for next generation of control systems to enable high levels of automation such as robotics and autonomous systems.

Wen-Hua Chen, 4-21 Oct 2022

2022-10-02T18:23:48Z

Abadithe: /* 4 Oct (Tue) */

Professor Wen-Hua Chen from the University of Loughborough will visit Caltech on 4-21 Oct 2022. A schedule for the first few days of his visit is given below. Please feel free to sign up for any open times.

{| border=1
|-
| align=top width=50% |
=== 4 Oct (Tue) ===
* 8:30 am: Richard Murray, 109 Steele Lab
* 9:00 am: Soon-Jo Chung, 235 Guggenheim (including lab tour (CAST & ARCL))
* 9:45 am: Diana Bohler (logistics)
* 10:30 am: Open
* 11:15 am: Apurva (meet outside Richard's office and walk over to Red door)
* 12:00 pm: Lunch with Richard
* 1:15 pm: Houman Owhadi, 201 Steele House
* 2:00 pm: Open
* 2:45 pm: Open
* 3:30 pm: Open
* 4:15 pm: Open
* 5:00 pm: Done for the day

| align=top width=50% |

=== 5 Oct (Wed) ===

* 9:00 am: Ersin Das, Location: TBD
* 9:45 am: Open
* 10:30 am: Open
* 11:15 am: Prithvi Akella, (Place TBD)
* 12:00 pm: Lunch
* 1:30 pm: Anima Anandkumar, 316 Annenberg
* 2:15 pm: Josefine Graebener, Location TBD
* 3:00 pm: CDS Tea, Annenberg
* 3:45 pm: Seminar - 121 Annenberg
* 5:00 pm: Richard Murray, 109 Steele Lab
* 6:00 pm: Dinner with NCS group
|}

'''Stability of Optimisation-Based Control: Brief Review and New Results'''

Prof Wen-Hua Chen 
Department of Aeronautical and Automotive Engineering 
Loughborough University

With the increase of the size and the complexity of systems and their performance specifications, it is more difficult to find analytic solutions for a control system as in traditional approaches to give optimal performance. Model Predictive Control (MPC) provides a promising mechanism to realise numerical optimal solutions online to achieve best possible performance. However, establishing stability and other formal properties of this type of optimisation-based control imposes significant challenges. This talk starts with the brief overview of 30 years’ journey in developing stability theory for MPC. It points out that despite all the success, there is still a significant gap between available theoretic tools and practical applications. For example, a terminal cost that covers the optimal cost-to-go is, in general, required to add the cost function in order to ensure stability of a MPC algorithm, but most of MPC used in practical applications does not have a terminal cost (for example, all cases studies in Matlab Nonlinear MPC Toolbox do not have a terminal cost but work well). This talk presents a new approach and development in this area. The stability condition is entirely complementary to the existing terminal based MPC stability theory. Opposite to the existing MPC stability conditions, the new stability conditions cover the terminal cost being less than the optimal cost-to-go including zero terminal cost even negative. The new conditions are established based on a property of a modified stage cost. Numerical results are presented to illustrate the links and differences between the new approach and the existing stability theory. It is hoped that this work would trigger more research into understanding the interaction between optimisation and feedback loops in both the AI and the control community so ensure efficiency and safety of future robotics and autonomous systems.

Dr Wen-Hua Chen holds Professor in Autonomous Vehicles in the Department of Aeronautical and Automotive Engineering at Loughborough University, UK. Prof. Chen has a considerable experience in control, signal processing and artificial intelligence and their applications in aerospace, automotive and agriculture systems. In the last 15 years, he has been working on the development and application of unmanned aircraft system and intelligent vehicle technologies, spanning autopilots, situational awareness, decision making, verification, remote sensing for precision agriculture and environment monitoring. He is a Chartered Engineer, and a Fellow of IEEE, the Institution of Mechanical Engineers and the Institution of Engineering and Technology, UK. Recently Prof Chen was awarded an EPSRC (Engineering and Physical Science Research Council) Established Career Fellowship in developing control theory for next generation of control systems to enable high levels of automation such as robotics and autonomous systems.

SURF 2022: Specification Monitor for Testing of Autonomous Systems

2022-01-20T03:32:45Z

Abadithe:

'''[[SURF 2022|2022 SURF]] project description'''
* Mentor: Richard Murray
* Co-mentor: Josefine Graebener, Apurva Badithela

==Project Description==

[[File:Duckiebot_db21.jpg|thumb|500px|right|Duckiebot model DB21. Image from https://www.duckietown.org/mooc]]

Testing of autonomous vehicles (AVs) is a very time and cost intensive effort, which needs to be repeated after every system modification [1]. Thus finding a way to improve the efficiency of testing is a very valuable step on the path to more autonomy. We propose a framework which `merges' multiple unit tests into one fewer tests, which guarantee to cover what is tested in the unit tests.

This framework uses a model of the system to find the merged test via a simulation and tree search, this model is non-deterministic, but expected to be perfect. But realistically , this system model will not cover the entire system in all possible situations in the real world -- due to the gap between simulation and real world -- therefore the execution of the test could not result in the desired outcome when it is run on the actual hardware. While executing the testing campaign, we need to find a way to automatically evaluate the tests --- whether it satisfied the test specification -- for example testing a left turn -- and whether the system behaved as expected -- for example safe and comfortable driving -- and then learn from the test outcomes to improve the future testing campaign.

The summer project will be implementing a `monitor', which visualizes whether the actual test fulfilled the desired outcome and implement it on the Duckietown hardware [2]. The test monitor needs to show the satisfaction or violation of the system specification and the test specification. This test monitor will enable learning from previously run tests and improve the testing suite by modifying the following tests in case the hardware did not perform as expected in the test. After completing the monitor, the output can be used to generate an improved testing campaign and determine if improvements to the testing campaign could be made.

Familiarity with robotic hardware (we are using Duckiebots DB21), Python 3, ROS, and Docker would be beneficial.

==References==
[1] Kalra, N., & Paddock, S. M. (2016). Driving to safety: How many miles of driving would it take to demonstrate autonomous vehicle reliability?. Transportation Research Part A: Policy and Practice, 94, 182-193.

[2] Paull, L., Tani, J., Ahn, H., Alonso-Mora, J., Carlone, L., Cap, M., ... & Censi, A. (2017, May). Duckietown: an open, inexpensive and flexible platform for autonomy education and research. In 2017 IEEE International Conference on Robotics and Automation (ICRA) (pp. 1497-1504). IEEE.

CDS 112/Ae 103a, Winter 2022

2022-01-03T22:53:15Z

Abadithe: last day

{| width=100%
|-
| colspan=2 align=center |
Optimal Control and Estimation__NOTOC__
|- valign=top
| width=50% |
'''Instructors'''
* Richard Murray (CDS/BE), murray@cds.caltech.edu
* Lectures: MWF, 2-3 pm, 213 ANB
| width=50% |
'''Teaching Assistants'''
* Apurva Badithela (CDS), Ayush Pandey (CDS)
* Office hours: Fri, 4-5 and Mon, 3-4. Location TBD.
|}

This is the course homepage for CDS 112 (and Ae 103a), Winter 2022. This course is intended for undergraduates and graduate students interested in optimization-based methods in control. After completion of the course, students will understand the key principles of state-space based controller design, including optimal estimation and control techniques.

=== Catalog Description ===

'''CDS 112. Optimal Control and Estimation.''' 9 units (3-0-6): second term. Prerequisites: CDS 110 (or equivalent) and CDS 131. Optimization-based design of control systems, including optimal control and receding horizon control. Introductory random processes and optimal estimation. Kalman filtering and nonlinear filtering methods for autonomous systems.

'''Ae 103 a. Aerospace Control Systems.''' 9 units (3-0-6): second term. Prerequisites: CDS 110 (or equivalent), CDS 131 or permission of instructor. Optimization-based design of control systems, including optimal control and receding horizon control. Introductory random processes and optimal estimation. Kalman filtering and nonlinear filtering methods for autonomous systems.

=== Lecture Schedule ===

{| class="mw-collapsible wikitable" width=100% border=1 cellpadding=5
|-
| '''Date'''
| '''Topic'''
| '''Reading'''
| '''Homework'''

|- valign=top
| '''Week 1''' 
3 Jan 5 Jan 7 Jan
| Introduction and review
* Course introduction and logistics
* Overview: control architectures
* [[http:python-control.org|Python Control Systems Library]]
|
* [[http:fbswiki.org/OBC|OBC]], {{OBC pdf|obc-intro|29Dec2021|Chapter 1}}
* Lecture notes: {{cds112 wi2022 pdf|L1-1_intro-03Jan2022.pdf|Mon}}, Wed
* Jupyter notebook: {{cds112 wi2022 pdf|W1_intro_to_python-control.ipynb}}
* [https://simons.berkeley.edu/control-theory Feedback Control Theory video tutorial (Simons Institute)]
| {{cds112 wi2022 pdf|hw1-wi2022.pdf|HW #1}} 
Out: 5 Jan 
Due: 12 Jan 


|- valign=top
| '''Week 2''' 
10 Jan 12 Jan 19 Jan
| Two degree of freedom control design
* Trajectory generation
* Differential flatness
* Implementation in Python
* Gain scheduling (if time)
|
* [[http:fbswiki.org/OBC|OBC]], Chapter 2
* Background: FBS2e, Section 8.5
* Theory: LST, Sections 3.1, 3.2
| {{cds112 wi2022 pdf|hw2-wi2022.pdf|HW #2}} 
Out: 12 Jan 
Due: 19 Jan 


|- valign=top
| '''Week 3''' 
<s>17 Jan</s> 19 Jan 21 Jan
| Optimal control
* Maximum principle
* Dynamic programming
* Examples and applications
* Implementation in Python
|
* [[http:fbswiki.org/OBC|OBC]], Chapter 3
| {{cds112 wi2022 pdf|hw3-wi2022.pdf|HW #3}} 
Out: 19 Jan 
Due: 26 Jan 


|- valign=top
| '''Week 4''' 
24 Jan 26 Jan 28 Jan*
| Linear quadratic regulators
* Problem formulation and solution
* Choosing LQR Weights
* Incorporating integral feedback
* Implementation in Python
|
* [[http:fbswiki.org/OBC|OBC]], Chapter 3
| {{cds112 wi2022 pdf|hw4-wi2022.pdf|HW #4}} 
Out: 26 Jan 
Due: 2 Feb 


|- valign=top
| '''Week 5''' 
31 Jan 2 Feb 4 Feb
| Receding horizon control
* Problem formulation and solution
* Receding horizon control using differential flatness
* Example: Caltech ducted fan
* Implementation in Python
|
* [[http:fbswiki.org/OBC|OBC]], Chapter 4
| {{cds112 wi2022 pdf|hw5-wi2022.pdf|HW #5}} 
Out: 2 Feb 
Due: 9 Feb 


|- valign=top
| '''Week 6''' 
7 Feb 9 Feb 11 Feb
| Stochastic systems
* Review of random variables
* Introduction to random processes
* Continuous-time, vector-valued random processes
* Linear stochastic systems
* Random processes in the frequency domain
|
* [[http:fbswiki.org/OBC|OBC]], Chapter 5
| {{cds112 wi2022 pdf|hw6-wi2022.pdf|HW #6}} 
Out: 9 Feb 
Due: 16 Feb 


|- valign=top
| '''Week 7''' 
14 Feb 16 Feb 18 Feb*
| Kalman filtering
* Linear quadratic estimators
* Extensions of the Kalman filter
* LQG control
* Example: vectored thrust aircraft
* Implementation in Python
|
* [[http:fbswiki.org/OBC|OBC]], Chapter 6
| {{cds112 wi2022 pdf|hw7-wi2022.pdf|HW #7}} 
Out: 16 Feb 
Due: 23 Feb 


|- valign=top
| '''Week 8''' 
<s>21 Feb</s> 23 Feb* 25 Feb*
| Sensor fusion
* Discrete-time stochastic systems
* Kalman filters in discrete time
* Predictor-corrector form
* Combining information from multiple sensors
* Information filters
* Implementation in Python
|
* [[http:fbswiki.org/OBC|OBC]], Chapter 7
| {{cds112 wi2022 pdf|hw8-wi2022.pdf|HW #8}} 
Out: 23 Feb 
Due: 2 Mar 


|- valign=top
| '''Week 9''' 
28 Feb 2 Mar 4 Mar
| Autonomous systems
* Multi-layer control stack for autonomous systems
* Introduction to discrete decision-making
* Introduction to safety-critical systems
* Challenges and open problems
|
* TBD
* [[http:www.youtube.com/watch?v=Wi8Y---ce28|Can We Really Use Machine Learning in Safety Critical Systems? (IPAM talk)]]
| {{cds112 wi2022 pdf|hw9-wi2022.pdf|HW #9}} 
Out: 2 Mar 
Due: 9 Mar 


|- valign=top
| '''Week 10''' 
7 Mar 9 Mar
| Review for final
|
| {{cds112 wi2022 pdf|final-wi2022.pdf|Final}} 
Out: 9 Mar 
Due: 16 Mar, 5 pm 


|}

=== Grading ===
The final grade will be based on homework sets and a final exam:

*''Homework (70%):'' Homework sets will be handed out weekly and due on Wednesdays by 2 pm using GradeScope. Each student is allowed up to two extensions of no more than 2 days each over the course of the term. Homework turned in after Friday at 2 pm or after the two extensions are exhausted will not be accepted without a note from the health center or the Dean. MATLAB/Python code and SIMULINK/Modelica diagrams are considered part of your solution and should be printed and turned in with the problem set (whether the problem asks for it or not).

:The lowest homework set grade will be dropped when computing your final grade.

* ''Final exam (30%):'' The final exam will be handed out on the last day of class (9 Mar) and due at the end of finals week. It will be an open book exam and computers will be allowed (though not required).

=== Collaboration Policy ===

Collaboration on homework assignments is encouraged. You may consult outside reference materials, other students, the TA, or the instructor, but you cannot consult homework solutions from prior years and you must cite any use of material from outside references. All solutions that are handed in should be written up individually and should reflect your own understanding of the subject matter at the time of writing. Any computer code that is used to solve homework problems is considered part of your writeup and should be done individually (you can share ideas, but not code).

No collaboration is allowed on the final exam.

=== Course Text and References ===

The primary course texts are
* [OBC] R. M. Murray, "Optimization-Based Control", 2022. [https://fbswiki.org/wiki/index.php/Supplement:_Optimization-Based_Control Online access]

The following additional references may also be useful:

* [FBS2e] K. J. Astrom and Richard M. Murray, [http://fbsbook.org ''Feedback Systems: An Introduction for Scientists and Engineers''], Princeton University Press, Second Edition*, 2020.
* [LST] Richard M. Murray, Feedback Systems: Notes on Linear Systems Theory, 2020. (Updated 30 Oct 2020)


Note: the only sources listed here are those that allow free access to online versions. Additional textbooks that are not freely available can be obtained from the library.

[[Category: Courses]]

SURF 2022: Evaluating Redundancy Between Test Executions for Autonomous Vehicles

2021-12-21T19:51:21Z

Abadithe: /* What you can expect from this SURF */

2022 SURF: Evaluating Redundancy between Test Executions

- Mentor: Richard Murray

- Co-mentors: Apurva Badithela, Josefine Graebener

==Project Description==
For autonomy to be deployed in safety-critical settings, operational testing is imperative. However, principled methods for generating operational tests is still a young but growing research area? Since the autonomous systems are complex and the domain of their operating environments is typically very large, it is not possible to exhaustively check or verify the autonomous systems' behavior. Instead, we need an automated paradigm to select a small number of tests that are the most informative of the system. In this work, we want to formally characterize the notion of redundancy between two test executions.

[[Image:Screen Shot 2021-12-20 at 11.27.21 PM.png|right|400px]]

Testing autonomous systems requires defining the test environment, which comprises of test agents, obstacles, and test harnesses on the system under test. Test cases of varying complexity (length of the test, number of test agents and their strategies) could offer the same information on the system's ability to satisfy a requirement. Consider the following example of testing a miniature self-driving car on the Duckietown platform. The autonomous car to be tested has a controller that navigates indefinitely around a loop --- it needs to do lane following, avoid colliding with other cars and take unprotected left turns at intersections after reading appropriate road signs. The figure to the right shows a duckiebot on a simple layout; other duckiebots and mini road signs can be easily augmented to this setup. The duckiebot under test has an off-the-shelf controller implemented on-board for indefinite navigation around the track. In addition to the hardware setup, we have access to a simulator of the hardware setup that could potentially be useful in designing our experiments.

For example, tests can generally be classified into four different categories:
* Open-loop test in a static environment -- The test parameters are defined prior to the start of the test and do not depend on the real-time state of the autonomous system during the test. The test environment has no continuous dynamics. An example of this type of test would be to setup the track to have obstacles that block the autonomous duckiebot's path and have the duckiebot start with some pre-defined initial pose and velocity.

* Open-loop test in a dynamic environment -- The test strategies of the test environment agents do not depend on the real-time state of the system under test. The test environment has agents with continuous dynamics. For example, in this test paradigm, we could have test environment duckiebots with predefined instructions on how to navigate around the track, and this pre-programmed logic on the test environment duckiebots does not change irrespective of the behavior of the duckiebot under test.

* Reactive test in a static environment -- The test parameters depend on the real-time state of the system under test. The test environment does not have agents with continuous dynamics. For example, obstacles can be placed on the track but instructions on where the duckiebot under test should navigate to could be generated in real-time and depends on the trajectory of the duckiebot under test.

* Reactive test in a dynamic environment -- The test parameters depend on the real-time state of the system under test, and the test strategies agents in the environment are reactive to the state of the system. Considering the running example of testing a duckiebot on the track, this type of test would comprise of test environment duckiebots with strategies that react to the duckiebot under test. A concrete example of this would be a test duckiebot slowing down to match the time at which the test duckiebot and the duckiebot under test reach an intersection. This would prompt the duckiebot under test to account for the test duckiebot before proceeding through the intersection, and this would be more useful test than having the duckiebot driving through an empty intersection.

For this SURF, we would like to implement a few tests in test environments of varying complexity (as described above) on the Duckietown hardware. We would then like to characterize the test scenarios for when two tests are redundant and when they are not. For instance, is there open-loop test executions in a dynamic environment that can be replaced by an equivalent open-loop test in a static environment? Likewise, we would also like to reason about redundancy in the following test paradigms -- open-loop test in a dynamic environment vs. reactive test in a dynamic environment and reactive test in a static environment vs. reactive test in a dynamic environment. We can start by defining and computing a notion of information gain over the test execution and show that a more complex test may or may not offer any new insight regarding system performance. If we can prove, or find a counterexample that disproves, redundancy between test executions from different test environment paradigms, we would like to implement the test executions on Duckietown to demonstrate this. in addition to the Duckietown hardware, we have access to the Duckietown simulator that will be a useful tool for this study.

==Prerequisites==
* Experience coding in Python
* Willing to learn development on Docker and Github
* Distributed Computing and Formal Methods (CS 142), Robotics (ME 133abc, ME 134)

==What you can expect from this SURF==
* Work closely with graduate students on test case generation for autonomy
* Coming up with theoretical insights (ex: Proving results on which class of systems and test paradigms are equivalent)
* Writing open-source code to implement algorithms to demonstrating these ideas

==References==
1. Duckietown. https://docs.duckietown.org/daffy/duckietown-robotics-development/out/index.html

SURF 2022: Evaluating Redundancy Between Test Executions for Autonomous Vehicles

2021-12-21T19:51:09Z

Abadithe:

2022 SURF: Evaluating Redundancy between Test Executions

- Mentor: Richard Murray

- Co-mentors: Apurva Badithela, Josefine Graebener

==Project Description==
For autonomy to be deployed in safety-critical settings, operational testing is imperative. However, principled methods for generating operational tests is still a young but growing research area? Since the autonomous systems are complex and the domain of their operating environments is typically very large, it is not possible to exhaustively check or verify the autonomous systems' behavior. Instead, we need an automated paradigm to select a small number of tests that are the most informative of the system. In this work, we want to formally characterize the notion of redundancy between two test executions.

[[Image:Screen Shot 2021-12-20 at 11.27.21 PM.png|right|400px]]

Testing autonomous systems requires defining the test environment, which comprises of test agents, obstacles, and test harnesses on the system under test. Test cases of varying complexity (length of the test, number of test agents and their strategies) could offer the same information on the system's ability to satisfy a requirement. Consider the following example of testing a miniature self-driving car on the Duckietown platform. The autonomous car to be tested has a controller that navigates indefinitely around a loop --- it needs to do lane following, avoid colliding with other cars and take unprotected left turns at intersections after reading appropriate road signs. The figure to the right shows a duckiebot on a simple layout; other duckiebots and mini road signs can be easily augmented to this setup. The duckiebot under test has an off-the-shelf controller implemented on-board for indefinite navigation around the track. In addition to the hardware setup, we have access to a simulator of the hardware setup that could potentially be useful in designing our experiments.

For example, tests can generally be classified into four different categories:
* Open-loop test in a static environment -- The test parameters are defined prior to the start of the test and do not depend on the real-time state of the autonomous system during the test. The test environment has no continuous dynamics. An example of this type of test would be to setup the track to have obstacles that block the autonomous duckiebot's path and have the duckiebot start with some pre-defined initial pose and velocity.

* Open-loop test in a dynamic environment -- The test strategies of the test environment agents do not depend on the real-time state of the system under test. The test environment has agents with continuous dynamics. For example, in this test paradigm, we could have test environment duckiebots with predefined instructions on how to navigate around the track, and this pre-programmed logic on the test environment duckiebots does not change irrespective of the behavior of the duckiebot under test.

* Reactive test in a static environment -- The test parameters depend on the real-time state of the system under test. The test environment does not have agents with continuous dynamics. For example, obstacles can be placed on the track but instructions on where the duckiebot under test should navigate to could be generated in real-time and depends on the trajectory of the duckiebot under test.

* Reactive test in a dynamic environment -- The test parameters depend on the real-time state of the system under test, and the test strategies agents in the environment are reactive to the state of the system. Considering the running example of testing a duckiebot on the track, this type of test would comprise of test environment duckiebots with strategies that react to the duckiebot under test. A concrete example of this would be a test duckiebot slowing down to match the time at which the test duckiebot and the duckiebot under test reach an intersection. This would prompt the duckiebot under test to account for the test duckiebot before proceeding through the intersection, and this would be more useful test than having the duckiebot driving through an empty intersection.

For this SURF, we would like to implement a few tests in test environments of varying complexity (as described above) on the Duckietown hardware. We would then like to characterize the test scenarios for when two tests are redundant and when they are not. For instance, is there open-loop test executions in a dynamic environment that can be replaced by an equivalent open-loop test in a static environment? Likewise, we would also like to reason about redundancy in the following test paradigms -- open-loop test in a dynamic environment vs. reactive test in a dynamic environment and reactive test in a static environment vs. reactive test in a dynamic environment. We can start by defining and computing a notion of information gain over the test execution and show that a more complex test may or may not offer any new insight regarding system performance. If we can prove, or find a counterexample that disproves, redundancy between test executions from different test environment paradigms, we would like to implement the test executions on Duckietown to demonstrate this. in addition to the Duckietown hardware, we have access to the Duckietown simulator that will be a useful tool for this study.

==Prerequisites==
* Experience coding in Python
* Willing to learn development on Docker and Github
* Distributed Computing and Formal Methods (CS 142), Robotics (ME 133abc, ME 134)

==What you can expect from this SURF==
* Work closely with graduate students on test case generation for autonomy
* Hands-on experience with autonomous robots
* Coming up with theoretical insights (ex: Proving results on which class of systems and test paradigms are equivalent)
* Writing open-source code to implement algorithms to demonstrating these ideas

==References==
1. Duckietown. https://docs.duckietown.org/daffy/duckietown-robotics-development/out/index.html

SURF 2022: Evaluating Redundancy Between Test Executions for Autonomous Vehicles

2021-12-21T17:29:02Z

Abadithe:

2022 SURF: Evaluating Redundancy between Test Executions

- Mentor: Richard Murray

- Co-mentors: Apurva Badithela, Josefine Graebener

==Project Description==
For autonomy to be deployed in safety-critical settings, operational testing is imperative. However, principled methods for generating operational tests is still a young but growing research area? Since the autonomous systems are complex and the domain of their operating environments is typically very large, it is not possible to exhaustively check or verify the autonomous systems' behavior. Instead, we need an automated paradigm to select a small number of tests that are the most informative of the system. In this work, we want to formally characterize the notion of redundancy between two test executions.

[[Image:Screen Shot 2021-12-20 at 11.27.21 PM.png|right|400px]]

Testing autonomous systems requires defining the test environment, which comprises of test agents, obstacles, and test harnesses on the system under test. Test cases of varying complexity (length of the test, number of test agents and their strategies) could offer the same information on the system's ability to satisfy a requirement. Consider the following example of testing a miniature self-driving car on the Duckietown platform. The autonomous car to be tested has a controller that navigates indefinitely around a loop --- it needs to do lane following, avoid colliding with other cars and take unprotected left turns at intersections after reading appropriate road signs. The figure to the right shows a duckiebot on a simple layout; other duckiebots and mini road signs can be easily augmented to this setup. The duckiebot under test has an off-the-shelf controller implemented on-board for indefinite navigation around the track. In addition to the hardware setup, we have access to a simulator of the hardware setup that could potentially be useful in designing our experiments.

For example, tests can generally be classified into four different categories:
* Open-loop test in a static environment -- The test parameters are defined prior to the start of the test and do not depend on the real-time state of the autonomous system during the test. The test environment has no continuous dynamics. An example of this type of test would be to setup the track to have obstacles that block the autonomous duckiebot's path and have the duckiebot start with some pre-defined initial pose and velocity.

* Open-loop test in a dynamic environment -- The test strategies of the test environment agents do not depend on the real-time state of the system under test. The test environment has agents with continuous dynamics. For example, in this test paradigm, we could have test environment duckiebots with predefined instructions on how to navigate around the track, and this pre-programmed logic on the test environment duckiebots does not change irrespective of the behavior of the duckiebot under test.

* Reactive test in a static environment -- The test parameters depend on the real-time state of the system under test. The test environment does not have agents with continuous dynamics. For example, obstacles can be placed on the track but instructions on where the duckiebot under test should navigate to could be generated in real-time and depends on the trajectory of the duckiebot under test.

* Reactive test in a dynamic environment -- The test parameters depend on the real-time state of the system under test, and the test strategies agents in the environment are reactive to the state of the system. Considering the running example of testing a duckiebot on the track, this type of test would comprise of test environment duckiebots with strategies that react to the duckiebot under test. A concrete example of this would be a test duckiebot slowing down to match the time at which the test duckiebot and the duckiebot under test reach an intersection. This would prompt the duckiebot under test to account for the test duckiebot before proceeding through the intersection, and this would be more useful test than having the duckiebot driving through an empty intersection.

For this SURF, we would like to implement a few tests in test environments of varying complexity (as described above) on the Duckietown hardware. We would then like to characterize the test scenarios for when two tests are redundant and when they are not. For instance, is there open-loop test executions in a dynamic environment that can be replaced by an equivalent open-loop test in a static environment? Likewise, we would also like to reason about redundancy in the following test paradigms -- open-loop test in a dynamic environment vs. reactive test in a dynamic environment and reactive test in a static environment vs. reactive test in a dynamic environment. We can start by defining and computing a notion of information gain over the test execution and show that a more complex test may or may not offer any new insight regarding system performance. If we can prove, or find a counterexample that disproves, redundancy between test executions from different test environment paradigms, we would like to implement the test executions on Duckietown to demonstrate this. in addition to the Duckietown hardware, we have access to the Duckietown simulator that will be a useful tool for this study.

==Prerequisites==
* Experience coding in Python
* Willing to learn development on Docker and Github
* Interest in hands-on robotics experience

==What you can expect from this SURF==
* Work closely with graduate students on test case generation for autonomy
* Hands-on experience with autonomous robots
* Coming up with theoretical insights (ex: Proving results on which class of systems and test paradigms are equivalent)
* Writing open-source code to implement algorithms to demonstrating these ideas

==References==
1. Duckietown. https://docs.duckietown.org/daffy/duckietown-robotics-development/out/index.html

SURF 2022: Evaluating Redundancy Between Test Executions for Autonomous Vehicles

2021-12-21T17:07:23Z

Abadithe:

2022 SURF: Evaluating Redundancy between Test Executions

- Mentor: Richard Murray

- Co-mentors: Apurva Badithela, Josefine Graebener

==Project Description==
For autonomy to be deployed in safety-critical settings, operational testing is imperative. However, principled methods for generating operational tests is still a young but growing research area? Since the autonomous systems are complex and the domain of their operating environments is typically very large, it is not possible to exhaustively check or verify the autonomous systems' behavior. Instead, we need an automated paradigm to select a small number of tests that are the most informative of the system. In this work, we want to formally characterize the notion of redundancy between two test executions.

[[Image:Screen Shot 2021-12-20 at 11.27.21 PM.png|right|400px]]

Testing autonomous systems requires defining the test environment, which comprises of test agents, obstacles, and test harnesses on the system under test. Test cases of varying complexity (length of the test, number of test agents and their strategies) could offer the same information on the system's ability to satisfy a requirement. Consider the following example of testing a miniature self-driving car on the Duckietown platform. The autonomous car to be tested has a controller that navigates indefinitely around a loop --- it needs to do lane following, avoid colliding with other cars and take unprotected left turns at intersections after reading appropriate road signs. The figure to the right shows a duckiebot on a simple layout; other duckiebots and mini road signs can be easily augmented to this setup. The duckiebot under test has an off-the-shelf controller implemented on-board for indefinite navigation around the track. In addition to the hardware setup, we have access to a simulator of the hardware setup that could potentially be useful in designing our experiments.

For example, tests can generally be classified into four different categories:
* Open-loop test in a static environment -- The test parameters are defined prior to the start of the test and do not depend on the real-time state of the autonomous system during the test. The test environment has no continuous dynamics. An example of this type of test would be to setup the track to have obstacles that block the autonomous duckiebot's path and have the duckiebot start with some pre-defined initial pose and velocity.

* Open-loop test in a dynamic environment -- The test strategies of the test environment agents do not depend on the real-time state of the system under test. The test environment has agents with continuous dynamics. For example, in this test paradigm, we could have test environment duckiebots with predefined instructions on how to navigate around the track, and this pre-programmed logic on the test environment duckiebots does not change irrespective of the behavior of the duckiebot under test.

* Reactive test in a static environment -- The test parameters depend on the real-time state of the system under test. The test environment does not have agents with continuous dynamics. For example, obstacles can be placed on the track but instructions on where the duckiebot under test should navigate to could be generated in real-time and depends on the trajectory of the duckiebot under test.

* Reactive test in a dynamic environment -- The test parameters depend on the real-time state of the system under test, and the test strategies agents in the environment are reactive to the state of the system. Considering the running example of testing a duckiebot on the track, this type of test would comprise of test environment duckiebots with strategies that react to the duckiebot under test. A concrete example of this would be a test duckiebot slowing down to match the time at which the test duckiebot and the duckiebot under test reach an intersection. This would prompt the duckiebot under test to account for the test duckiebot before proceeding through the intersection, and this would be more useful test than having the duckiebot driving through an empty intersection.

For this SURF, we would like to implement a few tests in test environments of varying complexity (as described above) on the Duckietown hardware. We would then like to characterize the test scenarios for when two tests are redundant and when they are not. For instance, is there open-loop test executions in a dynamic environment that can be replaced by an equivalent open-loop test in a static environment? Likewise, we would also like to reason about redundancy in the following test paradigms -- open-loop test in a dynamic environment vs. reactive test in a dynamic environment and reactive test in a static environment vs. reactive test in a dynamic environment. We can start by defining and computing a notion of information gain over the test execution and show that a more complex test may or may not offer any new insight regarding system performance. If we can prove, or find a counterexample that disproves, redundancy between test executions from different test environment paradigms, we would like to implement the test executions on Duckietown to demonstrate this. in addition to the Duckietown hardware, we have access to the Duckietown simulator that will be a useful tool for this study.

==Requisites==
* Experience coding in Python
* Willing to learn development on Docker and Github
* Interest in hands-on robotics experience

==What you can expect from this SURF==
* Work closely with graduate students on test case generation for autonomy
* Hands-on experience with autonomous robots
* Coming up with theoretical insights (ex: Proving results on which class of systems and test paradigms are equivalent)
* Writing open-source code to implement algorithms to demonstrating these ideas

==References==
1. Duckietown. https://docs.duckietown.org/daffy/duckietown-robotics-development/out/index.html

SURF 2022: Evaluating Redundancy Between Test Executions for Autonomous Vehicles

2021-12-21T17:07:11Z

Abadithe:

2022 SURF: Evaluating Redundancy between Test Paradigms

- Mentor: Richard Murray

- Co-mentors: Apurva Badithela, Josefine Graebener

==Project Description==
For autonomy to be deployed in safety-critical settings, operational testing is imperative. However, principled methods for generating operational tests is still a young but growing research area? Since the autonomous systems are complex and the domain of their operating environments is typically very large, it is not possible to exhaustively check or verify the autonomous systems' behavior. Instead, we need an automated paradigm to select a small number of tests that are the most informative of the system. In this work, we want to formally characterize the notion of redundancy between two test executions.

[[Image:Screen Shot 2021-12-20 at 11.27.21 PM.png|right|400px]]

Testing autonomous systems requires defining the test environment, which comprises of test agents, obstacles, and test harnesses on the system under test. Test cases of varying complexity (length of the test, number of test agents and their strategies) could offer the same information on the system's ability to satisfy a requirement. Consider the following example of testing a miniature self-driving car on the Duckietown platform. The autonomous car to be tested has a controller that navigates indefinitely around a loop --- it needs to do lane following, avoid colliding with other cars and take unprotected left turns at intersections after reading appropriate road signs. The figure to the right shows a duckiebot on a simple layout; other duckiebots and mini road signs can be easily augmented to this setup. The duckiebot under test has an off-the-shelf controller implemented on-board for indefinite navigation around the track. In addition to the hardware setup, we have access to a simulator of the hardware setup that could potentially be useful in designing our experiments.

For example, tests can generally be classified into four different categories:
* Open-loop test in a static environment -- The test parameters are defined prior to the start of the test and do not depend on the real-time state of the autonomous system during the test. The test environment has no continuous dynamics. An example of this type of test would be to setup the track to have obstacles that block the autonomous duckiebot's path and have the duckiebot start with some pre-defined initial pose and velocity.

* Open-loop test in a dynamic environment -- The test strategies of the test environment agents do not depend on the real-time state of the system under test. The test environment has agents with continuous dynamics. For example, in this test paradigm, we could have test environment duckiebots with predefined instructions on how to navigate around the track, and this pre-programmed logic on the test environment duckiebots does not change irrespective of the behavior of the duckiebot under test.

* Reactive test in a static environment -- The test parameters depend on the real-time state of the system under test. The test environment does not have agents with continuous dynamics. For example, obstacles can be placed on the track but instructions on where the duckiebot under test should navigate to could be generated in real-time and depends on the trajectory of the duckiebot under test.

* Reactive test in a dynamic environment -- The test parameters depend on the real-time state of the system under test, and the test strategies agents in the environment are reactive to the state of the system. Considering the running example of testing a duckiebot on the track, this type of test would comprise of test environment duckiebots with strategies that react to the duckiebot under test. A concrete example of this would be a test duckiebot slowing down to match the time at which the test duckiebot and the duckiebot under test reach an intersection. This would prompt the duckiebot under test to account for the test duckiebot before proceeding through the intersection, and this would be more useful test than having the duckiebot driving through an empty intersection.

For this SURF, we would like to implement a few tests in test environments of varying complexity (as described above) on the Duckietown hardware. We would then like to characterize the test scenarios for when two tests are redundant and when they are not. For instance, is there open-loop test executions in a dynamic environment that can be replaced by an equivalent open-loop test in a static environment? Likewise, we would also like to reason about redundancy in the following test paradigms -- open-loop test in a dynamic environment vs. reactive test in a dynamic environment and reactive test in a static environment vs. reactive test in a dynamic environment. We can start by defining and computing a notion of information gain over the test execution and show that a more complex test may or may not offer any new insight regarding system performance. If we can prove, or find a counterexample that disproves, redundancy between test executions from different test environment paradigms, we would like to implement the test executions on Duckietown to demonstrate this. in addition to the Duckietown hardware, we have access to the Duckietown simulator that will be a useful tool for this study.

==Requisites==
* Experience coding in Python
* Willing to learn development on Docker and Github
* Interest in hands-on robotics experience

==What you can expect from this SURF==
* Work closely with graduate students on test case generation for autonomy
* Hands-on experience with autonomous robots
* Coming up with theoretical insights (ex: Proving results on which class of systems and test paradigms are equivalent)
* Writing open-source code to implement algorithms to demonstrating these ideas

==References==
1. Duckietown. https://docs.duckietown.org/daffy/duckietown-robotics-development/out/index.html

SURF 2022: Evaluating Redundancy Between Test Executions for Autonomous Vehicles

2021-12-21T17:06:44Z

Abadithe: /* Project Description */

2022 SURF: Evaluating Redundancy between Test Cases

- Mentor: Richard Murray

- Co-mentors: Apurva Badithela, Josefine Graebener

==Project Description==
For autonomy to be deployed in safety-critical settings, operational testing is imperative. However, principled methods for generating operational tests is still a young but growing research area? Since the autonomous systems are complex and the domain of their operating environments is typically very large, it is not possible to exhaustively check or verify the autonomous systems' behavior. Instead, we need an automated paradigm to select a small number of tests that are the most informative of the system. In this work, we want to formally characterize the notion of redundancy between two test executions.

[[Image:Screen Shot 2021-12-20 at 11.27.21 PM.png|right|400px]]

Testing autonomous systems requires defining the test environment, which comprises of test agents, obstacles, and test harnesses on the system under test. Test cases of varying complexity (length of the test, number of test agents and their strategies) could offer the same information on the system's ability to satisfy a requirement. Consider the following example of testing a miniature self-driving car on the Duckietown platform. The autonomous car to be tested has a controller that navigates indefinitely around a loop --- it needs to do lane following, avoid colliding with other cars and take unprotected left turns at intersections after reading appropriate road signs. The figure to the right shows a duckiebot on a simple layout; other duckiebots and mini road signs can be easily augmented to this setup. The duckiebot under test has an off-the-shelf controller implemented on-board for indefinite navigation around the track. In addition to the hardware setup, we have access to a simulator of the hardware setup that could potentially be useful in designing our experiments.

For example, tests can generally be classified into four different categories:
* Open-loop test in a static environment -- The test parameters are defined prior to the start of the test and do not depend on the real-time state of the autonomous system during the test. The test environment has no continuous dynamics. An example of this type of test would be to setup the track to have obstacles that block the autonomous duckiebot's path and have the duckiebot start with some pre-defined initial pose and velocity.

* Open-loop test in a dynamic environment -- The test strategies of the test environment agents do not depend on the real-time state of the system under test. The test environment has agents with continuous dynamics. For example, in this test paradigm, we could have test environment duckiebots with predefined instructions on how to navigate around the track, and this pre-programmed logic on the test environment duckiebots does not change irrespective of the behavior of the duckiebot under test.

* Reactive test in a static environment -- The test parameters depend on the real-time state of the system under test. The test environment does not have agents with continuous dynamics. For example, obstacles can be placed on the track but instructions on where the duckiebot under test should navigate to could be generated in real-time and depends on the trajectory of the duckiebot under test.

* Reactive test in a dynamic environment -- The test parameters depend on the real-time state of the system under test, and the test strategies agents in the environment are reactive to the state of the system. Considering the running example of testing a duckiebot on the track, this type of test would comprise of test environment duckiebots with strategies that react to the duckiebot under test. A concrete example of this would be a test duckiebot slowing down to match the time at which the test duckiebot and the duckiebot under test reach an intersection. This would prompt the duckiebot under test to account for the test duckiebot before proceeding through the intersection, and this would be more useful test than having the duckiebot driving through an empty intersection.

For this SURF, we would like to implement a few tests in test environments of varying complexity (as described above) on the Duckietown hardware. We would then like to characterize the test scenarios for when two tests are redundant and when they are not. For instance, is there open-loop test executions in a dynamic environment that can be replaced by an equivalent open-loop test in a static environment? Likewise, we would also like to reason about redundancy in the following test paradigms -- open-loop test in a dynamic environment vs. reactive test in a dynamic environment and reactive test in a static environment vs. reactive test in a dynamic environment. We can start by defining and computing a notion of information gain over the test execution and show that a more complex test may or may not offer any new insight regarding system performance. If we can prove, or find a counterexample that disproves, redundancy between test executions from different test environment paradigms, we would like to implement the test executions on Duckietown to demonstrate this. in addition to the Duckietown hardware, we have access to the Duckietown simulator that will be a useful tool for this study.

==Requisites==
* Experience coding in Python
* Willing to learn development on Docker and Github
* Interest in hands-on robotics experience

==What you can expect from this SURF==
* Work closely with graduate students on test case generation for autonomy
* Hands-on experience with autonomous robots
* Coming up with theoretical insights (ex: Proving results on which class of systems and test paradigms are equivalent)
* Writing open-source code to implement algorithms to demonstrating these ideas

==References==
1. Duckietown. https://docs.duckietown.org/daffy/duckietown-robotics-development/out/index.html

SURF 2022: Evaluating Redundancy Between Test Executions for Autonomous Vehicles

2021-12-21T16:55:19Z

Abadithe: /* Project Description */

2022 SURF: Evaluating Redundancy between Test Cases

- Mentor: Richard Murray

- Co-mentors: Apurva Badithela, Josefine Graebener

==Project Description==
For autonomy to be deployed in safety-critical settings, operational testing is imperative. However, principled methods for generating operational tests is still a young but growing research area? Since the autonomous systems are complex and the domain of their operating environments is typically very large, it is not possible to exhaustively check or verify the autonomous systems' behavior. Instead, we need an automated paradigm to select a small number of tests that are the most informative of the system. In this work, we want to formally characterize the notion of redundancy between two test executions.

[[Image:Screen Shot 2021-12-20 at 11.27.21 PM.png|right|400px]]

Testing autonomous systems requires defining the test environment, which comprises of test agents, obstacles, and test harnesses on the system under test. Test cases of varying complexity (length of the test, number of test agents and their strategies) could offer the same information on the system's ability to satisfy a requirement. Consider the following example of testing a miniature self-driving car on the Duckietown platform. The autonomous car to be tested has a controller that navigates indefinitely around a loop --- it needs to do lane following, avoid colliding with other cars and take unprotected left turns at intersections after reading appropriate road signs. The figure to the right shows a duckiebot on a simple layout; other duckiebots and mini road signs can be easily augmented to this setup. The duckiebot under test has an off-the-shelf controller implemented on-board for indefinite navigation around the track. In addition to the hardware setup, we have access to a simulator of the hardware setup that could potentially be useful in designing our experiments.

For example, tests can generally be classified into four different categories:
* Open-loop test in a static environment -- The test parameters are defined prior to the start of the test and do not depend on the real-time state of the autonomous system during the test. The test environment has no continuous dynamics. An example of this type of test would be to setup the track to have obstacles that block the autonomous duckiebot's path and have the duckiebot start with some pre-defined initial pose and velocity.

* Open-loop test in a dynamic environment -- The test strategies of the test environment agents do not depend on the real-time state of the system under test. The test environment has agents with continuous dynamics. For example, in this test paradigm, we could have test environment duckiebots with predefined instructions on how to navigate around the track, and this pre-programmed logic on the test environment duckiebots does not change irrespective of the behavior of the duckiebot under test.

* Reactive test in a static environment -- The test parameters depend on the real-time state of the system under test. The test environment does not have agents with continuous dynamics. For example, obstacles can be placed on the track but instructions on where the duckiebot under test should navigate to could be generated in real-time and depends on the trajectory of the duckiebot under test.

* Reactive test in a dynamic environment -- The test parameters depend on the real-time state of the system under test, and the test strategies agents in the environment are reactive to the state of the system. Considering the running example of testing a duckiebot on the track, this type of test would comprise of test environment duckiebots with strategies that react to the duckiebot under test. A concrete example of this would be a test duckiebot slowing down to match the time at which the test duckiebot and the duckiebot under test reach an intersection. This would prompt the duckiebot under test to account for the test duckiebot before proceeding through the intersection, and this would be more useful test than having the duckiebot driving through an empty intersection.

For this SURF, we would like to implement a few tests in test environments of varying complexity on the Duckietown hardware. We would then like to characterize the test scenarios for when two tests are redundant and when they are not. We can show this by defining and computing a notion of information gain over the test and show that a more complex test may or may not offer any new insight regarding system performance.

==Requisites==
* Experience coding in Python
* Willing to learn development on Docker and Github
* Interest in hands-on robotics experience

==What you can expect from this SURF==
* Work closely with graduate students on test case generation for autonomy
* Hands-on experience with autonomous robots
* Coming up with theoretical insights (ex: Proving results on which class of systems and test paradigms are equivalent)
* Writing open-source code to implement algorithms to demonstrating these ideas

==References==
1. Duckietown. https://docs.duckietown.org/daffy/duckietown-robotics-development/out/index.html

SURF 2022: Evaluating Redundancy Between Test Executions for Autonomous Vehicles

2021-12-21T16:39:31Z

Abadithe: /* Project Description */

SURF 2022: Evaluating Redundancy Between Test Executions for Autonomous Vehicles

2021-12-21T16:29:11Z

Abadithe: /* Project Description */

2022 SURF: Evaluating Redundancy between Test Cases

- Mentor: Richard Murray

- Co-mentors: Apurva Badithela, Josefine Graebener

==Project Description==
For autonomy to be deployed in safety-critical settings,

[[Image:Screen Shot 2021-12-20 at 11.27.21 PM.png|right|400px]]

Testing autonomous systems requires defining the test environment, which comprises of test agents, obstacles, and test harnesses on the system under test. Test cases of varying complexity (length of the test, number of test agents and their strategies) could offer the same information on the system's ability to satisfy a requirement. Consider the following example of testing a miniature self-driving car on the Duckietown platform. The autonomous car to be tested has a controller that navigates indefinitely around a loop --- it needs to do lane following, avoid colliding with other cars and take unprotected left turns at intersections after reading appropriate road signs. The figure to the right shows a duckiebot on a simple layout; other duckiebots and mini road signs can be easily augmented to this setup. The duckiebot under test has an off-the-shelf controller implemented on-board for indefinite navigation around the track. In addition to the hardware setup, we have access to a simulator of the hardware setup that could potentially be useful in designing our experiments.

For example, tests can generally be classified into four different categories --- open-loop test in a static environment, open-loop test in a dynamic environment, reactive test in a static environment, and a closed-loop test in a reactive environment.

For this SURF, we would like to implement a few tests in test environments of varying complexity on the Duckietown hardware. We would then like to characterize the test scenarios for when two tests are redundant and when they are not. We can show this by defining and computing a notion of information gain over the test and show that a more complex test may or may not offer any new insight regarding system performance.

==Requisites==
* Experience coding in Python
* Willing to learn development on Docker and Github
* Interest in hands-on robotics experience

==What you can expect from this SURF==
* Work closely with graduate students on test case generation for autonomy
* Hands-on experience with autonomous robots
* Coming up with theoretical insights (ex: Proving results on which class of systems and test paradigms are equivalent)
* Writing open-source code to implement algorithms to demonstrating these ideas

==References==
1. Duckietown. https://docs.duckietown.org/daffy/duckietown-robotics-development/out/index.html

SURF 2022: Evaluating Redundancy Between Test Executions for Autonomous Vehicles

2021-12-21T16:28:38Z

Abadithe: /* What you can possibly expect from this SURF */

2022 SURF: Evaluating Redundancy between Test Cases

- Mentor: Richard Murray

- Co-mentors: Apurva Badithela, Josefine Graebener

==Project Description==
For autonomy to be deployed in safety-critical settings,

Testing autonomous systems requires defining the test environment, which comprises of test agents, obstacles, and test harnesses on the system under test. Test cases of varying complexity (length of the test, number of test agents and their strategies) could offer the same information on the system's ability to satisfy a requirement. Consider the following example of testing a miniature self-driving car on the Duckietown platform. The autonomous car to be tested has a controller that navigates indefinitely around a loop --- it needs to do lane following, avoid colliding with other cars and take unprotected left turns at intersections after reading appropriate road signs. The figure to the right shows a duckiebot on a simple layout; other duckiebots and mini road signs can be easily augmented to this setup. The duckiebot under test has an off-the-shelf controller implemented on-board for indefinite navigation around the track. In addition to the hardware setup, we have access to a simulator of the hardware setup that could potentially be useful in designing our experiments.

[[Image:Screen Shot 2021-12-20 at 11.27.21 PM.png|right|400px]]

For example, tests can generally be classified into four different categories --- open-loop test in a static environment, open-loop test in a dynamic environment, reactive test in a static environment, and a closed-loop test in a reactive environment.

For this SURF, we would like to implement a few tests in test environments of varying complexity on the Duckietown hardware. We would then like to characterize the test scenarios for when two tests are redundant and when they are not. We can show this by defining and computing a notion of information gain over the test and show that a more complex test may or may not offer any new insight regarding system performance.

==Requisites==
* Experience coding in Python
* Willing to learn development on Docker and Github
* Interest in hands-on robotics experience

==What you can expect from this SURF==
* Work closely with graduate students on test case generation for autonomy
* Hands-on experience with autonomous robots
* Coming up with theoretical insights (ex: Proving results on which class of systems and test paradigms are equivalent)
* Writing open-source code to implement algorithms to demonstrating these ideas

==References==
1. Duckietown. https://docs.duckietown.org/daffy/duckietown-robotics-development/out/index.html

SURF 2022: Evaluating Redundancy Between Test Executions for Autonomous Vehicles

2021-12-21T05:49:05Z

Abadithe:

2022 SURF: Evaluating Redundancy between Test Cases

- Mentor: Richard Murray

- Co-mentors: Apurva Badithela, Josefine Graebener

==Project Description==
For autonomy to be deployed in safety-critical settings,

Testing autonomous systems requires defining the test environment, which comprises of test agents, obstacles, and test harnesses on the system under test. Test cases of varying complexity (length of the test, number of test agents and their strategies) could offer the same information on the system's ability to satisfy a requirement. Consider the following example of testing a miniature self-driving car on the Duckietown platform. The autonomous car to be tested has a controller that navigates indefinitely around a loop --- it needs to do lane following, avoid colliding with other cars and take unprotected left turns at intersections after reading appropriate road signs. The figure to the right shows a duckiebot on a simple layout; other duckiebots and mini road signs can be easily augmented to this setup. The duckiebot under test has an off-the-shelf controller implemented on-board for indefinite navigation around the track. In addition to the hardware setup, we have access to a simulator of the hardware setup that could potentially be useful in designing our experiments.

[[Image:Screen Shot 2021-12-20 at 11.27.21 PM.png|right|400px]]

For example, tests can generally be classified into four different categories --- open-loop test in a static environment, open-loop test in a dynamic environment, reactive test in a static environment, and a closed-loop test in a reactive environment.

For this SURF, we would like to implement a few tests in test environments of varying complexity on the Duckietown hardware. We would then like to characterize the test scenarios for when two tests are redundant and when they are not. We can show this by defining and computing a notion of information gain over the test and show that a more complex test may or may not offer any new insight regarding system performance.

==Requisites==
* Experience coding in Python
* Willing to learn development on Docker and Github
* Interest in hands-on robotics experience

==What you can possibly expect from this SURF==
* Get a sense of the research frontier on test case generation for autonomy
* Hands-on experience with

==References==
1. Duckietown. https://docs.duckietown.org/daffy/duckietown-robotics-development/out/index.html

SURF 2022: Evaluating Redundancy Between Test Executions for Autonomous Vehicles

2021-12-21T05:33:57Z

Abadithe:

2022 SURF: Evaluating Redundancy between Test Cases

- Mentor: Richard Murray

- Co-mentors: Apurva Badithela, Josefine Graebener

==Project Description==

Testing autonomous systems requires defining the test environment, which comprises of test agents, obstacles, and test harnesses on the system under test. Test cases of varying complexity (length of the test, number of test agents and their strategies) could offer the same information on the system's ability to satisfy a requirement. Consider the following example of testing a miniature self-driving car on the Duckietown platform. The autonomous car to be tested has a controller that navigates indefinitely around a loop --- it needs to do lane following, avoid colliding with other cars and take unprotected left turns at intersections after reading appropriate road signs. The figure to the right shows a duckiebot on a simple layout; other duckiebots and mini road signs can be easily augmented to this setup.

[[Image:Screen Shot 2021-12-20 at 11.27.21 PM.png|right|400px]]

* For example, tests can generally be classified into four different categories --- open-loop test in a static environment, open-loop test in a dynamic environment, reactive test in a static environment, and a closed-loop test in a reactive environment.

* For this SURF, we would like to implement a few tests in test environments of varying complexity on the Duckietown hardware. We would then like to characterize the test scenarios for when two tests are redundant and when they are not. We can show this by defining and computing a notion of information gain over the test and show that a more complex test may or may not offer any new insight regarding system performance.

==References==