Difference between revisions of "SURF 2020: Test and Evaluation for Autonomy"

From Murray Wiki
Jump to navigationJump to search
Line 6: Line 6:
 
[[Image:gridworld_sys_env.png|thumb|400px|right|Figure 2: Blue agent constitutes the system. It's objective is to always eventually go to GOAL when the park signal is ON, and always eventually go HOME when the ''park'' signal is OFF. With every step, its FUEL decreases by 1 unit. The blue agent must accomplish these objectives without running out of FUEL and without hitting the obstacle (red) patrolling agent. The red patrolling car and the ''park'' signal constitute the environment. The ''park'' signal can only take ON/OFF values and the red agent can only move along the red-transition lines on the grid.]]
 
[[Image:gridworld_sys_env.png|thumb|400px|right|Figure 2: Blue agent constitutes the system. It's objective is to always eventually go to GOAL when the park signal is ON, and always eventually go HOME when the ''park'' signal is OFF. With every step, its FUEL decreases by 1 unit. The blue agent must accomplish these objectives without running out of FUEL and without hitting the obstacle (red) patrolling agent. The red patrolling car and the ''park'' signal constitute the environment. The ''park'' signal can only take ON/OFF values and the red agent can only move along the red-transition lines on the grid.]]
  
A test is a sequence of environmental inputs to the system with the objective of finding faults in the system. From test data, we can evaluate whether the system has passed the test. One of the difficulties with testing autonomous systems is that under the same environmental conditions, the system might choose to take different actions.
+
If the system and environment specifications are given as GR(1) LTL specifications, it is possible to synthesize correct-by-construction controllers using TuliP (Reference 5). More background on GR(1) formalism can be found in Reference 4. In this work, we use GR(1) specifications to describe system and environment requirements.
Here are a few different possibilities for a SURF project:
+
 
 +
In testing, we check if the system has satisfied its specifications for a finite number of possible combinations of initial conditions, model parameters and environment actions. From test data, we can then evaluate whether the system has passed the test. The challenge is how do we algorithmically generate the "best" (finite) set of tests for an autonomous system given the specifications of the system and the environment? To this end, we would like the SURF student to develop a Test Plan Monitor for discrete decision-making behaviors.
  
 
1)How do we leverage test data to design the next set of tests? Specifically, for each trace of system actions, can we generate a set of environment traces to form the next round of tests?  
 
1)How do we leverage test data to design the next set of tests? Specifically, for each trace of system actions, can we generate a set of environment traces to form the next round of tests?  
Line 23: Line 24:
 
3. On Signal Temporal Logic. A. Donze, 2014 [https://people.eecs.berkeley.edu/~sseshia/fmee/lectures/EECS294-98_Spring2014_STL_Lecture.pdf Link]
 
3. On Signal Temporal Logic. A. Donze, 2014 [https://people.eecs.berkeley.edu/~sseshia/fmee/lectures/EECS294-98_Spring2014_STL_Lecture.pdf Link]
  
4. Wongpiromsarn, Tichakorn, et al. "TuLiP: a software toolbox for receding horizon temporal logic planning." Proceedings of the 14th international conference on Hybrid systems: computation and control. ACM, 2011. [https://user.eng.umd.edu/~mumu/files/wtoxm_HSCC2011.pdf Link]
+
4. RM Murray, U. Topcu, N. Wongpiromsarn. HYCON-EECI, 2013 [http://www.cds.caltech.edu/~murray/courses/eeci-sp13/L7_reactive-20Mar13.pdf Link]
 +
 
 +
5. Wongpiromsarn, Tichakorn, et al. "TuLiP: a software toolbox for receding horizon temporal logic planning." Proceedings of the 14th international conference on Hybrid systems: computation and control. ACM, 2011. [https://user.eng.umd.edu/~mumu/files/wtoxm_HSCC2011.pdf Link]

Revision as of 00:43, 11 December 2019

Autonomous systems are an emerging technology with potential for growth and impact in safety-critical applications such as self-driving cars, space missions, distributed power grid. In these applications, a rigorous, proof-based framework for design, test and evaluation of autonomy is necessary.

The architecture of autonomous systems can be represented in a hierarchy of levels (see figure below) with a discrete decision-making layer at the top and low-level controllers at the bottom. In this project, we will be focusing on testing at the top layer, that is, testing for discrete event systems. The requirements of the system under consideration can be represented as specifications in a formal language. Most commonly in theory, specifications are written in temporal logic languages such as Linear Temporal Logic (LTL), Signal Temporal Logic (STL), Computational Tree Logic (CTL), etc (See references 2 and 3 for an overview on writing specifications in temporal logic). In addition to the system and its specifications, we have an environment for which corresponding specifications can be written. The environment constitutes of other agents modeled in the problem statement that do not constitute the system. See Figure 2 below for an example scenario of the system / environment.

Figure 1: Architecture of autonomous systems
Figure 2: Blue agent constitutes the system. It's objective is to always eventually go to GOAL when the park signal is ON, and always eventually go HOME when the park signal is OFF. With every step, its FUEL decreases by 1 unit. The blue agent must accomplish these objectives without running out of FUEL and without hitting the obstacle (red) patrolling agent. The red patrolling car and the park signal constitute the environment. The park signal can only take ON/OFF values and the red agent can only move along the red-transition lines on the grid.

If the system and environment specifications are given as GR(1) LTL specifications, it is possible to synthesize correct-by-construction controllers using TuliP (Reference 5). More background on GR(1) formalism can be found in Reference 4. In this work, we use GR(1) specifications to describe system and environment requirements.

In testing, we check if the system has satisfied its specifications for a finite number of possible combinations of initial conditions, model parameters and environment actions. From test data, we can then evaluate whether the system has passed the test. The challenge is how do we algorithmically generate the "best" (finite) set of tests for an autonomous system given the specifications of the system and the environment? To this end, we would like the SURF student to develop a Test Plan Monitor for discrete decision-making behaviors.

1)How do we leverage test data to design the next set of tests? Specifically, for each trace of system actions, can we generate a set of environment traces to form the next round of tests?

2)Use test data to identify if the fault was caused because of information not captured in the discrete system model.

It would be useful for the SURF student to know MATLAB and Python.

References:

1. Bartocci, Ezio, et al. "Specification-based monitoring of cyber-physical systems: a survey on theory, tools and applications." Lectures on Runtime Verification. Springer, Cham, 2018. 135-175. Link

2. RM Murray, U. Topcu, N. Wongpiromsarn. HYCON-EECI, 2013 Link

3. On Signal Temporal Logic. A. Donze, 2014 Link

4. RM Murray, U. Topcu, N. Wongpiromsarn. HYCON-EECI, 2013 Link

5. Wongpiromsarn, Tichakorn, et al. "TuLiP: a software toolbox for receding horizon temporal logic planning." Proceedings of the 14th international conference on Hybrid systems: computation and control. ACM, 2011. Link