2005 MURI White Paper

Return to 2005 MURI Proposal

Cover letter (optional)

Not sure what to put here. Perhaps description of team? Assume this won't be read by anyone but program manager.

Cover Page

The cover page shall be labeled “PROPOSAL WHITE PAPER” and shall include the BAA number 05-017, proposed title, and proposer’s technical point of contact, with telephone number, facsimile number, and email address

Identification of the research and issues (Richard; 1/2 page)

We propose to develop a mathematical language for specification, design and verification of distributed, embedded systems that provides an analyzable framework for robust performance.

Our specific goals for this MURI are to:

Develop common mathematical framework for analyzing distributed, hybrid systems. This framework will be able to handle a combination of continuous dynamics and temporal logic, building on our previous work in analysis of hybrid dynamical systems. This will require a significant extension of current work to include temporal logic and distributed computing. In addition, we believe that it will be important to include stochastic systems, extending work such as SOSTOOLS to include probabilistic analysis.
Extend ideas from distributed/parallel computing to apply to dynamical systems. Concepts such as fairness, safety, progress and concurrency are not strongly linked with concepts such as dynamics, stability and performance in embedded systems. We plan to extend tools in distributed computing to allow for dynamic changes in behavior and provide methods for analyzing robustness to such changes. This includes incremental changes (evolution of the dynamic state), large changes (changes in mode or failure of a component) and probabalistic changes (noise, disturbances and other stochastic uncertainties).
Move the handoff between handcrafted proofs and formal methods further upstream. A major theme of our work will be to show how to take currently handcrafted proofs and formal methods for verification and validation of embedded systems and move them to higher levels of complexity and abstraction. This will enable the design of increasingly complex systems without the need for huge amounts of iteration and Monte Carlo analysis to verify performance.
Demonstrate the utility of these methods on problems in network centric environments. We will make use of two testbeds at Caltech for this purpose - the Caltech Multi-Vehicle Wireless Testbed (MVWT) and "Alice", our 2005 DARPA Grand Challenge entry. Alice provides a sophisticated embedded system environment that includes 5 Gb/s raw data rates from sensors, 12 high speed processing cores linked by 1 Gb/s ethernet and a complex set of tasks and environments for autonomous operations. The MVWT offers a simpler dynamic and computational environment, but allows cooperative control of multiple vehicles operating in a dynamic, uncertain and adversarial environment. In each case we propose to develop a collection of primitive operations which can be used to "program" the systems and provide automatically verified code that satisfies a given performance specification.

Proposed technical approach

Background (Richard; 1/2 page)

DoD systems are becoming increasingly sophisticated as we seek to design, build, test and field systems that are capable of higher levels of decision making and stronger integration into C4ISR infrastructure. The command and control software required to implement such systems is increasingly complex and our ability to verify that a system meets its specifications and validate the proper operation of the system is fast becoming a bottleneck in deploying new capability.

A distributed, embedded control system requires a sophisticated supervisory control structure that not only switches between control modes, but also manages communication and information among UAVs, responds to commands, automatically generates tasks and subtasks, monitors the health of the system, and so on. However, current software engineering practice cannot produce large complex systems for which any substantial formal guarantees can be made. In fact, in software engineering, it is usually assumed that large systems will have bugs. Unfortunately, the enormous complexities of such systems coupled with the fact that they are embedded in a physical, often adversarial, world means that exhaustively testing them is essentially impossible. The cost of and time associated with developing software for these systems makes rapidly reconfiguring them to adapt to new scenarios impossible. This makes the entire infrastructure fragile and susceptible to failure and compromise.

We seek to address this situation through the development of new approaches for specifying, implementing and verifying command and control systems. We will build on two new areas of development over the past five years under AFOSR and other funding: sum-of-squares techniques for constructing proof certificates for hybrid systems and formal methods for analysis and design of distributed computation and control systems. New results in each of these areas are ripe for further development and integration as part of developing a mathematical framework for analyzing distributed, hybrid systems.

Specification and Programming Languages (Eric, with Mani; 1 page)

Two of the PIs (E. Klavins and R. Murray) have recently designed a specification language called CCL (The Computation and Control Language) that is model loosely after UNITY (developed by M. Chandy for modeling parallel systems) and which bears a strong resemblance to Promela (part of the SPIN model-checker developed by G. Holtzmann, a collaborator on this project). The idea behind CCL is to allow the control engineer to specify a simple model of the behavior of the system and to specify and reason about correctness properties in temporal logic. In particular, CCL allows (1) a formal model of the environment to be included as part of the specification and (2) allows the user to specify a model of how synchronized the distributed control elements are to be with each other and the environment (for example, one could say that the frequency of their clocks differs by no more than some small amount). Thus, the goal of CCL is to replace software engineering and testing with model-building, formal specification and proof.

Presently, there are many limitations to CCL (and of other systems that attempt to accomplish the same thing) that allow it to be used for only simple systems. First, the proofs in CCL are done by either by hand or with the help of a difficult to use theorem prover. Second, specifying and reasoning about a sophisticated set of continuous feedback control laws in CCL is difficult due to the fact that one typically uses a different tools to reason about concurrency (e.g. model checking and theorem proving) than one uses to reason about dynamical systems (e.g. barrier functions and sum of squares). Third, incorporating stochastic elements, such as the probability of a component failure or of the behavior of an adversary, in a formal and tractable way is presently impossible.

We propose to improve the approach in several ways:

We propose to simplify the language to disallow arbitrarily complex specifications. The PIs believe that a simple set of building blocks, each accompanied by a formal reasoning tactic will allow the "specification and proof" engineer to easily build and reason about systems without having to do one-of-a-kind proofs.

We propose to develop a common framework for reasoning about continuous controllers, dynamical systems and concurrent, loosely synchronized supervisory control structures. The PIs believe that the tantalizing use of barrier functions in hybrid control structures can be extended to systems with the sophistication of those specified in CCL.

We propose to extend our techniques to include probabilistic specifications. The state of the art in probabilistic model checking, for example, is presently to the point where simple systems can be reasoned about efficiently.

Sum of Squares Techniques (Pablo, with John; 1 page)

Sum of squares (SOS) based techniques constitute one of the most comprehensive methodologies available for the algorithmic analysis of uncertain dynamical systems, encompassing and subsuming many earlier successful techniques. Though conceptually simple, SOS-based methods are based on a combination of techniques from real algebraic geometry and convex optimization, and provide a novel and powerful generalization of duality-based methods.

In recent years, there have been a great number of exciting applications of these techniques in many areas, ranging from optimization and control to geometric theorem proving and quantum information theory. In particular, in the Systems and Control area, the methods have yielded novel analysis and design tools in areas such as nonlinear stability, robustness analysis, and safety verification for hybrid systems. Many known successful alternative approaches (such as $\mu$ or IQCs) can be understood, combined, or reformulated as particular cases of this general methodology.

Despite the many successes, there are several important issues that remain to be improved upon for the successful application of the techniques to large-scale heterogeneous, distributed systems. The most pressing one is the need to incorporate in a solid mathematical footing elements of temporal logic and/or concurrency. Most of the current SOS applications in a dynamic context entail a transformation (e.g., via loop invariants, Lyapunov, or barrier functions) to semialgebraic questions. It is very desirable to have a direct, formal path that connects directly the logic specification and the corresponding certificates.

Furthermore, in the SOS framework, the system properties under consideration (e.g., stability, safety, performance) are proved and certified algebraically by multivariate polynomials identities, which are obtained by numerical convex optimization. Among the difficulties towards the scalability of SOS techniques, is the relative lack of sophistication of the families of possible proof certificates in the optimization, since the current approach utilizes a simple notion of proof length directly related with the degree of the corresponding polynomials. A different (but related) source of problems concerns the exploitation of algebraic structure in the formulation of efficient numerical methods for large-scale convex optimization problems.

To address these issues, we propose:

Fully integrate elements of temporal logic and SOS-based verification of semialgebraic conditions. A possible step in this direction would be the formulation of differential algebra based analogues of the Positivstellansatz, to be used in a decision procedure for solution of differential systems.

We propose to refine SOS methods, to allow for the efficient search over structured families of certificates. We will also explore the incorporation of human-computer interaction strategies from automatic theorem proving for help in guiding development of SOS-based proofs, as well as the use of libraries of already proven theorems (e.g., identical subsystems) and associated heuristics.

Extend SOS-based methods to the analysis and verification of systems in an adversarial setting. A concrete starting point are continuous-state two player zero-sum games.

Incorporate probabilistic notions, both at the specification level (e.g., "the property holds with 99% probability") as well as through randomized algorithms for the efficient solution of the underlying deterministic search and optimization problems.

Testbeds (Richard; 1/2 page)

To test our methods and motivate new insights into their performance (and limitations), we plan to make use of two testbeds that have been developed at Caltech: the multi-vehicle wireless testbed and "Alice", an autonomous vehicle that will compete in the 2005 DARPA Grand Challenge. These systems represent complementary challenges that are very representative of those faced by designers of Air Force systems (and systems of systems).

MVWT

Over the past five years (under AFOSR DURIP and MURI funding), Caltech has built a testbed consisting of up to 18 mobile vehicles with embedded computing and communications capability for use in testing new approaches for command and control across dynamic networks. The system allows testing of a variety of communications-related technologies, including distributed command and control algorithms, dynamically reconfigurable network topologies, source coding for real-time transmission of data in lossy environments, and multi-network communications.

We propose to use the MVWT to demonstrate the ability to specify and verify cooperative control missions for multiple vehicles. We will do so by building an automated specification and proof interface to the MVWT that implements our approach to building systems that exhibit complex behaviors. Sample tasks include cooperative surveillance, area denial (in adversarial environments), and dynamic reconfiguration in the presence of vehicle failures (full or partial).

Alice

As a second testbed, we will make use of the infrastructure we have developed over the past 2.5 years through our participation in the 2004 and 2005 DARPA Grand Challenge. Our current vehicle, which the students have named "Alice", has six cameras, 4 LADAR units, an inertial measurement unit (IMU), a GPS navigation system, and numerous internal temperature and vibration sensors. The raw data rate for Alice is approximately 5 Gb/s, which must be processed and acted upon at rates of up to 100 Hz in order to insure safe operation at high driving speeds.

The control system for Alice makes use of a highly networked control architecture, with distributed data fusion to determine elevation maps (for the height of the terrain in front of the vehicle), multiple optimization-based controllers to plan possible routes forthe vehicle, and online modeling, fault management, and decision making to provide reliable and reconfigurable operation. Eight onboard computers distribute the computational load, sharing information at mixed rates across a 1 Gb/s switched network. System specifications call for reliable operation in the presence of up to 1 computer failure and 2 sensor failures, requiring careful coordination between computational elements.

Alice is representative of the level of complexity of UAVs and other systems. The verification and validation of software for such systems is a major challenge and by testing our techniques on Alice, we will enhance the likelihood of transition to industry and government.

Potential impact on DoD capabities (Richard; 1/2 page)

If successful, the techniques developed under this proposal will provide a fundamental understanding of how to design, implement and test complex, distributed command and control systems that can be used for mixed manned and unmanned environments. Specifically, we aim to develop as mathematical framework and specification/programming languages that allow formal analysis of performance and robustness of distributed systems. While these techniques will never elimininate the need for Monte Carlo simulations and test-based validation, they will narrow the gap between theory and implementation, allow more rapid (and more reliable) development of complex, distributed, embedded systems.

Potential team and management plan (Richard; 1/4 page)

The team will consist of five co-PIs at three universities: Murray (PI), Chandy and Doyle at Caltech; Klavins at U. Washington, and Parrilo at MIT. This group has a history of joint work (Parrilo got his PhD from Caltech CDS working with Doyle and Klavins was a postdoc in Caltech CS working with Murray) and represents an excellent collaboration between researchers with expertise in control and dynamical systems, computer science, and applied mathematics.

In addition to the university researchers, we plan to team with the Laboratory for Reliable Software (LARS) at the Jet Propulsion Laboratory (JPL) and we will leverage ongoing research at Caltech sponsored by Boeing. We are also exploring linkages to SRI (the author of PVS, a theorem-proving environment) and Honeywell. In all cases, these partnerships will provide transition paths for the proposed research as well as insights into additional research challenges as the program develops.

The program will be led by Richard Murray at Caltech. Because of the tight collaboration between the co-PIs, we anticipate frequent joint meetings and workshops, as well as student and faculty visits between institutions.

Summary of estimated costs (1/2 page)

We anticipate a budget of $1M/year, with approximately $600K to be spent at Caltech and $200K each at MIT and U. Washington. Funds will be used to support graduate students, postdoctoral scholars, and some faculty salary support. The experimental testbeds that will be used in the proposal have already been constructed using other funds.

Curriculum vitae of key investigators

Note: this section does not count against page limit
Everyone should send Richard a 2 page CV to be included
- Chandy: missing
- Doyle: missing
- Klavins: received
- Murray: received
- Parillo: received

2005 MURI White Paper

Contents