Robust Control of Uncertain Markov Decision Processes with Temporal Logic Specifications

From Murray Wiki
Revision as of 06:15, 15 May 2016 by Murray (talk | contribs) (htdb2wiki: creating page for 2012f_wtm12-cdc.html)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Eric M. Wolff, Ufuk Topcu, and Richard M. Murray
2012 Conference on Decision and Control (CDC)

We present a method for designing a robust control policy for an uncertain system subject to temporal logic spec- ifications. The system is modeled as a finite Markov Decision Process (MDP) whose transition probabilities are not exactly known but are known to belong to a given uncertainty set. A robust control policy is generated for the MDP that maximizes the worst-case probability of satisfying the specification over all transition probabilities in this uncertainty set. To this end, we use a procedure from probabilistic model checking to combine the system model with an automaton representing the specification. This new MDP is then transformed into an equivalent form that satisfies assumptions for stochastic shortest path dynamic programming. A robust version of dynamic programming solves for a epsilon-suboptimal robust control policy with time complexity O(log1/epsilon) times that for the non-robust case.