Publications

Discrete Approximate Information States in Partially Observable Environments

Published in 2022 American Control Conference (ACC), 2022

Access paper here The notion of approximate information states (AIS) was introduced in [1] as a methodology for learning taskrelevant state representations for control in partially observable systems. They proposed particular learning objectives which attempt to reconstruct the cost and next state and provide a bound on the suboptimality of the closed-loop performance, but it is unclear whether these bounds are tight or actually lead to good performance in practice. Here we study this methodology by examining the special case of discrete approximate information states (DAIS). In this setting, we can solve for the globally optimal policy using value iteration, allowing us to disambiguate the performance of the AIS objective from the policy search. Going further, for small problems with finite information states, we reformulate the DAIS learning problem as a novel mixed-integer program (MIP) and solve it to its global optimum; in the infinite information states case, we introduce clustering-based and end-to-end gradient-based optimization methods for minimizing the DAIS construction loss. We study DAIS in three partially observable environments and find that the AIS objective offers relatively loose bounds for guaranteeing monotonic performance improvement and is sufficient but not necessary for implementing optimal controllers. DAIS may even prove useful in practice by itself or as part of mixed discrete- and continuous-state representations, due to its ability to represent logical state, to its potential interpretabilty, and to the availability of these stronger algorithms.

Approximately Supermodular Scheduling Subject to Matroid Constraints

Published in Transactions on Automatic Control, 2020

Access paper here Control scheduling refers to the problem of as- signing agents or actuators to act upon a dynamical system at specific times so as to minimize a quadratic control cost, such as the objective of the Linear-quadratic-Gaussian (LQG) or the Linear Quadratic Regulator (LQR). When budget or operational constraints are imposed on the schedule, this problem is in general NP-hard and its solution can therefore only be approximated even for moderately large systems. The quality of this approximation depends on the structure of both the constraints and the objective. This work shows that greedy scheduling is near-optimal when the constraints can be repre- sented as an intersection of matroids, algebraic structures that encode requirements such as limits on the number of agents deployed per time slot, total number of actuator uses, and duty cycle restrictions. To do so, it proves that the LQG cost function is α-supermodular and provides a new α/(α + P )-optimality certificates for the greedy minimization of such functions over an intersections of P matroids. These certificates are shown to approach the 1/(1 + P ) guarantee of supermodular functions in relevant settings. These results support the use of greedy algorithms in non-supermodular quadratic control problems as opposed to typical heuristics such as convex relaxations and surrogate figures of merit, e.g., the log det of the controllability Gramian.