ch_intro.tex

\chapter*{Acknowledgments}
I am forever grateful to my research advisor Zachary Manchester for the years of mentorship and collaboration. His limitless energy and patience encouraged me at every stage to improve as a researcher and fostered my love for optimization.

My Stanford advisor, Allison Okamura, went beyond the call of duty to make my Ph.D. possible. Her graciousness throughout my time at Stanford provided me a feeling of complete support and enabled me to pursue my dream.

As an undergraduate at the University of Utah, Jake Abbott was an invaluable role model, providing early mentorship and instilling within me a drive for excellence in research.

To my labmates and collaborators, Simon Le Cleac'h, Brian Jackson, and Kevin Tracy, it was an incredible privilege to learn and work alongside you. I cherish the time we spent discussing ideas, making progress on challenging problems, and the shared excitement toward advancing robotics research.

The opportunities to perform research at Google Brain and DeepMind under the wonderful mentorship of Vikas Sindhwani and Yuval Tassa opened my mind to new and exciting research ideas and instilled a prioritization for excellent software engineering in my work. 

To my friends from Stanford, you made these years in California beautiful. And to my oldest friend Brian, thank you for all of our long conversations and the inspiration to take a path that has brought so much fulfillment and excitement into my life.

Finally, I am deeply grateful to my parents for their endless love and support; encouraging me to pursue my interests and fostering a creative environment where it felt possible to achieve anything I set out to accomplish. 

\chapter{Introduction}

This dissertation takes an optimization-first approach to the development of tools for simulation, planning, and control for robotic systems. The first chapter contains technical background on numerical optimization that will be extensively utilized in this work. Each of the following chapters is based on a research paper and includes discussion of limitations and future work.

Chapter 2 provides background on numerical optimization \cite{nocedal2006numerical} topics explored in this work, including trajectory optimization \cite{betts1998survey}, simulation of rigid-body dynamics with contact \cite{marsden2001discrete}, predictive control \cite{mayne2014model}, and implicit differentiation \cite{dini1907lezioni}. First, planning is formalized as a trajectory optimization problem and two important algorithms: the linear quadratic regular \cite{kalman1964lqr} and a popular variant for non-convex problems, iterative linear quadratic regulator \cite{mayne1966second}, are highlighted. Next, simulation for non-smooth mechanical systems experiencing impact and friction is formulated as a complementarity problem \cite{stewart1996implicit}. This is followed by a summary of predictive control. Then, two important algorithms for solving constrained optimization problems that are utilized extensively in this work, augmented Lagrangian \cite{bertsekas2014constrained} and interior-point \cite{nesterov1994interior} methods, are outlined. Finally, implicit differentiation is presented as an approach for efficiently differentiating through a numerical solver \cite{amos2017optnet, agrawal2019differentiating}.

Chapter 3 is based on the paper \cite{howell2022dojo}. This is joint work with Simon Le Cleac'h, Zico Kolter, Mac Schwager, and Zachary Manchester. In this chapter we develop and implement a differentiable physics engine for rigid-body dynamics with contact: Dojo. This engine builds upon prior work for simulating smooth dynamical systems in maximal coordinates using variational integrators \cite{brudigam2020linear}. In this work, we develop a new friction model that utilizes techniques from cone programming \cite{boyd2004convex} to support nonlinear friction cones in order to improve simulated stick-slip behavior. Next, we formulate a novel complementarity problem \cite{cottle2009linear} that incorporates this friction model, maximal-coordinates representations, and a classic impact model that is amenable to optimization with interior-point methods \cite{nesterov1994interior}. To efficiently and reliably solve this problem to high accuracy we develop a custom primal-dual interior-point solver. Finally, we employ implicit differentiation \cite{dini1907lezioni} to compute smooth analytical gradients that provide useful information through contact events by exploiting intermediate results from the solver. Experimental results including: simulation, planning, policy optimization, and system identification demonstrate the capabilities of the engine and the advantages of implicit gradients. An open-source implementation of the engine: \texttt{Dojo.jl}, written in the Julia programming language \cite{Julia-2017}, is provided.

Chapter 4 is based on the paper \cite{howell2019altro}. This is joint work with Brian Jackson and Zachary Manchester. In this chapter we develop and implement an optimizer that is specialized for trajectory optimization problems with equality and inequality constraints: ALTRO. This work builds on previous work that adds support for constraints to the iterative linear quadratic regulator (iLQR) algorithm via the augmented Lagrangian method \cite{lantoine2012hybrid, plancher2017constrained}. First, we devise a novel square-root backward pass, inspired by the square-root Kalman filter \cite{kaminski1971discrete}, that has improved numerical conditioning properties. Next, we present problem reformulations for free-time and infeasible-initialization problems that endow our iLQR-based algorithm with capabilities previous limited to direct trajectory optimization solvers. Finally, we devise a solution-polishing phase for the algorithm that takes coarse solutions from the primary iLQR phase and utilizes an active-set method \cite{nocedal2006numerical} to rapidly refine the solution to high precision. An open-source implementation of the optimizer, written in Julia, \texttt{TrajectoryOptimization.jl} is provided. 

Chapter 5 is based on the paper \cite{howell2022trajectory}. This work was developed during an internship at Google Brain and is in collaboration with Simon Le Cleac'h, Sumeet Singh, Pete Florence, Zachary Manchester, and Vikas Sindhwani. In this chapter we develop a bi-level approach for planning with a model represented as a constrained optimization problem. An upper-level problem, optimized with iLQR \cite{jacobson1970differential}, plans trajectories and a lower-level interior-point solver \cite{mehrotra1992implementation} optimizes a dynamics model at each time step that minimizes an objective subject to equality and cone constraints. Implicit differentiation \cite{dini1907lezioni} is utilized to compute derivatives through the lower-level solver which are subsequently utilized in the upper-level planning problem. Manipulation and locomotion examples are provided to demonstrate the approach.

Chapter 6 is based on the paper \cite{howell2022calipso}. This work is a collaboration with Kevin Tracy, Simon Le Cleac'h, and Zachary Manchester. We develop and implement a general-purpose solver for non-convex optimization problems: CALIPSO. Current state-of-the-art solvers, specifically: Ipopt \cite{wachter2006implementation} and SNOPT \cite{gill2005snopt}, have poor support for complementarity and second-order-cone constraints, which are necessary for many robotics applications. In this work we combined primal-dual augmented Lagrangian \cite{gill2012primal} and cone programming \cite{vandenberghe2010cvxopt} ideas, as well as heuristics from Ipopt, to support these constraints in the non-convex problem setting. We analyze the algorithmic properties of the solver on a simple contact-implicit trajectory optimization problem \cite{posa2014direct} by considering the linear independence constraint qualification \cite{izmailov2012global}. Then, we empirically validate the solver's performance for state-triggered constraints \cite{szmuk2020successive}, a collection of contact-implicit trajectory optimization problems, and a policy optimization scenario. An open-source implementation of the solver, \texttt{CALIPSO.jl}, written in Julia, is provided.

Chapter 7 is based on the paper \cite{howell2021direct}, in collaboration with Chunjiang Fu and Zachary Manchester. In this work we develop an algorithm to optimize robust feedback policies by formulating a single optimization problem that jointly optimizes a reference trajectory, a feedback tracking policy, and a set of deterministically chosen sample trajectories. Our work builds on the prior work of DIRTREL \cite{manchester2019robust}, which propagates uncertainty through linear dynamics and explicitly differentiates through the linear quadratic regulator problem in order to optimize a tracking policy. In contrast, our method deterministically propagates uncertainty through nonlinear dynamics using the unscented transform \cite{julier2004unscented} and directly optimizes the parameters of a policy. We demonstrate empirically that this generalized method exactly recovers linear quadratic regulator \cite{kalman1964lqr} policies in the case of linear dynamics, a quadratic objective, and Gaussian disturbances. Additionally, we demonstrate the capabilities of our method to optimize robust tracking policies with a collection of motion planning problems for underactuated, nonlinear dynamical systems, including synthesizing an output feedback policy.

Chapter 8 is based on the paper \cite{lecleach2021fast}, and is joint work with Simon Le Cleac'h, Mac Schwager, and Zachary Manchester. Hardware experiments are a collaboration with Shuo Yang, Chi-Yen Lee, John Zhang, and Arun Bishop. We present a predictive control algorithm for controlling systems that make and break contact with their environments. Online, the policy tracks a reference trajectory that is generated offline using contact-implicit trajectory optimization \cite{manchester2020variational}. To make the policy amenable to fast online optimization, the contact dynamics model, represented as a complementarity problem, is strategically approximated along the reference trajectory using a Taylor expansion. The resulting planning model comprises a sequence of time-varying linear complementarity problems. To further reduce the online computational cost of the planning model, we devise a custom linear-system solver that leverages offline partial factorization. We implement a custom primal-dual interior-point solver to optimize the resulting lower-level complementarity problems at each step in the planning horizon and a custom Gauss-Newton method for the upper-level planning problem. Real-time rates for the policy are achieved in simulation for a collection of locomotion examples. Hardware experiments performed on a Unitree Go1 quadruped \cite{unitree_go1} demonstrate the method's real-time performance and robustness to large external disturbances.

Chapter 9 is based on the paper \cite{howell2022predictive} and is co-authored with Yuval Tassa, in collaboration with Nimrod Gileadi, Saran Tunyasuvunakool, Kevin Zakka, and Tom Erez, during an internship at DeepMind. In this chapter we present an open-source tool, MuJoCo MPC, for real-time behavior synthesis using predictive control algorithms, built on top of the MuJoCo physics engine \cite{todorov2012mujoco}. The aim of this work is to democratize predictive control algorithms by providing fast planners written in C/C++ that extensively utilize multi-threading; asynchronous simulation and planning that enables the tool to run locally on limited-compute hardware; and interactive simulation in order to accelerate behavior design for robotics applications, which can often take minutes, hours, or days using available model-based or learning methods. Additionally, we present a simple derivative-free optimizer, Predictive Sampling, that is easy to understand and simple to implement. An open-source implementation, \texttt{MuJoCo MPC}, is provided. 

Ultimately, this work aims to leverage powerful tools from numerical optimization to advance the capabilities of robotic systems. My hope is that this work, and further research that may build upon it, will enable robots to be more dynamic, perform useful everyday tasks, and eventually, help us explore our universe.

\pagebreak