typos, exercises Ch 1,2.

YinTat · Jan 24, 2023 · 62d559c · 62d559c
1 parent 09f1e9c
commit 62d559c
Show file tree

Hide file tree

Showing 5 changed files with 147 additions and 48 deletions.
diff --git a/convexity.lyx b/convexity.lyx
@@ -941,7 +941,7 @@ K=\bigcap_{\theta\in\Rn}\left\{ x:\ \left\langle \theta,x\right\rangle \leq\max_
 
 \end_inset
 
-In other words, any closed convex set is a limit of a sequence of polyhedra.
+In other words, any closed convex set is the limit of a sequence of polyhedra.
 \end_layout
 
 \begin_layout Proof
@@ -1027,6 +1027,24 @@ where the set
 .
 \end_layout
 
+\begin_layout Exercise
+Let 
+\begin_inset Formula $A,B\subset\R^{n}$
+\end_inset
+
+ be nonempty disjoint closed convex sets.
+ Then there exists a vector 
+\begin_inset Formula $v\in\R^{n}$
+\end_inset
+
+ such that 
+\begin_inset Formula $\sup_{x\in A}v^{\top}x<\inf_{x\in B}v^{\top}x$
+\end_inset
+
+.
+
+\end_layout
+
 \begin_layout Standard
 Similar to convex sets, we have a separation theorem similar to Theorem
 
@@ -1037,15 +1055,15 @@ reference "thm:sep"
 \end_inset
 
  for convex functions.
- This shows that one can use binary search to find minimum of convex functions.
- (See Chapter 
+ In Chapter 
 \begin_inset CommandInset ref
 LatexCommand ref
 reference "chap:Elimination"
 
 \end_inset
 
-.)
+, we will see that this allows us to use binary search to minimize convex
+ functions.
 \end_layout
 
 \begin_layout Theorem
@@ -1275,19 +1293,19 @@ Minimum Cost Flow Problem (Computer Science)
 \begin_layout Standard
 The min cost flow problem has lots of applications such as route planning,
  airline scheduling, image segmentation, recommendation systems, etc.
- In this problem, we are given a graph 
+ In this problem, there is a graph 
 \begin_inset Formula $G=(V,E)$
 \end_inset
 
  with 
-\begin_inset Formula $m\defeq|E|$
+\begin_inset Formula $m=|E|$
 \end_inset
 
- edges and 
-\begin_inset Formula $n\defeq|V|$
+ and 
+\begin_inset Formula $n=|V|$
 \end_inset
 
- vertices.
+.
  Each edge 
 \begin_inset Formula $e\in E$
 \end_inset
@@ -1314,7 +1332,56 @@ The min cost flow problem has lots of applications such as route planning,
 \end_inset
 
 .
- Formally, the problem can be written as an optimization problem 
+ To imagine this less abstractly, imagine we want to match every person
+ to the best flight for them.
+ Then we can have a source node 
+\begin_inset Formula $s$
+\end_inset
+
+ connected to a node for each person with 
+\begin_inset Formula $u_{e}=1$
+\end_inset
+
+ for all such 
+\begin_inset Formula $e$
+\end_inset
+
+.
+ Further, we take 
+\begin_inset Formula $t$
+\end_inset
+
+ to be connected to a node for each flight, with 
+\begin_inset Formula $u_{e}$
+\end_inset
+
+ being the number of people that can fit on that flight.
+ Then, all the remaining edges will be from people nodes to flight nodes.
+ For any such 
+\begin_inset Formula $e,u_{e}=1$
+\end_inset
+
+ and 
+\begin_inset Formula $c_{e}$
+\end_inset
+
+ is proportional to how good that flight is for that person (does it get
+ them where they need to go at the time they need to go?) with 
+\begin_inset Formula $0$
+\end_inset
+
+ representing a perfect flight and 
+\begin_inset Formula $\infty$
+\end_inset
+
+ representing a flight they would not take even if given the option for
+ free.
+ Then we can calculate the min-cost flow to find the best allocation of
+ people to flights.
+\end_layout
+
+\begin_layout Standard
+Formally, the problem can be written as an optimization problem 
 \begin_inset Formula $\min_{f\in\R^{|E|}}\sum_{e\in E}c_{e}\cdot f_{e}$
 \end_inset
 
@@ -1983,7 +2050,7 @@ Epigraph of
 ; quasiconvex function
 \begin_inset CommandInset label
 LatexCommand label
-name "fig:rel-1-1"
+name "fig:quasiconvex"
 
 \end_inset
 
@@ -2040,7 +2107,7 @@ quasiconvex
 
 \begin_inset CommandInset ref
 LatexCommand ref
-reference "fig:rel-1-1"
+reference "fig:quasiconvex"
 plural "false"
 caps "false"
 noprefix "false"
@@ -2700,7 +2767,7 @@ The standard definition of a convex function in terms of gradients requires
 \end_layout
 
 \begin_layout Section
-Logconcave Functions
+Logconcave functions
 \begin_inset CommandInset label
 LatexCommand label
 name "sec:Logconcave-functions"
@@ -2747,6 +2814,8 @@ The indicator function of a convex set
 \end_inset
 
  is logconcave.
+ The Gaussian density function is logconcave.
+ The Gaussian density restricted to any convex set is logconcave.
 \end_layout
 
 \begin_layout Lemma

diff --git a/equivalence.lyx b/equivalence.lyx
@@ -741,7 +741,7 @@ The relationships among the four oracles for convex sets.
 
 \begin_inset CommandInset label
 LatexCommand label
-name "fig:rel-1-1"
+name "fig:oracles"
 
 \end_inset
 

diff --git a/gradient_descent.lyx b/gradient_descent.lyx
@@ -1199,24 +1199,9 @@ Let
 \end_layout
 
 \begin_layout Standard
-The proof idea involves showing the function value 
-\begin_inset Formula $f(x)$
-\end_inset
-
- decreases by at least 
-\begin_inset Formula $\frac{\epsilon^{2}}{2L}$
-\end_inset
-
- when 
-\begin_inset Formula $\|\nabla f(x)\|_{2}\geq\epsilon$
-\end_inset
-
-.
- Since the function value can only decrease by at most 
-\begin_inset Formula $f(x^{(0)})-f(x^{*})$
-\end_inset
-
-, this bounds the number of iterations.
+The next lemma shows that the function value must decrease along the GD
+ path for a sufficiently small step size, and the magnitude of the decrease
+ depends on the norm of the current gradient.
 \end_layout
 
 \begin_layout Lemma
@@ -1279,10 +1264,8 @@ for some
 \end_layout
 
 \begin_layout Standard
-\begin_inset Separator plain
-\end_inset
-
-
+We can now prove the theorem.
+
 \end_layout
 
 \begin_layout Proof
@@ -1302,7 +1285,39 @@ reference "thm:gd_general"
 
 \end_inset
 
-Since each step of gradient descent decreases 
+ We observe that either 
+\begin_inset Formula $\|\nabla f(x)\|_{2}\le\epsilon$
+\end_inset
+
+, or 
+\begin_inset Formula $\|\nabla f(x)\|_{2}\geq\epsilon$
+\end_inset
+
+ and so by Lemma 
+\begin_inset CommandInset ref
+LatexCommand ref
+reference "lem:gradient_progress"
+plural "false"
+caps "false"
+noprefix "false"
+
+\end_inset
+
+, the function value 
+\begin_inset Formula $f(x)$
+\end_inset
+
+ decreases by at least 
+\begin_inset Formula $\frac{\epsilon^{2}}{2L}$
+\end_inset
+
+.
+ Since the function value can decrease by at most 
+\begin_inset Formula $f(x^{(0)})-f(x^{*})$
+\end_inset
+
+, this bounds the number of iterations — each step of gradient descent decreases
+
 \begin_inset Formula $f$
 \end_inset
 
@@ -1323,8 +1338,7 @@ Since each step of gradient descent decreases
 
 \begin_layout Standard
 Despite the simplicity of the algorithm and the proof, it is known that
- this is the best one can do via any algorithm for this general setting
-
+ this is the best one can do via any algorithm in this general setting 
 \begin_inset CommandInset citation
 LatexCommand cite
 key "carmon2017lower"
@@ -2300,8 +2314,8 @@ Generalizing Gradient Descent*
 \end_layout
 
 \begin_layout Standard
-Now, we study what properties gradient descent are bein used for the strongly
- convex case.
+Now, we study what properties of gradient descent are being used for the
+ strongly convex case.
  There are many ways to generalize it.
  One way is to view gradient descent is as approximating the function 
 \begin_inset Formula $f$
@@ -2355,7 +2369,7 @@ We say
 \end_inset
 
  and 
-\begin_inset Formula $h((1-\alpha)x+\lambda\widehat{x})\leq\lambda^{2}h(\widehat{x})$
+\begin_inset Formula $h((1-\alpha)x+\alpha\widehat{x})\leq\alpha^{2}h(\widehat{x})$
 \end_inset
 
  for all 
@@ -2555,7 +2569,7 @@ status open
 \end_layout
 
 \begin_layout Standard
-Find a 
+Find an 
 \begin_inset Formula $\alpha$
 \end_inset
 
@@ -2664,7 +2678,7 @@ Using the fact that
 \begin_inset Formula $g^{(k)}+h^{(k)}$
 \end_inset
 
- is an upper bound of 
+ is an upper bound on 
 \begin_inset Formula $f$
 \end_inset
 
@@ -2684,7 +2698,7 @@ f(x^{(k+1)})\leq\min_{y}g^{(k)}(y)+h^{(k)}(y).
 
 \end_inset
 
-To bound the best possible progress, we consider 
+To bound the best possible progress, i.e., the RHS above, we consider 
 \begin_inset Formula $\widehat{x}=\arg\min_{y}g^{(k)}(y)+\alpha h^{(k)}(y)$
 \end_inset
 
@@ -2723,11 +2737,11 @@ where we used
 \end_layout
 
 \begin_layout Proof
-Combining both and using 
+Combining both and using the fact that 
 \begin_inset Formula $g^{(k)}+\alpha h^{(k)}$
 \end_inset
 
- is a lower bound of 
+ is a lower bound on 
 \begin_inset Formula $f$
 \end_inset
 
@@ -2754,6 +2768,22 @@ reference "thm:gd_general_apx"
  Here we list some of them.
 \end_layout
 
+\begin_layout Exercise
+Show that the second condition in the definition of 
+\begin_inset Formula $\alpha$
+\end_inset
+
+-approximation can be replaced by 
+\begin_inset Formula 
+\[
+g(y)+\alpha h(y/\alpha)\leq f(y)\leq g(y)+h(y)
+\]
+
+\end_inset
+
+ while maintaining the guarantee for the convergence of generalized GD.
+\end_layout
+
 \begin_layout Subsubsection*
 Projected Gradient Descent / Proximal Gradient Descent
 \end_layout

diff --git a/main.pdf b/main.pdf
diff --git a/preliminaries.lyx b/preliminaries.lyx
@@ -402,7 +402,7 @@ A real symmetric matrix
 
 \end_layout
 
-\begin_layout Definition*
+\begin_layout Definition
 For any matrix 
 \begin_inset Formula $A$
 \end_inset