spell check

Sandy4321 · Jul 8, 2014 · 423ce50 · 423ce50
1 parent b8be333
commit 423ce50
Show file tree

Hide file tree

Showing 7 changed files with 17 additions and 17 deletions.
diff --git a/tex/chapters/chapter01.tex b/tex/chapters/chapter01.tex
@@ -4,7 +4,7 @@ \chapter{Introduction}\label{ch:introduction}
 meteorology, medicine or finance to cite a few, experts aim at predicting a
 phenomenon based on past observations or measurements. For instance,
 meteorologists try to forecast the weather for the next days from the climatic
-conditions of the previous days. In medicine, practicians collect measurements
+conditions of the previous days. In medicine, practitioners collect measurements
 and information such as blood pressure, age or history for diagnosing the
 condition of incoming patients. Similarly, in chemistry, compounds are analyzed
 using mass spectrometry measurements in order to determine whether they contain
@@ -15,7 +15,7 @@ \chapter{Introduction}\label{ch:introduction}
 For centuries, scientists have addressed such problems by deriving theoretical
 frameworks from first principles or have accumulated knowledge in order to
 model, analyze and understand the pheno\-menon under study. For example,
-practicians know from past experience that elderly heart attack patients with
+practitioners know from past experience that elderly heart attack patients with
 low blood pressure are generally high risk. Similarly, meteorologists know from
 elementary climate models that one hot, high pollution day is likely to be
 followed by another. For an increasing number of problems however, standard
@@ -82,7 +82,7 @@ \chapter{Introduction}\label{ch:introduction}
 the algorithm are still not clearly and entirely understood. Random forests
 indeed evolved from empirical successes rather than from a sound
 theory. As such, various parts of the algorithm remain heuristic rather than
-theorically motivated. For example, preliminary
+theoretically motivated. For example, preliminary
 results have proven the consistency of simplified to very close variants of
 random forests, but consistency of the original algorithm remains unproven
 in a general setting.
@@ -120,7 +120,7 @@ \section{Thesis outline}
 forests. We discuss the learning capabilities of these models and carefully
 study all parts of the algorithm and their complementary effects. In particular,
 Chapter~\ref{ch:forest} includes original contributions on the bias-variance
-analysis of ensemble methods, highligthing how randomization can help improve
+analysis of ensemble methods, highlighting how randomization can help improve
 performance. Chapter~\ref{ch:complexity} concludes this first part with an
 original space and time complexity analysis of random forests (and their
 variants), along with an in-depth discussion of implementation details,

diff --git a/tex/chapters/chapter02.tex b/tex/chapters/chapter02.tex
@@ -500,7 +500,7 @@ \subsection{Selecting the (approximately) best model}
 structure learned from the training set is actually too specific and does not
 generalize. The model is overfitting. The best parameter value $\theta$ is
 therefore the one making the appropriate trade-off and producing a model which is
-neither too simple nor to complex, as shown by the grey line on the figure.
+neither too simple nor to complex, as shown by the gray line on the figure.
 
 As we will see later in Chapter~\ref{ch:forest},
 overfitting can also be explained by decomposing the generalization error

diff --git a/tex/chapters/chapter04.tex b/tex/chapters/chapter04.tex
@@ -122,12 +122,12 @@ \subsection{Regression}
 Section \ref{sec:2:model-selection}). The upper plots in
 Figure~\ref{fig:overfitting} illustrate in light red predictions $\varphi_{\cal
 L}(\mathbf{x})$ for polynomials of degree $1$, $5$ and $15$ learned over random
-learning sets ${\cal L}$ sampled from a noisy cosinus function. Predictions
+learning sets ${\cal L}$ sampled from a noisy cosine function. Predictions
 $\mathbb{E}_{\cal L} \{ \varphi_{\cal L}(\mathbf{x}) \}$ of the average model
 are represented by the thick red lines. Predictions for the model learned over
 the learning set, represented by the blue dots, are represented in gray.
 Predictions of the Bayes model are shown by blue lines and coincide with the unnoised
-cosinus function that defines the regression problem. The lower plots in the
+cosine function that defines the regression problem. The lower plots in the
 figure illustrate the bias-variance decomposition of the expected
 generalization error of the polynomials.
 

diff --git a/tex/chapters/chapter05.tex b/tex/chapters/chapter05.tex
@@ -59,7 +59,7 @@ \section{Complexity of the induction procedure}
 \begin{equation}
 f(N) =  O(g(N)) \ \text{if}\ \exists c > 0, N_0 > 0, \forall N > N_0, f(N) \leq c g(N)
 \end{equation}
-to express that  $f(N)$ is asymptotically upper bounded by $g(N)$, up to some neglectable constant factor $c$.
+to express that  $f(N)$ is asymptotically upper bounded by $g(N)$, up to some negligible constant factor $c$.
 Similarly, big $\Omega$ notations are used to express an asymptotic lower
 bound on the growth rate of the number of steps in the algorithm. Formally,
 we write that
@@ -255,7 +255,7 @@ \section{Complexity of the induction procedure}
 \sum_{i=1}^N i \log i = \log(H(N)),
 \end{equation}
 where $H(N)$ is the hyperfactorial function, complexity
-could be reexpressed tightly as $T(N) = \Theta(K \log(H(N)))$.
+could be re-expressed tightly as $T(N) = \Theta(K \log(H(N)))$.
 \end{proof}
 
 \begin{theorem}\label{thm:6:worst:kn}
@@ -423,9 +423,9 @@ \section{Complexity of the induction procedure}
 method of all, which is due to the fact that only a single split variable
 is considered at each split. For $K=1$ however, ETs and PERT have identical
 time complexity. Note that the analysis presented here is only valid
-asymptotically. In pratice, constant factors might lead to different
+asymptotically. In practice, constant factors might lead to different
 observed results, though they should not significantly deviate  from our conclusions if
-algorithms are all implemented from a common codebase.
+algorithms are all implemented from a common code-base.
 
 \begin{table}
     \centering
@@ -614,11 +614,11 @@ \subsection{Scikit-Learn}
 by 3,445 people and forked 1,867 times on GitHub; the mailing list receives more
 than 300 mails per month; version control logs
 % ddaa494c116e3c16bf032003c5cccbed851733d2
-show more than 200 unique contributors to the codebase and the online documentation
+show more than 200 unique contributors to the code-base and the online documentation
 receives 37,000 unique visitors and 295,000 pageviews per month.
 
 Our implementation guidelines emphasize writing efficient but readable code. In
-particular, we focus on making the codebase maintainable and understandable in
+particular, we focus on making the code-base maintainable and understandable in
 order to favor external contributions. Whenever practical, algorithms
 implemented in Scikit-Learn are written in Python, using NumPy vector
 operations for numerical work. This allows the code to remain concise, readable

diff --git a/tex/chapters/chapter06.tex b/tex/chapters/chapter06.tex
@@ -746,7 +746,7 @@ \subsection{Non-totally randomized trees}
 previously, $t$ is split into as many subtrees as the cardinality of the chosen
 variable. Asymptotically, for binary variables, this variant exactly matches
 Random Forests and Extremely Randomized Trees. For variables with a larger
-cardinality, the correspondance no longer exactly holds but the trees still
+cardinality, the correspondence no longer exactly holds but the trees still
 closely relate. Notice that, for $K=1$, this procedure amounts to building
 ensembles of  totally randomized trees as defined before, while, for $K=p$, it
 amounts to building classical single trees in a deterministic way.

diff --git a/tex/chapters/chapter07.tex b/tex/chapters/chapter07.tex
@@ -362,8 +362,8 @@ \subsection{Bias due to empirical impurity estimations}
 The analysis of variable importances carried out so far has considered
 asymptotic conditions for which the true node impurity $i(t)$ is assumed to be
 known. In practice however, due to the finite size of the learning set,
-impurity measurements suffer from an empiricial misestimation bias. In this
-section, we study this effect in the context of heterogenous variables\footnote{As an example, in
+impurity measurements suffer from an empirical misestimation bias. In this
+section, we study this effect in the context of heterogeneous variables\footnote{As an example, in
 the case of meteorological problems, variables often comprise mixed
 environmental measurements of different nature and scale,   like speed of
 wind, temperature, humidity, pressure, rainfall or solar radiation.}, with

diff --git a/tex/frontback/acknowledgments.tex b/tex/frontback/acknowledgments.tex
@@ -23,7 +23,7 @@ \chapter*{Acknowledgments}
 I want take this opportunity to thank the Scikit-Learn team and all its
 contributors. This experience within the open source world really contributed
 to shape my vision of science and software development towards a model
-of rigour, pragmatism and openness. Thanks go to Ga\"{e}l, Olivier, Lars,
+of rigor, pragmatism and openness. Thanks go to Ga\"{e}l, Olivier, Lars,
 Mathieu, Andreas, Alexandre and Peter.
 
 Special thanks go to the rowing team of the RCAE, for their friendship