consistency

naviat · Aug 17, 2023 · 791b775 · 791b775
1 parent ec1d5a2
commit 791b775
Show file tree

Hide file tree

Showing 5 changed files with 8 additions and 8 deletions.
diff --git a/chapter_introduction/index.md b/chapter_introduction/index.md
@@ -537,7 +537,7 @@ in depth throughout this book.
 Informally, the learning process looks something like the following.
 First, grab a big collection of examples for which the features are known
 and select from them a random subset,
-acquiring the ground-truth labels for each.
+acquiring the ground truth labels for each.
 Sometimes these labels might be available data that have already been collected
 (e.g., did a patient die within the following year?)
 and other times we might need to employ human annotators to label the data,
@@ -1583,7 +1583,7 @@ over the past decade.
   [Theano](https://github.com/Theano/Theano).
   Many seminal papers were written using these tools.
   These have now been superseded by
-  [TensorFlow](https://github.com/tensorflow/tensorflow) (often used via its high level API [Keras](https://github.com/keras-team/keras)), [CNTK](https://github.com/Microsoft/CNTK), [Caffe 2](https://github.com/caffe2/caffe2), and [Apache MXNet](https://github.com/apache/incubator-mxnet).
+  [TensorFlow](https://github.com/tensorflow/tensorflow) (often used via its high-level API [Keras](https://github.com/keras-team/keras)), [CNTK](https://github.com/Microsoft/CNTK), [Caffe 2](https://github.com/caffe2/caffe2), and [Apache MXNet](https://github.com/apache/incubator-mxnet).
   The third generation of frameworks consists
   of so-called *imperative* tools for deep learning,
   a trend that was arguably ignited by [Chainer](https://github.com/chainer/chainer),

diff --git a/chapter_linear-classification/classification.md b/chapter_linear-classification/classification.md
@@ -139,7 +139,7 @@ Accuracy is computed as follows.
 First, if `y_hat` is a matrix,
 we assume that the second dimension stores prediction scores for each class.
 We use `argmax` to obtain the predicted class by the index for the largest entry in each row.
-Then we [**compare the predicted class with the ground-truth `y` elementwise.**]
+Then we [**compare the predicted class with the ground truth `y` elementwise.**]
 Since the equality operator `==` is sensitive to data types,
 we convert `y_hat`'s data type to match that of `y`.
 The result is a tensor containing entries of 0 (false) and 1 (true).

diff --git a/chapter_linear-regression/linear-regression-scratch.md b/chapter_linear-regression/linear-regression-scratch.md
@@ -448,7 +448,7 @@ def fit_epoch(self):
 We are almost ready to train the model,
 but first we need some training data.
 Here we use the `SyntheticRegressionData` class 
-and pass in some ground-truth parameters.
+and pass in some ground truth parameters.
 Then we train our model with 
 the learning rate `lr=0.03` 
 and set `max_epochs=3`. 
@@ -498,7 +498,7 @@ print(f"error in estimating b: {data.b - params['b']}")
 ```
 
 We should not take the ability to exactly recover 
-the ground-truth parameters for granted.
+the ground truth parameters for granted.
 In general, for deep models unique solutions
 for the parameters do not exist,
 and even for linear models,

diff --git a/chapter_preliminaries/linear-algebra.md b/chapter_preliminaries/linear-algebra.md
@@ -1068,7 +1068,7 @@ In deep learning, we are often trying to solve optimization problems:
 *maximize* the probability assigned to observed data;
 *maximize* the revenue associated with a recommender model; 
 *minimize* the distance between predictions
-and the ground-truth observations; 
+and the ground truth observations; 
 *minimize* the distance between representations 
 of photos of the same person 
 while *maximizing* the distance between representations 

diff --git a/chapter_recurrent-modern/seq2seq.md b/chapter_recurrent-modern/seq2seq.md
@@ -37,7 +37,7 @@ given both the input sequence
 and the preceding tokens in the output.
 During training, the decoder will typically
 be conditioned upon the preceding tokens
-in the official "ground-truth" label. 
+in the official "ground truth" label. 
 However, at test time, we will want to condition
 each output of the decoder on the tokens already predicted. 
 Note that if we ignore the encoder,
@@ -774,7 +774,7 @@ def predict_step(self, params, batch, num_steps,
 
 We can evaluate a predicted sequence
 by comparing it with the
-target sequence (the ground-truth).
+target sequence (the ground truth).
 But what precisely is the appropriate measure 
 for comparing similarity between two sequences?