diff --git a/chapter_linear-networks/linear-regression-gluon.md b/chapter_linear-networks/linear-regression-gluon.md index 25892eaf86..9c89f54dff 100644 --- a/chapter_linear-networks/linear-regression-gluon.md +++ b/chapter_linear-networks/linear-regression-gluon.md @@ -10,7 +10,7 @@ data iterators, loss functions, model architectures, and optimizers, are so common, deep learning libraries will give us library functions for these as well. -We have used Gluon to load the MNIST dataset in :numref:`chapter_naive_bayes`. In this section, we will how we can implement +In this section, we will learn how we can implement the linear regression model in :numref:`chapter_linear_scratch` much more concisely with Gluon. ## Generating Data Sets @@ -31,11 +31,7 @@ features, labels = d2l.synthetic_data(true_w, true_b, 1000) Rather than rolling our own iterator, we can call upon Gluon's `data` module to read data. -Since `data` is often used as a variable name, -we will replace it with the pseudonym `gdata` -(adding the first letter of Gluon), -too differentiate the imported `data` module -from a variable we might define. + The first step will be to instantiate an `ArrayDataset`, which takes in one or more ndarrays as arguments. Here, we pass in `features` and `labels` as arguments. @@ -66,7 +62,7 @@ for X, y in data_iter: ## Define the Model -When we implemented linear regression from scratch in the previous section, we had to define the model parameters and explicitly write out the calculation to produce output using basic linear algebra opertions. You should know how to do this. But once your models get more complex, even qualitatively simple changes to the model might result in many low-level changes. +When we implemented linear regression from scratch in the previous section, we had to define the model parameters and explicitly write out the calculation to produce output using basic linear algebra operations. You should know how to do this. But once your models get more complex, even qualitatively simple changes to the model might result in many low-level changes. For standard operations, we can use Gluon's predefined layers, which allow us to focus especially on the layers used to construct the model rather than having to focus on the implementation. @@ -105,7 +101,7 @@ the input shape for each layer. So here, we don't need to tell Gluon how many inputs go into this linear layer. When we first try to pass data through our model, -e.g., when we exedcute `net(X)` later, +e.g., when we execute `net(X)` later, Gluon will automatically infer the number of inputs to each layer. We will describe how this works in more detail in the chapter "Deep Learning Computation". diff --git a/chapter_preliminaries/ndarray.md b/chapter_preliminaries/ndarray.md index d4f6f9a404..2aeb60ecb0 100644 --- a/chapter_preliminaries/ndarray.md +++ b/chapter_preliminaries/ndarray.md @@ -408,9 +408,10 @@ we often begin with preprocessing raw data, rather than those nicely prepared da Among popular data analytic tools in Python, the `pandas` package is commonly used. Like many other extension packages in the vast ecosystem of Python, `pandas` can work together with `ndarray`. -Before wrapping up this introductory section, +So, before wrapping up this introductory section, we will briefly walk through steps for preprocessing raw data with `pandas` and converting them into the `ndarray` format. +We will cover more data preprocessing techniques in later chapters. ### Loading Data diff --git a/chapter_preliminaries/probability.md b/chapter_preliminaries/probability.md index 66f845b2d1..664b3a8ce9 100644 --- a/chapter_preliminaries/probability.md +++ b/chapter_preliminaries/probability.md @@ -257,7 +257,7 @@ plt.axhline(y=0.65, color='black', linestyle='dashed'); As we can see, on average, this sampler will generate 35% zeros and 65% ones. Now what if we have more than two possible outcomes? We can simply generalize -this idea as follows. Given any probability distribution, e.g. $p = [0.1, 0.2, 0.05, 0.3, 0.25, 0.1]$ we can compute its cumulative distribution (python's ``cumsum`` will do this for you) $F = [0.1, 0.3, 0.35, 0.65, 0.9, 1]$. Once we have this we draw a random variable $x$ from the uniform distribution $U[0,1]$ and then find the interval where $F[i-1] \leq x < F[i]$. We then return $i$ as the sample. By construction, the chances of hitting interval $[F[i-1], F[i])$ has probability $p(i)$. +this idea as follows. Given any probability distribution, e.g. $p = [0.1, 0.2, 0.05, 0.3, 0.25, 0.1]$ we can compute its cumulative distribution (python's `cumsum` will do this for you) $F = [0.1, 0.3, 0.35, 0.65, 0.9, 1]$. Once we have this we draw a random variable $x$ from the uniform distribution $U[0,1]$ and then find the interval where $F[i-1] \leq x < F[i]$. We then return $i$ as the sample. By construction, the chances of hitting interval $[F[i-1], F[i])$ has probability $p(i)$. Note that there are many more efficient algorithms for sampling than the one above. For instance, binary search over $F$ will run in $O(\log n)$ time for $n$ random variables. There are even more clever algorithms, such as the [Alias Method](https://en.wikipedia.org/wiki/Alias_method) to sample in constant time,