Updates to the documentation (linguistic corrections) (#414)

* Fix typo in Features list * Update distance.md documentation * Fix grammatical mistakes in documentation * Fix grammatical mistakes in documentation * Fix grammatical mistakes in documentation * Fix grammatical mistakes in documentation * Fix grammatical mistakes in documentation * Fix grammatical mistakes in documentation * Fix grammatical mistakes in documentation * Fix grammatical mistakes in documentation * Fix grammatical mistakes in documentation
Hadryan · Nov 2, 2019 · 7d5c6b1 · 7d5c6b1
1 parent f30e576
commit 7d5c6b1
Show file tree

Hide file tree

Showing 27 changed files with 82 additions and 83 deletions.
diff --git a/docs/machine-learning/association/apriori.md b/docs/machine-learning/association/apriori.md
@@ -15,7 +15,7 @@ $associator = new Apriori($support = 0.5, $confidence = 0.5);
 
 ### Train
 
-To train a associator simply provide train samples and labels (as `array`). Example:
+To train an associator, simply provide train samples and labels (as `array`). Example:
 
 ```
 $samples = [['alpha', 'beta', 'epsilon'], ['alpha', 'beta', 'theta'], ['alpha', 'beta', 'epsilon'], ['alpha', 'beta', 'theta']];
@@ -31,7 +31,7 @@ You can train the associator using multiple data sets, predictions will be based
 
 ### Predict
 
-To predict sample label use `predict` method. You can provide one sample or array of samples:
+To predict sample label use the `predict` method. You can provide one sample or array of samples:
 
 ```
 $associator->predict(['alpha','theta']);
@@ -43,7 +43,7 @@ $associator->predict([['alpha','epsilon'],['beta','theta']]);
 
 ### Associating
 
-Get generated association rules simply use `rules` method.
+To get generated association rules, simply use the `rules` method.
 
 ```
 $associator->getRules();
@@ -52,7 +52,7 @@ $associator->getRules();
 
 ### Frequent item sets
 
-Generating k-length frequent item sets simply use `apriori` method.
+To generate k-length frequent item sets, simply use the `apriori` method.
 
 ```
 $associator->apriori();

diff --git a/docs/machine-learning/classification/k-nearest-neighbors.md b/docs/machine-learning/classification/k-nearest-neighbors.md
@@ -14,7 +14,7 @@ $classifier = new KNearestNeighbors($k=3, new Minkowski($lambda=4));
 
 ## Train
 
-To train a classifier simply provide train samples and labels (as `array`). Example:
+To train a classifier, simply provide train samples and labels (as `array`). Example:
 
 ```
 $samples = [[1, 3], [1, 4], [2, 4], [3, 1], [4, 1], [4, 2]];
@@ -28,7 +28,7 @@ You can train the classifier using multiple data sets, predictions will be based
 
 ## Predict
 
-To predict sample label use `predict` method. You can provide one sample or array of samples:
+To predict sample label use the `predict` method. You can provide one sample or array of samples:
 
 ```
 $classifier->predict([3, 2]);

diff --git a/docs/machine-learning/classification/naive-bayes.md b/docs/machine-learning/classification/naive-bayes.md
@@ -4,7 +4,7 @@ Classifier based on applying Bayes' theorem with strong (naive) independence ass
 
 ### Train
 
-To train a classifier simply provide train samples and labels (as `array`). Example:
+To train a classifier, simply provide train samples and labels (as `array`). Example:
 
 ```
 $samples = [[5, 1, 1], [1, 5, 1], [1, 1, 5]];
@@ -18,7 +18,7 @@ You can train the classifier using multiple data sets, predictions will be based
 
 ### Predict
 
-To predict sample label use `predict` method. You can provide one sample or array of samples:
+To predict sample label use the `predict` method. You can provide one sample or array of samples:
 
 ```
 $classifier->predict([3, 1, 1]);

diff --git a/docs/machine-learning/classification/svc.md b/docs/machine-learning/classification/svc.md
@@ -21,7 +21,7 @@ $classifier = new SVC(Kernel::RBF, $cost = 1000, $degree = 3, $gamma = 6);
 
 ### Train
 
-To train a classifier simply provide train samples and labels (as `array`). Example:
+To train a classifier, simply provide train samples and labels (as `array`). Example:
 
 ```
 use Phpml\Classification\SVC;
@@ -38,7 +38,7 @@ You can train the classifier using multiple data sets, predictions will be based
 
 ### Predict
 
-To predict sample label use `predict` method. You can provide one sample or array of samples:
+To predict sample label use the `predict` method. You can provide one sample or array of samples:
 
 ```
 $classifier->predict([3, 2]);
@@ -74,7 +74,7 @@ $classifier = new SVC(
 $classifier->train($samples, $labels);
 ```
 
-Then use `predictProbability` method instead of `predict`:
+Then use the `predictProbability` method instead of `predict`:
 
 ```
 $classifier->predictProbability([3, 2]);

diff --git a/docs/machine-learning/clustering/dbscan.md b/docs/machine-learning/clustering/dbscan.md
@@ -16,12 +16,12 @@ $dbscan = new DBSCAN($epsilon = 2, $minSamples = 3, new Minkowski($lambda=4));
 
 ### Clustering
 
-To divide the samples into clusters simply use `cluster` method. It's return the `array` of clusters with samples inside.
+To divide the samples into clusters, simply use the `cluster` method. It returns the `array` of clusters with samples inside.
 
 ```
 $samples = [[1, 1], [8, 7], [1, 2], [7, 8], [2, 1], [8, 9]];
 
 $dbscan = new DBSCAN($epsilon = 2, $minSamples = 3);
 $dbscan->cluster($samples);
-// return [0=>[[1, 1], ...], 1=>[[8, 7], ...]] 
+// return [0=>[[1, 1], ...], 1=>[[8, 7], ...]]
 ```
diff --git a/docs/machine-learning/clustering/k-means.md b/docs/machine-learning/clustering/k-means.md
@@ -1,6 +1,6 @@
 # K-means clustering
 
-The K-Means algorithm clusters data by trying to separate samples in n groups of equal variance, minimizing a criterion known as the inertia or within-cluster sum-of-squares. 
+The K-Means algorithm clusters data by trying to separate samples in n groups of equal variance, minimizing a criterion known as the inertia or within-cluster sum-of-squares.
 This algorithm requires the number of clusters to be specified.
 
 ### Constructor Parameters
@@ -15,11 +15,11 @@ $kmeans = new KMeans(4, KMeans::INIT_RANDOM);
 
 ### Clustering
 
-To divide the samples into clusters simply use `cluster` method. It's return the `array` of clusters with samples inside.
+To divide the samples into clusters, simply use the `cluster` method. It returns the `array` of clusters with samples inside.
 
 ```
 $samples = [[1, 1], [8, 7], [1, 2], [7, 8], [2, 1], [8, 9]];
-Or if you need to keep your indentifiers along with yours samples you can use array keys as labels.
+Or if you need to keep your identifiers along with yours samples you can use array keys as labels.
 $samples = [ 'Label1' => [1, 1], 'Label2' => [8, 7], 'Label3' => [1, 2]];
 
 $kmeans = new KMeans(2);
@@ -32,8 +32,8 @@ $kmeans->cluster($samples);
 #### kmeans++ (default)
 
 K-means++ method selects initial cluster centers for k-mean clustering in a smart way to speed up convergence.
-It use the DASV seeding method consists of finding good initial centroids for the clusters.
+It uses the DASV seeding method consists of finding good initial centroids for the clusters.
 
 #### random
 
-Random initialization method chooses completely random centroid. It get the space boundaries to avoid placing clusters centroid too far from samples data.
+Random initialization method chooses completely random centroid. It gets the space boundaries to avoid placing cluster centroids too far from samples data.
diff --git a/docs/machine-learning/cross-validation/random-split.md b/docs/machine-learning/cross-validation/random-split.md
@@ -1,20 +1,20 @@
 # Random Split
 
-One of the simplest methods from Cross-validation is implemented as `RandomSpilt` class. Samples are split to two groups: train group and test group. You can adjust number of samples in each group.
+One of the simplest methods from Cross-validation is implemented as `RandomSpilt` class. Samples are split to two groups: train group and test group. You can adjust the number of samples in each group.
 
 ### Constructor Parameters
 
 * $dataset - object that implements `Dataset` interface
 * $testSize - a fraction of test split (float, from 0 to 1, default: 0.3)
 * $seed - seed for random generator (e.g. for tests)
- 
+
 ```
 $randomSplit = new RandomSplit($dataset, 0.2);
 ```
 
 ### Samples and labels groups
 
-To get samples or labels from test and train group you can use getters:
+To get samples or labels from test and train group, you can use getters:
 
 ```
 $dataset = new RandomSplit($dataset, 0.3, 1234);

diff --git a/docs/machine-learning/cross-validation/stratified-random-split.md b/docs/machine-learning/cross-validation/stratified-random-split.md
@@ -1,22 +1,22 @@
 # Stratified Random Split
 
-Analogously to `RandomSpilt` class samples are split to two groups: train group and test group.
+Analogously to `RandomSpilt` class, samples are split to two groups: train group and test group.
 Distribution of samples takes into account their targets and trying to divide them equally.
-You can adjust number of samples in each group.
+You can adjust the number of samples in each group.
 
 ### Constructor Parameters
 
 * $dataset - object that implements `Dataset` interface
 * $testSize - a fraction of test split (float, from 0 to 1, default: 0.3)
 * $seed - seed for random generator (e.g. for tests)
- 
+
 ```
 $split = new StratifiedRandomSplit($dataset, 0.2);
 ```
 
 ### Samples and labels groups
 
-To get samples or labels from test and train group you can use getters:
+To get samples or labels from test and train group, you can use getters:
 
 ```
 $dataset = new StratifiedRandomSplit($dataset, 0.3, 1234);
@@ -41,4 +41,4 @@ $dataset = new ArrayDataset(
 $split = new StratifiedRandomSplit($dataset, 0.5);
 ```
 
-Split will have equals amount of each target. Two of the target `a` and two of `b`.
+Split will have equal amounts of each target. Two of the target `a` and two of `b`.
diff --git a/docs/machine-learning/datasets/array-dataset.md b/docs/machine-learning/datasets/array-dataset.md
@@ -2,7 +2,7 @@
 
 Helper class that holds data as PHP `array` type. Implements the `Dataset` interface which is used heavily in other classes.
 
-### Constructors Parameters
+### Constructor Parameters
 
 * $samples - (array) of samples
 * $labels - (array) of labels
@@ -15,7 +15,7 @@ $dataset = new ArrayDataset([[1, 1], [2, 1], [3, 2], [4, 1]], ['a', 'a', 'b', 'b
 
 ### Samples and labels
 
-To get samples or labels you can use getters:
+To get samples or labels, you can use getters:
 
 ```
 $dataset->getSamples();
@@ -24,7 +24,7 @@ $dataset->getTargets();
 
 ### Remove columns
 
-You can remove columns by index numbers, for example:
+You can remove columns by their index numbers, for example:
 
 ```
 use Phpml\Dataset\ArrayDataset;

diff --git a/docs/machine-learning/datasets/csv-dataset.md b/docs/machine-learning/datasets/csv-dataset.md
@@ -2,11 +2,11 @@
 
 Helper class that loads data from CSV file. It extends the `ArrayDataset`.
 
-### Constructors Parameters
+### Constructor Parameters
 
 * $filepath - (string) path to `.csv` file
 * $features - (int) number of columns that are features (starts from first column), last column must be a label
-* $headingRow - (bool) define is file have a heading row (if `true` then first row will be ignored)
+* $headingRow - (bool) define if the file has a heading row (if `true` then first row will be ignored)
 
 ```
 $dataset = new CsvDataset('dataset.csv', 2, true);

diff --git a/docs/machine-learning/datasets/files-dataset.md b/docs/machine-learning/datasets/files-dataset.md
@@ -2,7 +2,7 @@
 
 Helper class that loads dataset from files. Use folder names as targets. It extends the `ArrayDataset`.
 
-### Constructors Parameters
+### Constructor Parameters
 
 * $rootPath - (string) path to root folder that contains files dataset
 
@@ -42,7 +42,7 @@ data
         ...
 ```
 
-Load files data with `FilesDataset`: 
+Load files data with `FilesDataset`:
 
 ```
 use Phpml\Dataset\FilesDataset;

diff --git a/docs/machine-learning/datasets/mnist-dataset.md b/docs/machine-learning/datasets/mnist-dataset.md
@@ -1,6 +1,6 @@
 # MnistDataset
 
-Helper class that load data from MNIST dataset: [http://yann.lecun.com/exdb/mnist/](http://yann.lecun.com/exdb/mnist/)
+Helper class that loads data from MNIST dataset: [http://yann.lecun.com/exdb/mnist/](http://yann.lecun.com/exdb/mnist/)
 
 > The MNIST database of handwritten digits, available from this page, has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger set available from NIST. The digits have been size-normalized and centered in a fixed-size image.
   It is a good database for people who want to try learning techniques and pattern recognition methods on real-world data while spending minimal efforts on preprocessing and formatting.
@@ -18,7 +18,7 @@ $trainDataset = new MnistDataset('train-images-idx3-ubyte', 'train-labels-idx1-u
 
 ### Samples and labels
 
-To get samples or labels you can use getters:
+To get samples or labels, you can use getters:
 
 ```
 $dataset->getSamples();

diff --git a/docs/machine-learning/datasets/svm-dataset.md b/docs/machine-learning/datasets/svm-dataset.md
@@ -2,7 +2,7 @@
 
 Helper class that loads data from SVM-Light format file. It extends the `ArrayDataset`.
 
-### Constructors Parameters
+### Constructor Parameters
 
 * $filepath - (string) path to the file
 

diff --git a/docs/machine-learning/feature-extraction/tf-idf-transformer.md b/docs/machine-learning/feature-extraction/tf-idf-transformer.md
@@ -19,7 +19,7 @@ $transformer = new TfIdfTransformer($samples);
 
 ### Transformation
 
-To transform a collection of text samples use `transform` method. Example:
+To transform a collection of text samples, use the `transform` method. Example:
 
 ```
 use Phpml\FeatureExtraction\TfIdfTransformer;
@@ -28,7 +28,7 @@ $samples = [
     [0 => 1, 1 => 1, 2 => 2, 3 => 1, 4 => 0, 5 => 0],
     [0 => 1, 1 => 1, 2 => 0, 3 => 0, 4 => 2, 5 => 3],
 ];
-        
+
 $transformer = new TfIdfTransformer($samples);
 $transformer->transform($samples);
 
@@ -38,5 +38,5 @@ $samples = [
    [0 => 0, 1 => 0, 2 => 0, 3 => 0, 4 => 0.602, 5 => 0.903],
 ];
 */
-        
+
 ```
diff --git a/docs/machine-learning/feature-extraction/token-count-vectorizer.md b/docs/machine-learning/feature-extraction/token-count-vectorizer.md
@@ -16,7 +16,7 @@ $vectorizer = new TokenCountVectorizer(new WhitespaceTokenizer());
 
 ### Transformation
 
-To transform a collection of text samples use `transform` method. Example:
+To transform a collection of text samples, use the `transform` method. Example:
 
 ```
 $samples = [
@@ -42,7 +42,7 @@ $vectorizer->transform($samples);
 
 ### Vocabulary
 
-You can extract vocabulary using `getVocabulary()` method. Example:
+You can extract vocabulary using the `getVocabulary()` method. Example:
 
 ```
 $vectorizer->getVocabulary();

diff --git a/docs/machine-learning/feature-selection/selectkbest.md b/docs/machine-learning/feature-selection/selectkbest.md
@@ -5,7 +5,7 @@
 ## Constructor Parameters
 
 * $k (int) - number of top features to select, rest will be removed (default: 10)
-* $scoringFunction (ScoringFunction) - function that take samples and targets and return array with scores (default: ANOVAFValue)
+* $scoringFunction (ScoringFunction) - function that takes samples and targets and returns an array with scores (default: ANOVAFValue)
 
 ```php
 use Phpml\FeatureSelection\SelectKBest;
@@ -27,13 +27,13 @@ $selector->fit($samples = $dataset->getSamples(), $dataset->getTargets());
 $selector->transform($samples);
 
 /*
-$samples[0] = [1.4, 0.2]; 
+$samples[0] = [1.4, 0.2];
 */
 ```
 
 ## Scores
 
-You can get a array with the calculated score for each feature. 
+You can get an array with the calculated score for each feature.
 A higher value means that a given feature is better suited for learning.
 Of course, the rating depends on the scoring function used.
 
@@ -56,7 +56,7 @@ $selector->scores();
   float(1179.0343277002)
   [3]=>
   float(959.32440572573)
-} 
+}
 */
 ```
 
@@ -70,11 +70,11 @@ For classification:
    The test is applied to samples from two or more groups, possibly with differing sizes.
 
 For regression:
- - **UnivariateLinearRegression**  
+ - **UnivariateLinearRegression**
    Quick linear model for testing the effect of a single regressor, sequentially for many regressors.
    This is done in 2 steps:
      - 1. The cross correlation between each regressor and the target is computed, that is, ((X[:, i] - mean(X[:, i])) * (y - mean_y)) / (std(X[:, i]) *std(y)).
-     - 2. It is converted to an F score 
+     - 2. It is converted to an F score
 
 ## Pipeline