Skip to content

Commit

Permalink
Fix Imputer docs and check if train data was set (#314)
Browse files Browse the repository at this point in the history
* Update docs for Imputer class

* Throw exception when trying to transform imputer without train data

* Update changelog
  • Loading branch information
akondas authored Oct 10, 2018
1 parent 15adf9e commit e255369
Show file tree
Hide file tree
Showing 4 changed files with 40 additions and 0 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ This changelog references the relevant changes done in PHP-ML library.
* feature [Dataset] added removeColumns function to ArrayDataset (#249)
* feature [Dataset] added a SvmDataset class for SVM-Light (or LibSVM) format files (#237)
* feature [Optimizer] removed $initialTheta property and renamed setInitialTheta method to setTheta (#252)
* change [Imputer] Throw exception when trying to transform without train data (#314)
* enhancement Add performance test for LeastSquares (#263)
* enhancement Micro optimization for matrix multiplication (#255)
* enhancement Throw proper exception (#259, #251)
Expand Down
19 changes: 19 additions & 0 deletions docs/machine-learning/preprocessing/imputation-missing-values.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ To solve this problem you can use the `Imputer` class.
* $missingValue (mixed) - this value will be replaced (default null)
* $strategy (Strategy) - imputation strategy (read to use: MeanStrategy, MedianStrategy, MostFrequentStrategy)
* $axis (int) - axis for strategy, Imputer::AXIS_COLUMN or Imputer::AXIS_ROW
* $samples (array) - array of samples to train

```
$imputer = new Imputer(null, new MeanStrategy(), Imputer::AXIS_COLUMN);
Expand All @@ -34,6 +35,7 @@ $data = [
];
$imputer = new Imputer(null, new MeanStrategy(), Imputer::AXIS_COLUMN);
$imputer->fit($data);
$imputer->transform($data);
/*
Expand All @@ -46,3 +48,20 @@ $data = [
*/
```

You can also use `$samples` constructer parameter instead of `fit` method:

```
use Phpml\Preprocessing\Imputer;
use Phpml\Preprocessing\Imputer\Strategy\MeanStrategy;
$data = [
[1, null, 3, 4],
[4, 3, 2, 1],
[null, 6, 7, 8],
[8, 7, null, 5],
];
$imputer = new Imputer(null, new MeanStrategy(), Imputer::AXIS_COLUMN, $data);
$imputer->transform($data);
```
5 changes: 5 additions & 0 deletions src/Preprocessing/Imputer.php
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@

namespace Phpml\Preprocessing;

use Phpml\Exception\InvalidOperationException;
use Phpml\Preprocessing\Imputer\Strategy;

class Imputer implements Preprocessor
Expand Down Expand Up @@ -50,6 +51,10 @@ public function fit(array $samples, ?array $targets = null): void

public function transform(array &$samples): void
{
if ($this->samples === []) {
throw new InvalidOperationException('Missing training samples for Imputer.');
}

foreach ($samples as &$sample) {
$this->preprocessSample($sample);
}
Expand Down
15 changes: 15 additions & 0 deletions tests/Preprocessing/ImputerTest.php
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@

namespace Phpml\Tests\Preprocessing;

use Phpml\Exception\InvalidOperationException;
use Phpml\Preprocessing\Imputer;
use Phpml\Preprocessing\Imputer\Strategy\MeanStrategy;
use Phpml\Preprocessing\Imputer\Strategy\MedianStrategy;
Expand Down Expand Up @@ -173,4 +174,18 @@ public function testImputerWorksOnFitSamples(): void

$this->assertEquals($imputeData, $data, '', $delta = 0.01);
}

public function testThrowExceptionWhenTryingToTransformWithoutTrainSamples(): void
{
$this->expectException(InvalidOperationException::class);

$data = [
[1, 3, null],
[6, null, 8],
[null, 7, 5],
];

$imputer = new Imputer(null, new MeanStrategy(), Imputer::AXIS_COLUMN);
$imputer->transform($data);
}
}

0 comments on commit e255369

Please sign in to comment.