Skip to content

Commit

Permalink
Add setup.py
Browse files Browse the repository at this point in the history
  • Loading branch information
sooftware committed Mar 5, 2020
1 parent 3dffc57 commit 36ef0b6
Show file tree
Hide file tree
Showing 53 changed files with 186 additions and 495 deletions.
2 changes: 2 additions & 0 deletions .idea/.gitignore

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

12 changes: 12 additions & 0 deletions .idea/Korean-Speech-Recognition.iml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 6 additions & 0 deletions .idea/inspectionProfiles/profiles_settings.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 4 additions & 0 deletions .idea/misc.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 8 additions & 0 deletions .idea/modules.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 6 additions & 0 deletions .idea/vcs.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Binary file modified docs/build/doctrees/Dataset.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/Distance.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/Evaluator.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/Feature.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/Hparams.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/Label.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/Load.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/Loader.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/Loss.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/Lr.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/Models.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/Trainer.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/environment.pickle
Binary file not shown.
Binary file modified docs/build/doctrees/index.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/notes/More-details.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/notes/Preparation.doctree
Binary file not shown.
Binary file modified docs/build/doctrees/notes/intro.doctree
Binary file not shown.
91 changes: 2 additions & 89 deletions docs/build/html/Dataset.html
Original file line number Diff line number Diff line change
Expand Up @@ -173,95 +173,8 @@
<div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
<div itemprop="articleBody">

<div class="section" id="module-utils.dataset">
<span id="dataset"></span><h1>Dataset<a class="headerlink" href="#module-utils.dataset" title="Permalink to this headline"></a></h1>
<dl class="class">
<dt id="utils.dataset.BaseDataset">
<em class="property">class </em><code class="descclassname">utils.dataset.</code><code class="descname">BaseDataset</code><span class="sig-paren">(</span><em>audio_paths</em>, <em>label_paths</em>, <em>sos_id</em>, <em>eos_id</em>, <em>target_dict=None</em>, <em>input_reverse=True</em>, <em>use_augment=True</em>, <em>batch_size=None</em>, <em>augment_ratio=1.0</em>, <em>pack_by_length=True</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/utils/dataset.html#BaseDataset"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#utils.dataset.BaseDataset" title="Permalink to this definition"></a></dt>
<dd><p>Dataset for audio &amp; label matching</p>
<dl class="docutils">
<dt>Args: audio_paths, label_paths, bos_id, eos_id, target_dict</dt>
<dd><ul class="first last simple">
<li><strong>audio_paths</strong> (list): set of audio path</li>
<li><strong>label_paths</strong> (list): set of label paths</li>
<li><strong>bos_id</strong> (int): &lt;s&gt;`s id</li>
<li><strong>eos_id</strong> (int): &lt;/s&gt;`s id</li>
<li><strong>target_dict</strong> (dict): dictionary of filename and labels</li>
</ul>
</dd>
<dt>Inputs:</dt>
<dd><ul class="first last simple">
<li><strong>index</strong> (int): index of dataset`s</li>
</ul>
</dd>
<dt>Outputs:</dt>
<dd><ul class="first last simple">
<li><strong>feat</strong>: feature vector for audio</li>
<li><strong>label</strong>: label for audio</li>
</ul>
</dd>
</dl>
<dl class="method">
<dt id="utils.dataset.BaseDataset.augmentation">
<code class="descname">augmentation</code><span class="sig-paren">(</span><span class="sig-paren">)</span><a class="reference internal" href="_modules/utils/dataset.html#BaseDataset.augmentation"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#utils.dataset.BaseDataset.augmentation" title="Permalink to this definition"></a></dt>
<dd><p>Apply Spec-Augmentation</p>
</dd></dl>

<dl class="method">
<dt id="utils.dataset.BaseDataset.batch_shuffle">
<code class="descname">batch_shuffle</code><span class="sig-paren">(</span><em>remain_drop=False</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/utils/dataset.html#BaseDataset.batch_shuffle"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#utils.dataset.BaseDataset.batch_shuffle" title="Permalink to this definition"></a></dt>
<dd><p>batch shuffle</p>
</dd></dl>

<dl class="method">
<dt id="utils.dataset.BaseDataset.shuffle">
<code class="descname">shuffle</code><span class="sig-paren">(</span><span class="sig-paren">)</span><a class="reference internal" href="_modules/utils/dataset.html#BaseDataset.shuffle"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#utils.dataset.BaseDataset.shuffle" title="Permalink to this definition"></a></dt>
<dd><p>Shuffle Dataset</p>
</dd></dl>

<dl class="method">
<dt id="utils.dataset.BaseDataset.sort_by_length">
<code class="descname">sort_by_length</code><span class="sig-paren">(</span><span class="sig-paren">)</span><a class="reference internal" href="_modules/utils/dataset.html#BaseDataset.sort_by_length"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#utils.dataset.BaseDataset.sort_by_length" title="Permalink to this definition"></a></dt>
<dd><p>descending sort by sequence length</p>
</dd></dl>

</dd></dl>

<dl class="function">
<dt id="utils.dataset.split_dataset">
<code class="descclassname">utils.dataset.</code><code class="descname">split_dataset</code><span class="sig-paren">(</span><em>hparams</em>, <em>audio_paths</em>, <em>label_paths</em>, <em>valid_ratio=0.05</em>, <em>target_dict=None</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/utils/dataset.html#split_dataset"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#utils.dataset.split_dataset" title="Permalink to this definition"></a></dt>
<dd><p>Dataset split into training and validation Dataset.</p>
<dl class="docutils">
<dt>Inputs: hparams, audio_paths, label_paths, target_dict</dt>
<dd><ul class="first last simple">
<li><strong>valid_ratio</strong> (float): ratio for validation data</li>
<li><strong>hparams</strong> (HyperParams): set of hyper parameters</li>
<li><strong>audio_paths</strong> (list): set of audio path</li>
<li><strong>label_paths</strong> (list): set of label path</li>
<li><strong>target_dict</strong> (dict): dictionary of filename and labels</li>
</ul>
</dd>
<dt>Local Variables:</dt>
<dd><ul class="first last simple">
<li><strong>train_num</strong> (int): num of training data</li>
<li><strong>batch_num</strong> (int): total num of batch</li>
<li><strong>valid_batch_num</strong> (int): num of batch for validation</li>
<li><strong>train_num_per_worker</strong> (int): num of train data per CPU core</li>
<li><strong>data_paths</strong> (list): temp variables for audio_paths and label_paths to be shuffled in the same order</li>
<li><strong>train_begin_idx</strong> (int): begin index of worker`s training dataset</li>
<li><strong>train_end_idx</strong> (int): end index of worker`s training dataset</li>
</ul>
</dd>
<dt>Outputs: train_batch_num, train_dataset_list, valid_dataset</dt>
<dd><ul class="first last simple">
<li><strong>train_batch_num</strong> (int): num of batch for training</li>
<li><strong>train_dataset_list</strong> (list): list of training data</li>
<li><strong>valid_dataset</strong> (BaseDataset): list of validation data</li>
</ul>
</dd>
</dl>
</dd></dl>

<div class="section" id="dataset">
<h1>Dataset<a class="headerlink" href="#dataset" title="Permalink to this headline"></a></h1>
</div>


Expand Down
23 changes: 11 additions & 12 deletions docs/build/html/Distance.html
Original file line number Diff line number Diff line change
Expand Up @@ -175,18 +175,17 @@

<div class="section" id="module-utils.distance">
<span id="distance"></span><h1>Distance<a class="headerlink" href="#module-utils.distance" title="Permalink to this headline"></a></h1>
<dl class="function">
<dt id="utils.distance.char_distance">
<code class="descclassname">utils.distance.</code><code class="descname">char_distance</code><span class="sig-paren">(</span><em>target</em>, <em>y_hat</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/utils/distance.html#char_distance"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#utils.distance.char_distance" title="Permalink to this definition"></a></dt>
<dd><p>get Levenshtein distance</p>
</dd></dl>

<dl class="function">
<dt id="utils.distance.get_distance">
<code class="descclassname">utils.distance.</code><code class="descname">get_distance</code><span class="sig-paren">(</span><em>targets</em>, <em>y_hats</em>, <em>id2char</em>, <em>eos_id</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/utils/distance.html#get_distance"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#utils.distance.get_distance" title="Permalink to this definition"></a></dt>
<dd><p>get character distance</p>
</dd></dl>

<p>Copyright 2020- Kai.Lib
Licensed under the Apache License, Version 2.0 (the “License”);
you may not use this file except in compliance with the License.
You may obtain a copy of the License at</p>
<blockquote>
<div><a class="reference external" href="http://www.apache.org/licenses/LICENSE-2.0">http://www.apache.org/licenses/LICENSE-2.0</a></div></blockquote>
<p>Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an “AS IS” BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.</p>
</div>


Expand Down
31 changes: 2 additions & 29 deletions docs/build/html/Evaluator.html
Original file line number Diff line number Diff line change
Expand Up @@ -173,35 +173,8 @@
<div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
<div itemprop="articleBody">

<div class="section" id="module-train.evaluator">
<span id="evaluator"></span><h1>Evaluator<a class="headerlink" href="#module-train.evaluator" title="Permalink to this headline"></a></h1>
<dl class="function">
<dt id="train.evaluator.evaluate">
<code class="descclassname">train.evaluator.</code><code class="descname">evaluate</code><span class="sig-paren">(</span><em>model</em>, <em>queue</em>, <em>criterion</em>, <em>device</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/train/evaluator.html#evaluate"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#train.evaluator.evaluate" title="Permalink to this definition"></a></dt>
<dd><table class="docutils field-list" frame="void" rules="none">
<col class="field-name" />
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
<li><strong>model</strong> (<em>-</em>) – Model to be evaluated</li>
<li><strong>queue</strong> (<em>-</em>) – queue for threading</li>
<li><strong>criterion</strong> (<em>-</em>) – loss function ex) nn.CrossEntropyLoss, LabelSmoothingLoss etc ..</li>
<li><strong>device</strong> (<em>-</em>) – device used (‘cuda’ or ‘cpu’)</li>
</ul>
</td>
</tr>
</tbody>
</table>
<dl class="docutils">
<dt>Outputs:</dt>
<dd><ul class="first last simple">
<li><strong>loss</strong>: loss of evalution</li>
<li><strong>cer</strong>: character error rate of evaluation</li>
</ul>
</dd>
</dl>
</dd></dl>

<div class="section" id="evaluator">
<h1>Evaluator<a class="headerlink" href="#evaluator" title="Permalink to this headline"></a></h1>
</div>


Expand Down
115 changes: 2 additions & 113 deletions docs/build/html/Feature.html
Original file line number Diff line number Diff line change
Expand Up @@ -173,119 +173,8 @@
<div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
<div itemprop="articleBody">

<div class="section" id="module-utils.feature">
<span id="feature"></span><h1>Feature<a class="headerlink" href="#module-utils.feature" title="Permalink to this headline"></a></h1>
<dl class="function">
<dt id="utils.feature.get_librosa_melspectrogram">
<code class="descclassname">utils.feature.</code><code class="descname">get_librosa_melspectrogram</code><span class="sig-paren">(</span><em>filepath</em>, <em>n_mels=80</em>, <em>del_silence=False</em>, <em>input_reverse=True</em>, <em>mel_type='log_mel'</em>, <em>format='pcm'</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/utils/feature.html#get_librosa_melspectrogram"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#utils.feature.get_librosa_melspectrogram" title="Permalink to this definition"></a></dt>
<dd><p>Provides Mel-Spectrogram (or Log-Mel) for Audio</p>
<table class="docutils field-list" frame="void" rules="none">
<col class="field-name" />
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
<li><strong>filepath</strong> (<em>-</em>) – specific path of audio file</li>
<li><strong>n_mels</strong> (<em>-</em>) – number of mel filter</li>
<li><strong>del_silence</strong> (<em>-</em>) – flag indication whether to delete silence or not (default: True)</li>
<li><strong>mel_type</strong> (<em>-</em>) – if ‘log_mel’ return log-mel (default: ‘log_mel’)</li>
<li><strong>input_reverse</strong> (<em>-</em>) – flag indication whether to reverse input or not (default: True)</li>
<li><strong>format</strong> (<em>-</em>) – file format ex) pcm, wav (default: pcm)</li>
</ul>
</td>
</tr>
</tbody>
</table>
<dl class="docutils">
<dt>Feature Parameters:</dt>
<dd><ul class="first last simple">
<li><strong>sample rate</strong>: A.I Hub dataset`s sample rate is 16,000</li>
<li><strong>frame length</strong>: 25ms</li>
<li><strong>stride</strong>: 10ms</li>
<li><strong>overlap</strong>: 15ms</li>
<li><strong>window</strong>: Hamming Window</li>
<li><strong>n_fft</strong>: sr * frame_length (16,000 * 30ms)</li>
<li><strong>hop_length</strong>: sr * stride (16,000 * 7.5ms)</li>
</ul>
</dd>
<dt>Outputs:</dt>
<dd>feat: return Mel-Spectrogram (or Log-Mel)</dd>
</dl>
</dd></dl>

<dl class="function">
<dt id="utils.feature.get_librosa_mfcc">
<code class="descclassname">utils.feature.</code><code class="descname">get_librosa_mfcc</code><span class="sig-paren">(</span><em>filepath=None</em>, <em>n_mfcc=33</em>, <em>del_silence=False</em>, <em>input_reverse=True</em>, <em>format='pcm'</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/utils/feature.html#get_librosa_mfcc"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#utils.feature.get_librosa_mfcc" title="Permalink to this definition"></a></dt>
<dd><p>:
Provides Mel Frequency Cepstral Coefficient (MFCC) for Audio</p>
<table class="docutils field-list" frame="void" rules="none">
<col class="field-name" />
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
<li><strong>filepath</strong> (<em>-</em>) – specific path of audio file</li>
<li><strong>n_mfcc</strong> (<em>-</em>) – number of mel filter</li>
<li><strong>del_silence</strong> (<em>-</em>) – flag indication whether to delete silence or not (default: True)</li>
<li><strong>input_reverse</strong> (<em>-</em>) – flag indication whether to reverse input or not (default: True)</li>
<li><strong>format</strong> (<em>-</em>) – file format ex) pcm, wav (default: pcm)</li>
</ul>
</td>
</tr>
</tbody>
</table>
<dl class="docutils">
<dt>Feature Parameters:</dt>
<dd><ul class="first last simple">
<li><strong>sample rate</strong>: A.I Hub dataset`s sample rate is 16,000</li>
<li><strong>frame length</strong>: 25ms</li>
<li><strong>stride</strong>: 10ms</li>
<li><strong>overlap</strong>: 15ms</li>
<li><strong>window</strong>: Hamming Window</li>
<li><strong>n_fft</strong>: sr * frame_length (16,000 * 30ms)</li>
<li><strong>hop_length</strong>: sr * stride (16,000 * 7.5ms)</li>
</ul>
</dd>
<dt>Outputs</dt>
<dd><ul class="first last simple">
<li><strong>feat</strong> (torch.Tensor): MFCC values of signal</li>
</ul>
</dd>
</dl>
</dd></dl>

<dl class="function">
<dt id="utils.feature.spec_augment">
<code class="descclassname">utils.feature.</code><code class="descname">spec_augment</code><span class="sig-paren">(</span><em>feat</em>, <em>T=70</em>, <em>F=20</em>, <em>time_mask_num=2</em>, <em>freq_mask_num=2</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/utils/feature.html#spec_augment"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#utils.feature.spec_augment" title="Permalink to this definition"></a></dt>
<dd><p>Provides Augmentation for audio</p>
<table class="docutils field-list" frame="void" rules="none">
<col class="field-name" />
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
<li><strong>feat</strong> (<em>-</em>) – input data feature</li>
<li><strong>T</strong> (<em>-</em>) – Hyper Parameter for Time Masking to limit time masking length</li>
<li><strong>F</strong> (<em>-</em>) – Hyper Parameter for Freq Masking to limit freq masking length</li>
<li><strong>time_mask_num</strong> (<em>-</em>) – how many time-masked area to make</li>
<li><strong>freq_mask_num</strong> (<em>-</em>) – how many freq-masked area to make</li>
</ul>
</td>
</tr>
</tbody>
</table>
<dl class="docutils">
<dt>Outputs:</dt>
<dd><ul class="first last simple">
<li><strong>feat</strong>: Augmented feature</li>
</ul>
</dd>
<dt>Reference :</dt>
<dd><dl class="first last docutils">
<dt>「SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition」Google Brain Team. 2019.</dt>
<dd><a class="reference external" href="https://github.com/DemisEom/SpecAugment/blob/master/SpecAugment/spec_augment_pytorch.py">https://github.com/DemisEom/SpecAugment/blob/master/SpecAugment/spec_augment_pytorch.py</a></dd>
</dl>
</dd>
</dl>
</dd></dl>

<div class="section" id="feature">
<h1>Feature<a class="headerlink" href="#feature" title="Permalink to this headline"></a></h1>
</div>


Expand Down
Loading

0 comments on commit 36ef0b6

Please sign in to comment.