Skip to content

Commit

Permalink
spellcheck CNTK-TechReport
Browse files Browse the repository at this point in the history
  • Loading branch information
jonsafari committed Jan 25, 2016
1 parent 53de59c commit b0c3c13
Show file tree
Hide file tree
Showing 6 changed files with 53 additions and 53 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -743,7 +743,7 @@ To build the TIMIT graph, only three input files are needed: the model state
\end_layout

\begin_layout Standard
The scripts assume each context-indepenent phone is represented by a three
The scripts assume each context-independent phone is represented by a three
state, left to right, hidden markov model.
The names of these states should be in a
\begin_inset Quotes eld
Expand All @@ -756,7 +756,7 @@ model state map
file that has one line for every model.
The first column is the name of the model, and subsequent columns are the
names of the states, in left to right order.
The transition probabilites between these states are stored in a separate
The transition probabilities between these states are stored in a separate

\begin_inset Quotes eld
\end_inset
Expand Down Expand Up @@ -859,7 +859,7 @@ To decode, the following parameters to Argon should be specified: -graph,
The decoder uses a Viterbi beam search algorithm, in which unlikely hypotheses
are pruned at each frame.
The -beam parameter prevents unlikely hypotheses from being pursued.
Any hypothesis that differes from the best hypothesis by more than this
Any hypothesis that differs from the best hypothesis by more than this
amount will be be discarded.
The -max-tokens parameter controls the number of active hypotheses.
If the -beam parameter causes more than max-tokens hypotheses to be generated,
Expand All @@ -872,7 +872,7 @@ The decoder uses a Viterbi beam search algorithm, in which unlikely hypotheses
\begin_layout Standard
The -graph parameter tells Argon which compiled decoding graph should be
used.
The -lm should indicate an ARPA format ngram languag emodel.
The -lm should indicate an ARPA format ngram language model.
\end_layout

\begin_layout Standard
Expand Down
10 changes: 5 additions & 5 deletions Documentation/CNTK-TechReport/lyx/CNTKBook_CNTK_Adv_Chapter.lyx
Original file line number Diff line number Diff line change
Expand Up @@ -705,7 +705,7 @@ After defining the network, it’s important to let CNTK know what the special
It also needs to know the default output nodes, evaluation nodes and training
criteria nodes.
Note here the specification of the nodes that require special handling
(NodesReqMultiSeqHandling) when the network is evalauted or trained with
(NodesReqMultiSeqHandling) when the network is evaluated or trained with
multiple sequences, e.g., when the network itself is an RNN or the model
is trained with the sequence-level criterion.
Since in these cases multiple sequences will be stitched together to improve
Expand Down Expand Up @@ -2233,7 +2233,7 @@ RowStack
\end_layout

\begin_layout Standard
Concatnate rows of input matrices to form a bigger matrix.
Concatenate rows of input matrices to form a bigger matrix.
The resulting matrix is a sumof(rows) by m1.cols matrix.
It supports variable-length input.
The syntax is
Expand Down Expand Up @@ -2898,11 +2898,11 @@ labels - the ground truth labels.
The first row is the ground truth output id.
The second row is the ground truth class id.
The third and fourth rows are the start (inclusive) and end (exclusive)
output ids corresponding to the ground trueth class id.
output ids corresponding to the ground truth class id.
\end_layout

\begin_layout Itemize
mainInputInfo - contains the main information to make the classfication
mainInputInfo - contains the main information to make the classification
decision.
It's an inputDim by T matrix.
In language model, inputDim is often the hidden layer size.
Expand Down Expand Up @@ -4422,7 +4422,7 @@ To integrate this new layer into the model, the inputs and outputs of the
After the copy any node whose connected nodes were not copied will have
those connections set to an invalid value.
These need to be fixed in order to have a valid model.
Before a model can be saved CNTK first checkes to see if all nodes are
Before a model can be saved CNTK first checks to see if all nodes are
correctly connected.
\end_layout

Expand Down
36 changes: 18 additions & 18 deletions Documentation/CNTK-TechReport/lyx/CNTKBook_CNTK_Chapter.lyx
Original file line number Diff line number Diff line change
Expand Up @@ -965,7 +965,7 @@ CLASSLSTM

: the class-based long short-term memory neural network.
It uses sparse input, sparse parameter and sparse output.
This is often uesd for language modeling tasks.
This is often used for language modeling tasks.
\end_layout

\end_deeper
Expand Down Expand Up @@ -1768,10 +1768,10 @@ numMiniBatch4LRSearch

\end_inset

: the number of minibatches used to search the minibatch size whenin adaptive
: the number of minibatches used to search the minibatch size when in adaptive
minibatch size mode.
Default value is 500.
It's typically set to 10-20% of the total minibatches in an epochthis is
It's typically set to 10-20% of the total minibatches in an epoch this is
shared with the search for learning rate in SearchBeforeEpoch mode.

\end_layout
Expand All @@ -1792,10 +1792,10 @@ autoAdjustMinibatch
\end_inset

: enable or disable whether minibatch size is adaptively adjusted.
Default value is false.Adapative minibatch sizing will begin on epochs starting
after user minbatch sizes expcitilyspecified are complete.
Default value is false.Adaptive minibatch sizing will begin on epochs starting
after user minibatch sizes expcitilyspecified are complete.
For example if the userspecifed minibatchSize=256:1024, then 256 and 1024are
used in the first 2 Epochs and adaptive minibatchsizing is used aferwards
used in the first 2 Epochs and adaptive minibatchsizing is used afterwards

\end_layout

Expand All @@ -1814,7 +1814,7 @@ minibatchSizeTuningFrequency

\end_inset

: The number of epochs to skip, on a periodic basis, beforedynamically adjusting
: The number of epochs to skip, on a periodic basis, before dynamically adjusting
the minibatch size.
Default value is 1.

Expand All @@ -1835,7 +1835,7 @@ minibatchSizeTuningMax

\end_inset

: The maximum size allowed for anadaptively adjusted minibatch size.
: The maximum size allowed for an adaptively adjusted minibatch size.
Default value is 1048576.

\end_layout
Expand Down Expand Up @@ -2669,10 +2669,10 @@ rollingWindow

option reads in all feature files and stores them on disk in one large
temporary binary file.
The data is randomized by running a large rollowing window over the data
The data is randomized by running a large rolling window over the data
in this file and randomizing the data within the window.
This method produces more thorough randomization of the data but requires
a large temprorary file written to disk.
a large temporary file written to disk.
The other option is
\begin_inset Quotes eld
\end_inset
Expand Down Expand Up @@ -2798,14 +2798,14 @@ labels
\end_inset

are the default names used by the SimpleNetworkBuilder but if the network
is designed using the Network Descrition Language (NDL), then any names
is designed using the Network Description Language (NDL), then any names
can be used, as long as they each have a corresponding node in the network.
\end_layout

\begin_layout Standard
To specify continuous-valued features, e.g.
MFCC's or log mel filterbank coefficients, the following parameters should
be included in the a confguration block:
be included in the a configuration block:
\end_layout

\begin_layout Itemize
Expand Down Expand Up @@ -3378,7 +3378,7 @@ nbruttsineachrecurrentiter
The reader arranges same-length input sentences, up to the specified limit,
into each minibatch.
For recurrent networks, trainer resets hidden layer activities only at
the begining of sentences.
the beginning of sentences.
Activities of hidden layers are carried over to the next minibatch if an
end of sentence is not reached.
Using multiple sentences in a minibatch can speed up training processes.
Expand Down Expand Up @@ -3425,7 +3425,7 @@ wordclass
This is used for class-based language modeling.
An example of the class information is below.
The first column is the word index.
The second column is the number of occurances, the third column is the
The second column is the number of occurrences, the third column is the
word, and the last column is the class id of the word.

\begin_inset listings
Expand Down Expand Up @@ -3795,7 +3795,7 @@ nbrUttsInEachRecurrentIter
The reader arranges same-length input sentences, up to the specified limit,
into each minibatch.
For recurrent networks, trainer resets hidden layer activities only at
the begining of sentences.
the beginning of sentences.
Activities of hidden layers are carried over to the next minibatch if an
end of sentence is not reached.
Using multiple sentences in a minibatch can speed up training processes.
Expand Down Expand Up @@ -4999,7 +4999,7 @@ section
\end_inset

– the encoderReader and decoderReader are the readers for encoder and decoder.
Similary for encoderCVReader and decoderCVReader for validation set.
Similarly for encoderCVReader and decoderCVReader for validation set.

\end_layout

Expand Down Expand Up @@ -5365,7 +5365,7 @@ deviceId

\begin_layout Standard
CNTK supports CPU and GPU computation.
Users can determine what device to use by setting the deviceId papameter.
Users can determine what device to use by setting the deviceId parameter.
The possible values are
\end_layout

Expand Down Expand Up @@ -5509,7 +5509,7 @@ traceLevel=0 # larger values mean more output

The default value is 0 and specifies minimal output.
The higher the number the more output can be expected.
Currently 0 (limited output), 1 (medium ouput) and 2 (verbose output) are
Currently 0 (limited output), 1 (medium output) and 2 (verbose output) are
the only values supported.
\end_layout

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3186,7 +3186,7 @@ s().GetNumCols() != 1)

\begin_layout Plain Layout

throw std::logic_error("The left value of ScaleNode must be a scarlar
throw std::logic_error("The left value of ScaleNode must be a scalar
value.");
\end_layout

Expand Down
16 changes: 8 additions & 8 deletions Documentation/CNTK-TechReport/lyx/CNTKBook_CN_Chapter.lyx
Original file line number Diff line number Diff line change
Expand Up @@ -1816,11 +1816,11 @@ sly.
In this algorithm, all the nodes whose children have not been computed
are in the waiting set and those whose children are computed are in the
ready set.
At the beginning, all non-leaf descendents of
At the beginning, all non-leaf descendants of
\begin_inset Formula $root$
\end_inset

are in the waiting set and all leaf descendents are in the ready set.
are in the waiting set and all leaf descendants are in the ready set.
The scheduler picks a node from the ready set based on some policy, removes
it from the ready set, and dispatches it for computation.
Popular policies include first-come/first-serve, shortest task first, and
Expand Down Expand Up @@ -2015,7 +2015,7 @@ status open
\begin_inset Formula $waiting$
\end_inset

is initialized to include all non-leaf descendents of
is initialized to include all non-leaf descendants of
\begin_inset Formula $root$
\end_inset

Expand Down Expand Up @@ -2061,7 +2061,7 @@ status open
\begin_inset Formula $ready$
\end_inset

is initialized to include all leaf descendents of
is initialized to include all leaf descendants of
\begin_inset Formula $root$
\end_inset

Expand Down Expand Up @@ -3412,7 +3412,7 @@ status open

\end_inset

Decide the order to compute the gradient at all descendents of
Decide the order to compute the gradient at all descendants of
\begin_inset Formula $node$
\end_inset

Expand Down Expand Up @@ -8892,7 +8892,7 @@ CRF
\color none
CRF stands for conditional random fields.
This node does sequence-level training, using CRF criterion.
This node has three nputs.
This node has three inputs.
The first is the label
\family default
\series bold
Expand Down Expand Up @@ -10198,7 +10198,7 @@ reference "fig:CN-WithDelayNode"
A simple way to do forward computation and backpropagation in a recurrent
network is to unroll all samples in the sequence over time.
Once unrolled, the graph is expanded into a DAG and the forward computation
and gradient calcalclation algorithms we just discussed can be directly
and gradient calculation algorithms we just discussed can be directly
used.
This means, however, all computation nodes in the CN need to be computed
sample by sample and this significantly reduces the potential of parallelizatio
Expand Down Expand Up @@ -10318,7 +10318,7 @@ key "StronglyConnectedComponents-Hopcroft+1983"
in the CN and the CN is reduced to a DAG.
All the nodes inside each loop (or composite node) can be unrolled over
time and also reduced to a DAG.
For all these DAGs the forward computation and backprogation algorithms
For all these DAGs the forward computation and backpropagation algorithms
we discussed in the previous sections can be applied.
The detailed procedure in determining the forward computation order in
the CN with arbitrary recurrent connections is described in Algorithm
Expand Down
Loading

0 comments on commit b0c3c13

Please sign in to comment.