Skip to content

Commit

Permalink
Adjustment doc and code for CostLayer, GradientMachine and DataProvider.
Browse files Browse the repository at this point in the history
Also add some comments for cost layers.
ISSUE=4580653

git-svn-id: https://svn.baidu.com/idl/trunk/paddle@1410 1ad973e4-5ce8-4261-8a94-b56d1f490c56
  • Loading branch information
qingqing01 committed Aug 30, 2016
1 parent 4268885 commit f063752
Show file tree
Hide file tree
Showing 15 changed files with 426 additions and 134 deletions.
4 changes: 2 additions & 2 deletions doc/source/gserver/activations/index.rst
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Activations
=============

.. doxygenfile:: paddle/gserver/activations/ActivationFunction.h
.. doxygenfile:: paddle/gserver/activations/ActivationFunction.cpp
.. doxygenfile:: paddle/gserver/activations/ActivationFunction.h
.. doxygenfile:: paddle/gserver/activations/ActivationFunction.cpp
85 changes: 77 additions & 8 deletions doc/source/gserver/dataprovider/dataproviders.rst
Original file line number Diff line number Diff line change
@@ -1,14 +1,83 @@
Data Providers
================

Data Provider
Base DataProvider
------------------
.. doxygenclass:: paddle::DataProvider
:members:

DataProviderGroup
-------------------
.. doxygenclass:: paddle::DataProviderGroup
:members:

MultiDataProvider
-------------------
.. doxygenclass:: paddle::MultiDataProvider
:members:

PyDataProvider
===================

IFieldScanner
-------------
.. doxygenclass:: paddle::IFieldScanner
:members:

DenseScanner
-------------
.. doxygenclass:: paddle::DenseScanner
:members:

IndexScanner
-------------
.. doxygenclass:: paddle::IndexScanner
:members:

SparseNonValueScanner
---------------------
.. doxygenclass:: paddle::SparseNonValueScanner
:members:

SparseValueScanner
------------------
.. doxygenclass:: paddle::SparseValueScanner
:members:

SequenceScanner
------------------
.. doxygenclass:: paddle::SparseValueScanner
:members:

IPyDataProviderCache
--------------------
.. doxygenclass:: paddle::IPyDataProviderCache
:members:

NoCacheStrategy
---------------
.. doxygenfile:: paddle/gserver/dataproviders/DataProvider.h
.. doxygenfile:: paddle/gserver/dataproviders/PyDataProvider2.cpp
.. doxygenfile:: paddle/gserver/dataproviders/DataProviderGroup.h
.. doxygenfile:: paddle/gserver/dataproviders/MultiDataProvider.h
.. doxygenclass:: paddle::NoCacheStrategy
:members:

Proto Data Provider
CacheOnePassInMemory
--------------------
.. doxygenfile:: paddle/gserver/dataproviders/ProtoDataProvider.h
.. doxygenfile:: paddle/gserver/dataproviders/ProtoReader.h
.. doxygenclass:: paddle::CacheOnePassInMemory
:members:

IPyDataProvider
---------------
.. doxygenclass:: paddle::PyDataProvider2
:members:

Proto Data Provider
===================

ProtoDataProvider
----------------
.. doxygenclass:: paddle::ProtoDataProvider
:members:

ProtoSequenceDataProvider
----------------
.. doxygenclass:: paddle::ProtoSequenceDataProvider
:members:
102 changes: 102 additions & 0 deletions doc/source/gserver/evaluators/evaluators.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
Base Evaluator
==============

Evaluator
---------
.. doxygenclass:: paddle::Evaluator
:members:


Utils
=====

SumEvaluator
------------
.. doxygenclass:: paddle::SumEvaluator
:members:

ColumnSumEvaluator
------------------
.. doxygenclass:: paddle::ColumnSumEvaluator
:members:

Classification
==============

ClassificationErrorEvaluator
---------------------------
.. doxygenclass:: paddle::ClassificationErrorEvaluator
:members:

SequenceClassificationErrorEvaluator
------------------------------------
.. doxygenclass:: paddle::SequenceClassificationErrorEvaluator
:members:

AucEvaluator
-------------
.. doxygenclass:: paddle::AucEvaluator
:members:

PrecisionRecallEvaluator
------------------------
.. doxygenclass:: paddle::PrecisionRecallEvaluator
:members:

ChunkEvaluator
--------------
.. doxygenclass:: paddle::ChunkEvaluator
:members:

CTCEvaluator
------------
.. doxygenclass:: paddle::CTCErrorEvaluator
:members:


Rank
====

PnpairEvaluator
-------------
.. doxygenclass:: paddle::PnpairEvaluator
:members:

AucEvaluator
-------------
.. doxygenclass:: paddle::RankAucEvaluator
:members:


Printer
=======

ValuePrinter
-------------
.. doxygenclass:: paddle::ValuePrinter
:members:

GradientPrinter
---------------
.. doxygenclass:: paddle::GradientPrinter
:members:

MaxIdPrinter
------------
.. doxygenclass:: paddle::MaxIdPrinter
:members:

MaxFramePrinter
---------------
.. doxygenclass:: paddle::MaxFramePrinter
:members:

SequenceTextPrinter
------------------
.. doxygenclass:: paddle::SequenceTextPrinter
:members:

ClassificationErrorPrinter
--------------------------
.. doxygenclass:: paddle::ClassificationErrorPrinter
:members:
9 changes: 4 additions & 5 deletions doc/source/gserver/evaluators/index.rst
Original file line number Diff line number Diff line change
@@ -1,8 +1,7 @@
Evaluators
============

.. doxygenfile:: paddle/gserver/evaluators/Evaluator.h
.. doxygenfile:: paddle/gserver/evaluators/ChunkEvaluator.cpp
.. doxygenfile:: paddle/gserver/evaluators/CTCErrorEvaluator.cpp
==========

.. toctree::
:maxdepth: 3

evaluators.rst
44 changes: 32 additions & 12 deletions doc/source/gserver/gradientmachines/gradientmachines.rst
Original file line number Diff line number Diff line change
@@ -1,20 +1,40 @@
Gradient machines
===================
Gradient Machines
================

Networks
------------
.. doxygenfile:: paddle/gserver/gradientmachines/MultiNetwork.h
.. doxygenfile:: paddle/gserver/gradientmachines/ParallelNeuralNetwork.h
GradientMachine
---------------------
.. doxygenclass:: paddle::GradientMachine
:members:

Gradient Machines
GradientMachineModel
--------------------
.. doxygenfile:: paddle/gserver/gradientmachines/GradientMachine.h
.. doxygenfile:: paddle/gserver/gradientmachines/MultiGradientMachine.h
.. doxygenclass:: paddle::IGradientMachineMode
:members:

MultiGradientMachine
---------------------
.. doxygenclass:: paddle::MultiGradientMachine
:members:

TrainerThread
`````````````
.. doxygenclass:: paddle::TrainerThread
:members:

Recurrent Gradient Machines
-----------------------------
.. doxygenfile:: paddle/gserver/gradientmachines/RecurrentGradientMachine.h
.. doxygenfile:: paddle/gserver/gradientmachines/RecurrentGradientMachine.cpp
---------------------------
.. doxygenclass:: paddle::RecurrentGradientMachine
:members:

Networks
========

NeuralNetwork
-------------
.. doxygenclass:: paddle::NeuralNetwork
:members:

ParallelNeuralNetwork
---------------------
.. doxygenclass:: paddle::ParallelNeuralNetwork
:members:
32 changes: 19 additions & 13 deletions paddle/gserver/dataproviders/DataProvider.h
Original file line number Diff line number Diff line change
Expand Up @@ -118,10 +118,10 @@ class DataBatch {
data_.push_back(argu);
}

/*
* argus: DataBatch.getStreams()
* size: DataBatch.getSize()
* dataId: sub dataprovider id (in MultiDataProvider)
/**
* @param argus: DataBatch.getStreams()
* @param size: DataBatch.getSize()
* @param dataId: sub dataprovider id (in MultiDataProvider)
*/
void appendArguments(const std::vector<Argument>& argus, int size,
int dataId) {
Expand Down Expand Up @@ -312,22 +312,28 @@ class DummyDataProvider : public DataProvider {
}
};

// Data provider for one input and one integer label
/**
* Data provider for one input and one integer label.
*/
class SimpleDataProviderBase : public DataProvider {
protected:
int64_t sampleDim_; // sample feature dimension
int64_t bufferCapacity_; // the number of samples
/// sample feature dimension
int64_t sampleDim_;
/// the number of samples
int64_t bufferCapacity_;
int64_t sampleNumInBuf_;
int64_t nextItemIndex_; // next item to read in buffer
bool withInfo_; // some user defined info for validation
/// next item to read in buffer
int64_t nextItemIndex_;
/// some user defined info for validation
bool withInfo_;

// data buffer: bufferCapacity_ * nDataDim_
/// data buffer: bufferCapacity_ * nDataDim_
CpuMatrixPtr hInputDataBuf_;

// label buffer:bufferCapacity_ * 1
/// label buffer:bufferCapacity_ * 1
CpuIVectorPtr hInputLabelBuf_;

// info buffer:bufferCapacity_ * 1
/// info buffer:bufferCapacity_ * 1
CpuIVectorPtr hInputInfoBuf_;

ThreadLocal<MatrixPtr> dataBatch_;
Expand All @@ -348,7 +354,7 @@ class SimpleDataProviderBase : public DataProvider {

virtual int64_t getNextBatchInternal(int64_t size, DataBatch* batch);

// return the number of samples in the buffer
/// return the number of samples in the buffer
int64_t fillBuffer();

protected:
Expand Down
7 changes: 4 additions & 3 deletions paddle/gserver/dataproviders/ProtoDataProvider.h
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,7 @@ class ProtoDataProvider : public DataProvider {
*/
inline bool iidData() const { return sequenceStartPositions_.empty(); }

// check that sample is consistent with header_
/// check that sample is consistent with header_
void checkSample(const DataSample& sample);

template <class Op>
Expand Down Expand Up @@ -129,14 +129,15 @@ class ProtoDataProvider : public DataProvider {

int64_t currentSequenceIndex_;

// The size should be the number of sequences.
/// The size should be the number of sequences.
std::vector<size_t> shuffledSequenceIds_;

ThreadLocalD<DataBatch> cpuBatch_;
ThreadLocalD<DataBatch> gpuBatch_;

RWLock lock_;
std::vector<StatPtr> nnzStats_; // stats for number of none-zeros entries
// stats for number of none-zeros entries
std::vector<StatPtr> nnzStats_;
};

/**
Expand Down
14 changes: 14 additions & 0 deletions paddle/gserver/evaluators/Evaluator.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1000,20 +1000,34 @@ REGISTER_EVALUATOR(max_frame_printer, MaxFramePrinter);
/**
* Sequence text printer will print text according to index matrix and a
* dictionary. There can be multiple input to this layer:
*
* 1) If there is only one input, the input must be a matrix containing
* the sequence of indices;
*
* 2) If there are more than one input, the first input should be ids,
* and are interpreted as sample ids.
*
* The output format will be:
*
* 1) sequence without sub-sequence, and there is probability.
*
* @code
* id \t prob space_seperated_tokens_from_dictionary_according_to_seq
* @endcode
*
* 2) sequence without sub-sequence, and there is not probability.
*
* @code
* id \t space_seperated_tokens_from_dictionary_according_to_seq
* @endcode
*
* 3) sequence with sub-sequence, and there is not probability.
*
* @code
* id \t space_seperated_tokens_from_dictionary_according_to_sub_seq
* \t \t space_seperated_tokens_from_dictionary_according_to_sub_seq
* ...
* @endcode
*
* Typically SequenceTextPrinter layer takes output of maxid or RecurrentGroup
* with maxid (when generating) as an input.
Expand Down
Loading

0 comments on commit f063752

Please sign in to comment.