Note: Functions taking Block
arguments can also take anything accepted by
td.convert_to_block
. Functions taking ResultType
arguments can can also take anything accepted by
td.convert_to_type
.
module tensorflow_fold
(td
)- Compiler
- Blocks for input
- Blocks for composition
- Blocks for tensors
- Blocks for sequences
class td.Map
class td.Fold
td.RNN(cell, initial_state=None, initial_state_from_input=False, name=None)
class td.Reduce
td.Sum(name=None)
td.Min(name=None)
td.Max(name=None)
td.Mean(name=None)
class td.Broadcast
class td.Zip
td.ZipWith(elem_block, name=None)
class td.NGrams
class td.Nth
class td.GetItem
class td.Length
td.Slice(*args, **kwargs)
- Other blocks
- Layers
- Types
- Plans
- Conversion functions
- Utilities
- Abstract classes
This is the high-level API for TensorFlow Fold.
A compiler for TensorFlow Fold blocks.
Creates a Compiler.
Most users will want to use the Compiler.create
factory, like so:
compiler = td.Compiler.create(root_block_like)
Which is simply a short-hand for:
compiler = td.Compiler()
compiler.compile(root_block_like)
compiler.init_loom()
Turns a batch of examples into a dictionary for feed_dict.
If an input_tensor was supplied when the Compiler was constructed, the user can just evaluate the compiler's output tensors without needing to create a feed_dict via 'build_feed_dict'.
This is a convenience method equivalent to
{compiler.loom_input_tensor: compiler.build_loom_input_batched(examples, batch_size, ordered)}
when metric_labels=False
.
The result is computed lazily (e.g. when passed as a feed_dict to
Session.run()
), and thus does not block when using
multiprocessing. The exception is when metric_labels=True, in
which case we need to block in order to aggregate the labels
across chunks of work.
examples
: A non-empty iterable of examples to be built into tensors.batch_size
: The maximum number of examples to compile into each loom input. Defaults to 100. If multiprocessing then this will also be the chunk size for each unit of work.metric_labels
: Whether or not to return metric labels.ordered
: Whether or not to preserve ordering when multiprocessing, otherwise has not effect (and order is always preserved).
A feed dictionary which can be passed to TensorFlow run()
/eval()
. If
metric_labels
is True, a (feed_dict, metric_labels)
tuple.
TypeError
: Ifexamples
is not an iterable.RuntimeError
: Ifinit_loom()
has not been called.
Turns examples into a feed value for self.loom_input_tensor
.
The result is an iterator; work doesn't happen until you call
e.g. next()
or list()
on it.
examples
: A non-empty iterable of examples to be built into tensors.batch_size
: The maximum number of examples to compile into each loom input. Defaults to 100. If multiprocessing then this will also be the chunk size for each unit of work.metric_labels
: Whether or not to return metric labels.ordered
: Whether or not to preserve ordering when multiprocessing, otherwise has not effect (and order is always preserved).
Feed value(s) corresponding to examples
grouped into batches. The result
itself can be fed directly to self.loom_input_tensor
, or be iterated
over to feed values batch-by-batch. If metric_labels
is True, an
iterable of (batch_feed_value, metric_labels)
tuples.
TypeError
: Ifexamples
is not an iterable.RuntimeError
: Ifinit_loom()
has not been called.
Turns examples into feed values for self.loom_input_tensor
.
The result is an iterator; work doesn't happen until you call
e.g. next()
or list()
on it.
examples
: An iterable of example to be built into tensors.metric_labels
: Whether or not to return metric labels.chunk_size
: If multiprocessing then the size of each unit of work. Defaults to 100. If not multiprocessing then this has no effect.ordered
: Whether or not to preserve ordering when multiprocessing. If not multiprocessing then this has no effect (order is always preserved).
An iterable of strings (morally bytes) that can be fed to
self.loom_input_tensor
. If metric_labels
is True, an iterable of
(string, metric_labels)
tuples.
TypeError
: Ifexamples
is not an iterable.RuntimeError
: Ifinit_loom()
has not been called.
Compiles a block, and sets it to the root.
root_block_like
: A block or an object that can be converted to a block bytd.convert_to_block
. Must have at least one output or metric tensor. The output type may not contain any Sequence or PyObject types.
self
RuntimeError
: Ifinit_loom()
has already been called.TypeError
: Ifroot_block_like
cannot be converted to a block.TypeError
: Ifroot_block_like
fails to compile.TypeError
: Ifroot_block_like
has no output or metric tensors.TypeError
: Ifroot_block_like
has an invalid output type.
td.Compiler.create(cls, root_block_like, max_depth=None, loom_input_tensor=None, input_tensor=None, parallel_iterations=None, back_prop=None, swap_memory=None)
Creates a Compiler, compiles a block, and initializes loom.
root_block_like
: A block or an object that can be converted to a block bytd.convert_to_block
. Must have at least one output or metric tensor. The output type may not contain any Sequence or PyObject types.max_depth
: Optional. The maximum nesting depth that the encapsulated loom ought to support. This is dependent on the topology of the block graph and on the shape of the data being passed in. May be calculated by callingCompiler.max_depth
. If unspecified, atf.while_loop
will be used to dynamically calculatemax_depth
on a per-batch basis.loom_input_tensor
: An optional string tensor of loom inputs for the compiler to read from. Mutually exclusive with `input_tensor'.input_tensor
: an optional string tensor for the block to read inputs from. If an input_tensor is supplied the user can just evaluate the compiler's output tensors without needing to create a feed dict via 'build_feed_dict'. Mutually exclusive with `loom_input_tensor'.parallel_iterations
: tf.while_loop's parallel_iterations option, which caps the number of different depths at which ops could run in parallel. Only applies when max_depth=None. Default: 10.back_prop
: tf.while_loop's back_prop option, which enables gradients. Only applies when max_depth=None. Default: True.swap_memory
: Whether to use tf.while_loop's swap_memory option, which enables swapping memory between GPU and CPU at the possible expense of some performance. Only applies when max_depth=None. Default: False.
A fully initialized Compiler.
TypeError
: Ifroot_block_like
cannot be converted to a block.TypeError
: Ifroot_block_like
fails to compile.TypeError
: Ifroot_block_like
has no output or metric tensors.TypeError
: Ifroot_block_like
has an invalid output type.ValueError
: If bothloom_input_tensor
andinput_tensor
are provided.
td.Compiler.init_loom(max_depth=None, loom_input_tensor=None, input_tensor=None, parallel_iterations=None, back_prop=None, swap_memory=None)
Intializes the loom object, which is used to run on tensorflow.
max_depth
: Optional. The maximum nesting depth that the encapsulated loom ought to support. This is dependent on the topology of the block graph and on the shape of the data being passed in. May be calculated by callingCompiler.max_depth
. If unspecified, atf.while_loop
will be used to dynamically calculatemax_depth
on a per-batch basis.loom_input_tensor
: An optional string tensor of loom inputs for the compiler to read from. Mutually exclusive with `input_tensor'.input_tensor
: an optional string tensor for the block to read inputs from. If an input_tensor is supplied the user can just evaluate the compiler's output tensors without needing to create a feed dict via 'build_feed_dict'. Mutually exclusive with `loom_input_tensor'.parallel_iterations
: tf.while_loop's parallel_iterations option, which caps the number of different depths at which ops could run in parallel. Only applies when max_depth=None. Default: 10.back_prop
: tf.while_loop's back_prop option, which enables gradients. Only applies when max_depth=None. Default: True.swap_memory
: Whether to use tf.while_loop's swap_memory option, which enables swapping memory between GPU and CPU at the possible expense of some performance. Only applies when max_depth=None. Default: False.
RuntimeError
: Ifcompile()
has not been called.RuntimeError
: If the loom has already been initialized.ValueError
: If bothloom_input_tensor
andinput_tensor
are provided.
Returns input tensor that can feed data to this compiler.
Returns the loom input tensor, used for building feed dictionaries.
May be fed a single result or a sequence of results from
Compiler.build_loom_inputs()
or Compiler.build_loom_input_batched()
.
A string tensor.
RuntimeError
: IfCompiler.init_loom()
has not been called.
Returns the loom max_depth
needed to evaluate inp
.
Returns a ordered dictionary of tensors for output metrics.
Creates a context for use with the Python with
statement.
Entering this context creates a pool of subprocesses for building loom inputs in parallel with this compiler. When the context exits the pool is closed, blocking until all work is completed.
processes
: The number of worker processes to use. Defaults to the cpu count (multiprocessing.cpu_count()
).
Nothing.
RuntimeError
: Ifinit_loom()
has not been called.
Returns a flattened list of all output tensors.
Returns the current multiprocessing pool if it exists, else None.
Returns the root block, or None if compile()
has not been called.
A block that converts its input from a python object to a tensor.
A block that converts its input to a scalar.
A block that converts its input to a vector.
A Python function, lifted to a block.
A block that turns serialized protobufs into nested Python dicts and lists.
The block's input and output types are both PyObjectType
.
message_type_name
: A string; the full name of the expected message type.
A dictionary of the message's values by fieldname, where the
function renders repeated fields as lists, submessages via
recursion, and enums as dictionaries whose keys are name
,
index
, and number
. Missing optional fields are rendered as
None
. Scalar field values are rendered as themselves.
TypeError
: Ifmessage_type_name
is not a string.
A block that converts PyObject input to a one-hot encoding.
Will raise an KeyError
if the block is applied to an out-of-range input.
Initializes the block.
start
: The start of the input range.stop
: Upper limit (exclusive) on the input range. If stop isNone
, the range is[0, start)
, like the Python range function.dtype
: The dtype for the output array.name
: An optional string name for the block.
IndexError
: If the range is empty.
A block that converts PyObject input to a one-hot encoding.
Differs from OneHot
in that the user specifies the elements covered by the
one-hot encoding rather than assuming they are consecutive integers.
elements
: The list of elements to be given one-hot encodings.dtype
: The type of the block's return value.strict
: Whether the block should throw a KeyError if it encounters an input which wasn't in elements. Default: True.name
: An optional string name for the block.
AssertionError
: if any of theelements
given are equal.
A Block that takes a PyObject and returns a tensor of type dtype
and shape
[len(elements)]
. If passed any member of elements
the block will return
a basis vector corresponding to the position of the element in the list. If
passed anything else the block will throw a KeyError if strict
was set to
True, and return the zero vector if strict
was set to False.
Dispatches its input based on whether the input exists, or is None.
Similar to OneOf(lambda x: x is None, {True: none_block, False: some_block})
except that none_block
has input_type
VoidType
.
Creates an Optional block.
some_case
: The block to evaluate on x if x exists.none_case
: The block to evaluate if x is None -- defaults to zeros for tensor types, and an empty sequence for sequence types.name
: An optional string name for the block.
A composition of blocks, which are connected in a DAG.
Connect a
to the input of b
.
The argument a
can be either:
-
A block, in which case the output of
a
is fed into the input ofb
. -
The i^th output of a block, obtained from
a[i]
. -
A tuple or list of blocks or block outputs.
a
: Inputs to the block (see above).b
: The block to connect the inputs to.
ValueError
: ifa
includes the output of the composition.ValueError
: ifb
is the input of the composition.ValueError
: if the input ofb
is already connected.
Return a placeholder whose output is the input to the composition.
Return a placeholder whose input is the output of the composition.
Creates a context for use with the python with
statement.
Entering this context enabled the use of a block's reads
method. Once
inside a context calling some_block.reads(...)
sets some_block
's inputs
within the composition.
For example, you could make a composition which computes
c = td.Composition()
with c.scope():
x = td.Vector(3).reads(c.input)
x_squared = td.Function(tf.mul).reads(x, x)
ten = td.FromTensor(10 * np.ones(3, dtype='float32'))
ten_x = td.Function(tf.mul).reads(ten, x)
c.output.reads(td.Function(tf.add).reads(x_squared, ten_x)
Creates a composition which pipes each block into the next one.
Pipe(a, b, c)
is equivalent to a >> b >> c
.
Pipe(a, b, c).eval(x) => c(b(a(x)))
*blocks
: A tuple of blocks.**kwargs
:{'name': name_string}
or{}
.
A block.
Dispatch each element of a dict, list, or tuple to child blocks.
A Record block takes a python dict or list of key-block pairs, or a tuple of blocks, processes each element, and returns a tuple of results as the output.
Record({'a': a_block, 'b': b_block}).eval(inp) =>
(a_block.eval(inp['a']), b_block.eval(inp['b']))
Record([('a', a_block), ('b', b_block)]).eval(inp) =>
(a_block.eval(inp['a']), b_block.eval(inp['b']))
Record((a_block, b_block)).eval(inp) =>
(a_block.eval(inp[0]), b_block.eval(inp[1]))
Create a Record Block.
If named_children is list or tuple or ordered dict, then the output tuple of the Record will preserve child order, otherwise the output tuple will be ordered by key.
named_children
: A dictionary, list of (key, block) pairs, or a tuple of blocks (in which case the keys are 0, 1, 2, ...).name
: An optional string name for the block.
A block that runs all of its children (conceptually) in parallel.
AllOf().eval(inp) => None
AllOf(a).eval(inp) => (a.eval(inp),)
AllOf(a, b, c).eval(inp) => (a.eval(inp), b.eval(inp), c.eval(inp))
*blocks
: Blocks.**kwargs
: {name: name_string} or {}.
See above.
A block that returns a particular TF tensor or NumPy array.
Creates the block.
tensor
: A TF tensor or variable with a complete shape, or a NumPy array.name
: A string. Defaults to the name oftensor
if it has one.
TypeError
: Iftensor
is not a TF tensor or variable or NumPy array.TypeError
: Iftensor
does not have a complete shape.
A TensorFlow function, wrapped in a block.
The TensorFlow function that is passed into a Function
block must be a batch
version of the operation you want. This doesn't matter for things like
element-wise addition td.Function(tf.add)
, but if you, for example, want a
Function
block that multiplies matrices, you need to call
td.Function(tf.batch_matmul)
. This is done for efficiency reasons, so that
calls to the same function can be batched together naturally and take
advantage of TensorFlow's parallelism.
Creates a Function
block.
tf_fn
: The batch version of the TensorFlow function to be evaluated.name
: An optional string name for the block. If present, must be a valid name for a TensorFlow scope.infer_output_type
: A bool; whether or not to infer the output type of of the block by invokingtf_fn
once on dummy placeholder. If False, you will probably need to callset_output_type()
explicitly.
Concatenates a non-empty tuple of tensors into a single tensor.
Create a Concat block.
concat_dim
: The dimension to concatenate along (not counting the batch dimension).flatten
: Whether or not to recursively concatenate nested tuples of tensors. Default is False, in which case we throw on nested tuples.name
: An optional string name for the block. If present, must be a valid name for a TensorFlow scope.
A block of zeros, voids, and empty sequences of output_type
.
If output_type
is a tensor type, the output is tf.zeros
of this
type. If it is a tuple type, the output is a tuple of Zeros
of the
corresponding item types. If it is void, the output is void. If it
is a sequence type, the output is an empty sequence of this type.
output_type
: A type. May not contain pyobject types.name
: An optional string name for the block.
A block.
TypeError
: Ifoutput_type
contains pyobject types.
Map a block over a sequence or tuple.
Left-fold a two-argument block over a sequence or tuple.
Create an RNN block.
An RNN takes a tuple of (input sequence, initial state) as input, and returns a tuple of (output sequence, final state) as output. It can be used to implement sequence-to-sequence RNN models, such as LSTMs.
If initial_state_from_input
is False (the default), then the output of
initial_state
will be used for the initial state instead, and the input to
the RNN block is just the input sequence, rather than a (sequence, state)
tuple. If initial_state
is None (the default), then a block of the form
td.Zeros(cell.output_type[1])
will be created. This requires
that cell has an output type set (which it will if it is e.g. a
td.ScopedLayer
wrapping a tf rnn cell). For example:
cell = td.ScopedLayer(tf.contrib.rnn.GRUCell(num_units=16), 'mygru')
model = td.Map(td.Vector(8)) >> td.RNN(gru_cell)
cell
: a block or layer that takes (input_elem, state) as input and produces (output_elem, state) as output.initial_state
: an (optional) tensor or block to use for the initial state.initial_state_from_input
: if True, pass the initial state as an input to the RNN block, otherwise use initial_state.name
: An optional string name.
ValueError
: if initial_state_from_input == True and initial_state != None
a block.
Reduce a two-argument block over a sequence or tuple.
Sums its inputs.
Takes the minimum of its inputs. Zero on no inputs.
Takes the maximum of its inputs. Zero on no inputs.
Takes the average of its inputs. Zero on no inputs.
Block that creates an infinite sequence of the same element.
This is useful in conjunction with Zip
and Map
, for example:
def center_seq(seq_block):
return (seq_block >> AllOf(Identity(), Mean() >> Broadcast()) >> Zip() >>
Map(Function(tf.sub)))
Converts a tuple of sequences to a sequence of tuples.
The output sequence is truncated in length to the length of the shortest input sequence.
A Zip followed by a Map.
ZipWith(elem_block) => Zip() >> Map(elem_block)
elem_block
: A block with a tuple input type.name
: An optional string name for the block.
A block zips its input then maps over it with elem_block
.
Computes tuples of n-grams over a sequence.
(Map(Scalar()) >> NGrams(2)).eval([1, 2, 3]) => [(1, 2), (2, 3)]
Extracts the Nth element of a sequence, where N is a PyObject.
block = (Map(Scalar()), Identity()) >> Nth()
block.eval((list, n)) => list[n]
A block that calls Pythons getitem operator (i.e. [] syntax) on its input.
The input type may be a PyObject, a Tuple, or a finite Sequence.
(GetItem(key) >> block).eval(inp) => block.eval(inp[key])
Will raise a KeyError
if applied to an input where the key cannot be found.
A block that returns the length of its input.
A block which applies Python slicing to a PyObject, Tuple, or Sequence.
For example, to reverse a sequence:
(Map(Scalar()) >> Slice(step=-1)).eval(range(5)) => [4, 3, 2, 1, 0]
Positional arguments are not accepted in order to avoid the ambiguity of slice(start=N) vs. slice(stop=N).
*args
: Positional arguments; must be empty (see above).**kwargs
: Keyword arguments;start=None, stop=None, step=None, name=None
.
The block.
A ForwardDeclaration is used to define Blocks recursively.
Usage:
fwd = ForwardDeclaration(in_type, out_type) # declare type of block
block = ... fwd() ... fwd() ... # define block recursively
fwd.resolve_to(block) # resolve forward declaration
Resolve the forward declaration by setting it to the given block.
A block that dispatches its input to one of its children.
Can be used to dynamically dispatch on the type of its input, or emulate an 'if' or 'switch' statement.
case_blocks = {'a': a_block, 'b': b_block}
block = OneOf(GetItem('key'), case_blocks)
inp1 = {'key': 'a', ...}
inp2 = {'key': 'b', ...}
block.eval(inp1) => a_block.eval(inp1)
block.eval(inp2) => b_block.eval(inp2)
case_blocks = (block0, block1, block2)
block = OneOf(GetItem('index'), case_blocks)
inp1 = {'index': 0, ...}
inp2 = {'index': -1, ...}
block.eval(inp1) => block0.eval(inp1)
block.eval(inp2) => block2.eval(inp2)
Creates the OneOf block.
key_fn
: A python function or a block withPyObject
output type, which returns a key, when given an input. The key will be used to look up a child incase_blocks
for dispatch.case_blocks
: A non-empty Python dict, list of (key, block) pairs, or tuple of blocks (in which case the keys are 0, 1, 2, ...), where each block has the same input typeT
and the same output type.pre_block
: An optional block with output typeT
. If specified, pre_block will be used to pre-process the input before the input is handed to one ofcase_blocks
.name
: An optional string name for the block.
ValueError
: Ifcase_blocks
is empty.
A block that computes a metric.
Metrics are used in Fold when the size of a model's output is not fixed, but varies as a function of the input data. They are also handy for accumulating results across sequential and recursive computations without having the thread them through explicitly as return values.
For example, to create a block y
that takes a (label, prediction)
as input, adds an L2 'loss'
metric, and returns the prediction as
its output, you could say:
y = Composition()
with y.scope():
label = y.input[0]
prediction = y.input[1]
l2 = (Function(tf.sub) >> Function(tf.nn.l2_loss)).reads(label, prediction)
Metric('loss').reads(l2)
y.output.reads(prediction)
The input type of the block must be a TensorType
, or a
(TensorType, PyObjectType)
tuple.
The output type is always VoidType
. In the tuple input case, the
second item of the tuple becomes a label for the tensor value, which
can be used to identify where the value came from in a nested data
structure and/or batch of inputs.
For example:
sess = tf.InteractiveSession()
# We pipe Map() to Void() because blocks with sequence output types
# cannot be compiled.
block = td.Map(td.Scalar() >> td.Metric('foo')) >> td.Void()
compiler = td.Compiler.create(block)
sess.run(compiler.metric_tensors['foo'],
compiler.build_feed_dict([range(3), range(4)])) =>
array([ 0., 1., 2., 0., 1., 2., 3.], dtype=float32)
Or with labels:
sess = tf.InteractiveSession()
block = td.Map((td.Scalar(), td.Identity()) >> td.Metric('bar')) >> td.Void()
compiler = td.Compiler.create(block)
feed_dict, metric_labels = compiler.build_feed_dict(
[[(0, 'zero'), (1, 'one')], [(2, 'two')]],
metric_labels=True)
metric_labels => {'bar': ['zero', 'one', 'two']}
sess.run(compiler.metric_tensors['bar'], feed_dict) =>
array([ 0., 1., 2.], dtype=float32)
A block that merely returns its input.
A block with void output type that accepts any input type.
A fully connected network layer.
Fully connected layers require a float32
vector (i.e. 1D tensor) as input,
and build float32
vector outputs. Layers can be applied to multiple inputs,
provided they all have the same shape.
For example, to apply the same hidden layer to two different input fields:
layer = FC(100)
in = {'a': Vector(10), 'b': Vector(10)}
hidden = [in['a'] >> Call(layer), in['b'] >> Call(layer)] >> Concat()
out = hidden >> Call(FC(10, activation=None))
td.FC.__init__(num_units_out, activation=relu, initializer=None, input_keep_prob=None, output_keep_prob=None, name=None)
Initializes the layer.
num_units_out
: The number of output units in the layer.activation
: The activation function. Default is ReLU. UseNone
to get a linear layer.initializer
: The initializer for the weights. Defaults to uniform unit scaling with factor derived in http://arxiv.org/pdf/1412.6558v3.pdf if activation is ReLU, ReLU6, tanh, or linear. Otherwise defaults to truncated normal initialization with a standard deviation of 0.01.input_keep_prob
: Optional scalar float32 tensor for dropout on input. Feed 1.0 at serving to disable dropout.output_keep_prob
: Optional scalar float32 tensor for dropout on output. Feed 1.0 at serving to disable dropout.name
: An optional string name. Defaults toFC_%d % num_units_out
. Used to name the variable scope where the variables for the layer live.
An embedding for integers.
Embeddings require integer scalars as input, and build float32
vector
outputs. Embeddings can be applied to multiple inputs. Embedding
doesn't
do any hashing on its own, it just takes its inputs mod num_buckets
to determine which embedding(s) to return.
Implementation detail: tf.gather
currently only supports int32
and int64
. If the input type is smaller than 32 bits it will be
cast to tf.int32
. Since all currently defined TF dtypes other than
int32
and int64
have less than 32 bits, this means that we
support all current integer dtypes.
td.Embedding.__init__(num_buckets, num_units_out, initializer=None, name=None, trainable=True, mod_inputs=True)
Initializes the layer.
num_buckets
: How many buckets the embedding has.num_units_out
: The number of output units in the layer.initializer
: the initializer for the weights. Defaults to uniform unit scaling. The initializer can also be a Tensor or numpy array, in which case the weights are initialized to this value and shape. Note that in this case the weights will still be trainable unless you also passtrainable=False
.name
: An optional string name. Defaults toEmbedding_%d_%d % (num_buckets, num_units_out)
. Used to name the variable scope where the variables for the layer live.trainable
: Whether or not to make the weights trainable.mod_inputs
: Whether or not to mod the input by the number of buckets.
ValueError
: If the shape ofweights
is not(num_buckets, num_units_out)
.
An implementation of FractalNet.
See https://arxiv.org/abs/1605.07648 for details.
td.FractalNet.__init__(num_fractal_blocks, fractal_block_depth, base_layer_builder, mixer=None, drop_path=False, p_local_drop_path=0.5, p_drop_base_case=0.25, p_drop_recursive_case=0.25, name=None)
Initializes the FractalNet.
num_fractal_blocks
: The number of fractal blocks the net is made from. This variable is namedB
in the FractalNet paper. This argument uses the wordblock
in the sense that the FractalNet paper uses it.fractal_block_depth
: How deeply nested the blocks are. This variable isC-1
in the paper.base_layer_builder
: A callable that takes a name and returns aLayer
object. We would pass in a convolutional layer to reproduce the results in the paper.mixer
: The join operation in the paper. Assumed to have two arguments. Defaults to element-wise averaging. Mixing doesn't occur if either path gets dropped.drop_path
: A boolean, whether or not to do drop-path. Defaults to False. If selected, we do drop path as described in the paper (unless drop-path choices is provided in which case how drop path is done can be further customized by the user.p_local_drop_path
: A probability between 0.0 and 1.0. 0.0 means always do global drop path. 1.0 means always do local drop path. Default: 0.5, as in the paper.p_drop_base_case
: The probability, when doing local drop path, to drop the base case.p_drop_recursive_case
: The probability, when doing local drop path, to drop the recusrive case. (Requires:p_drop_base_case + p_drop_recursive_case < 1
)name
: An optional string name.
Create a Fold Layer that wraps a TensorFlow layer or RNN cell.
The default TensorFlow mechanism for weight sharing is to use tf.variable_scope, but this requires that a scope parameter be passed whenever the layer is invoked. ScopedLayer stores a TensorFlow layer, along with its variable scope, and passes the scope appropriately. For example:
gru_cell1 = td.ScopedLayer(tf.contrib.rnn.GRUCell(num_units=16), 'gru1')
... td.RNN(gru_cell1) ...
Wrap a TensorFlow layer.
layer_fn
: A callable that accepts and returns nests of batched tensors. A nest of tensors is either a tensor or a sequence of nests of tensors. Must also accept ascope
keyword argument. For example, may be an instance oftf.contrib.rnn.RNNCell
.name_or_scope
: A variable scope or a string to use as the scope name.
Tensors (which may be numpy or tensorflow) of a particular shape.
Tensor types implement the numpy array protocol, which means that
e.g. np.ones_like(tensor_type)
will do what you expect it
to. Calling np.array(tensor_type)
returns a zeroed array.
Returns a zeroed numpy array of this type.
Creates a tensor type.
shape
: A tuple or list of non-negative integers.dtype
: Atf.DType
, or stringified version thereof (e.g.'int64'
).
TypeError
: Ifshape
is not a tuple or list of non-negative integers.TypeError
: Ifdtype
cannot be converted to a TF dtype.
A type used for blocks that don't return inputs or outputs.
The type of an arbitrary python object (usually used as an input type).
Type for fixed-length tuples of items, each of a particular type.
TupleType
implements the sequence protocol, so e.g. foo[0]
is
the type of the first item, foo[2:4]
is a TupleType
with the
expected item types, and len(foo)
is the number of item types in
the tuple.
Creates a tuple type.
*item_types
: A tuple of types or a single iterable of types.
TypeError
: If the items ofitem_types
are not all types.
Type for variable-length sequences of elements all having the same type.
Creates a sequence type.
elem_type
: A type.
TypeError
: Ifelem_type
is not a type.
Type for infinite sequences of same element repeated.
Creates a sequence type.
elem_type
: A type.
TypeError
: Ifelem_type
is not a type.
Base class for training, evaluation, and inference plans.
mode
: One of 'train', 'eval', or 'infer'.compiler
: Atd.Compiler
, or None.examples
: An iterable of examples, or None.metrics
: An ordered dict from strings to real numeric tensors. These are used to make scalar summaries if they are scalars and histogram summaries otherwise.losses
: An ordered dict from strings to tensors.num_multiprocess_processes
: Number of worker processes to use for multiprocessing loom inputs. Default (None) is the CPU count. Set to zero to disable multiprocessing.is_chief_trainer
: A boolean indicating whether this is the chief training worker.name
: A string; defaults to 'plan'.logdir
: A string; used for saving/restoring checkpoints and summaries.rundir
: A string; the parent directory of logdir, shared between training and eval jobs for the same model.plandir
: A string; the parent directory of rundir, shared between many runs of different models on the same task.master
: A string; Tensorflow master to use.save_summaries_secs
: An integer; set to zero to disable summaries. In distributed training only the chief should set this to a non-zero value.print_file
: A file to print logging messages to; defaults to stdout.should_stop
: A callback to check for whether the training or eval jobs should be stopped.report_loss
: A callback for training and eval jobs to report losses.report_done
: A callback called by the eval jobs when they finish.
Raises an exception if the plan cannot be run.
A placeholder for normalizing loss summaries.
A scalar placeholder if there are losses and finalize_stats() has been called, else None.
A bool; whether or not summaries are being computed.
Creates a plan.
mode
: A string; 'train', 'eval', or 'infer'.
ValueError
: Ifmode
is invalid.
A Plan.
Creates a plan from flags.
setup_plan_fn
: A unary function accepting a plan as its argument. The function must assign the following attributes:- compiler
- examples (excepting when batches are being read from the loom input tensor in train mode e.g. by a dequeuing worker)
- losses (in train/eval mode)
- outputs (in infer mode)
A runnable plan with finalized stats.
ValueError
: If flags are invalid.
Creates a plan from a dictionary.
setup_plan_fn
: A unary function accepting a plan as its argument. The function must assign the following attributes:- compiler
- examples (excepting when batches are being read from the loom input tensor in train mode e.g. by a dequeuing worker)
- losses (in train/eval mode)
- outputs (in infer mode)
params
: a dictionary to pull options from.
A runnable plan with finalized stats.
ValueError
: If params are invalid.
Creates a TF supervisor for running the plan.
Finalizes metrics and losses. Gets/creates global_step if unset.
The global step tensor.
Initializes compilers's loom.
The plan must have a compiler with a compiled root block and an uninitialized loom.
In training mode this sets up enqueuing/dequeuing if num_dequeuers is
non-zero. When enqueuing, no actual training is performed; the
train op is to enqueue batches of loom inputs from train_set
,
typically for some other training worker(s) to dequeue from. When
dequeuing, batches are read using a dequeue op, typically from a
queue that some other training worker(s) are enqueuing to.
**loom_kwargs
: Arguments tocompiler.init_loom
. In enqueuing or dequeuing trainingloom_input_tensor
may not be specified.
A pair of two bools (needs_examples, needs_stats)
, indicating
which of these requirements must be met in order for the plan to
be runnable. In enqueuing training and in inference we need examples
but not stats, whereas in dequeuing the obverse holds. In all
other cases we need both examples and stats.
ValueError
: Ifcompiler
is missing.RuntimeError
: Ifcompile()
has not been called on the compiler.RuntimeError
: If the compiler's loom has already been initialized.
A scalar tensor, or None.
The total loss if there are losses and finalize_stats() has been called, else None.
Runs the plan with supervisor
and session
.
supervisor
: A TF supervisor, or None. If None, a supervisor is created by callingself.create_supervisor()
.session
: A TF session, or None. If None, a session is created by callingsession.managed_session(self.master)
. Will be installed as the default session while running the plan.
ValueError
: If the plan's attributes are invalid.RuntimeError
: If the plan has metrics or losses, andfinalize_stats()
has not been called.
A scalar string tensor, or None.
Merged summaries if compute_summaries is true and finalize_stats has been called, else None.
Plan class for training.
There are two primary training modes. When examples
is present,
batches are created in Python and passed to TF as feeds. In this
case, examples
must be non-empty. When examples
is absent,
batches are read directly from the compiler's loom_input_tensor. In
the latter case, each batch must have exactly batch_size elements.
batches_per_epoch
: An integer, or None; how many batches to consider an epoch whenexamples
is absent. Has no effect on training whenexamples
is present (because an epoch is defined as a full pass through the training set).dev_examples
: An iterable of development (i.e. validation) examples, or None.train_op
: An TF op, e.g.Optimizer.minimize(loss)
, or None.train_feeds
: A dict of training feeds, e.g. keep probability for dropout.epochs
: An integer, or None.batch_size
: An integer, or None.save_model_secs
: An integer. Note that if a dev_examples is provided then we save the best performing models, and this is ignored.task
: An integer. This is a different integer from the rest; ps_tasks, num_dequeuers, queue_capacity all indicate some form of capacity, whereas this guy is task ID.worker_replicas
: An integer.ps_tasks
: An integer.num_dequeuers
: An integer.queue_capacity
: An integer.optimizer_params
: a dictionary mapping strings to optimizer arguments. Used only if train_op is not provided.exact_batch_sizes
: A bool; if true,len(examples) % batch_size
items from the training set will be dropped each epoch to ensure that all batches have exactlybatch_size
items. Default is false. Has no effect if batches are being read from the compiler's loom input tensor. Otherwise, if true,examples
must have at leastbatch_size
items (to ensure that the training set is non-empty).
Plan class for evaluation.
eval_interval_secs
: Time interval between eval runs (when running in a loop). Set to zero or None to run a single eval and then exit; in this case data will be streamed. Otherwise, data must fit in memory.save_best
: A boolean determining whether to save a checkpoint if this model has the best loss so far.logdir_restore
: A string or None; log directory for restoring checkpoints from.batch_size
: An integer (defaults to 10,000); maximal number of examples to pass to a single call toSession.run()
. When streaming, this is also the maximal number of examples that will be materialized in-memory.
Plan class for inference.
-
key_fn
: A function from examples to keys, or None. -
outputs
: A list or tuple of tensors to be run to produce results, or None. -
results_fn
: A function that takes an iterable of(key, result)
pairs ifkey_fn
is present orresult
s otherwise; by default prints to stdout. -
context_manager
: A context manager for wrapping calls to `result_ -
batch_size
: An integer (defaults to 10,000); maximal number of examples to materialize in-memory. -
chunk_size
: An integer (defaults to 100); chunk size for each unit of work, if multiprocessing.
Defines all of the flags used by td.Plan.create_from_flags()
.
default_plan_name
: A default value for the--plan_name
flag.blacklist
: A set of string flag names to not define.
Returns a dict from plan option parameter names to their defaults.
Converts block_like
to a block.
The conversion rules are as follows:
type of block_like |
result |
---|---|
Block |
block_like |
Layer |
Function(block_like) |
(tf.Tensor, tf.Variable, np.ndarray) |
FromTensor(block_like) |
(dict, list, tuple) |
Record(block_like) |
block_like
: Described above.
A block.
TypeError
: Ifblock_like
cannot be converted to a block.
Converts type_like
to a Type
.
If type_like
is already a Type
, it is returned. The following
conversions are performed:
-
Python tuples become
Tuple
s; items are recursively converted. -
A
tf.TensorShape
becomes a correspondingTensorType
withdtype=float32
. Must be fully defined. -
Lists of
shape + [dtype]
(e.g.[3, 4, 'int32']
) becomeTensorType
s, with the defaultdtype=float32
if omitted. -
A
tf.Dtype
or stringified version thereof (e.g.'int64'
) becomes a corresponding scalarTensorType((), dtype)
. -
An integer
vector_len
becomes a corresponding vectorTensorType((vector_len,), dtype=float32)
.
type_like
: Described above.
A Type
.
TypeError
: Iftype_like
cannot be converted to aType
.
Returns a canonical representation of a type.
Recursively applies a reduction rule that converts tuples/sequences
of PyObjectType
to a single terminal PyObjectType
.
canonicalize_type((PyObjectType(), PyObjectType())) => PyObjectType()
canonicalize_type(SequenceType(PyObjectType())) => PyObjectType()
type_like
: A type or an object convertible to one byconvert_to_type
.
A canonical representation of type_like
.
A wrapper around an iterator that lets it be used as a TF feed value.
Edible iterators are useful when you have an expensive computation running asynchronously that you want to feed repeatedly. For example:
items = my_expensive_function() # returns an iterator, doesn't block
fd = {x: list(items)} # blocks here
while training:
do_stuff() # doesn't run until my_expensive_function() completes
sess.run(fetches, fd)
With an edible iterator you can instead say:
items = my_expensive_function() # returns an iterator, doesn't block
fd = {x: EdibleIterator(items)} # doesn't block
while training:
do_stuff() # runs right away
sess.run(fetches, fd) # blocks here
Python iterators are only good for a single traversal. This means
that if you call next()
before a feed dict is created, then that
value will be lost. When an edible iterator gets fed to TF the base
iterator is exhausted, but the results are cached, so the same
edible may be consumed repeatedly.
Implementation details: TF consumes feed values by converting them
to NumPy arrays. NumPy doesn't like iterators (otherwise we could
e.g use tee(), which would be simpler), so we use __array__
to
tell NumPy how to do the conversion.
NumPy array protocol; returns iterator values as an ndarray.
Returns iterator values as an ndarray if it exists, else None.
Yields successive batches from an iterable, as lists.
iterable
: An iterable.batch_size
: A positive integer.truncate
: A bool (default false). If true, then the lastlen_iterable % batch_size
items are not yielded, ensuring that all batches have exactlybatch_size
items.
Successive batches from iterable
, as lists of at most
batch_size
items.
ValueError
: Ifbatch_size
is non-positive.
Yields the items of an iterable repeatedly.
This function is particularly useful when items
is expensive to compute
and you want to memoize it without blocking. For example:
for items in epochs((my_expensive_function(x) for x in inputs), n):
for item in items:
f(item)
This lets f(item)
run as soon as the first item is ready.
As an optimization, when n == 1 items itself is yielded without memoization.
items
: An iterable.n
: How many times to yield; zero or None (the default) means loop forever.shuffle
: Whether or not to shuffle the items after each yield. Shuffling is performed in-place. We don't shuffle before the first yield because this would require us to block until all of the items were ready.prng
: Nullary function returning a random float in [0.0, 1.0); defaults torandom.random
.
An iterable of items
, n
times.
TypeError
: Ifitems
is not an iterable.ValueError
: Ifn
is negative.
Parses a list of key values pairs.
spec
: A comma separated list of strings of the form<key>=<value>
.
ValueError
: Ifspec
is malformed or contains duplicate keys.
A dict.
Constructs an optimizer from key-value pairs.
For example
build_optimizer_from_params('momentum', momentum=0.9, learning_rate=1e-3)
creates a MomentumOptimizer with momentum 0.9 and learning rate 1e-3.
optimizer
: The name of the optimizer to construct.**kwargs
: Arguments for the optimizer's constructor.
ValueError
: Ifoptimizer
is unrecognized.ValueError
: Ifkwargs
sets arguments that optimizer doesn't have, or fails to set arguments the optimizer requires.
A tf.train.Optimizer of the appropriate type.
Creates a new variable scope based on name
, nested in the current scope.
If name
ends with a /
then the new scope will be created exactly as if
you called tf.variable_scope(name)
. Otherwise, name
will be
made globally unique, in the context of the current graph (e.g.
foo
will become foo_1
if a foo
variable scope already exists).
name
: A non-empty string.
A variable scope.
TypeError
: ifname
is not a string.ValueError
: ifname
is empty.
Base class for objects with associated input/output types and names.
Returns the input type if known, else None.
Returns the output type if known, else None.
Updates the input type.
input_type
: A type, or None.
self
TypeError
: Ifinput_type
is not compatible with the current input type or its expected type classes.
Updates the type classes of the input type.
*input_type_classes
: A tuple of type classes.
self
TypeError
: Ifinput_type_classes
are not compatible with the current input type or its expected type classes.
Updates input and output types of two IOBase
objects to match.
other
: An instance of IOBase.
self
TypeError
: If the input/output types of self and other are incompatible.
Updates the output type.
output_type
: A type, or None.
self
TypeError
: Ifoutput_type
is not compatible with the current output type.
Updates the type class of the output type.
*output_type_classes
: A tuple of type classes.
self
TypeError
: Ifoutput_type_classes
are not compatible with the current output type or its expected type classes.
Base class for all blocks.
A Block
is an object which maps a data-structure (or queued TensorFlow
operations, depending on a the block's input type) into queued TensorFlow
operations. (Except for InputTransform which maps from
data-structure to data-structure.)
When interacting with Fold you can debug your blocks
by calling eval
inside of a TF session. (This has high
per-call overhead and is not recommended for long-running jobs.)
The efficient way to evaluate a block repeatedly is to pass the root of a tree
of blocks to a persistent td.Compiler
object (note that
eval
creates a compiler object behind the scenes.)
Return a reference to the i^th output from this block.
Function composition; (a >> b).eval(x) => b(a(x))
.
Function composition; (a >> b).eval(x) => b(a(x))
.
Evaluates this block on inp
in a TF session.
Intended for testing and interactive development. If there are any uninitialized variables, they will be initialized prior to evaluation.
inp
: An input to the block.feed_dict
: A dictionary that mapsTensor
objects to feed values.session
: The TF session to be used. Defaults to the default session.tolist
: A bool; whether to return (possibly nested) Python lists in place of NumPy arrays.use_while_loop
: A bool; whether to use atf.while_loop
in evaluation (default) or to unroll the loop. Provided for testing and debugging, should not affect the result.
The result of running the block. If output_type
is tensor, then a
NumPy array (or Python list, if tolist
is true). If a tuple, then a
tuple. If a sequence, then a list, or an instance of itertools.repeat
in the case of an infinite sequence. If metrics are defined then eval
returns a (result, metrics)
tuple, where metrics
is a dict mapping
metric names to NumPy arrays.
ValueError
: Ifsession
is none and no default session is registered. If the block contains no TF tensors or ops then a session is not required.
Returns the loom max_depth
needed to evaluate inp
.
Like eval
, this is a convenience method for testing and
interactive development. It cannot be called after the TF graph
has been finalized (nb. Compiler.max_depth
does not have this limitation).
inp
: A well-formed input to this block.
An int (see above).
Sets self
to read its inputs from other
.
*other
: which blocks to make the current block read from.
self
AssertionError
: if no composition scope has been entered.
A callable that accepts and returns nests of batched of tensors.
Creates the layer.
input_type
: A type.output_type
: A type.name_or_scope
: A string or variable scope. If a string, a new variable scope will be created by callingcreate_variable_scope
, with defaults inherited from the current variable scope. If no caching device is set, it will be set tolambda op: op.device
. This is becausetf.while
can be very inefficient if the variables it uses are not cached locally.
Base class for types that can be used as inputs/outputs to blocks.
Converts an instance of this type to a flat list of terminal values.
Calls fn(terminal_type, value) for all terminal values in instance.
Returns the total number of scalar elements in the type.
Returns None if the size is not fixed -- e.g. for a variable-length sequence.
Returns an iterable of all terminal types in this type, in pre-order.
Void is not considered to be a terminal type since no terminal values are needed to construct it. Instead, it has no terminal types.
An iterable with the terminal types.
Converts a iterator over terminal values to an instance of this type.