Configuration of NeuralClassifier uses JSON.
- task_info
- label_type: Candidates: "single-label", "multi-label".
- hierarchical: Boolean. Indicates whether it is a hierarchical classification.
- hierar_taxonomy: A text file describes taxonomy.
- hierar_penalty: Float.
- device: Candidates: "cuda", "cpu".
- model_name: Candidates: "FastText", "TextCNN", "TextRNN", "TextRCNN", "DRNN", "VDCNN", "DPCNN", "AttentiveConvNet", "Transformer".
- checkpoint_dir: checkpoint directory
- model_dir: model directory
- data
- train_json_files: train input data.
- validate_json_files: validation input data.
- test_json_files: test input data.
- generate_dict_using_json_files: generate dict using train data.
- generate_dict_using_all_json_files: generate dict using train, validate, test data.
- generate_dict_using_pretrained_embedding: generate dict from pre-trained embedding.
- dict_dir: dict directory.
- num_worker: number of porcess to load data.
- feature_names: Candidates: "token", "char".
- min_token_count
- min_char_count
- token_ngram: N-Gram, for example, 2 means bigram.
- min_token_ngram_count
- min_keyword_count
- min_topic_count
- max_token_dict_size
- max_char_dict_size
- max_token_ngram_dict_size
- max_keyword_dict_size
- max_topic_dict_size
- max_token_len
- max_char_len
- max_char_len_per_token
- token_pretrained_file: token pre-trained embedding.
- keyword_pretrained_file: keyword pre-trained embedding.
- batch_size
- eval_train_data: whether evaluate training data when training.
- start_epoch: start number of epochs.
- num_epochs: number of epochs.
- num_epochs_static_embedding: number of epochs that input embedding does not update.
- decay_steps: decay learning rate every decay_steps.
- decay_rate: Rate of decay for learning rate.
- clip_gradients: Clip absolute value gradient bigger than threshold.
- l2_lambda: l2 regularization lambda value.
- loss_type: Candidates: "SoftmaxCrossEntropy", "SoftmaxFocalCrossEntropy", "SigmodFocalCrossEntropy", "BCEWithLogitsLoss".
- sampler: If loss type is NCE, sampler is needed. Candidate: "fixed", "log", "learned", "uniform".
- num_sampled: If loss type is NCE, need to sample negative labels.
- hidden_layer_dropout: dropout of hidden layer.
- visible_device_list: GPU list to use.
- type: Candidates: "embedding", "region_embedding".
- dimension: dimension of embedding.
- region_embedding_type: config for Region embedding. Candidates: "word_context", "context_word".
- region_size region size, must be odd number. Config for Region embedding.
- initializer: Candidates: "uniform", "normal", "xavier_uniform", "xavier_normal", "kaiming_uniform", "kaiming_normal", "orthogonal".
- fan_mode: Candidates: "FAN_IN", "FAN_OUT".
- uniform_bound: If embedding_initializer is uniform, this param will be used as bound. e.g. [-embedding_uniform_bound,embedding_uniform_bound].
- random_stddev: If embedding_initializer is random, this param will be used as stddev.
- dropout: dropout of embedding layer.
- optimizer_type: Candidates: "Adam", "Adadelta"
- learning_rate: learning rate.
- adadelta_decay_rate: useful when optimizer_type is Adadelta.
- adadelta_epsilon: useful when optimizer_type is Adadelta.
- text_file
- threshold: float trunc threshold for predict probabilities.
- dir: output dir of evaluation.
- batch_size: batch size of evaluation.
- is_flat: Boolean, flat evaluation or hierarchical evaluation.
- logger_file: log file path.
- log_level: Candidates: "debug", "info", "warn", "error".
- kernel_sizes: kernel size.
- num_kernels: number of kernels.
- top_k_max_pooling: max top-k pooling.
- hidden_dimension: dimension of hidden layer.
- rnn_type: Candidates: "RNN", "LSTM", "GRU".
- num_layers: number of layers.
- doc_embedding_type: Candidates: "AVG", "Attention", "LastHidden".
- attention_dimension: dimension of self-attention.
- bidirectional: Boolean, use Bi-RNNs.
see TextCNN and TextRNN
- hidden_dimension: dimension of hidden layer.
- window_size: window size.
- rnn_type: Candidates: "RNN", "LSTM", "GRU".
- bidirectional: Boolean.
- cell_hidden_dropout
- vdcnn_depth: depth of VDCNN.
- top_k_max_pooling: max top-k pooling.
- kernel_size: kernel size.
- pooling_stride: stride of pooling.
- num_kernels: number of kernels.
- blocks: number of blocks for DPCNN.
- attention_type: Candidates: "dot", "bilinear", "additive_projection".
- margin_size: attentive width, must be odd.
- type: Candidates: "light", "advanced".
- hidden_size: size of hidder layer.
- d_inner: dimension of inner nodes.
- d_k: dimension of key.
- d_v: dimension fo value.
- n_head: number of heads.
- n_layers: number of layers.
- dropout
- use_star: whether use Star-Transformer, see Star-Transformer