GitHub - kapitsa2811/CharNet-1: A multi-layer perceptron network for artificial text and character generation

CharNet

A multi-layer perceptron network for artificial text and character generation

General

CharNet is a multilayer perceptron network using a feed-forward network where each layer receives all previous layers as inputs.
CharNet acts as a humanly-readable example to show improvements compared to common multilayer perceptron models and RNNs for sequential data.
Various parameters are supported, most of them being switches, you can try to configure your own model.
To get started, clone this repository, import the charnet and execute it with your configuration. For a more in-depth tutorial, you can also follow the notebook in here or execute the notebook using a Cloud-GPU at Google Colaboratory.

A 3GB example dataset created out of all books found on the full books site can be downloaded in here.
A 500MB dataset based on the data from textfiles can be downloaded in here. This dataset is significantly more noisy.
Both datasets have been formatted to use a minimalistic character set.
A third, significantly smaller dataset, is a dataset containing all tweets by Donald Trump, as seen in here. Its only 5 Megabytes, contains links and is not formatted yet. It can be found in here.
Lastly there also is a dump of the linux kernel with removed comments which can be found in here. It hasnt been formatted either. It is recommended to copy those datasets to your google drive to then use them in Google Colab if you are looking to try this project out.

Parameters

Parameter name	Description	Datatype	Default value
leakyReLU	LeakyReLU is a rectifier function which can be used as an activation layer. See this graphic for more information.	Boolean	False
batchNorm	Batch Normalization is a technique that can be used in keras to achieve better results by normalizing the activations of the previous layer.	Boolean	False
trainNewModel	Specifies whether a new model should be trained or an old one should be loaded to continue training.	Boolean	True
repeatInput	Append the input layer to the input of every layer.	Boolean	True
unroll	When using a LSTM on a non-GPU device, LSTMs can be unrolled to reduce computational cost while increasing RAM cost drastically.	Boolean	True
splitInputs	If a large number of inputs is used, the inputs can be split to many perceptrons which all receive only part of the input.	Boolean	False
initialLSTM	Use an LSTM as the first layer after the input layer. LSTMs tend to have a better performance for sequence prediction, yet are against the idea of an MLP.	Boolean	False
inputDense	Adds a smaller layer in front of the hidden layers to reduce calculation. The number of neurons in it is equal to the number of inputs.	Boolean	False
splitLayer	Similar to splitInputs, layers can be split as well when they receive many inputs. This particular implementation also allows data to be fed from left to right of the model. It is highly discoured to use this parameter as it disables every form of parallelization.	Boolean	False
concatDense	Concatenate all previous layers to one big input layer for the next hidden layer. See model.png for more information. This is the main point of research of CharNet.	Boolean	True
bidirectional	If a LSTM is used, it can either be bidirectional or not. Bidirectional LSTMs tend to understand constructs better but also require twice the time to train and evaluate.	Boolean	True
concatBeforeOutput	If activated, all previous layers will be concatenated to one large layer to be fed into the output layer. See model.png for more information.	Boolean	True
drawModel	Draws a graphic as seen in model.png using keras' built-in functions to help visualize the networks architecture.	Boolean	True
gpu	Specifies whether a NVIDIA GPU is used or not. Using an NVIDIA GPU allows for massive optimizations in LSTM calculation.	Boolean	True
indexIn	If activated, indexes or labels are used as inputs instead of one-hot encoded labels. For more information, see this blog post.	Boolean	False
classNeurons	To improve the user experience, a parameter that automatically multiplies the number of neurons provided with the number of classes used is added. If used, a hidden layer will have significantly more neurons allowing for higher performance when using one-hot encoding.	Boolean	True
inputs	Specifies the number of characters used as an input for the neural network.	Integer	60
neuronsPerLayer	Specifies the number neurons in each hidden layer. In most cases 2*inputs is sufficient.	Integer	120
layerCount	The total number of hidden layers used in the network. Generally four is the highest recommended value.	Integer	4
epochs	Number of real epochs where the entire dataset was passed through the neural network.	Integer	1
kerasEpochsPerEpoch	To receive frequent updates on large datasets, one can set this parameter so that keras calls the callbacks more often. Warning: This does interfere with the training optimizer, potentially causing negative results.	Integer	256
learningRate	The rate the neural network learns with. The most important hyperparameter to test and tweak.	Float	0.005
outputs	Number of outputs in characters. It is highly discouraged to use more than one output.	Integer	1
dropout	Dropout is a regularization technique to make sure the network learns instead of remembering trainings data. Values from 0.15 to 0.4 can be used without significantly harming the network.	Integer	0.35
batchSize	The batch size is the number of training samples the neural network uses to make one update to the weights. A higher number improves the speed and reduces overfitting but might also harm the network.	Integer	1024
valSplit	Specifies how much of the input dataset should be used for validation at the end of every keras epoch. Generally 100MB or more should be set aside for validation.	Float	0.1
verbose	Used to set a verbosity level in keras. According to keras' documentation: `0 = silent, 1 = progress bar, 2 = one line per epoch.`	Integer	1
outCharCount	Number of characters to be recursively generated using a test string at every end of a keras epoch.	Integer	512
changePerKerasEpoch	Sets the percentage of the original batch size that should be added to the current batch size after every keras epoch. Example: `Epoch 1: Batch Size 100; Epoch 1: Batch Size 200; Epoch 3: Batch Size 300`	Float	0.25
activation	Activation function used in the neural network. The default is gelu but other activation functions can be used as well.	String	'gelu'
weightFolderName	Defines the name of the folder weights are saved to and loaded from.	String	'MLP_Weights'
testString	A string that is used as an input for the neural network to predict at the end of every keras epoch. If set to None, it will default to this string.	String	None
charSet	A set of characters used in the text used as input. Leaving at None will make the network assume it is this char set. A text can be formatted to this format by passing `prepareText=True` to your charnet instance or explicitly calling `charnet.prepareText()`.	String	None

Todo

Explain parameters
Make code humanly readable
Add config dict instead of enforcing parameters to be assigned
Create Notebooks to link to
Add link to example datasets
Add proper example showing most interface functions using the trump dataset as an example.
Add example output

Name		Name	Last commit message	Last commit date
Latest commit History 115 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
mlp		mlp
.gitattributes		.gitattributes
.gitignore		.gitignore
CharNet_Notebook.ipynb		CharNet_Notebook.ipynb
LICENSE		LICENSE
README.md		README.md
interface.py		interface.py
model.png		model.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CharNet

General

Parameters

Todo

About

Releases

Packages

Languages

License

kapitsa2811/CharNet-1

Folders and files

Latest commit

History

Repository files navigation

CharNet

General

Parameters

Todo

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages