Glove

Cython general implementation of the Glove multi-threaded training.

GloVe is an unsupervised learning algorithm for generating vector representations for words. Training is done using a co-occcurence matrix from a corpus. The resulting representations contain structure useful for many other tasks.

The paper describing the model is here.

The original implementation for this Machine Learning model can be found here.

@author Jonathan Raiman

Example

To use this package you need a sparse co-occurence matrix. This matrix is represented by nested dictionaries that use ints as keys with a 0-index.

For instance below we have a corpus of 3 indices. Below 0 co-occurs with 2, 3.5 times:

import glove

cooccur = {
	0: {
		0: 1.0,
		2: 3.5
	},
	1: {
		2: 0.5
	},
	2: {
		0: 3.5,
		1: 0.5,
		2: 1.2
	}
}

model = glove.Glove(cooccur, d=50, alpha=0.75, x_max=100.0)

for epoch in range(25):
    err = model.train(batch_size=200, workers=9, batch_size=50)
    print("epoch %d, error %.3f" % (epoch, err), flush=True)

The trained embeddings are now present under model.W.

Usage

The model is controlled by setting several hyperpameters.

Glove.init()

cooccurence dict<int, dict<int, float>> : the co-occurence matrix
alpha float : (default 0.75) hyperparameter for controlling the exponent for normalized co-occurence counts.
x_max float : (default 100.0) hyperparameter for controlling smoothing for common items in co-occurence matrix.
d int : (default 50) how many embedding dimensions for learnt vectors
seed int : (default 1234) the random seed

Glove.train

step_size float : the learning rate for the model
workers int : number of worker threads used for training
batch_size int : how many examples should each thread receive (controls the size of the job queue)

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
glove		glove
.gitignore		.gitignore
MANIFEST.in		MANIFEST.in
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Glove

Example

Usage

Glove.init()

Glove.train

About

Releases

Packages

Languages

JonathanRaiman/glove

Folders and files

Latest commit

History

Repository files navigation

Glove

Example

Usage

Glove.init()

Glove.train

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages