Skip to content

An R package for estimating and doing statistical inference on context-specific word embeddings.

Notifications You must be signed in to change notification settings

prodriguezsosa/conText

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

logo-conText

About

conText provides a fast, flexible and transparent framework to estimate context-specific word and short document embeddings using the 'a la carte' embeddings approach developed by Khodak et al. (2018) and evaluate hypotheses about covariate effects on embeddings using the regression framework developed by Rodriguez et al. (2021).

How to Install

install.packages("conText")

Datasets

To use conText you will need three objects:

  1. A (quanteda) corpus with the documents and corresponding document variables you want to evaluate.
  2. A set of (GloVe) pre-trained embeddings.
  3. A transformation matrix specific to the pre-trained embeddings.

conText includes sample objects for all three but keep in mind these are just meant to illustrate function implementations. In this Dropbox folder we have included the raw versions of these objects including the full Stanford GloVe 300-dimensional embeddings (labeled glove.rds) and its corresponding transformation matrix estimated by Khodak et al. (2018) (labeled khodakA.rds).

Quick Start Guides

Check out this Quick Start Guide to get going with conText.

About

An R package for estimating and doing statistical inference on context-specific word embeddings.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published