RE-DS-Word-Attention-Models

Paper Link: http://www.akbc.ws/2017/papers/15_paper.pdf Abstract: Distant Supervision (DS) is a popular technique for developing relation extractors starting with limited supervision. We note that most of the sentences in the distant supervision setting are very long and may bene t from word attention for better sentence representation. Our contributions in this paper are threefold. Firstly, we propose two novel word attention models for distantly-supervised relation extraction: (1) a Bi-GRU based word attention model (BGWA), (2) an entity-centric attention model (EA), and a combination model which combines multiple complementary models using weighted voting method for improved relation extraction. Secondly, we introduce GDS, a new distant supervision dataset for relation extraction. GDS removes test data noise present in all previous distant-supervision benchmark datasets, making credible automatic evaluation possible. Thirdly, through extensive experiments on multiple real-world datasets, we demonstrate effectiveness of the proposed methods

Software required

python 2.7, pytorch 0.1.12, cuda 8.0, numpy 1.12.1, sklearn 0.18.2

Running the Model Files

The folder Codes/Models/ has the files for the 3 models:

BGWA.py : Bi-GRU based word attention model
EA.py : Entity-centric attention model
PCNN.py : Piecewise convolutional neural model
ENSEMBLE.py : Ensemble model for relation extraction

files 1,2,3 can be run in the following way:

python2.7 <file> <data directory> <train file name> <test file name> <dev file name> <word embedding file name>

The command has 5 arguments

: The name of the directory containing the processed files
: The name (not the path) of the processed train file
: The name (not the path) of the processed test file
: The name (not the path) of the processed dev file
: The name (not the path) of the processed word embedding file

File 4 can be run using the following command

python2.7 ENSEMBLE.py <path_to_dataset_ensemble_files>

e.g: Codes/Models$ python2.7 ENSEMBLE.py ../../Data/Ensemble_Data/gids/

Please extract the preprocessed files for riedel dataset before using ensemble model by running the following command

Codes/Models$ unzip ../../Data/Ensemble_Data/riedel2010/preprocessed_dataset.zip

Preprocessing the files

The folder Code/Preprocess/ has the files for preprocessing the data. For the Reidel2010 dataset, just run the file preprocess.sh to get the output files in the same folder. There will be some intermediate files, but the final processed files will have the following name:

train_final.p : The processed train files
test_final.p : The processed test files
dev_final.p : The processed dev files

For the GIDS dataset, just run the file preprocess_GIDS.sh to get the output files in the same folder. There will be some intermediate files, but the final processed files will have the following name:

train_final.p : The processed train files
test_final.p : The processed test files
dev_final.p : The processed dev files

For the two datasets used in the paper you can find pre-processed files at 'Data/Ensemble_Data/gids/preprocessed_dataset/' and 'Data/Ensemble_Data/riedel2010/preprocessed_dataset.zip' (extract in same folder)

Saved Models and PR files

Find all the models and saved pr files from the following public link:

Results_&_ Models

(Generated Precision-Recall (PR) files have precision in 1st column and recall in the second column)

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
Codes		Codes
Data		Data
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RE-DS-Word-Attention-Models

Software required

Running the Model Files

Preprocessing the files

Saved Models and PR files

About

Releases

Packages

Languages

rishabhjoshi/RE-DS-Word-Attention-Models

Folders and files

Latest commit

History

Repository files navigation

RE-DS-Word-Attention-Models

Software required

Running the Model Files

Preprocessing the files

Saved Models and PR files

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages