Name		Name	Last commit message	Last commit date
Latest commit History 341 Commits
.github		.github
.streamlit		.streamlit
seq2rel		seq2rel
test_fixtures		test_fixtures
tests		tests
training_config		training_config
.allennlp_plugins		.allennlp_plugins
.coveragerc		.coveragerc
.flake8		.flake8
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
codecov.yml		codecov.yml
demo.py		demo.py
mypy.ini		mypy.ini
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini

Repository files navigation

seq2rel: A sequence-to-sequence approach for document-level relation extraction

The corresponding code for our paper: A sequence-to-sequence approach for document-level relation extraction. Checkout our demo here!

Installation

This repository requires Python 3.7.1 or later.

Setting up a virtual environment

Before installing, you should create and activate a Python virtual environment. If you need pointers on setting up a virtual environment environment, please see the AllenNLP install instructions.

Installing the library and dependencies

If you don't plan on modifying the source code, install from git using pip

pip install git+https://github.com/JohnGiorgi/seq2rel.git

Otherwise, clone the repository and install from source using Poetry:

# Install poetry for your system: https://python-poetry.org/docs/#installation
curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | python

# Clone and move into the repo
git clone https://github.com/JohnGiorgi/seq2rel
cd seq2rel

# Install the package with poetry
poetry install

Usage

Preparing a dataset

Datasets are tab-seperated files, where each example is contained on its own line. The first column contains the text, and the last column contains the relation. Relations themselves must be serialized to strings.

Take the following example, which denotes an adverse drug event ("@ADE@") between the drug benzodiazepine ("@DRUG@") and the effect coma ("@EFFECT@")

A review of the literature showed no previous description of this pattern in benzodiazepine coma.	@ADE@ benzodiazepine @DRUG@ coma @EFFECT@ @EOR@

For convenience, we provide a second package, seq2rel-ds, which makes it easy to generate data in this format for various popular corpora.

Training

To train the model, use the allennlp train command with one of our configs (or write your own!)

For example, to train a model on the Adverse Drug Event (ADE) corpus, first preprocess this data with seq2rel-ds

seq2rel-ds preprocess ade "path/to/preprocessed/ade"

Then, call allennlp train with the ADE config we have provided

allennlp train "training_config/transformer_copynet_ade.jsonnet" \
    --serialization-dir "output" \
    --overrides "{'train_data_path': 'path/to/preprocessed/ade/train.tsv'}" \
    --include-package "seq2rel"

The --overrides flag allows you to override any field in the config with a JSON-formatted string, but you can equivalently update the config itself if you prefer. During training, models, vocabulary, configuration, and log files will be saved to the directory provided by --serialization-dir. This can be changed to any directory you like.

Inference

To use the model as a library, import Seq2Rel and pass it some text (it accepts both strings and lists of strings)

from seq2rel import Seq2Rel

# Pretrained models stored in GitHub. Downloaded and cached automatically. This model is ~500mb.
pretrained_model = "ade"

# Models are loaded via a dead-simple interface
seq2rel = Seq2Rel(pretrained_model)

# Extremely flexible inputs. User can provide...
# - a string
# - a list of strings
# - a text file (local path or URL)
input_text = "Ciprofloxacin-induced renal insufficiency in cystic fibrosis."

seq2rel(input_text)
>>> ['ciprofloxacin @DRUG@ renal insufficiency @EFFECT@ @ADE@']

See the list of available PRETRAINED_MODELS in seq2rel/seq2rel.py

python -c "from seq2rel import PRETRAINED_MODELS ; print(list(PRETRAINED_MODELS.keys()))"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

seq2rel: A sequence-to-sequence approach for document-level relation extraction

Table of contents

Installation

Setting up a virtual environment

Installing the library and dependencies

Usage

Preparing a dataset

Training

Inference

About

Releases 1

Packages

Contributors 2

Languages

License

JohnGiorgi/seq2rel

Folders and files

Latest commit

History

Repository files navigation

seq2rel: A sequence-to-sequence approach for document-level relation extraction

Table of contents

Installation

Setting up a virtual environment

Installing the library and dependencies

Usage

Preparing a dataset

Training

Inference

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Languages

Packages