Add relevance transfer package #14

achyudh · 2019-04-19T21:35:32Z

No description provided.

* Fix package imports * Update README.md * Fix bug due to TAR/AR attribute check * Add BERT models * Add BERT tokenizer * Return logits from the model.py * Remove unused classes in models/bert * Return logits from the model.py (#12) * Remove unused classes in models/bert (#13) * Add initial main file * Add args for BERT * Add partial support for BERT * Initialize training and optimization * Draft the structure of Trainers for BERT * Remove duplicate tokenizer * Add utils * Move optimization to utils * Add more structure for trainer * Refactor the trainer (#15) * Refactor the trainer * Add more edits * Add support for our datasets * Add evaluator * Split data4bert module into multiple processors * Refactor BERT tokenizer * Integrate BERT into Castor framework (#17) * Remove unused classes in models/bert * Split data4bert module into multiple processors * Refactor BERT tokenizer * Add multilabel support in BertTrainer * Add multilabel support in BertEvaluator * Add get_test_samples method in dataset processors * Fix args.py for BERT * Add support for Reuters, IMDB datasets for BERT * Revert "Integrate BERT into Castor framework (#17)" This reverts commit e4244ec. * Fix paths to datasets in dataset classes and args * Add SST dataset * Add hedwig-data instructions to README.md * Fix KimCNN README * Fix RegLSTM README * Fix typos in README * Remove trec_eval from README * Add tensorboardX to requirements.txt * Rename processors module to bert_processors * Add method to print metrics after training * Add model check-pointing and early stopping for BERT * Add logos * Update README.md * Fix code comments in classification trainer * Add support for AAPD, Sogou, AGNews and Yelp2014 * Fix bug that deleted saved models * Update README for HAN * Update README for XML-CNN * Remove redundant TODOs from the READMEs * Fix logo in README.md * Update README for Char-CNN * Fix all the READMEs * Resolve conflict * Fix Typos * Re-Add SST2 Processor * Add support for evaluating trained model * Update args.py * Resolve issues due to DataParallel wrapper on saved model * Remove redundant Yelp processor * Fix bug for safely creating the saving directory * Change checkpoint paths to timestamps * Remove unwanted string.strip() from tokenizer * Create save path if it doesn't exist * Decouple model checkpoints from code * Remove model choice restrictions for BERT * Remove model/distill driver * Simplify checkpoint directory creation

Ashutosh-Adhikari

LGTM!

achyudh and others added 11 commits April 13, 2019 23:25

Resolve conflicts in the dev fork

cb14201

Merge branch 'karkaroff-master'

8346514

Resolve merge conflicts in README.md

fff8e0a

Add TREC relevance datasets

0979f77

Add relevance transfer trainer and evaluator

e5f2ee0

Add re-ranking module

57f0680

Add ImbalancedDatasetSampler

7d26d71

Add relevance transfer package

eab4fc2

Fix import in classification trainer

a08b2d1

Merge remote-tracking branch 'castorini/master'

cb3ca31

achyudh requested a review from Ashutosh-Adhikari April 19, 2019 21:35

achyudh self-assigned this Apr 19, 2019

achyudh added the enhancement New feature or request label Apr 19, 2019

Ashutosh-Adhikari approved these changes Apr 20, 2019

View reviewed changes

achyudh merged commit 99a01c6 into castorini:master Apr 20, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add relevance transfer package #14

Add relevance transfer package #14

achyudh commented Apr 19, 2019

Ashutosh-Adhikari left a comment

Add relevance transfer package #14

Add relevance transfer package #14

Conversation

achyudh commented Apr 19, 2019

Ashutosh-Adhikari left a comment

Choose a reason for hiding this comment