Fine-tuning BERT with Distilled Data for Semantic Similarity, Textual Entailment and Word Sense Disambiguation
In this project:
- We will create Distilled data for QQP, RTE and WiC superGLUE tasks
- Fine-tune BERT with the distilled samples
- Compare its performance with models fine-tuned with entire datasets in-terms of accuracy and compute time.