Skip to content

Latest commit

 

History

History
125 lines (103 loc) · 9.69 KB

README.md

File metadata and controls

125 lines (103 loc) · 9.69 KB

Build Status

Deep QA

This repository contains code for training deep learning systems to do question answering tasks. Our primary focus is on Aristo's science questions, though we can run various models on several popular datasets.

This code is a mix of scala (for data processing / pipeline management) and python (for actually training and executing deep models with Keras / Theano / TensorFlow).

Implemented models

This repository implements several variants of memory networks, including the models found in these papers:

As well as some of our own, as-yet-unpublished variants. There is a lot of similarity between the models in these papers, and our code is structured in a way to allow for easily switching between these models. For a description of how we've built an extensible memory network architecture in this library, see this readme.

Datasets

This code allows for easy experimentation with the following datasets:

And more to come... In the near future, we hope to also include easy experimentation with CNN/Daily Mail and SimpleQuestions.

Usage Guide

This code is a mix of scala and python. The intent is that the data processing and experiment pipeline code is in scala, and the deep learning code is in python. The recommended approach is to set up your experiments in scala code, then run them through sbt. Some documentation on how to do this is found in the README for the org.allenai.deep_qa.experiments package.

Running experiments with python

If for whatever reason you don't want to gain the benefits of the scala pipeline when running experiments, you can run the python code manually. To do this, from the base directory, you run the command python src/main/python/run_solver.py [model_config]. You must use python >= 3.5, as we make heavy use of the type annotations introduced in python 3.5 to aid in code readability (I recommend using anaconda to set up python 3, if you don't have it set up already).

You can see some examples of what model configuration files look like in the example experiments directory. We try to keep these up to date, but the way parameters are specified is still sometimes in a state of flux, so we make no promises that these are actually usable with the current master (and you'll have to provide your own input files to use them, in any event). Looking at the most recently added or changed example experiment should be your best bet to get an accurate format. And if you find one that's out of date, submitting a pull request to fix it would be really nice!

The best way currently to get an idea for what options are available in this configuration file, and what those options mean, is to look at the class mentioned in the solver_class field. Looking at the dynamic_memory_network.json example, we can see that it's using a MultipleTrueFalseMemoryNetworkSolver as it's solver_class. If we go to that class's __init__ method, in the code, we don't see any parameters, because MultipleTrueFalseMemoryNetworkSolver has no unique parameters of its own. So, we continue up the class hierarchy to MemoryNetworkSolver, and we can see the parameters that it takes: things like num_memory_layers, knowledge_encoder, entailment_model, and so on. If you continue on to its super class, TextTrainer, you'll find more parameters, this time for things that deal with word embeddings and sentence encoders. Finally, you can continue to the base class, Trainer, to see parameters for things like whether and where models should be saved, how to run training, specifying debug output, running pre-training, and other things. It would be nice to automatically generate some website to document all of these parameters, but I don't know how to do that and don't have the time to dedicate to making it happen. So for now, just read the comments that are in the code.

There are several places where we give lists of available choices for particular options. For example, there is a list of concrete solver classes that are valid options for the solver_class parameter in a model config file. One way to find lists of available options for these parameters (other than just by tracing the handling of parameters in the code) is by searching github for get_choice or get_choice_with_default. This might point you, for instance, to the knowledge_encoders field in memory_network.py, which is imported from layers/knowledge_encoders.py, where it is defined at the bottom of the file. In general, the places where there are these kinds of options are in the solver class (already mentioned), and the various layers we have implemented - each kind of Layer will typically specify a list of options either at the bottom of the corresponding file, or in an associated __init__.py file (as is done with the sentence encoders).

We've tried to also give reasonable documentation throughout the code, both in docstring comments and in READMEs distributed throughout the code packages, so browsing github should be pretty informative if you're confused about something. If you're still confused about how something works, open an issue asking to improve documentation of a particular piece of the code (or, if you've figured it out after searching a bit, submit a pull request containing documentation improvements that would have helped you).

Contributing

If you use this code and think something could be improved, pull requests are very welcome. Opening an issue is ok, too, but we're a lot more likely to respond to a PR. The primary maintainer of this code is Matt Gardner, with a lot of help from Pradeep Dasigi (who was the initial author of this codebase) and Mark Neumann.

License

This code is released under the terms of the Apache 2 license.