deep_question_answering

Implementation of "Teaching Machines to Read and Comprehend" proposed by Google DeepMind
git clone https://esimon.eu/repos/deep_question_answering.git
Log | Files | Refs | README | LICENSE

README.md (4994B)


      1 DeepMind : Teaching Machines to Read and Comprehend
      2 =========================================
      3 
      4 This repository contains an implementation of the two models (the Deep LSTM and the Attentive Reader) described in *Teaching Machines to Read and Comprehend* by Karl Moritz Hermann and al., NIPS, 2015. This repository also contains an implementation of a Deep Bidirectional LSTM. 
      5 
      6 The three models implemented in this repository are:
      7 
      8 - `deepmind_deep_lstm` reproduces the experimental settings of the DeepMind paper for the LSTM reader
      9 - `deepmind_attentive_reader` reproduces the experimental settings of the DeepMind paper for the Attentive reader
     10 - `deep_bidir_lstm_2x128` implements a two-layer bidirectional LSTM reader
     11 
     12 ## Our results
     13 
     14 We trained the three models during 2 to 4 days on a Titan Black GPU. The following results were obtained:
     15 
     16 
     17 <table width="416" cellpadding="2" cellspacing="2">
     18 <tr>
     19 <td valign="top" align="center"> </td>
     20 <td colspan="2" valign="top" align="center">DeepMind </td>
     21 <td colspan="2" valign="top" align="center">Us </td>
     22 </tr>
     23 <tr>
     24 <td valign="top" align="center"> </td>
     25 <td colspan="2" valign="top" align="center">CNN </td>
     26 <td colspan="2" valign="top" align="center">CNN </td>
     27 </tr>
     28 <tr>
     29 <td valign="top" align="center"> </td>
     30 <td valign="top" align="center">Valid </td>
     31 <td valign="top" align="center">Test </td>
     32 <td valign="top" align="center">Valid </td>
     33 <td valign="top" align="center">Test </td>
     34 </tr>
     35 <tr>
     36 <td valign="top" align="center">Attentive Reader </td>
     37 <td valign="top" align="center"><b>61.6</b> </td>
     38 <td valign="top" align="center"><b>63.0</b> </td>
     39 <td valign="top" align="center">59.37 </td>
     40 <td valign="top" align="center">61.07 </td>
     41 </tr>
     42 <tr>
     43 <td valign="top" align="center">Deep Bidir LSTM </td>
     44 <td valign="top" align="center">- </td>
     45 <td valign="top" align="center">- </td>
     46 <td valign="top" align="center"><b>59.76</b> </td>
     47 <td valign="top" align="center"><b>61.62</b> </td>
     48 </tr>
     49 <tr>
     50 <td valign="top" align="center">Deep LSTM Reader</td>
     51 <td valign="top" align="center">55.0</td>
     52 <td valign="top" align="center">57.0</td>
     53 <td valign="top" align="center">46</td>
     54 <td valign="top" align="center">47</td>
     55 </tr>
     56 </table>
     57 
     58 Here is an example of attention weights used by the attentive reader model on an example:
     59 
     60 <img src="https://raw.githubusercontent.com/thomasmesnard/DeepMind-Teaching-Machines-to-Read-and-Comprehend/master/doc/attention_weights_example.png" width="816px" height="652px" />
     61 
     62 
     63 ## Requirements
     64 
     65 Software dependencies:
     66 
     67 * [Theano](https://github.com/Theano/Theano) GPU computing library library
     68 * [Blocks](https://github.com/mila-udem/blocks) deep learning framework 
     69 * [Fuel](https://github.com/mila-udem/fuel) data pipeline for Blocks
     70 
     71 Optional dependencies:
     72 
     73 * Blocks Extras and a Bokeh server for the plot
     74 
     75 We recommend using [Anaconda 2](https://www.continuum.io/downloads) and installing them with the following commands (where `pip` refers to the `pip` command from Anaconda):
     76 
     77     pip install git+git://github.com/Theano/Theano.git
     78     pip install git+git://github.com/mila-udem/fuel.git
     79     pip install git+git://github.com/mila-udem/blocks.git -r https://raw.githubusercontent.com/mila-udem/blocks/master/requirements.txt
     80 
     81 Anaconda also includes a Bokeh server, but you still need to install `blocks-extras` if you want to have the plot:
     82 
     83     pip install git+git://github.com/mila-udem/blocks-extras.git
     84 
     85 The corresponding dataset is provided by [DeepMind](https://github.com/deepmind/rc-data) but if the script does not work (or you are tired of waiting) you can check [this preprocessed version of the dataset](http://cs.nyu.edu/~kcho/DMQA/) by [Kyunghyun Cho](http://www.kyunghyuncho.me/).
     86 
     87 
     88 ## Running
     89 
     90 Set the environment variable `DATAPATH` to the folder containing the DeepMind QA dataset. The training questions are expected to be in `$DATAPATH/deepmind-qa/cnn/questions/training`.
     91 
     92 Run:
     93 
     94     cp deepmind-qa/* $DATAPATH/deepmind-qa/
     95 
     96 This will copy our vocabulary list `vocab.txt`, which contains a subset of all the words appearing in the dataset.
     97 
     98 To train a model (see list of models at the beginning of this file), run:
     99 
    100     ./train.py model_name
    101 
    102 Be careful to set your `THEANO_FLAGS` correctly! For instance you might want to use `THEANO_FLAGS=device=gpu0` if you have a GPU (highly recommended!)
    103 
    104 
    105 ## Reference
    106 
    107 [Teaching Machines to Read and Comprehend](https://papers.nips.cc/paper/5945-teaching-machines-to-read-and-comprehend.pdf), by Karl Moritz Hermann, Tomáš Kočiský, Edward Grefenstette, Lasse Espeholt, Will Kay, Mustafa Suleyman and Phil Blunsom, Neural Information Processing Systems, 2015.
    108 
    109 
    110 ## Credits
    111 
    112 [Thomas Mesnard](https://github.com/thomasmesnard)
    113 
    114 [Alex Auvolat](https://github.com/Alexis211)
    115 
    116 [Étienne Simon](https://github.com/ejls)
    117 
    118 
    119 ## Acknowledgments
    120 
    121 We would like to thank the developers of Theano, Blocks and Fuel at MILA for their excellent work.
    122 
    123 We thank Simon Lacoste-Julien from SIERRA team at INRIA, for providing us access to two Titan Black GPUs.
    124 
    125