# seq2seq **Repository Path**: fuyongxu/seq2seq ## Basic Information - **Project Name**: seq2seq - **Description**: Attention-based sequence to sequence learning - **Primary Language**: Python - **License**: Apache-2.0 - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2022-02-13 - **Last Updated**: 2022-02-13 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # seq2seq Attention-based sequence to sequence learning ## Dependencies * [TensorFlow 1.2+ for Python 3](https://www.tensorflow.org/get_started/os_setup.html) * YAML and Matplotlib modules for Python 3: `sudo apt-get install python3-yaml python3-matplotlib` * A recent NVIDIA GPU ## How to use Train a model (CONFIG is a YAML configuration file, such as `config/default.yaml`): ./seq2seq.sh CONFIG --train -v Translate text using an existing model: ./seq2seq.sh CONFIG --decode FILE_TO_TRANSLATE --output OUTPUT_FILE or for interactive decoding: ./seq2seq.sh CONFIG --decode #### Example English→French model This is the same model and dataset as [Bahdanau et al. 2015](https://arxiv.org/abs/1409.0473). config/WMT14/download.sh # download WMT14 data into raw_data/WMT14 config/WMT14/prepare.sh # preprocess the data, and copy the files to data/WMT14 ./seq2seq.sh config/WMT14/baseline.yaml --train -v # train a baseline model on this data You should get similar BLEU scores as these (our model was trained on a single Titan X I for about 4 days). | Dev | Test | +beam | Steps | Time | |:-----:|:-----:|:-----:|:-----:|:----:| | 25.04 | 28.64 | 29.22 | 240k | 60h | | 25.25 | 28.67 | 29.28 | 330k | 80h | Download this model [here](https://drive.google.com/file/d/1Qe4yZTYSTF-mlRlP_NTFGwXgacZnBwdp/view?usp=sharing). To use this model, just extract the archive into the `seq2seq/models` folder, and run: ./seq2seq.sh models/WMT14/config.yaml --decode -v #### Example German→English model This is the same dataset as [Ranzato et al. 2015](https://arxiv.org/abs/1511.06732). config/IWSLT14/prepare.sh ./seq2seq.sh config/IWSLT14/baseline.yaml --train -v | Dev | Test | +beam | Steps | |:-----:|:-----:|:-----:|:-----:| | 28.32 | 25.33 | 26.74 | 44k | The model is available for download [here](https://drive.google.com/file/d/1qCL3ZRxZ13fC45f74Nt6qiQ8tVAYFF9H/view?usp=sharing). ## Audio pre-processing If you want to use the toolkit for Automatic Speech Recognition (ASR) or Automatic Speech Translation (AST), then you'll need to pre-process your audio files accordingly. This [README](https://github.com/eske/seq2seq/tree/master/config/BTEC) details how it can be done. You'll need to install the **Yaafe** library, and use `scripts/speech/extract-audio-features.py` to extract MFCCs from a set of wav files. ## Features * **YAML configuration files** * **Beam-search decoder** * **Ensemble decoding** * **Multiple encoders** * **Hierarchical encoder** * **Bidirectional encoder** * **Local attention model** * **Convolutional attention model** * **Detailed logging** * **Periodic BLEU evaluation** * **Periodic checkpoints** * **Multi-task training:** train on several tasks at once (e.g. French->English and German->English MT) * **Subwords training and decoding** * **Input binary features instead of text** * **Pre-processing script:** we provide a fully-featured Python script for data pre-processing (vocabulary creation, lowercasing, tokenizing, splitting, etc.) * **Dynamic RNNs:** we use symbolic loops instead of statically unrolled RNNs. This means that we don't mean to manually configure bucket sizes, and that model creation is much faster. ## Credits * This project is based on [TensorFlow's reference implementation](https://www.tensorflow.org/tutorials/seq2seq) * We include some of the pre-processing scripts from [Moses](http://www.statmt.org/moses/) * The scripts for subword units come from [github.com/rsennrich/subword-nmt](https://github.com/rsennrich/subword-nmt)