# DeepPavlov
**Repository Path**: jempson/DeepPavlov
## Basic Information
- **Project Name**: DeepPavlov
- **Description**: DeepPavlov 是一个开源的对话 AI 库,基于 TensorFlow 和 Keras 构建,其作用是: NLP 和对话系统研究; 实现和评估复杂对话系统
- **Primary Language**: Unknown
- **License**: Apache-2.0
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 9
- **Created**: 2018-02-27
- **Last Updated**: 2021-11-03
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
[](https://github.com/deepmipt/DeepPavlov/blob/master/LICENSE)

#
DeepPavlov
### *We are in a really early Alpha release. You have to be ready for hard adventures.*
An open-source conversational AI library, built on TensorFlow and Keras, and designed for
* NLP and dialog systems research
* implementation and evaluation of complex conversational systems
Our goal is to provide researchers with:
* a framework for implementing and testing their own dialog models with subsequent sharing of that models
* set of predefined NLP models / dialog system components (ML/DL/Rule-based) and pipeline templates
* benchmarking environment for conversational models and systematized access to relevant datasets
and AI-application developers with:
* framework for building conversational software
* tools for application integration with adjacent infrastructure (messengers, helpdesk software etc.)
## Features
| Component | Description |
| --------- | ----------- |
| [Slot filling component](deeppavlov/models/ner/README.md) | is based on neural Named Entity Recognition network and fuzzy Levenshtein search to extract normalized slot values from the text. The NER network component reproduces architecture from the paper [Application of a Hybrid Bi-LSTM-CRF model to the task of Russian Named Entity Recognition](https://arxiv.org/pdf/1709.09686.pdf), which is inspired by LSTM+CRF architecture from https://arxiv.org/pdf/1603.01360.pdf. |
| [Intent classification component](deeppavlov/models/classifiers/intents/README.md) | Based on shallow-and-wide Convolutional Neural Network architecture from [Kim Y. Convolutional neural networks for sentence classification – 2014](https://arxiv.org/pdf/1408.5882). The model allows multilabel classification of sentences. |
| [Automatic spelling correction component](deeppavlov/models/spellers/error_model/README.md) | Based on [An Improved Error Model for Noisy Channel Spelling Correction by Eric Brill and Robert C. Moore](http://www.aclweb.org/anthology/P00-1037) and uses statistics based error model, a static dictionary and an ARPA language model to correct spelling errors. |
| **Skill** | |
| [Goal-oriented bot](deeppavlov/skills/go_bot/README.md) | Based on Hybrid Code Networks (HCNs) architecture from [Jason D. Williams, Kavosh Asadi, Geoffrey Zweig, Hybrid Code Networks: practical and efficient end-to-end dialog control with supervised and reinforcement learning – 2017](https://arxiv.org/abs/1702.03274). It allows to predict responses in the goal-oriented task dialogue. The model is quite customizable: embeddings, slot filler and intent classifier can be used or not on demand. |
| **Embeddings** | |
| [Pre-trained embeddings for Russian language](pretrained-vectors.md) | Pre-trained on joint [Russian Wikipedia](https://ru.wikipedia.org/wiki/%D0%97%D0%B0%D0%B3%D0%BB%D0%B0%D0%B2%D0%BD%D0%B0%D1%8F_%D1%81%D1%82%D1%80%D0%B0%D0%BD%D0%B8%D1%86%D0%B0) and [Lenta.ru](https://lenta.ru/) corpora word vectors for Russian language. |
## Basic examples
View video demo of deploy goal-oriented bot and slot-filling model with Telegram UI
[](https://youtu.be/yzoiCa_sMuY)
* Run goal-oriented bot with Telegram interface:
```
python deep.py interactbot skills/go_bot/config.json -t
```
* Run goal-oriented bot with console interface:
```
python deep.py interact skills/go_bot/config.json
```
* Run slot-filling model with Telegram interface
```
python deep.py interactbot models/ner/config.json -t
```
* Run slot-filling model with console interface
```
python deep.py interact models/ner/config.json
```
## Conceptual overview
### Principles
The library is designed following the principles:
* end-to-end deep learning architecture as long-term goal
* hybrid ML/DL/Rule-based architecture as a current approach
* modular dialog system architecture
* component-based software engineering, reusability maximization
* easy to extend and benchmark
* multiple components by one NLP task with data-driven selection of suitable components
### Target Architecture
Target architecture of our library:
DeepPavlov is built on top of machine learning frameworks (TensorFlow, Keras). Other external libraries can be used to build basic components.
### Key Concepts
* `Agent` - conversational agent communicating with users in natural language (text)
* `Skill` - unit of interaction that fulfills a user’s need. Typically, a user’s need is fulfilled by presenting information or completing a transaction (e.g. answer question by FAQ, booking tickets etc.); however, for some experiences success is defined as continued engagement (e.g. chit-chat)
* `Components` - atomic functionality blocks
* `Rule-based Components` - can not be trained
* `Machine Learning Components` - can be trained only separately
* `Deep Learning Components` - can be trained separately and in end-to-end mode being joined in chain
* `Switcher` - mechanism by which agent ranks and selects the final response shown to the user
* `Components Chainer` - tool for agents/components pipeline building from heterogeneous components (rule-based/ml/dl), which allow to train and inference pipeline as a whole.
### Contents
* [Installation](#installation)
* [Quick start](#quick-start)
* [Technical overview](#technical-overview)
* [Project modules](#project-modules)
* [Config](#config)
* [DatasetReader](#datasetreader)
* [Dataset](#dataset)
* [Vocab](#vocab)
* [Model](#model)
* [Training](#training)
* [Inferring](#inferring)
* [License](#license)
* [Support and collaboration](#support-and-collaboration)
* [The Team](#the-team)
## Installation
1. Create a virtual environment with `Python 3.6`
```
virtualenv env
```
2. Activate the environment.
```
source ./env/bin/activate
```
3. Clone the repo and `cd` to project root
```
git clone https://github.com/deepmipt/DeepPavlov.git
cd DeepPavlov
```
4. Install the requirements:
```
python setup.py install
```
5. Clean the installation:
```
python setup.py clean --all
```
6. Install `spacy` dependencies:
```
python -m spacy download en
```
## Quick start
To interact with our pre-trained models, they should be downloaded first:
```
python download.py [-all]
```
* `[-all]` option is not required for basic examples; it will download **all** our pre-trained models.
* Warning! `[-all]` requires about 10 GB of free space on disk.
Then models can be interacted or trained with the following command:
```
python deep.py
```
* `` can be 'train', 'interact' or 'interactbot'
* `` should be a path to an NLP pipeline json config
For 'interactbot' mode you should specify Telegram bot token in `-t` parameter or in `TELEGRAM_TOKEN` environment variable.
Available model configs are:
*skills/go_bot/config.json*
*models/classifiers/intents/config_dstc2.json*
*models/ner/config.json*
*models/spellers/error_model/config_en.json*
---
## Technical overview
### Project modules
| deeppavlov.core.commands |
basic training and inferring functions |
| deeppavlov.core.common |
registration and classes initialization functionality, class method decorators |
| deeppavlov.core.data |
basic Dataset, DatasetReader and Vocab classes |
| deeppavlov.core.models |
abstract model classes and interfaces |
| deeppavlov.dataset_readers |
concrete DatasetReader classes |
| deeppavlov.datasets |
concrete Dataset classes |
| deeppavlov.models |
concrete Model classes |
| deeppavlov.skills |
Skill classes. Skills are dialog models. |
| deeppavlov.vocabs |
concrete Vocab classes |
### Config
An NLP pipeline config is a JSON file, which consists of four required elements:
```javascript
{
"dataset_reader": {
},
"dataset": {
},
"vocabs": {
},
"model": {
}
}
```
Each class in the config has `name` parameter, which is its registered codename
and can have any other parameters, repeating its `__init__()` method arguments.
Default values of `__init__()` arguments will be overridden with the config values
during class instance initialization.
### DatasetReader
`DatasetReader` class reads data and returns it in a specified format.
A concrete `DatasetReader` class should be inherited from base
`deeppavlov.data.dataset_reader.DatasetReader` class and registered with a codename:
```python
@register('dstc2_datasetreader')
class DSTC2DatasetReader(DatasetReader):
```
### Dataset
`Dataset` forms needed sets of data ('train', 'valid', 'test') and forms data batches.
A concrete `Dataset` class should be registered and can be inherited from
`deeppavlov.data.dataset_reader.Dataset` class. `deeppavlov.data.dataset_reader.Dataset`
is not an abstract class and can be used as `Dataset` as well.
### Vocab
`Vocab` is a trainable class, which forms and serialize vocabs. Vocabs index any data.
For example, tokens to indices and backwards, chars to indices, classes to indices, etc.
It can index X (features) and y (answers) types of data. A concrete `Vocab` class
should be registered and can be inherited from `deeppavlov.data.vocab.DefaultVocabulary` class.
`deeppavlov.data.vocab.DefaultVocabulary` is not an abstract class and can be used as `Vocab` as well.
### Model
`Model` is the main class which rules the training/inferring process and feature generation.
If a model requires other models to produce features, they need to be passed in its constructor
and config. All models can be nested as much as needed. For example, a skeleton of
`deeppavlov.skills.go_bot.go_bot.GoalOrientedBot` consists of 11 separate model classes,
3 of which are neural networks:
```javascript
{
"model": {
"name": "go_bot",
"network": {
"name": "go_bot_rnn"
},
"slot_filler": {
"name": "dstc_slotfilling",
"ner_network": {
"name": "ner_tagging_network",
}
},
"intent_classifier": {
"name": "intent_model",
"embedder": {
"name": "fasttext"
},
"tokenizer": {
"name": "nltk_tokenizer"
}
},
"embedder": {
"name": "fasttext"
},
"bow_encoder": {
"name": "bow"
},
"tokenizer": {
"name": "spacy_tokenizer"
},
"tracker": {
"name": "featurized_tracker"
}
}
}
```
All models should be registered and inherited from `deeppavlov.core.models.inferable.Inferable`
or from both `Inferable` and `deeppavlov.core.models.trainable.Trainable` interfaces.
Models inherited from `Trainable` interface can be trained. Models inherited from `Inferable`
interface can be only inferred. Usually `Inferable` models are rule-based models or
pre-trained models that we import from third-party libraries (like `NLTK`, `Spacy`, etc.).
### Training
All models inherited from `deeppavlov.core.models.trainable.Trainable` interface can be trained.
The training process should be described in `train()` method:
```python
@register("my_model")
class MyModel(Inferable, Trainable):
def train(*args, **kwargs):
"""
Implement training here.
"""
```
All parameters for training which can be changed during experiments (like *num of epochs*,
*batch size*, *patience*, *learning rate*, *optimizer*) should be passed to a model's
`__init__()`. The default parameters values from `__init__()` are overridden with JSON config values.
To change these values, there is no need to rewrite the code, only the config should be changed.
The training process is managed by `train_now` attribute. If `train_now` is *True*,
a model is being trained. This parameter is useful when using `Vocab`, because in a single
model run some vocabs can be trained, while some only inferred by other models in pipeline.
The training parameters in JSON config can look like this:
```javascript
{
"model": {
"name": "my_model",
"train_now": true,
"optimizer": "Adam",
"learning_rate": 0.2,
"num_epochs": 1000
}
}
```
Training is triggered by `deeppavlov.core.commands.train.train_model_from_config()` function.
### Inferring
All models inherited from `deeppavlov.core.models.inferable.Inferable` interface can be inferred.
The `infer()` method should return what a model can do. For example, a *tokenizer* should return
*tokens*, a *NER recognizer* should return *recognized entities*, a *bot* should return a *replica*.
A particular format of returned data should be defined in `infer()`.
Inferring is triggered by `deeppavlov.core.commands.train.infer_model_from_config()` function.
There is no need in s separate JSON for inferring. `train_now` parameter is ignored during
inferring.
## License
DeepPavlov is Apache 2.0 - licensed.
## Support and collaboration
If you have any questions, bug reports or feature requests, please feel free to post on our [Github Issues](https://github.com/deepmipt/DeepPavlov/issues) page. Please tag your issue with 'bug', 'feature request', or 'question'. Also we’ll be glad to see your pull-requests to add new datasets, models, embeddings and etc.
## The Team
DeepPavlov is built and maintained by [Neural Networks and Deep Learning Lab](https://mipt.ru/english/research/labs/neural-networks-and-deep-learning-lab) at [MIPT](https://mipt.ru/english/) within [iPavlov](http://ipavlov.ai/) project (part of [National Technology Initiative](https://asi.ru/eng/nti/)) and in partnership with [Sberbank](http://www.sberbank.com/).