# TextClassification-Keras
**Repository Path**: wwfcoder/TextClassification-Keras
## Basic Information
- **Project Name**: TextClassification-Keras
- **Description**: Text classification models implemented in Keras, including: FastText, TextCNN, TextRNN, TextBiRNN, TextAttBiRNN, HAN, RCNN, RCNNVariant, etc.
- **Primary Language**: Python
- **License**: MIT
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2019-08-20
- **Last Updated**: 2020-12-19
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
# TextClassification-Keras
这个代码仓库使用 **Keras** 框架实现了多种用于**文本分类**的**深度学习模型**,其中包含的模型有:**FastText**, **TextCNN**, **TextRNN**, **TextBiRNN**, **TextAttBiRNN**, **HAN**, **RCNN**, **RCNNVariant** 等等。除了模型实现,还附带了简化的应用程序。
- [English documents](README.md)
- [中文文档](README-ZH.md)
## 向导
1. [环境](#环境)
2. [使用说明](#使用说明)
3. [模型](#模型)
1. [FastText](#1-fasttext)
2. [TextCNN](#2-textcnn)
3. [TextRNN](#3-textrnn)
4. [TextBiRNN](#4-textbirnn)
5. [TextAttBiRNN](#5-textattbirnn)
6. [HAN](#6-han)
7. [RCNN](#7-rcnn)
8. [RCNNVariant](#8-rcnnvariant)
999. [未完待续……](#未完待续)
4. [引用](#引用)
## 环境
- Python 3.6
- NumPy 1.15.2
- Keras 2.2.0
- Tensorflow 1.8.0
## 使用说明
代码部分都位于目录 `/model` 下,每种模型有相应的目录,该目录下放置了模型代码和应用代码。
例如:FastText 的模型代码和应用代码都位于 `/model/FastText` 下,模型部分是 `fast_text.py`,应用部分是 `main.py`。
## 模型
### 1 FastText
FastText 在论文 [Bag of Tricks for Efficient Text Classification](https://arxiv.org/pdf/1607.01759.pdf) 中被提出。
#### 1.1 论文的描述
1. Using a look-up table, **bags of ngram** covert to **word representations**.
2. Word representations are **averaged** into a text representation, which is a hidden variable.
3. Text representation is in turn fed to a **linear classifier**.
4. Use the **softmax** function to compute the probability distribution over the predefined classes.
#### 1.2 此处的实现
FastText 的网络结构:
### 2 TextCNN
TextCNN 在论文 [Convolutional Neural Networks for Sentence Classification](http://www.aclweb.org/anthology/D14-1181) 中被提出。
#### 2.1 论文的描述
1. Represent sentence with **static and non-static channels**.
2. **Convolve** with multiple filter widths and feature maps.
3. Use **max-over-time pooling**.
4. Use **fully connected layer** with **dropout** and **softmax** ouput.
#### 2.2 此处的实现
TextCNN 的网络结构:
### 3 TextRNN
TextRNN 在论文 [Recurrent Neural Network for Text Classification with Multi-Task Learning](https://www.ijcai.org/Proceedings/16/Papers/408.pdf) 中有被提到,但并不是这篇论文提出的。
#### 3.1 论文的描述
#### 3.2 此处的实现
TextRNN 的网络结构:
### 4 TextBiRNN
TextBiRNN 是基于 TextRNN 的改进版本,将网络结构中的 RNN 层改进成了双向(Bidirectional)的 RNN 层,希望不仅能考虑正向编码的信息,也能考虑反向编码的信息。暂时没有找到相关的论文。
TextBiRNN 的网络结构:
### 5 TextAttBiRNN
TextAttBiRNN 是基于 TextBiRNN 的改进版本,引入了注意力机制(Attention)。对于双向 RNN 编码得到的表征向量,模型能够通过注意力机制,关注与决策最相关的信息。其中注意力机制最先在论文 [Neural Machine Translation by Jointly Learning to Align and Translate](https://arxiv.org/pdf/1409.0473.pdf) 中被提出,而此处对于注意力机制的实现参照了论文 [Feed-Forward Networks with Attention Can Solve Some Long-Term Memory Problems](https://arxiv.org/pdf/1512.08756.pdf)。
#### 5.1 论文的描述
In the paper [Feed-Forward Networks with Attention Can Solve Some Long-Term Memory Problems](https://arxiv.org/pdf/1512.08756.pdf), the **feed forward attention** is simplified as follows,
Function `a`, a learnable function, is recognized as a **feed forward network**. In this formulation, attention can be seen as producing a fixed-length embedding `c` of the input sequence by computing an **adaptive weighted average** of the state sequence `h`.
#### 5.2 此处的实现
Attention 的实现不做介绍,请直接查阅源代码。
TextAttBiRNN 的网络结构:
### 6 HAN
HAN 在论文 [Hierarchical Attention Networks for Document Classification](http://www.aclweb.org/anthology/N16-1174) 中被提出。
#### 6.1 论文的描述
1. **Word Encoder**. Encoding by **bidirectional GRU**, an annotation for a given word is obtained by concatenating the forward hidden state and backward hidden state, which summarizes the information of the whole sentence centered around word in current time step.
2. **Word Attention**. By a one-layer **MLP** and softmax function, it is enable to calculate normalized importance weights over the previous word annotations. Then, compute the sentence vector as a **weighted sum** of the word annotations based on the weights.
3. **Sentence Encoder**. In a similar way with word encoder, use a **bidirectional GRU** to encode the sentences to get an annotation for a sentence.
4. **Sentence Attention**. Similar with word attention, use a one-layer **MLP** and softmax function to get the weights over sentence annotations. Then, calculate a **weighted sum** of the sentence annotations based on the weights to get the document vector.
5. **Document Classification**. Use the **softmax** function to calculate the probability of all classes.
#### 6.2 此处的实现
此处的 Attention 的实现使用了 FeedForwardAttention 的实现方式,与 TextAttBiRNN 中的 Attention 相同。
HAN 的网络结构:
此处使用了 TimeDistributed 包装器,希望 Embedding、Bidirectional RNN 和 Attention 层的参数能够在时间步维度上共享。
### 7 RCNN
RCNN 在论文 [Recurrent Convolutional Neural Networks for Text Classification](https://www.aaai.org/ocs/index.php/AAAI/AAAI15/paper/view/9745/9552) 中被提出。
#### 7.1 论文的描述
1. **Word Representation Learning**. RCNN uses a recurrent structure, which is a **bi-directional recurrent neural network**, to capture the contexts. Then, combine the word and its context to present the word. And apply a **linear transformation** together with the `tanh` activation fucntion to the representation.
2. **Text Representation Learning**. When all of the representations of words are calculated, it applys a element-wise **max-pooling** layer in order to capture the most important information throughout the entire text. Finally, do the **linear transformation** and apply the **softmax** function.
#### 7.2 此处的实现
RCNN 的网络结构:
### 8 RCNNVariant
RCNNVariant 是基于 RCNN 的改进版本,做了以下几点改进。暂时没有找到相关的论文。
1. 三输入改成了**单输入**,移除了左右上下文的输入。
2. 使用**双向的 LSTM/GRU** 取代传统 RNN 进行编码。
3. 使用**多通道的 CNN** 进行语义向量的表征。
4. 使用 **ReLU 激活层**取代 Tanh 激活层。
5. 同时使用 **AveragePooling** 和 **MaxPooling** 进行池化。
RCNNVariant 的网络结构:
### 未完待续……
## 引用
1. [Bag of Tricks for Efficient Text Classification](https://arxiv.org/pdf/1607.01759.pdf)
2. [Keras Example IMDB FastText](https://github.com/keras-team/keras/blob/master/examples/imdb_fasttext.py)
3. [Convolutional Neural Networks for Sentence Classification](http://www.aclweb.org/anthology/D14-1181)
4. [Keras Example IMDB CNN](https://github.com/keras-team/keras/blob/master/examples/imdb_cnn.py)
5. [Recurrent Neural Network for Text Classification with Multi-Task Learning](https://www.ijcai.org/Proceedings/16/Papers/408.pdf)
6. [Neural Machine Translation by Jointly Learning to Align and Translate](https://arxiv.org/pdf/1409.0473.pdf)
7. [Feed-Forward Networks with Attention Can Solve Some Long-Term Memory Problems](https://arxiv.org/pdf/1512.08756.pdf)
8. [cbaziotis's Attention](https://gist.github.com/cbaziotis/6428df359af27d58078ca5ed9792bd6d)
9. [Hierarchical Attention Networks for Document Classification](http://www.aclweb.org/anthology/N16-1174)
10. [Richard's HAN](https://richliao.github.io/supervised/classification/2016/12/26/textclassifier-HATN/)
11. [Recurrent Convolutional Neural Networks for Text Classification](https://www.aaai.org/ocs/index.php/AAAI/AAAI15/paper/view/9745/9552)
12. [airalcorn2's RCNN](https://github.com/airalcorn2/Recurrent-Convolutional-Neural-Network-Text-Classifier)