# bertviz
**Repository Path**: srwpf/bertviz
## Basic Information
- **Project Name**: bertviz
- **Description**: Tool for visualizing attention in the Transformer model (BERT, GPT-2, XLNet, and RoBERTa)
- **Primary Language**: Unknown
- **License**: Apache-2.0
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2019-10-25
- **Last Updated**: 2020-12-19
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
# BertViz
Tool for visualizing attention in BERT, GPT-2, XLNet, and RoBERTa. Extends [Tensor2Tensor visualization tool](https://github.com/tensorflow/tensor2tensor/tree/master/tensor2tensor/visualization) by [Llion Jones](https://medium.com/@llionj) and [pytorch-transformers](https://github.com/huggingface/pytorch-transformers) from [HuggingFace](https://github.com/huggingface).
Blog posts:
* [Deconstructing BERT, Part 2: Visualizing the Inner Workings of Attention](https://towardsdatascience.com/deconstructing-bert-part-2-visualizing-the-inner-workings-of-attention-60a16d86b5c1)
* [OpenAI GPT-2: Understanding Language Generation through Visualization](https://towardsdatascience.com/openai-gpt-2-understanding-language-generation-through-visualization-8252f683b2f8)
* [Deconstructing BERT: Distilling 6 Patterns from 100 Million Parameters](https://towardsdatascience.com/deconstructing-bert-distilling-6-patterns-from-100-million-parameters-b49113672f77)
Paper:
* [A Multiscale Visualization of Attention in the Transformer Model](https://arxiv.org/pdf/1906.05714.pdf)
## Attention-head view
The *attention-head view* visualizes the attention patterns produced by one or more attention heads in a given transformer layer.
 
BERT:
[[Notebook]](https://github.com/jessevig/bertviz/blob/master/head_view_bert.ipynb)
[[Colab]](https://colab.research.google.com/drive/1pS-eegmUz9EqXJw22VbVIHlHoXjNaYuc)
GPT-2:
[[Notebook]](https://github.com/jessevig/bertviz/blob/master/head_view_gpt2.ipynb)
[[Colab]](https://colab.research.google.com/drive/1qEJ4HiKy9XUKgu0t5SBNurkPAdEPnzke)
XLNet: [[Notebook]](https://github.com/jessevig/bertviz/blob/master/head_view_xlnet.ipynb)
[[Colab]](https://colab.research.google.com/drive/1zy_yHOdcz3KpCA6HYrxrEx6cL1VZ4sSq)
RoBERTa: [[Notebook]](https://github.com/jessevig/bertviz/blob/master/head_view_roberta.ipynb)
[[Colab]](https://colab.research.google.com/drive/1nk72i9cwaYNujpzKoqoEoaWtB76f9a0i)
## Model view
The *model view* provides a birds-eye view of attention across all of the model’s layers and heads.

BERT: [[Notebook]](https://github.com/jessevig/bertviz/blob/master/model_view_bert.ipynb)
[[Colab]](https://colab.research.google.com/drive/1A6xDAwAY-8MGHs3sCmy1QKUjC5O9f-4K)
GPT-2
[[Notebook]](https://github.com/jessevig/bertviz/blob/master/model_view_gpt2.ipynb)
[[Colab]](https://colab.research.google.com/drive/1lwo9tG1ncmCX7FAEnqSxbPVECdZRy8Nn)
XLNet: [[Notebook]](https://github.com/jessevig/bertviz/blob/master/model_view_xlnet.ipynb)
[[Colab]](https://colab.research.google.com/drive/1u4f3GIwusU9xiIRbxOxDUlDaMUUq6-Ba)
RoBERTa: [[Notebook]](https://github.com/jessevig/bertviz/blob/master/model_view_roberta.ipynb)
[[Colab]](https://colab.research.google.com/drive/190BnK5UPeoVrRA9VkmQxWng2-K2RhfyH)
## Neuron view
The *neuron view* visualizes the individual neurons in the query and key vectors and shows how they are used to compute attention.

BERT: [[Notebook]](https://github.com/jessevig/bertviz/blob/master/neuron_view_bert.ipynb)
[[Colab]](https://colab.research.google.com/drive/1m37iotFeubMrp9qIf9yscXEL1zhxTN2b)
GPT-2
[[Notebook]](https://github.com/jessevig/bertviz/blob/master/neuron_view_gpt2.ipynb)
[[Colab]](https://colab.research.google.com/drive/1s8XCCyxsKvNRWNzjWi5Nl8ZAYZ5YkLm_)
RoBERTa
[[Notebook]](https://github.com/jessevig/bertviz/blob/master/neuron_view_roberta.ipynb)
[[Colab]](https://colab.research.google.com/drive/1rsHzCWXibhp4sgfVlqE39W3UB5NnongL)
## Requirements
* [PyTorch](https://pytorch.org/) >=0.4.1
* [Jupyter](https://jupyter.org/install)
* [tqdm](https://pypi.org/project/tqdm/)
* [boto3](https://pypi.org/project/boto3/)
* [IPython](https://pypi.org/project/ipython/)
* [requests](https://pypi.org/project/requests/)
* [regex](https://pypi.org/project/regex/)
* [sentencepiece](https://pypi.org/project/sentencepiece/)
(See [requirements.txt](https://github.com/jessevig/bertviz/blob/master/requirements.txt))
## Execution
```
git clone https://github.com/jessevig/bertviz.git
cd bertviz
jupyter notebook
```
## Authors
* [Jesse Vig](https://github.com/jessevig)
## Citation
When referencing BertViz, please cite [this paper](https://arxiv.org/abs/1906.05714).
```
@article{vig2019transformervis,
author = {Jesse Vig},
title = {A Multiscale Visualization of Attention in the Transformer Model},
journal = {arXiv preprint arXiv:1906.05714},
year = {2019},
url = {https://arxiv.org/abs/1906.05714}
}
```
## License
This project is licensed under the Apache 2.0 License - see the [LICENSE](LICENSE) file for details
## Acknowledgments
This project incorporates code from the following repos:
* https://github.com/tensorflow/tensor2tensor
* https://github.com/huggingface/pytorch-pretrained-BERT