# efficient_densenet_pytorch

**Repository Path**: mrzhanzhan/efficient_densenet_pytorch

## Basic Information

- **Project Name**: efficient_densenet_pytorch
- **Description**: 引用自：https://github.com/gpleiss/efficient_densenet_pytorch.git
使用pytorch实现DenseNet
- **Primary Language**: Python
- **License**: MIT
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 1
- **Forks**: 0
- **Created**: 2021-07-10
- **Last Updated**: 2022-03-06

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# efficient_densenet_pytorch
A PyTorch >=1.0 implementation of DenseNets, optimized to save GPU memory.

## Recent updates
1. **Now works on PyTorch 1.0!** It uses the checkpointing feature, which makes this code WAY more efficient!!!

## Motivation
While DenseNets are fairly easy to implement in deep learning frameworks, most
implmementations (such as the [original](https://github.com/liuzhuang13/DenseNet)) tend to be memory-hungry.
In particular, the number of intermediate feature maps generated by batch normalization and concatenation operations
grows quadratically with network depth.
*It is worth emphasizing that this is not a property inherent to DenseNets, but rather to the implementation.*

This implementation uses a new strategy to reduce the memory consumption of DenseNets.
We use [checkpointing](https://pytorch.org/docs/stable/checkpoint.html?highlight=checkpointing) to compute the Batch Norm and concatenation feature maps.
These intermediate feature maps are discarded during the forward pass and recomputed for the backward pass.
This adds 15-20% of time overhead for training, but **reduces feature map consumption from quadratic to linear.**

This implementation is inspired by this [technical report](https://arxiv.org/pdf/1707.06990.pdf), which outlines a strategy for efficient DenseNets via memory sharing.

## Requirements
- PyTorch >=1.0.0
- CUDA

## Usage

**In your existing project:**
There is one file in the `models` folder.
 - `models/densenet.py` is an implementation based off the [torchvision](https://github.com/pytorch/vision/blob/master/torchvision/models/densenet.py) and
[project killer](https://github.com/felixgwu/img_classification_pk_pytorch/blob/master/models/densenet.py) implementations.

If you care about speed, and memory is not an option, pass the `efficient=False` argument into the `DenseNet` constructor.
Otherwise, pass in `efficient=True`.

**Options:**
- All options are described in [the docstrings of the model files](https://github.com/gpleiss/efficient_densenet_pytorch/blob/master/models/densenet_efficient.py#L189)
- The depth is controlled by `block_config` option
- `efficient=True` uses the memory-efficient version
- If you want to use the model for ImageNet, set `small_inputs=False`. For CIFAR or SVHN, set `small_inputs=True`.

**Running the demo:**

The only extra package you need to install is [python-fire](https://github.com/google/python-fire):
```sh
pip install fire
```

- Single GPU:

```sh
CUDA_VISIBLE_DEVICES=0 python demo.py --efficient True --data <path_to_folder_with_cifar10> --save <path_to_save_dir>
```

- Multiple GPU:

```sh
CUDA_VISIBLE_DEVICES=0,1,2 python demo.py --efficient True --data <path_to_folder_with_cifar10> --save <path_to_save_dir>
```

Options:
- `--depth` (int) - depth of the network (number of convolution layers) (default 40)
- `--growth_rate` (int) - number of features added per DenseNet layer (default 12)
- `--n_epochs` (int) - number of epochs for training (default 300)
- `--batch_size` (int) - size of minibatch (default 256)
- `--seed` (int) - manually set the random seed (default None)

## Performance

A comparison of the two implementations (each is a DenseNet-BC with 100 layers, batch size 64, tested on a NVIDIA Pascal Titan-X):

| Implementation | Memory cosumption (GB/GPU) | Speed (sec/mini batch) |
|----------------|------------------------|------------------------|
| Naive          |  2.863  | 0.165                  |
| Efficient      |  1.605  | 0.207                  |
| Efficient (multi-GPU)      |  0.985  | -                  |


## Other efficient implementations
- [LuaTorch](https://github.com/liuzhuang13/DenseNet/tree/master/models) (by Gao Huang)
- [Tensorflow](https://github.com/joeyearsley/efficient_densenet_tensorflow) (by Joe Yearsley)
- [Caffe](https://github.com/Tongcheng/DN_CaffeScript) (by Tongcheng Li)

## Reference

```
@article{pleiss2017memory,
  title={Memory-Efficient Implementation of DenseNets},
  author={Pleiss, Geoff and Chen, Danlu and Huang, Gao and Li, Tongcheng and van der Maaten, Laurens and Weinberger, Kilian Q},
  journal={arXiv preprint arXiv:1707.06990},
  year={2017}
}
```