# ml-spin

**Repository Path**: mirrors_apple/ml-spin

## Basic Information

- **Project Name**: ml-spin
- **Description**: This repository contains the official implementation for the ECCV'22 paper, "SPIN: An Empirical Evaluation on Sharing Parameters of Isotropic Networks".
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2022-07-20
- **Last Updated**: 2026-03-21

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# SPIN
This repository contains the official implementation for the ECCV'22 paper, ["SPIN: An Empirical Evaluation on Sharing Parameters of Isotropic Networks"](https://arxiv.org/abs/2207.10237).

## Code Overview
We provide the implementation of weight sharing version of the [ConvMixer](https://openreview.net/pdf?id=TVHS5Y4dNvM) model. The main code for the implementation are in the `models` directory. The model can be configured by the files in `configs`. We provide three example configs.
* `configs/ConvMixer.yaml` for vanilla ConvMixer model.
* `configs/WS-ConvMixer.yaml` for Weight-shared ConvMixer (WS-ConvMixer) model.
* `configs/WFWS-ConvMixer.yaml` for Weight-fusion Weight-shared ConvMixer (WFWS-ConvMixer) model.

Note that in order to run the model `configs/WF-WSConvMixer.yaml`, you must have a corresponding pretrained ConvMixer model. Please refer to our paper for each technique.

## Installation
First, clone this repo with
```
git clone https://github.com/apple/ml-spin.git
```
The implementation of SPIN reuses the infrastructure of Meta Research's open source project [SlowFast](https://github.com/facebookresearch/SlowFast). Our modification to the SlowFast code is stored in the `spin-slowfast.patch`. To download the SlowFast code and apply our changes, run
```
bash setup.sh
```
After getting the codebase ready, follow this [link](https://github.com/facebookresearch/SlowFast/blob/main/INSTALL.md) from SlowFast repo to setup your environment and install other dependencies.

## Training
After the environment is set up, you can run the following example training script to train a weight sharing ConvMixer model. The script assumes you have a machine with 4-GPUs.
```
bash run.sh
```
### Pre-trained ConvMixer Models on ImageNet1K
We provide our pretrained models of ConvMixer, WS-ConvMixer and WFWS-ConvMixer in the following table. For the WFWS-ConvMixer, we first initialized the model using the proposed weight fusion technique with mean strategy, and then run the `models/fuse_weights.py` to export the fused model after training. In order to re-run the model, please use the WS-ConvMixer configuration. Please note we did a light hyperparameter tunning so the accuracy is slightly higher than the numbers reported in the paper.
| C/D/P/K | Weight Sharing? | Weight Fusion? | Sharing Rate | Share Distribution | Sharing Mapping | Accuracy | Model Size |
| ------- | --------------- | -------------- | ------------ | ------------- | -------------------- | -------- | ---------- |
| 768/32/14/3 | No  | No  | - | - | - | 76.32% | [79MB](pretrained/ConvMixer_768_32_14_3-Stripped.pyth)
| 768/32/14/3 | Yes | No  | 2 | Uniform | Sequential | 74.27% | [43MB](pretrained/WS-ConvMixer_768_32_14_3-Stripped.pyth) |
| 768/32/14/3 | Yes | Mean | 2 | Uniform | Sequential | 75.21%  | [43MB](pretrained/WF-Mean-WS-ConvMixer_768_32_14_3-Fused-Stripped.pyth) |

## Citation
If you find our code or paper helps, please consider citing:
```
@article{spin_eccv22,
    author    = {Lin, Chien-Yu and Prabhu, Anish and Merth, Thomas and Mehta, Sachin and Ranjan, Anurag and Horton, Maxwell and Rastegari, Mohammad}
    title     = {SPIN: An Empirical Evaluation on Sharing Parameters of Isotropic Networks},
    booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
    year      = {2022}
}
```