# DeepSpeed **Repository Path**: xuuu3/DeepSpeed ## Basic Information - **Project Name**: DeepSpeed - **Description**: No description available - **Primary Language**: Unknown - **License**: Apache-2.0 - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-05-15 - **Last Updated**: 2025-05-16 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README [](https://github.com/deepspeedai/DeepSpeed/blob/master/LICENSE) [](https://pypi.org/project/deepspeed/) [](https://pepy.tech/project/deepspeed) [](#build-pipeline-status) [](https://www.bestpractices.dev/projects/9530) [](https://twitter.com/intent/follow?screen_name=DeepSpeedAI) [](https://twitter.com/DeepSpeedAI_JP) [](https://www.zhihu.com/people/deepspeed)
## DeepSpeed-Training
DeepSpeed offers a confluence of system innovations, that has made large scale DL training effective, and efficient, greatly improved ease of use, and redefined the DL training landscape in terms of scale that is possible. These innovations such as ZeRO, 3D-Parallelism, DeepSpeed-MoE, ZeRO-Infinity, etc. fall under the training pillar. Learn more: [DeepSpeed-Training](https://www.deepspeed.ai/training/)
## DeepSpeed-Inference
DeepSpeed brings together innovations in parallelism technology such as tensor, pipeline, expert and ZeRO-parallelism, and combines them with high performance custom inference kernels, communication optimizations and heterogeneous memory technologies to enable inference at an unprecedented scale, while achieving unparalleled latency, throughput and cost reduction. This systematic composition of system technologies for inference falls under the inference pillar. Learn more: [DeepSpeed-Inference](https://www.deepspeed.ai/inference)
## DeepSpeed-Compression
To further increase the inference efficiency, DeepSpeed offers easy-to-use and flexible-to-compose compression techniques for researchers and practitioners to compress their models while delivering faster speed, smaller model size, and significantly reduced compression cost. Moreover, SoTA innovations on compression like ZeroQuant and XTC are included under the compression pillar. Learn more: [DeepSpeed-Compression](https://www.deepspeed.ai/compression)
## DeepSpeed4Science
In line with Microsoft's mission to solve humanity's most pressing challenges, the DeepSpeed team at Microsoft is responding to this opportunity by launching a new initiative called *DeepSpeed4Science*, aiming to build unique capabilities through AI system technology innovations to help domain experts to unlock today's biggest science mysteries. Learn more: [tutorials](https://www.deepspeed.ai/deepspeed4science/)
---
# DeepSpeed Software Suite
## DeepSpeed Library
The [DeepSpeed](https://github.com/deepspeedai/deepspeed) library (this repository) implements and packages the innovations and technologies in DeepSpeed Training, Inference and Compression Pillars into a single easy-to-use, open-sourced repository. It allows for easy composition of multitude of features within a single training, inference or compression pipeline. The DeepSpeed Library is heavily adopted by the DL community, and has been used to enable some of the most powerful models (see [DeepSpeed Adoption](#deepspeed-adoption)).
## Model Implementations for Inference (MII)
[Model Implementations for Inference (MII)](https://github.com/deepspeedai/deepspeed-mii) is an open-sourced repository for making low-latency and high-throughput inference accessible to all data scientists by alleviating the need to apply complex system optimization techniques themselves. Out-of-box, MII offers support for thousands of widely used DL models, optimized using DeepSpeed-Inference, that can be deployed with a few lines of code, while achieving significant latency reduction compared to their vanilla open-sourced versions.
## DeepSpeed on Azure
DeepSpeed users are diverse and have access to different environments. We recommend to try DeepSpeed on Azure as it is the simplest and easiest method. The recommended method to try DeepSpeed on Azure is through AzureML [recipes](https://github.com/Azure/azureml-examples/tree/main/v1/python-sdk/workflows/train/deepspeed). The job submission and data preparation scripts have been made available [here](https://github.com/deepspeedai/Megatron-DeepSpeed/tree/main/examples_deepspeed/azureml). For more details on how to use DeepSpeed on Azure, please follow the [Azure tutorial](https://www.deepspeed.ai/tutorials/azure/).
---
# DeepSpeed Adoption
DeepSpeed was an important part of Microsoft’s
[AI at Scale](https://www.microsoft.com/en-us/research/project/ai-at-scale/)
initiative to enable next-generation AI capabilities at scale, where you can find more
information [here](https://innovation.microsoft.com/en-us/exploring-ai-at-scale).
DeepSpeed has been used to train many different large-scale models, below is a list of several examples that we are aware of (if you'd like to include your model please submit a PR):
* [Megatron-Turing NLG (530B)](https://www.microsoft.com/en-us/research/blog/using-deepspeed-and-megatron-to-train-megatron-turing-nlg-530b-the-worlds-largest-and-most-powerful-generative-language-model/)
* [Jurassic-1 (178B)](https://uploads-ssl.webflow.com/60fd4503684b466578c0d307/61138924626a6981ee09caf6_jurassic_tech_paper.pdf)
* [BLOOM (176B)](https://huggingface.co/blog/bloom-megatron-deepspeed)
* [GLM (130B)](https://github.com/THUDM/GLM-130B)
* [xTrimoPGLM (100B)](https://www.biorxiv.org/content/10.1101/2023.07.05.547496v2)
* [YaLM (100B)](https://github.com/yandex/YaLM-100B)
* [GPT-NeoX (20B)](https://github.com/EleutherAI/gpt-neox)
* [AlexaTM (20B)](https://www.amazon.science/blog/20b-parameter-alexa-model-sets-new-marks-in-few-shot-learning)
* [Turing NLG (17B)](https://www.microsoft.com/en-us/research/blog/turing-nlg-a-17-billion-parameter-language-model-by-microsoft/)
* [METRO-LM (5.4B)](https://arxiv.org/pdf/2204.06644.pdf)
DeepSpeed has been integrated with several different popular open-source DL frameworks such as:
| | Documentation |
| ---------------------------------------------------------------------------------------------- | -------------------------------------------- |

| [Transformers with DeepSpeed](https://huggingface.co/docs/transformers/deepspeed) |
| 
| [Accelerate with DeepSpeed](https://huggingface.co/docs/accelerate/usage_guides/deepspeed) |
|
| [MMEngine with DeepSpeed](https://mmengine.readthedocs.io/en/latest/common_usage/large_model_training.html#deepspeed) |
---
# Build Pipeline Status
| Description | Status |
| ----------- | ------ |
| NVIDIA | [](https://github.com/deepspeedai/DeepSpeed/actions/workflows/nv-torch110-p40.yml) [](https://github.com/deepspeedai/DeepSpeed/actions/workflows/nv-torch110-v100.yml) [](https://github.com/deepspeedai/DeepSpeed/actions/workflows/nv-torch-latest-v100.yml) [](https://github.com/deepspeedai/DeepSpeed/actions/workflows/nv-h100.yml) [](https://github.com/deepspeedai/DeepSpeed/actions/workflows/nv-inference.yml) [](https://github.com/deepspeedai/DeepSpeed/actions/workflows/nv-nightly.yml) |
| AMD | [](https://github.com/deepspeedai/DeepSpeed/actions/workflows/amd-mi200.yml) |
| CPU | [](https://github.com/deepspeedai/DeepSpeed/actions/workflows/cpu-torch-latest.yml) [](https://github.com/deepspeedai/DeepSpeed/actions/workflows/cpu-inference.yml) |
| Intel Gaudi | [](https://github.com/deepspeedai/DeepSpeed/actions/workflows/hpu-gaudi2.yml) |
| Intel XPU | [](https://github.com/deepspeedai/DeepSpeed/actions/workflows/xpu-max1100.yml) |
| PyTorch Nightly | [](https://github.com/deepspeedai/DeepSpeed/actions/workflows/nv-torch-nightly-v100.yml) |
| Integrations | [](https://github.com/deepspeedai/DeepSpeed/actions/workflows/nv-transformers-v100.yml) [](https://github.com/deepspeedai/DeepSpeed/actions/workflows/nv-lightning-v100.yml) [](https://github.com/deepspeedai/DeepSpeed/actions/workflows/nv-accelerate-v100.yml) [](https://github.com/deepspeedai/DeepSpeed/actions/workflows/nv-mii.yml) [](https://github.com/deepspeedai/DeepSpeed/actions/workflows/nv-ds-chat.yml) [](https://github.com/deepspeedai/DeepSpeed/actions/workflows/nv-sd.yml) |
| Misc | [](https://github.com/deepspeedai/DeepSpeed/actions/workflows/formatting.yml) [](https://github.com/deepspeedai/DeepSpeed/actions/workflows/pages/pages-build-deployment) [](https://deepspeed.readthedocs.io/en/latest/?badge=latest)[](https://github.com/deepspeedai/DeepSpeed/actions/workflows/python.yml) |
| Huawei Ascend NPU | [](https://github.com/Ascend/Ascend-CI/actions/workflows/deepspeed.yaml) |
# Installation
The quickest way to get started with DeepSpeed is via pip, this will install
the latest release of DeepSpeed which is not tied to specific PyTorch or CUDA
versions. DeepSpeed includes several C++/CUDA extensions that we commonly refer
to as our 'ops'. By default, all of these extensions/ops will be built
just-in-time (JIT) using [torch's JIT C++ extension loader that relies on
ninja](https://pytorch.org/docs/stable/cpp_extension.html) to build and
dynamically link them at runtime.
## Requirements
* [PyTorch](https://pytorch.org/) must be installed _before_ installing DeepSpeed.
* For full feature support we recommend a version of PyTorch that is >= 1.9 and ideally the latest PyTorch stable release.
* A CUDA or ROCm compiler such as [nvcc](https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/#introduction) or [hipcc](https://github.com/ROCm-Developer-Tools/HIPCC) used to compile C++/CUDA/HIP extensions.
* Specific GPUs we develop and test against are listed below, this doesn't mean your GPU will not work if it doesn't fall into this category it's just DeepSpeed is most well tested on the following:
* NVIDIA: Pascal, Volta, Ampere, and Hopper architectures
* AMD: MI100 and MI200
## Contributed HW support
* DeepSpeed now support various HW accelerators.
| Contributor | Hardware | Accelerator Name | Contributor validated | Upstream validated |
|-------------|-------------------------------------|------------------| --------------------- |--------------------|
| Huawei | Huawei Ascend NPU | npu | Yes | No |
| Intel | Intel(R) Gaudi(R) 2 AI accelerator | hpu | Yes | Yes |
| Intel | Intel(R) Xeon(R) Processors | cpu | Yes | Yes |
| Intel | Intel(R) Data Center GPU Max series | xpu | Yes | Yes |
| Tecorigin | Scalable Data Analytics Accelerator | sdaa | Yes | No |
## PyPI
We regularly push releases to [PyPI](https://pypi.org/project/deepspeed/) and encourage users to install from there in most cases.
```bash
pip install deepspeed
```
After installation, you can validate your install and see which extensions/ops
your machine is compatible with via the DeepSpeed environment report.
```bash
ds_report
```
If you would like to pre-install any of the DeepSpeed extensions/ops (instead
of JIT compiling) or install pre-compiled ops via PyPI please see our [advanced
installation instructions](https://www.deepspeed.ai/tutorials/advanced-install/).
## Windows
Many DeepSpeed features are supported on Windows for both training and inference. You can read more about this in the original blog post [here](https://github.com/deepspeedai/DeepSpeed/tree/master/blogs/windows/08-2024/README.md). Among features that are currently not supported are async io (AIO) and GDS (which does not support Windows).
1. Install PyTorch, such as pytorch 2.3+cu121.
2. Install Visual C++ build tools, such as VS2022 C++ x64/x86 build tools.
3. Launch Cmd console with Administrator permissions for creating required symlink folders and ensure MSVC tools are added to your PATH or launch the Developer Command Prompt for Visual Studio 2022 with administrator permissions.
4. Run `build_win.bat` to build wheel in `dist` folder.
# Features
Please checkout [DeepSpeed-Training](https://www.deepspeed.ai/training), [DeepSpeed-Inference](https://www.deepspeed.ai/inference) and [DeepSpeed-Compression](https://www.deepspeed.ai/compression) pages for full set of features offered along each of these three pillars.
# Further Reading
All DeepSpeed documentation, tutorials, and blogs can be found on our website: [deepspeed.ai](https://www.deepspeed.ai/)
| | Description |
| ---------------------------------------------------------------------------------------------- | -------------------------------------------- |
| [Getting Started](https://www.deepspeed.ai/getting-started/) | First steps with DeepSpeed |
| [DeepSpeed JSON Configuration](https://www.deepspeed.ai/docs/config-json/) | Configuring DeepSpeed |
| [API Documentation](https://deepspeed.readthedocs.io/en/latest/) | Generated DeepSpeed API documentation |
| [Tutorials](https://www.deepspeed.ai/tutorials/) | Tutorials |
| [Blogs](https://www.deepspeed.ai/posts/) | Blogs |
# Contributing
DeepSpeed welcomes your contributions! Please see our
[contributing](CONTRIBUTING.md) guide for more details on formatting, testing,
etc.