diff --git a/README.md b/README.md index d15f9dc95f08f0155e40d6a2be19bacea0fc3738..d8156156e6410e6744ec9caa4e5c599817591715 100644 --- a/README.md +++ b/README.md @@ -183,23 +183,29 @@ DeepSparkInference将按季度进行版本更新,后续会逐步丰富模型 ## LLM (Large Language Model) -| Model | vLLM | TRT-LLM | TGI | -|--------------------|---------------------------------------------------------------------|------------------------------------------------------------------|-------------------------------------------------------------------------------------| -| Baichuan2-7B | [✅](models/nlp/large_language_model/baichuan2-7b/vllm/README.md) | | | -| ChatGLM-3-6B | [✅](models/nlp/large_language_model/chatglm3-6b/vllm/README.md) | | | -| ChatGLM-3-6B-32K | [✅](models/nlp/large_language_model/chatglm3-6b-32k/vllm/README.md) | | | -| Llama2-7B | [✅](models/nlp/large_language_model/llama2-7b/vllm/README.md) | [✅](models/nlp/large_language_model/llama2-7b/trtllm/README.md) | | -| Llama2-13B | | [✅](models/nlp/large_language_model/llama2-13b/trtllm/README.md) | | -| Llama2-70B | | [✅](models/nlp/large_language_model/llama2-70b/trtllm/README.md) | | -| Llama3-70B | [✅](models/nlp/large_language_model/llama3-70b/vllm/README.md) | | | -| Qwen-7B | [✅](models/nlp/large_language_model/qwen-7b/vllm/README.md) | | | -| Qwen1.5-7B | [✅](models/nlp/large_language_model/qwen1.5-7b/vllm/README.md) | | [✅](models/nlp/large_language_model/qwen1.5-7b/text-generation-inference/README.md) | -| Qwen1.5-14B | [✅](models/nlp/large_language_model/qwen1.5-14b/vllm/README.md) | | | -| Qwen1.5-32B Chat | [✅](models/nlp/large_language_model/qwen1.5-32b/vllm/README.md) | | | -| Qwen1.5-72B | [✅](models/nlp/large_language_model/qwen1.5-72b/vllm/README.md) | | | -| Qwen2-7B Instruct | [✅](models/nlp/large_language_model/qwen2-7b/vllm/README.md) | | | -| Qwen2-72B Instruct | [✅](models/nlp/large_language_model/qwen2-72b/vllm/README.md) | | | -| StableLM2-1.6B | [✅](models/nlp/large_language_model/stablelm/vllm/README.md) | | | +| Model | vLLM | TRT-LLM | TGI | +|-------------------------------|-----------------------------------------------------------------------------------|------------------------------------------------------------------|-------------------------------------------------------------------------------------| +| Baichuan2-7B | [✅](models/nlp/large_language_model/baichuan2-7b/vllm/README.md) | | | +| ChatGLM-3-6B | [✅](models/nlp/large_language_model/chatglm3-6b/vllm/README.md) | | | +| ChatGLM-3-6B-32K | [✅](models/nlp/large_language_model/chatglm3-6b-32k/vllm/README.md) | | | +| DeepSeek-R1-Distill-Llama-8B | [✅](models/nlp/large_language_model/deepseek-r1-distill-llama-8b/vllm/README.md) | | | +| DeepSeek-R1-Distill-Llama-70B | [✅](models/nlp/large_language_model/deepseek-r1-distill-llama-70b/vllm/README.md) | | | +| DeepSeek-R1-Distill-Qwen-1.5B | [✅](models/nlp/large_language_model/deepseek-r1-distill-qwen-1.5b/vllm/README.md) | | | +| DeepSeek-R1-Distill-Qwen-7B | [✅](models/nlp/large_language_model/deepseek-r1-distill-qwen-7b/vllm/README.md) | | | +| DeepSeek-R1-Distill-Qwen-14B | [✅](models/nlp/large_language_model/deepseek-r1-distill-qwen-14b/vllm/README.md) | | | +| DeepSeek-R1-Distill-Qwen-32B | [✅](models/nlp/large_language_model/deepseek-r1-distill-qwen-32b/vllm/README.md) | | | +| Llama2-7B | [✅](models/nlp/large_language_model/llama2-7b/vllm/README.md) | [✅](models/nlp/large_language_model/llama2-7b/trtllm/README.md) | | +| Llama2-13B | | [✅](models/nlp/large_language_model/llama2-13b/trtllm/README.md) | | +| Llama2-70B | | [✅](models/nlp/large_language_model/llama2-70b/trtllm/README.md) | | +| Llama3-70B | [✅](models/nlp/large_language_model/llama3-70b/vllm/README.md) | | | +| Qwen-7B | [✅](models/nlp/large_language_model/qwen-7b/vllm/README.md) | | | +| Qwen1.5-7B | [✅](models/nlp/large_language_model/qwen1.5-7b/vllm/README.md) | | [✅](models/nlp/large_language_model/qwen1.5-7b/text-generation-inference/README.md) | +| Qwen1.5-14B | [✅](models/nlp/large_language_model/qwen1.5-14b/vllm/README.md) | | | +| Qwen1.5-32B Chat | [✅](models/nlp/large_language_model/qwen1.5-32b/vllm/README.md) | | | +| Qwen1.5-72B | [✅](models/nlp/large_language_model/qwen1.5-72b/vllm/README.md) | | | +| Qwen2-7B Instruct | [✅](models/nlp/large_language_model/qwen2-7b/vllm/README.md) | | | +| Qwen2-72B Instruct | [✅](models/nlp/large_language_model/qwen2-72b/vllm/README.md) | | | +| StableLM2-1.6B | [✅](models/nlp/large_language_model/stablelm/vllm/README.md) | | | ## Multimodal diff --git a/models/nlp/large_language_model/deepseek-r1-distill-llama-70b/vllm/README.md b/models/nlp/large_language_model/deepseek-r1-distill-llama-70b/vllm/README.md index 26049a59f9beae20d1382ba6b597b4f8f8c6808f..af8b5abbcfc6eb665cb6f5a935dfa2bc5e3f1781 100644 --- a/models/nlp/large_language_model/deepseek-r1-distill-llama-70b/vllm/README.md +++ b/models/nlp/large_language_model/deepseek-r1-distill-llama-70b/vllm/README.md @@ -2,7 +2,9 @@ ## Description -DeepSeek-R1-Distill models are fine-tuned based on open-source models, using samples generated by DeepSeek-R1. We slightly change their configs and tokenizers. We open-source distilled 1.5B, 7B, 8B, 14B, 32B, and 70B checkpoints based on Qwen2.5 and Llama3 series to the community. +DeepSeek-R1-Distill models are fine-tuned based on open-source models, using samples generated by +DeepSeek-R1. We slightly change their configs and tokenizers. We open-source distilled 1.5B, 7B, +8B, 14B, 32B, and 70B checkpoints based on Qwen2.5 and Llama3 series to the community. ## Setup @@ -31,6 +33,7 @@ ln -s /path/to/DeepSeek-R1-Distill-Llama-70B ./data/ ```bash python3 offline_inference.py --model ./data/DeepSeek-R1-Distill-Llama-70B --max-tokens 256 -tp 8 --temperature 0.0 --max-model-len 3096 ``` + ## Inference with serve ```bash diff --git a/models/nlp/large_language_model/deepseek-r1-distill-llama-70b/vllm/ci/prepare.sh b/models/nlp/large_language_model/deepseek-r1-distill-llama-70b/vllm/ci/prepare.sh index 75fb19458942e1d61a674c1d6dd9bbdb521bc00c..0fa3df9b4017331b2579cf5e039676248f79fff9 100644 --- a/models/nlp/large_language_model/deepseek-r1-distill-llama-70b/vllm/ci/prepare.sh +++ b/models/nlp/large_language_model/deepseek-r1-distill-llama-70b/vllm/ci/prepare.sh @@ -1,5 +1,5 @@ #!/bin/bash -# Copyright (c) 2024, Shanghai Iluvatar CoreX Semiconductor Co., Ltd. +# Copyright (c) 2025, Shanghai Iluvatar CoreX Semiconductor Co., Ltd. # All Rights Reserved. # # Licensed under the Apache License, Version 2.0 (the "License"); you may diff --git a/models/nlp/large_language_model/deepseek-r1-distill-llama-70b/vllm/offline_inference.py b/models/nlp/large_language_model/deepseek-r1-distill-llama-70b/vllm/offline_inference.py index 9b7d87fd797c78fcedba7cd4c9a9a0e7642c251f..7653847bb042e59cc174d93a0bfc98f521e743e3 100644 --- a/models/nlp/large_language_model/deepseek-r1-distill-llama-70b/vllm/offline_inference.py +++ b/models/nlp/large_language_model/deepseek-r1-distill-llama-70b/vllm/offline_inference.py @@ -1,3 +1,18 @@ +# Copyright (c) 2025, Shanghai Iluvatar CoreX Semiconductor Co., Ltd. +# All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); you may +# not use this file except in compliance with the License. You may obtain +# a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + import sys from pathlib import Path import os diff --git a/models/nlp/large_language_model/deepseek-r1-distill-llama-8b/vllm/README.md b/models/nlp/large_language_model/deepseek-r1-distill-llama-8b/vllm/README.md index 903a7a160a1e77f7ff35a1f320b810477e7a5455..b5b9c6d0fd73099ac11bac6ebbc18f99c19201f5 100644 --- a/models/nlp/large_language_model/deepseek-r1-distill-llama-8b/vllm/README.md +++ b/models/nlp/large_language_model/deepseek-r1-distill-llama-8b/vllm/README.md @@ -2,7 +2,9 @@ ## Description -DeepSeek-R1-Distill models are fine-tuned based on open-source models, using samples generated by DeepSeek-R1. We slightly change their configs and tokenizers. We open-source distilled 1.5B, 7B, 8B, 14B, 32B, and 70B checkpoints based on Qwen2.5 and Llama3 series to the community. +DeepSeek-R1-Distill models are fine-tuned based on open-source models, using samples generated by +DeepSeek-R1. We slightly change their configs and tokenizers. We open-source distilled 1.5B, 7B, +8B, 14B, 32B, and 70B checkpoints based on Qwen2.5 and Llama3 series to the community. ## Setup @@ -31,6 +33,7 @@ ln -s /path/to/DeepSeek-R1-Distill-Llama-8B ./data/ ```bash python3 offline_inference.py --model ./data/DeepSeek-R1-Distill-Llama-8B --max-tokens 256 -tp 1 --temperature 0.0 --max-model-len 3096 ``` + ## Inference with serve ```bash @@ -39,9 +42,9 @@ vllm serve data/DeepSeek-R1-Distill-Llama-8B --tensor-parallel-size 2 --max-mode ## Results -| Model | QPS | -| ---------- | ----- | -| DeepSeek-R1-Distill-Llama-8B | 105.33| +| Model | QPS | +|------------------------------|--------| +| DeepSeek-R1-Distill-Llama-8B | 105.33 | ## Reference diff --git a/models/nlp/large_language_model/deepseek-r1-distill-llama-8b/vllm/ci/prepare.sh b/models/nlp/large_language_model/deepseek-r1-distill-llama-8b/vllm/ci/prepare.sh index 75fb19458942e1d61a674c1d6dd9bbdb521bc00c..0fa3df9b4017331b2579cf5e039676248f79fff9 100644 --- a/models/nlp/large_language_model/deepseek-r1-distill-llama-8b/vllm/ci/prepare.sh +++ b/models/nlp/large_language_model/deepseek-r1-distill-llama-8b/vllm/ci/prepare.sh @@ -1,5 +1,5 @@ #!/bin/bash -# Copyright (c) 2024, Shanghai Iluvatar CoreX Semiconductor Co., Ltd. +# Copyright (c) 2025, Shanghai Iluvatar CoreX Semiconductor Co., Ltd. # All Rights Reserved. # # Licensed under the Apache License, Version 2.0 (the "License"); you may diff --git a/models/nlp/large_language_model/deepseek-r1-distill-llama-8b/vllm/offline_inference.py b/models/nlp/large_language_model/deepseek-r1-distill-llama-8b/vllm/offline_inference.py index 9b7d87fd797c78fcedba7cd4c9a9a0e7642c251f..7653847bb042e59cc174d93a0bfc98f521e743e3 100644 --- a/models/nlp/large_language_model/deepseek-r1-distill-llama-8b/vllm/offline_inference.py +++ b/models/nlp/large_language_model/deepseek-r1-distill-llama-8b/vllm/offline_inference.py @@ -1,3 +1,18 @@ +# Copyright (c) 2025, Shanghai Iluvatar CoreX Semiconductor Co., Ltd. +# All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); you may +# not use this file except in compliance with the License. You may obtain +# a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + import sys from pathlib import Path import os diff --git a/models/nlp/large_language_model/deepseek-r1-distill-qwen-1.5b/vllm/README.md b/models/nlp/large_language_model/deepseek-r1-distill-qwen-1.5b/vllm/README.md index c52fc0f55f7692753cea62ebd1d171f80a739136..88eb51633c2fc459fccaada7df287349d3c89808 100644 --- a/models/nlp/large_language_model/deepseek-r1-distill-qwen-1.5b/vllm/README.md +++ b/models/nlp/large_language_model/deepseek-r1-distill-qwen-1.5b/vllm/README.md @@ -2,7 +2,9 @@ ## Description -DeepSeek-R1-Distill models are fine-tuned based on open-source models, using samples generated by DeepSeek-R1. We slightly change their configs and tokenizers. We open-source distilled 1.5B, 7B, 8B, 14B, 32B, and 70B checkpoints based on Qwen2.5 and Llama3 series to the community. +DeepSeek-R1-Distill models are fine-tuned based on open-source models, using samples generated by +DeepSeek-R1. We slightly change their configs and tokenizers. We open-source distilled 1.5B, 7B, +8B, 14B, 32B, and 70B checkpoints based on Qwen2.5 and Llama3 series to the community. ## Setup @@ -31,6 +33,7 @@ ln -s /path/to/DeepSeek-R1-Distill-Qwen-1.5B ./data/ ```bash python3 offline_inference.py --model ./data/DeepSeek-R1-Distill-Qwen-1.5B --max-tokens 256 -tp 1 --temperature 0.0 --max-model-len 3096 ``` + ## Inference with serve ```bash @@ -39,8 +42,8 @@ vllm serve data/DeepSeek-R1-Distill-Qwen-1.5B --tensor-parallel-size 2 --max-mod ## Results -| Model | QPS | -| ---------- | ----- | +| Model | QPS | +|-------------------------------|--------| | DeepSeek-R1-Distill-Qwen-1.5B | 259.42 | ## Reference diff --git a/models/nlp/large_language_model/deepseek-r1-distill-qwen-1.5b/vllm/ci/prepare.sh b/models/nlp/large_language_model/deepseek-r1-distill-qwen-1.5b/vllm/ci/prepare.sh index 75fb19458942e1d61a674c1d6dd9bbdb521bc00c..0fa3df9b4017331b2579cf5e039676248f79fff9 100644 --- a/models/nlp/large_language_model/deepseek-r1-distill-qwen-1.5b/vllm/ci/prepare.sh +++ b/models/nlp/large_language_model/deepseek-r1-distill-qwen-1.5b/vllm/ci/prepare.sh @@ -1,5 +1,5 @@ #!/bin/bash -# Copyright (c) 2024, Shanghai Iluvatar CoreX Semiconductor Co., Ltd. +# Copyright (c) 2025, Shanghai Iluvatar CoreX Semiconductor Co., Ltd. # All Rights Reserved. # # Licensed under the Apache License, Version 2.0 (the "License"); you may diff --git a/models/nlp/large_language_model/deepseek-r1-distill-qwen-1.5b/vllm/offline_inference.py b/models/nlp/large_language_model/deepseek-r1-distill-qwen-1.5b/vllm/offline_inference.py index 9b7d87fd797c78fcedba7cd4c9a9a0e7642c251f..7653847bb042e59cc174d93a0bfc98f521e743e3 100644 --- a/models/nlp/large_language_model/deepseek-r1-distill-qwen-1.5b/vllm/offline_inference.py +++ b/models/nlp/large_language_model/deepseek-r1-distill-qwen-1.5b/vllm/offline_inference.py @@ -1,3 +1,18 @@ +# Copyright (c) 2025, Shanghai Iluvatar CoreX Semiconductor Co., Ltd. +# All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); you may +# not use this file except in compliance with the License. You may obtain +# a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + import sys from pathlib import Path import os diff --git a/models/nlp/large_language_model/deepseek-r1-distill-qwen-14b/vllm/README.md b/models/nlp/large_language_model/deepseek-r1-distill-qwen-14b/vllm/README.md index 7d20eeeec6f29e20af0a1aed438f5f6a26f92986..f24c6904ade62b7028a17bf01bc4ef143d969935 100644 --- a/models/nlp/large_language_model/deepseek-r1-distill-qwen-14b/vllm/README.md +++ b/models/nlp/large_language_model/deepseek-r1-distill-qwen-14b/vllm/README.md @@ -2,7 +2,9 @@ ## Description -DeepSeek-R1-Distill models are fine-tuned based on open-source models, using samples generated by DeepSeek-R1. We slightly change their configs and tokenizers. We open-source distilled 1.5B, 7B, 8B, 14B, 32B, and 70B checkpoints based on Qwen2.5 and Llama3 series to the community. +DeepSeek-R1-Distill models are fine-tuned based on open-source models, using samples generated by +DeepSeek-R1. We slightly change their configs and tokenizers. We open-source distilled 1.5B, 7B, +8B, 14B, 32B, and 70B checkpoints based on Qwen2.5 and Llama3 series to the community. ## Setup @@ -31,6 +33,7 @@ ln -s /path/to/DeepSeek-R1-Distill-Qwen-14B ./data/ ```bash python3 offline_inference.py --model ./data/DeepSeek-R1-Distill-Qwen-14B --max-tokens 256 -tp 2 --temperature 0.0 --max-model-len 3096 ``` + ## Inference with serve ```bash @@ -39,9 +42,9 @@ vllm serve data/DeepSeek-R1-Distill-Qwen-14B --tensor-parallel-size 2 --max-mode ## Results -| Model | QPS | -| ---------- | ----- | -| DeepSeek-R1-Distill-Qwen-14B | 88.01| +| Model | QPS | +|------------------------------|-------| +| DeepSeek-R1-Distill-Qwen-14B | 88.01 | ## Reference diff --git a/models/nlp/large_language_model/deepseek-r1-distill-qwen-14b/vllm/ci/prepare.sh b/models/nlp/large_language_model/deepseek-r1-distill-qwen-14b/vllm/ci/prepare.sh index 75fb19458942e1d61a674c1d6dd9bbdb521bc00c..0fa3df9b4017331b2579cf5e039676248f79fff9 100644 --- a/models/nlp/large_language_model/deepseek-r1-distill-qwen-14b/vllm/ci/prepare.sh +++ b/models/nlp/large_language_model/deepseek-r1-distill-qwen-14b/vllm/ci/prepare.sh @@ -1,5 +1,5 @@ #!/bin/bash -# Copyright (c) 2024, Shanghai Iluvatar CoreX Semiconductor Co., Ltd. +# Copyright (c) 2025, Shanghai Iluvatar CoreX Semiconductor Co., Ltd. # All Rights Reserved. # # Licensed under the Apache License, Version 2.0 (the "License"); you may diff --git a/models/nlp/large_language_model/deepseek-r1-distill-qwen-14b/vllm/offline_inference.py b/models/nlp/large_language_model/deepseek-r1-distill-qwen-14b/vllm/offline_inference.py index 9b7d87fd797c78fcedba7cd4c9a9a0e7642c251f..7653847bb042e59cc174d93a0bfc98f521e743e3 100644 --- a/models/nlp/large_language_model/deepseek-r1-distill-qwen-14b/vllm/offline_inference.py +++ b/models/nlp/large_language_model/deepseek-r1-distill-qwen-14b/vllm/offline_inference.py @@ -1,3 +1,18 @@ +# Copyright (c) 2025, Shanghai Iluvatar CoreX Semiconductor Co., Ltd. +# All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); you may +# not use this file except in compliance with the License. You may obtain +# a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + import sys from pathlib import Path import os diff --git a/models/nlp/large_language_model/deepseek-r1-distill-qwen-32b/vllm/README.md b/models/nlp/large_language_model/deepseek-r1-distill-qwen-32b/vllm/README.md index e1c8ca81e97b877136c80e293816845e75c1dec0..4a2b85bc6a9618d4a1d6543503bed7768dc7c919 100644 --- a/models/nlp/large_language_model/deepseek-r1-distill-qwen-32b/vllm/README.md +++ b/models/nlp/large_language_model/deepseek-r1-distill-qwen-32b/vllm/README.md @@ -2,7 +2,9 @@ ## Description -DeepSeek-R1-Distill models are fine-tuned based on open-source models, using samples generated by DeepSeek-R1. We slightly change their configs and tokenizers. We open-source distilled 1.5B, 7B, 8B, 14B, 32B, and 70B checkpoints based on Qwen2.5 and Llama3 series to the community. +DeepSeek-R1-Distill models are fine-tuned based on open-source models, using samples generated by +DeepSeek-R1. We slightly change their configs and tokenizers. We open-source distilled 1.5B, 7B, +8B, 14B, 32B, and 70B checkpoints based on Qwen2.5 and Llama3 series to the community. ## Setup @@ -31,6 +33,7 @@ ln -s /path/to/DeepSeek-R1-Distill-Qwen-32B ./data/ ```bash python3 offline_inference.py --model ./data/DeepSeek-R1-Distill-Qwen-32B --max-tokens 256 -tp 4 --temperature 0.0 --max-model-len 3096 ``` + ## Inference with serve ```bash @@ -39,9 +42,9 @@ vllm serve data/DeepSeek-R1-Distill-Qwen-32B --tensor-parallel-size 4 --max-mode ## Results -| Model | QPS | -| ---------- | ----- | -| DeepSeek-R1-Distill-Qwen-32B | 68.30| +| Model | QPS | +|------------------------------|-------| +| DeepSeek-R1-Distill-Qwen-32B | 68.30 | ## Reference diff --git a/models/nlp/large_language_model/deepseek-r1-distill-qwen-32b/vllm/ci/prepare.sh b/models/nlp/large_language_model/deepseek-r1-distill-qwen-32b/vllm/ci/prepare.sh index 75fb19458942e1d61a674c1d6dd9bbdb521bc00c..0fa3df9b4017331b2579cf5e039676248f79fff9 100644 --- a/models/nlp/large_language_model/deepseek-r1-distill-qwen-32b/vllm/ci/prepare.sh +++ b/models/nlp/large_language_model/deepseek-r1-distill-qwen-32b/vllm/ci/prepare.sh @@ -1,5 +1,5 @@ #!/bin/bash -# Copyright (c) 2024, Shanghai Iluvatar CoreX Semiconductor Co., Ltd. +# Copyright (c) 2025, Shanghai Iluvatar CoreX Semiconductor Co., Ltd. # All Rights Reserved. # # Licensed under the Apache License, Version 2.0 (the "License"); you may diff --git a/models/nlp/large_language_model/deepseek-r1-distill-qwen-32b/vllm/offline_inference.py b/models/nlp/large_language_model/deepseek-r1-distill-qwen-32b/vllm/offline_inference.py index 9b7d87fd797c78fcedba7cd4c9a9a0e7642c251f..7653847bb042e59cc174d93a0bfc98f521e743e3 100644 --- a/models/nlp/large_language_model/deepseek-r1-distill-qwen-32b/vllm/offline_inference.py +++ b/models/nlp/large_language_model/deepseek-r1-distill-qwen-32b/vllm/offline_inference.py @@ -1,3 +1,18 @@ +# Copyright (c) 2025, Shanghai Iluvatar CoreX Semiconductor Co., Ltd. +# All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); you may +# not use this file except in compliance with the License. You may obtain +# a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + import sys from pathlib import Path import os diff --git a/models/nlp/large_language_model/deepseek-r1-distill-qwen-7b/vllm/README.md b/models/nlp/large_language_model/deepseek-r1-distill-qwen-7b/vllm/README.md index 8d72e0c77645cd2ade78fd30f4b065e183508870..fba0db66aa7ea985df7b962be5b6ebb5cea59289 100644 --- a/models/nlp/large_language_model/deepseek-r1-distill-qwen-7b/vllm/README.md +++ b/models/nlp/large_language_model/deepseek-r1-distill-qwen-7b/vllm/README.md @@ -2,7 +2,9 @@ ## Description -DeepSeek-R1-Distill models are fine-tuned based on open-source models, using samples generated by DeepSeek-R1. We slightly change their configs and tokenizers. We open-source distilled 1.5B, 7B, 8B, 14B, 32B, and 70B checkpoints based on Qwen2.5 and Llama3 series to the community. +DeepSeek-R1-Distill models are fine-tuned based on open-source models, using samples generated by +DeepSeek-R1. We slightly change their configs and tokenizers. We open-source distilled 1.5B, 7B, +8B, 14B, 32B, and 70B checkpoints based on Qwen2.5 and Llama3 series to the community. ## Setup @@ -31,6 +33,7 @@ ln -s /path/to/DeepSeek-R1-Distill-Qwen-7B ./data/ ```bash python3 offline_inference.py --model ./data/DeepSeek-R1-Distill-Qwen-7B --max-tokens 256 -tp 1 --temperature 0.0 --max-model-len 3096 ``` + ## Inference with serve ```bash @@ -39,9 +42,9 @@ vllm serve data/DeepSeek-R1-Distill-Qwen-7B --tensor-parallel-size 2 --max-model ## Results -| Model | QPS | -| ---------- | ----- | -| DeepSeek-R1-Distill-Qwen-7B | 90.48| +| Model | QPS | +|-----------------------------|-------| +| DeepSeek-R1-Distill-Qwen-7B | 90.48 | ## Reference diff --git a/models/nlp/large_language_model/deepseek-r1-distill-qwen-7b/vllm/ci/prepare.sh b/models/nlp/large_language_model/deepseek-r1-distill-qwen-7b/vllm/ci/prepare.sh index 75fb19458942e1d61a674c1d6dd9bbdb521bc00c..0fa3df9b4017331b2579cf5e039676248f79fff9 100644 --- a/models/nlp/large_language_model/deepseek-r1-distill-qwen-7b/vllm/ci/prepare.sh +++ b/models/nlp/large_language_model/deepseek-r1-distill-qwen-7b/vllm/ci/prepare.sh @@ -1,5 +1,5 @@ #!/bin/bash -# Copyright (c) 2024, Shanghai Iluvatar CoreX Semiconductor Co., Ltd. +# Copyright (c) 2025, Shanghai Iluvatar CoreX Semiconductor Co., Ltd. # All Rights Reserved. # # Licensed under the Apache License, Version 2.0 (the "License"); you may diff --git a/models/nlp/large_language_model/deepseek-r1-distill-qwen-7b/vllm/offline_inference.py b/models/nlp/large_language_model/deepseek-r1-distill-qwen-7b/vllm/offline_inference.py index 9b7d87fd797c78fcedba7cd4c9a9a0e7642c251f..7653847bb042e59cc174d93a0bfc98f521e743e3 100644 --- a/models/nlp/large_language_model/deepseek-r1-distill-qwen-7b/vllm/offline_inference.py +++ b/models/nlp/large_language_model/deepseek-r1-distill-qwen-7b/vllm/offline_inference.py @@ -1,3 +1,18 @@ +# Copyright (c) 2025, Shanghai Iluvatar CoreX Semiconductor Co., Ltd. +# All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); you may +# not use this file except in compliance with the License. You may obtain +# a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + import sys from pathlib import Path import os