diff --git a/docs/vllm_mindspore/docs/source_en/index.rst b/docs/vllm_mindspore/docs/source_en/index.rst index 6bdf807d69e848b567e30d1d1952ea9d938be05a..6cbeafd7013635673d4b38c882385c1b718f1f82 100644 --- a/docs/vllm_mindspore/docs/source_en/index.rst +++ b/docs/vllm_mindspore/docs/source_en/index.rst @@ -1,9 +1,9 @@ -vLLM-MindSpore Plugin +vLLM-MindSpore Plugin Documentation ========================================= -Overview +vLLM-MindSpore Plugin Overview ----------------------------------------------------- -vLLM-MindSpore Plugin (`vllm-mindspore`) is a plugin brewed by the `MindSpore community `_ , which aims to integrate MindSpore LLM inference capabilities into `vLLM `_ . With vLLM-MindSpore Plugin, technical strengths of Mindspore and vLLM will be organically combined to provide a full-stack open-source, high-performance, easy-to-use LLM inference solution. +vLLM-MindSpore Plugin is a plugin brewed by the `MindSpore community `_ , which aims to integrate MindSpore LLM inference capabilities into `vLLM `_ . With vLLM-MindSpore Plugin, technical strengths of Mindspore and vLLM will be organically combined to provide a full-stack open-source, high-performance, easy-to-use LLM inference solution. vLLM, an opensource and community-driven project initiated by Sky Computing Lab, UC Berkeley, has been widely used in academic research and industry applications. On the basis of Continuous Batching scheduling mechanism and PagedAttention Key-Value cache management, vLLM provides a rich set of inference service features, including speculative inference, Prefix Caching, Multi-LoRA, etc. vLLM also supports a wide range of open-source large models, including Transformer-based models (e.g., LLaMa), Mixture-of-Expert models (e.g., DeepSeek), Embedding models (e.g., E5-Mistral), and multi-modal models (e.g., LLaVA). Because vLLM chooses to use PyTorch to build large models and manage storage resources, it cannot deploy large models built upon MindSpore. diff --git a/docs/vllm_mindspore/docs/source_zh_cn/index.rst b/docs/vllm_mindspore/docs/source_zh_cn/index.rst index 7fe2761a461e52526a17e1bfbbadca7abbbd4a93..d59bd8650b1f93326d2b7e28a4370fdcde598c70 100644 --- a/docs/vllm_mindspore/docs/source_zh_cn/index.rst +++ b/docs/vllm_mindspore/docs/source_zh_cn/index.rst @@ -1,9 +1,9 @@ -vLLM-MindSpore插件文档 +vLLM-MindSpore Plugin 文档 ========================================= -vLLM-MindSpore插件简介 +vLLM-MindSpore Plugin 简介 ----------------------------------------------------- -vLLM-MindSpore插件(`vllm-mindspore`)是一个由 `MindSpore社区 `_ 孵化的vLLM后端插件。其将基于MindSpore构建的大模型推理能力接入 `vLLM `_ ,从而有机整合MindSpore和vLLM的技术优势,提供全栈开源、高性能、易用的大模型推理解决方案。 +vLLM-MindSpore插件(vLLM-MindSpore Plugin)是一个由 `MindSpore社区 `_ 孵化的vLLM后端插件。其将基于MindSpore构建的大模型推理能力接入 `vLLM `_ ,从而有机整合MindSpore和vLLM的技术优势,提供全栈开源、高性能、易用的大模型推理解决方案。 vLLM是由加州大学伯克利分校Sky Computing Lab创建的社区开源项目,已广泛用于学术研究和工业应用。vLLM以Continuous Batching调度机制和PagedAttention Key-Value缓存管理为基础,提供了丰富的推理服务功能,包括投机推理、Prefix Caching、Multi-LoRA等。同时,vLLM已支持种类丰富的开源大模型,包括Transformer类(如LLaMa)、混合专家类(如DeepSeek)、Embedding类(如E5-Mistral)、多模态类(如LLaVA)等。由于vLLM选用PyTorch构建大模型和管理计算存储资源,此前无法使用其部署基于MindSpore大模型的推理服务。