# localai-run-llm **Repository Path**: fly-llm/localai-run-llm ## Basic Information - **Project Name**: localai-run-llm - **Description**: No description available - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 9 - **Forks**: 6 - **Created**: 2024-04-08 - **Last Updated**: 2025-03-10 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README ## 1,localai 项目 下载二进制文件: https://github.com/mudler/LocalAI/releases ```bash cuda11 wget https://github.com/mudler/LocalAI/releases/download/v2.12.3/local-ai-cuda11-Linux-x86_64 cpu https://github.com/mudler/LocalAI/releases/download/v2.12.3/local-ai-avx2-Linux-x86_64 ``` 替换地址:https://huggingface.co 成:https://hf-mirror.com/ 模型启动方法: https://localai.io/models/ ## 1,创建embedding 接口 ```bash curl http://localhost:8080/models/apply -H "Content-Type: application/json" -d '{ "url": "https://gitee.com/fly-llm/localai-run-llm/raw/master/model-gallery/bert-embeddings.yaml", "name": "text-embedding-ada-002" }' ``` 测试: ```bash curl -X 'POST' http://0.0.0.0:8080/v1/embeddings \ -H "Content-Type: application/json" \ -d '{ "input": "测试ebmeddings", "model": "text-embedding-ada-002" }' ``` ## 3,大模型 qwen1.5-0.5b-chat,速度快 参考地址: https://github.com/mudler/LocalAI/issues/1110 ```bash curl http://localhost:8080/models/apply -H "Content-Type: application/json" -d '{ "url": "https://gitee.com/fly-llm/localai-run-llm/raw/master/model-gallery/qwen1.5-0.5b.yaml", "name": "qwen1.5-0.5b-chat" }' ``` 测试接口 ```bash curl -X 'POST' 'http://0.0.0.0:8080/v1/chat/completions' \ -H 'Content-Type: application/json' -d '{ "model": "qwen1.5-0.5b-chat", "messages": [ { "role": "user", "content": "北京景点?" } ], "max_tokens": 512, "temperature": 1 }' ``` ## 大模型 qwen1.5-1.8b-chat,速度快 参考地址: https://github.com/mudler/LocalAI/issues/1110 ```bash curl http://localhost:8080/models/apply -H "Content-Type: application/json" -d '{ "url": "https://gitee.com/fly-llm/localai-run-llm/raw/master/model-gallery/qwen1.5-1.8b.yaml", "name": "qwen1.5-1.8b-chat" }' ``` 测试接口 ```bash curl -X 'POST' 'http://0.0.0.0:8080/v1/chat/completions' \ -H 'Content-Type: application/json' -d '{ "model": "qwen1.5-1.8b-chat","stream":true, "messages": [ { "role": "user", "content": "北京景点" } ] }' ``` ## 大模型 qwen1.5-7b-chat,速度快 参考地址: https://github.com/mudler/LocalAI/issues/1110 ```bash curl http://localhost:8080/models/apply -H "Content-Type: application/json" -d '{ "url": "https://gitee.com/fly-llm/localai-run-llm/raw/master/model-gallery/qwen1.5-7b.yaml", "name": "qwen1.5-7b-chat" }' ``` 测试接口 ```bash curl -X 'POST' 'http://0.0.0.0:8080/v1/chat/completions' \ -H 'Content-Type: application/json' -d '{ "model": "qwen1.5-7b-chat","stream":true, "messages": [ { "role": "user", "content": "北京景点" } ] }' ``` ## 大模型 qwen1.5-14b-chat,速度快 参考地址: https://github.com/mudler/LocalAI/issues/1110 ```bash axel -a "https://www.modelscope.cn/api/v1/models/qwen/Qwen1.5-14B-Chat-GGUF/repo?Revision=master&FilePath=qwen1_5-14b-chat-q4_0.gguf" curl http://localhost:8080/models/apply -H "Content-Type: application/json" -d '{ "url": "https://gitee.com/fly-llm/localai-run-llm/raw/master/model-gallery/qwen1.5-14b.yaml", "name": "qwen1.5-14b-chat" }' ``` 测试接口 ```bash curl -X 'POST' 'http://0.0.0.0:8080/v1/chat/completions' \ -H 'Content-Type: application/json' -d '{ "model": "qwen1.5-14b-chat","stream":true, "messages": [ { "role": "user", "content": "北京景点" } ] }' ``` ## 大模型 qwen1.5-32b-chat,速度快 ```bash wget "https://modelscope.cn/api/v1/models/qwen/Qwen1.5-32B-Chat-GGUF/repo?Revision=master&FilePath=qwen1_5-32b-chat-q4_0.gguf" curl http://localhost:8080/models/apply -H "Content-Type: application/json" -d '{ "url": "https://gitee.com/fly-llm/localai-run-llm/raw/master/model-gallery/qwen1.5-32b.yaml", "name": "qwen1.5-32b-chat" }' ``` 测试接口 ```bash curl -X 'POST' 'http://0.0.0.0:8080/v1/chat/completions' \ -H 'Content-Type: application/json' -d '{ "model": "qwen1.5-32b-chat","stream":true, "messages": [ { "role": "user", "content": "北京景点" } ] }' ``` ## 使用docker 启动本地镜像 ```bash # 开启日志: docker run -p 8080:8080 -e DEBUG=true --name local-ai -it \ -v `pwd`/aio:/aio -v `pwd`/models:/build/models localai/localai:latest-aio-cpu ``` ## 生成图片,使用 stablediffusion-cpp 需要使用镜像: ```bash curl http://localhost:8080/models/apply -H "Content-Type: application/json" -d '{ "url": "https://gitee.com/fly-llm/localai-run-llm/raw/master/model-gallery/stablediffusion.yaml", "name": "stablediffusion" }' ``` 测试接口 ```bash curl http://localhost:8080/v1/images/generations -H "Content-Type: application/json" -d '{ "prompt": "floating hair, portrait, ((loli)), ((one girl)), cute face, hidden hands, asymmetrical bangs, beautiful detailed eyes, eye shadow, hair ornament, ribbons, bowties, buttons, pleated skirt, (((masterpiece))), ((best quality)), colorful|((part of the head)), ((((mutated hands and fingers)))), deformed, blurry, bad anatomy, disfigured, poorly drawn face, mutation, mutated, extra limb, ugly, poorly drawn hands, missing limb, blurry, floating limbs, disconnected limbs, malformed hands, blur, out of focus, long neck, long body, Octane renderer, lowres, bad anatomy, bad hands, text", "size": "256x256" }' ``` ## 3,大模型 chatglm3-6b,不能执行 ```bash curl http://localhost:8080/models/apply -H "Content-Type: application/json" -d '{ "url": "https://gitee.com/fly-llm/localai-run-llm/raw/master/model-gallery/chatglm3-6b.yaml", "name": "chatglm3-6b" }' ``` 测试接口 ```bash curl -X 'POST' 'http://0.0.0.0:8080/v1/chat/completions' \ -H 'Content-Type: application/json' -d '{ "model": "chatglm3-6b", "messages": [ { "role": "user", "content": "北京景点?" } ], "max_tokens": 512, "temperature": 0.7 }' ``` ## 3,大模型 Yi-6B-200K ```bash curl http://localhost:8080/models/apply -H "Content-Type: application/json" -d '{ "url": "https://gitee.com/fly-llm/localai-run-llm/raw/master/model-gallery/yi-6b-200k.yaml", "name": "yi-6b-200k" }' ``` 测试接口 ```bash curl -X 'POST' 'http://0.0.0.0:8080/v1/chat/completions' \ -H 'Content-Type: application/json' -d '{ "model": "yi-6b-200k","stream":true, "messages": [ { "role": "user", "content": "北京景点?" } ], "max_tokens": 512, "temperature": 0.7 }' ```