# candle-run-llm

**Repository Path**: fly-llm/candle-run-llm

## Basic Information

- **Project Name**: candle-run-llm
- **Description**: candle-run-llm
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 3
- **Forks**: 1
- **Created**: 2024-02-18
- **Last Updated**: 2025-03-01

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

## 使用平台

在autodl 上面使用。
[https://www.autodl.com/create](https://www.autodl.com/create)


选择 pytroch 2.1 版本，python3.10

先创建相关配置的容器，然后克隆本项目，执行运行某些模型脚本：

```bash
git clone https://gitee.com/fly-llm/candle-run-llm.git

#下载 candle 项目

git clone https://github.com/huggingface/candle.git 

```

## 使用rust镜像加速，统一设置环境变量

https://rsproxy.cn/

```bash
# 临时替换
export RUSTUP_DIST_SERVER="https://rsproxy.cn"
export RUSTUP_UPDATE_ROOT="https://rsproxy.cn/rustup"

export RUSTUP_HOME=/root/autodl-tmp/cargo
export CARGO_HOME=/root/autodl-tmp/cargo

source "/root/autodl-tmp/cargo/env"

export HF_HOME=/root/autodl-tmp/hf_cache

```

## 运行qwen-0.5b的聊天模型

模型地址：https://hf-mirror.com/Qwen/Qwen1.5-0.5B-Chat

```bash

python3 download.py Qwen/Qwen1.5-0.5B-Chat

cargo run --example qwen --features cuda -- --model-id Qwen/Qwen1.5-0.5B-Chat --prompt 北京景点推荐

cargo run --example qwen -- --model-id Qwen/Qwen1.5-0.5B-Chat --prompt 北京景点推荐

```

## 运行qwen-4b大模型

模型地址：https://hf-mirror.com/Qwen/Qwen1.5-4B-Chat


```bash

python3 download.py Qwen/Qwen1.5-4B

cargo run --example qwen --features cuda -- --model-id Qwen/Qwen1.5-4B-Chat --prompt 北京景点推荐
```


## 运行yi-6b的模型

模型地址：https://hf-mirror.com/01-ai/Yi-6B-Chat-4bits

```bash

python3 download.py 01-ai/Yi-6B-Chat-4bits

cargo run --example yi --features cuda -- --model-id 01-ai/Yi-6B --prompt 北京景点推荐
```

## 运行chatglm大模型，需要24G+显存，启动失败

模型地址：https://hf-mirror.com/THUDM/chatglm3-6b

```bash

python3 download.py THUDM/chatglm3-6b
python3 download.py lmz/candle-chatglm

cargo run --example chatglm --features cuda -- --prompt 北京景点推荐
```