# mnemo

**Repository Path**: yeasincode/mnemo

## Basic Information

- **Project Name**: mnemo
- **Description**: Mnemo 是一个基于 Go 开发的 LLM Gateway，提供了完整的 API 转发、内存管理和中间件处理能力。
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2026-04-14
- **Last Updated**: 2026-04-17

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# Mnemo

LLM 网关服务，带有记忆管理功能。

## 项目简介

Mnemo 是一个基于 Go 开发的 LLM Gateway，提供了完整的 API 转发、内存管理和中间件处理能力。

## 技术栈

- **Go 1.21+** - 开发语言
- **Gin** - HTTP 框架
- **Wire** - 依赖注入
- **Zap** - 日志库

## 架构设计

```
cmd/server → gin route → gateway → pipeline → composer → client
```

### 目录结构

```
mnemo/
├── cmd/server/              # 入口程序
│   └── main.go             # 主入口
├── internal/
│   ├── api/                # API 接口
│   │   ├── controller.go   # HTTP 控制器
│   │   ├── suite.go        # 测试套件
│   │   └── chat_test.go    # 接口测试
│   ├── client/             # 模型客户端
│   │   ├── interface.go    # 客户端接口
│   │   ├── openai.go      # OpenAI 客户端
│   │   ├── vllm.go        # VLLM 客户端
│   │   ├── sglang.go      # SGLang 客户端
 │   │   ├── manager.go     # 客户端管理
 │   │   ├── transport.go    # HTTP 传输层
 │   │   └── utils.go       # 转换工具
│   ├── gateway/           # 网关核心
│   │   └── service.go     # 业务逻辑
│   ├── inject/            # 依赖注入
│   │   ├── wire.go        # Wire 提供者
│   │   └── wire_gen.go    # Wire 生成代码
│   └── pipeline/          # 中间件
│       ├── trace.go       # 全局 Trace
       ├── auth.go        # Gateway 认证
│       ├── cors.go        # CORS
│       ├── logger.go      # 日志
│       ├── limiter.go     # 限流
│       ├── rewrite.go     # 消息重写
│       └── middleware.go  # 会话中间件
├── pkg/
│   ├── chain/              # 责任链模式
│   ├── composer/           # 记忆管理
│   │   ├── composer.go    # 组合器接口
│   │   ├── history.go     # 历史截断
│   │   ├── system.go      # 系统提示词
│   │   ├── dedup.go       # 去重
│   │   └── context.go     # 上下文构建
│   ├── config/            # 配置加载
│   │   └── config.go      # 配置结构
│   ├── log/               # 日志封装
│   ├── protocol/          # 协议定义
│   │   ├── chat.go        # Chat 请求/响应
│   │   ├── context.go     # 协议上下文
│   │   └── session.go     # 会话
│   ├── server/            # Gin 服务封装
│   ├── stream/            # SSE 流式响应
 │   └── util/              # 工具函数
       ├── trace.go       # Trace 工具
       ├── id.go          # ID 生成
       ├── time.go        # 时间工具
       └── strings.go     # 字符串工具
└── configs/                # 配置文件
    └── application-dev.yaml
```

## 快速开始

### 安装依赖

```bash
go mod tidy
```

### 运行服务

```bash
go run cmd/server/main.go
```

服务默认监听 `http://localhost:8080`

### 配置文件

编辑 `configs/application-dev.yaml`：

```yaml
server:
  port: 8080

upstream:
  default_model: gpt-3.5-turbo
   # 多客户端多模型配置（推荐格式）
   clients:
     - name: openai-main
       type: openai
       endpoint: https://api.openai.com
       handler: "/v1/chat/completions"
       api_key: your-openai-key
       models: [gpt-3.5-turbo, gpt-4, gpt-4o, gpt-4o-mini]
     - name: openai-azure
       type: openai
       endpoint: https://your-openai-azure.openai.azure.com
       handler: "/openai/deployments/gpt-4o/chat/completions"
       api_key: your-azure-key
       models: [gpt-4o-azure]
     - name: local-vllm
       type: vllm
       endpoint: http://localhost:8000
       handler: "/v1/chat/completions"
       models: [llama-3-8b-instruct, qwen-7b-chat, deepseek-r1:8b]
```

### Handler 说明
- 如果提供了 `handler`，完整 URL 为 `endpoint + handler`
- 如果不提供 `handler`：
  - openai: 使用 `endpoint` 直接传入 SDK
  - vllm/sglang: 默认使用 `endpoint + "/v1"`

### API Key 配置
每个客户端必须单独配置自己的 `api_key`，不支持全局默认 API Key。

### 模型路由

支持两种模型指定方式：

1. **按模型名自动匹配**：
```json
{
  "model": "gpt-4o",
  "messages": [...]
}
```

2. **显式指定客户端**：格式 `client@model` (使用 `@` 避免和模型名中 `:` 冲突)
```json
{
  "model": "openai-main@gpt-4o",
  "messages": [...]
}
```

## API 接口

### Chat Completion

```
POST /v1/chat/completions
```

**请求体：**

```json
{
  model: 'gpt-3.5-turbo',
  messages: [
    { role: 'system', content: '你是一个有帮助的助手' },
    { role: 'user', content: '你好' }
  ],
  stream: false,
  temperature: 0.7
}
```

**响应：**

```json
{
  id: 'chatcmpl-xxx',
  object: 'chat.completion',
  created: 1234567890,
  model: 'gpt-3.5-turbo',
  choices: [
    {
      index: 0,
      message: { role: 'assistant', content: '你好！有什么可以帮助你的吗？' },
      finish_reason: 'stop'
    }
  ]
}
```

### 流式响应

设置 `stream: true`，返回 SSE 格式。

### 请求头和响应头

**请求头：**
- `X-Gateway-Auth: role:uid` - Gateway 认证 (必需)

**响应头：**
- `X-Trace-ID: <uuid>` - 全局唯一请求 Trace ID

### 认证说明

Mnemo 使用 Gateway 认证，请求头必须包含 `X-Gateway-Auth`，格式为 `role:uid`。

支持的角色：
- `admin` - 管理员权限
- `user` - 普通用户权限

示例：
```
X-Gateway-Auth: user:12345
```

## 全局 Trace 系统

Mnemo 实现了完整的全局 Trace 系统，为每个请求分配唯一 TraceID，贯穿整个请求生命周期。

### TraceID 特点

- **全局唯一**: 基于 UUID 生成，无碰撞
- **全链路透传**: HTTP → 网关 → 管道 → 记忆编排 → 模型客户端
- **自动注入**: Gin 中间件自动生成，无需手动处理
- **日志自动带 Trace**: 所有日志自动包含 `trace_id` 字段
- **上下游透传**: 请求头 `X-Trace-ID` 传递给上游 API

### 日志示例

```
[INFO] trace_id=54c88f3a-2b6d-4c1a-9b1a-33f2c8e7d1ef 请求开始
[INFO] trace_id=54c88f3a-2b6d-4c1a-9b1a-33f2c8e7d1ef 路由到模型: gpt-4o
[INFO] trace_id=54c88f3a-2b6d-4c1a-9b1a-33f2c8e7d1ef 调用上游API完成
```

## 测试

```bash
go test ./internal/api/... -v
```

## 主要特性

- **全局 Trace 系统** - 全链路唯一 TraceID，自动注入和透传
- **多模型支持** - OpenAI、VLLM、SGLang，支持自定义 Handler
- **责任链模式** - 灵活的中间件编排和 Pipeline 处理
- **内存管理** - 历史截断、系统提示词注入、消息去重
- **SSE 流式** - 支持实时流式输出
- **依赖注入** - 使用 Wire 管理依赖
- **Gateway 认证** - 基于角色的访问控制