# audio_parse

**Repository Path**: queqijingfeng/audio_parse

## Basic Information

- **Project Name**: audio_parse
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2026-04-10
- **Last Updated**: 2026-04-10

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# Voice Clone TTS

录音 + 音色克隆 + 文本转语音工具。基于 Coqui XTTS v2 模型。

## 安装

```bash
pip install -r requirements.txt
```

> 需要 Python 3.9~3.11，建议有 NVIDIA GPU（CPU 也可运行，但合成较慢）。
> scipy 用于音频重采样，如未安装会自动提示：`pip install scipy`

## 使用

```bash
python server.py
```

浏览器打开 http://localhost:5000

### 步骤
1. 在「录音」页面，逐条朗读文本并录音（至少录 3 条）
2. 录音保存在 `data/` 目录下（WAV 16kHz 单声道）
3. 切换到「语音合成」页面，输入文本，点击合成
4. 首次合成会下载 XTTS v2 模型（约 1.8GB），之后会缓存

### 支持语言
中文、英文、日语、韩语、法语、德语、西班牙语等 17 种语言