# hallucination_detection

**Repository Path**: AthenaCrafter/hallucination_detection

## Basic Information

- **Project Name**: hallucination_detection
- **Description**: ###########
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2026-05-04
- **Last Updated**: 2026-05-17

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# RAG幻觉检测实验框架

## 项目结构

```
hallucination_detection/
├── data/
│   ├── raw/                    # 原始数据集放这里
│   │   ├── qa_data.json
│   │   ├── summarization_data.json
│   │   └── dialogue_data.json
│   └── processed/              # 处理后的数据
├── src/
│   ├── __init__.py
│   ├── data_loader.py          # 数据加载与预处理
│   ├── splitter.py             # 答案拆分模块
│   ├── nli_detector.py         # NLI检测模块
│   ├── llm_judge.py            # LLM-as-Judge模块
│   ├── aggregator.py           # 聚合策略模块
│   └── evaluator.py            # 评估指标模块
├── experiments/
│   ├── run_chapter3.py         # 第三章：消融实验主程序
│   └── run_chapter4.py         # 第四章：影响因素分析主程序（后续再实现）
├── results/
│   ├── chapter3/               # 实验结果保存
│   └── chapter4/
├── configs/
│   └── config.yaml             # 统一配置文件
└── README.md
```

## 功能说明

- **data_loader.py**: 加载HaluEval数据集，支持QA、Summarization和Dialogue三种子集
- **splitter.py**: 将模型回答拆分为多个子句，便于细粒度检测
- **nli_detector.py**: 使用NLI模型检测每个子句与上下文的一致性
- **llm_judge.py**: 使用LLM作为裁判，判断回答是否存在幻觉
- **aggregator.py**: 聚合多个检测结果，得到最终的幻觉判断
- **evaluator.py**: 计算检测性能指标（准确率、F1等）

## 依赖库

- transformers
- torch
- sklearn
- pandas
- numpy
- nltk
- tqdm

## 实验流程

1. 加载数据集
2. 预处理数据
3. 运行不同的检测配置
4. 评估性能
5. 分析结果