# Alphahunt **Repository Path**: mirrors_microsoft/Alphahunt ## Basic Information - **Project Name**: Alphahunt - **Description**: Training threat hunting agents through scalable realistic synthetic data - **Primary Language**: Unknown - **License**: MIT - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-11-16 - **Last Updated**: 2026-03-29 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # ๐ŸŽฏ AlphaHunt: Synthetic Environment Builder > Generate high-quality synthetic cybersecurity data at scale for training reinforcement learning models in incident investigation, threat hunting, and vulnerability detection. [![Python 3.12+](https://img.shields.io/badge/python-3.12+-blue.svg)](https://www.python.org/downloads/) [![uv](https://img.shields.io/badge/uv-package%20manager-orange)](https://github.com/astral-sh/uv) [![License](https://img.shields.io/badge/license-MIT-green.svg)](LICENSE) ## ๐Ÿ“‹ Table of Contents - [Overview](#-overview) - [Features](#-features) - [Project Structure](#-project-structure) - [Quick Start](#-quick-start) - [Usage](#-usage) - [Configuration](#-configuration) - [Testing](#-testing) - [Development](#-development) - [Customization](#-customization) - [Troubleshooting](#-troubleshooting) - [Contributing](#-contributing) ## ๐Ÿ” Overview AlphaHunt simulates realistic digital environments by creating: - **Synthetic companies** with organizational structures - **Entities** including users, processes, and devices - **Realistic security logs** for training ML models - **Attack scenarios** across the MITRE ATT&CK framework ## โœจ Features - ๐Ÿข **Synthetic Company Generation** - Create realistic organizational profiles - ๐Ÿ‘ฅ **Entity Simulation** - Generate users, devices, and processes with relationships - ๐Ÿ“Š **Security Log Generation** - Produce realistic event logs at scale - ๐ŸŽญ **Attack Simulation** - Implement full attack chains from reconnaissance to impact - ๐Ÿ”ง **Highly Configurable** - YAML-based scenario configuration - โšก **Fast & Modern** - Built with uv for lightning-fast dependency management ## ๐Ÿ“ Project Structure ``` Alphahunt/ โ”œโ”€โ”€ config/ # YAML scenario configurations โ”‚ โ”œโ”€โ”€ environment_config.yaml โ”‚ โ””โ”€โ”€ *.yaml # Attack chain scenarios โ”‚ โ”œโ”€โ”€ src/ โ”‚ โ”œโ”€โ”€ attack_simulation/ # Attack simulation engine โ”‚ โ”‚ โ”œโ”€โ”€ components/ # MITRE ATT&CK technique implementations โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ Benign/ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ Reconnaissance/ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ InitialAccess/ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ CredentialAccess/ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ Execution/ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ LateralMovement/ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ Collection/ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ CommandAndControl/ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ Exfiltration/ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ Impact/ โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ Persistence/ โ”‚ โ”‚ โ””โ”€โ”€ scenarios/ # KC7 / SimuLand mappings โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ data_generation/ # Defender-XDR schema builders โ”‚ โ”‚ โ”œโ”€โ”€ simulator.py # CLI entry point โ”‚ โ”‚ โ”œโ”€โ”€ environment_builder.py # Core environment builder โ”‚ โ”‚ โ””โ”€โ”€ defender_xdr/ # Schema-specific generators โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ utils/ # Utilities โ”‚ โ”œโ”€โ”€ logging_utils.py โ”‚ โ”œโ”€โ”€ config_utils.py โ”‚ โ””โ”€โ”€ pydantic_models/ # Data models โ”‚ โ”œโ”€โ”€ tests/ # Unit and integration tests โ”œโ”€โ”€ pyproject.toml # Project dependencies โ”œโ”€โ”€ uv.lock # Locked versions โ””โ”€โ”€ .python-version # Python version pin ``` ## ๐Ÿš€ Quick Start ### Prerequisites - **Python 3.10+** (managed automatically by uv) - **uv** package manager ### Installation **1. Install uv** ```bash # Linux/WSL/macOS curl -LsSf https://astral.sh/uv/install.sh | sh source $HOME/.cargo/env # Windows (PowerShell) powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex" ``` **2. Clone and setup** ```bash git clone https://github.com/microsoft/Alphahunt cd Alphahunt uv sync ``` That's it! ๐ŸŽ‰ The `uv sync` command automatically: - Creates a virtual environment in `.venv/` - Installs all dependencies from `uv.lock` - Sets up the correct Python version ## ๐Ÿ’ป Usage ### Generate All Attack Chains ```bash uv run python src/generate_data.py ``` ### Generate Specific Attack Chain ```bash uv run python src/generate_data.py --config config/initial_access_malware_to_ransomware.yaml ``` ### What It Does The generator: 1. โœ… Creates a synthetic company profile with org structure 2. โœ… Generates entities (users, processes, devices) with relationships 3. โœ… Simulates realistic activity logs over time 4. โœ… Outputs data samples to console (or saves to file) ## โš™๏ธ Configuration Configure scenarios using YAML files in `config/`: ```yaml # config/environment_config.yaml num_entities_range: [50, 150] # Entity count range num_logs_range: [500, 2000] # Log event count range simulation_days: 15 # Simulation duration ``` ### Available Scenarios - `initial_access_malware_to_ransomware.yaml` - Ransomware attack chain - Add more scenarios in `config/` following the same structure ## ๐Ÿงช Testing ### Run All Tests ```bash uv run python -m unittest discover -s tests ``` ### Run Specific Test Module ```bash uv run python -m unittest tests.test_environment_builder ``` ### Run With Pytest (if installed) ```bash uv add --dev pytest uv run pytest tests/ -v ``` ## ๐Ÿ› ๏ธ Development ### Adding Dependencies ```bash # Production dependency uv add numpy pandas # Development dependency uv add --dev pytest black ruff # With version constraint uv add "requests>=2.31.0" ``` ### Updating Dependencies ```bash # Update all packages uv lock --upgrade # Update specific package uv add package-name --upgrade # Sync after pulling changes uv sync ``` ### Code Quality ```bash # Format code uv add --dev black uv run black src/ tests/ # Lint code uv add --dev ruff uv run ruff check src/ tests/ # Type checking uv add --dev mypy uv run mypy src/ ``` ### View Installed Packages ```bash uv pip list ``` ## ๐ŸŽจ Customization ### Integrate LLMs for Realistic Data Add OpenAI for more realistic company profiles and entity attributes: ```bash uv add openai python-dotenv ``` Create a `.env` file: ```bash OPENAI_API_KEY=your-api-key-here ``` Update `environment_builder.py`: ```python import openai import os from dotenv import load_dotenv load_dotenv() class LLM: def __init__(self, model_name='gpt-4', verbose=False): self.client = openai.OpenAI(api_key=os.getenv('OPENAI_API_KEY')) self.model_name = model_name self.verbose = verbose def __call__(self, messages): response = self.client.chat.completions.create( model=self.model_name, messages=messages ) return response.choices[0].message.content ``` ### Add Custom Attack Techniques 1. Create a new module in `src/attack_simulation/components/` 2. Implement the attack logic following MITRE ATT&CK 3. Register it in the scenario configuration YAML ### Extend Entity Types Add new entity types (network devices, applications, etc.): 1. Extend `create_entities()` in `environment_builder.py` 2. Implement attribute generation in `generate_entity_attributes()` 3. Update `generate_log_details()` for new entity interactions ## ๐Ÿ› Troubleshooting ### Command not found: uv Ensure uv is in your PATH: ```bash source $HOME/.cargo/env ``` Or restart your terminal. ### Dependencies out of sync After pulling changes: ```bash uv sync ``` ### Import errors Always use `uv run` from the project root: ```bash uv run python src/generate_data.py ``` Verify `__init__.py` files exist in: - `src/` - `src/data_generation/` - `tests/` ### Clean install If you encounter persistent issues: ```bash rm -rf .venv uv.lock uv sync ``` ### Python version issues Check version: ```bash uv run python --version ``` Pin specific version: ```bash uv python pin 3.12 uv sync ``` ## ๐Ÿค Contributing Contributions are welcome! Please follow these steps: 1. **Fork the repository** 2. **Create a feature branch** ```bash git checkout -b feature/amazing-feature ``` 3. **Make your changes** 4. **Run tests and formatting** ```bash uv run pytest uv run black src/ tests/ uv run ruff check src/ tests/ ``` 5. **Commit with descriptive message** ```bash git commit -m "Add amazing feature" ``` 6. **Push and create Pull Request** ```bash git push origin feature/amazing-feature ``` ### Development Setup ```bash git clone cd Alphahunt uv sync uv add --dev pytest black ruff mypy ipython ``` ### Commit Checklist - [ ] Tests pass: `uv run pytest` - [ ] Code formatted: `uv run black src/ tests/` - [ ] Linting clean: `uv run ruff check src/ tests/` - [ ] Dependencies locked: `uv lock` (commit `uv.lock`) - [ ] Documentation updated ## ๐Ÿ“š Resources - [MITRE ATT&CK Framework](https://attack.mitre.org/) - [Microsoft Defender XDR Schema](https://learn.microsoft.com/en-us/microsoft-365/security/) - [uv Documentation](https://github.com/astral-sh/uv)