# catalyst **Repository Path**: mirrors_mozilla/catalyst ## Basic Information - **Project Name**: catalyst - **Description**: Catalyst is a python project designed for analyzing and generating performance reports based on Firefox telemetry data. - **Primary Language**: Unknown - **License**: MPL-2.0 - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-02-26 - **Last Updated**: 2026-03-21 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Catalyst - Telemetry Performance Reports ## Overview Catalyst is a python project designed for analyzing and generating performance reports based on telemetry data with a heavy focus on Nimbus experiments. ## Dependencies - **Python 3.10+** Make sure Python 3.10 or above is installed on your system. - **Just** (task runner) Install Just for cross-platform task management: ```bash # macOS brew install just # Windows (Chocolatey) choco install just # Rust/Cargo cargo install just ``` - [**Google Cloud SDK**](https://cloud.google.com/sdk/docs/install) Install and set up the Cloud SDK, then setup your Google account: ```bash gcloud config set project mozdata ``` ## Quick Start 1. **Authenticate your Google account:** ```bash gcloud auth login gcloud auth application-default login ``` 2. **Install dependencies:** ```bash just install ``` 3. **Create or use existing configs:** ```bash # Start with the template (includes default histograms and events) cp configs/template.yaml configs/my-experiment.yaml # Edit the config file to match your experiment # See Configuration section below for details ``` 4. **Generate performance report:** ```bash # Simple syntax (recommended) just run configs/my-experiment.yaml # Or using --config flag just run --config configs/my-experiment.yaml ``` 5. **Open the generated report:** `reports/{slug}.html` ## Available Commands ### ๐Ÿ“ฆ `just install` Install all dependencies (both production and development) ### ๐Ÿงช `just test` Run the complete test suite including: - Unit tests (21 comprehensive tests) - Coverage reporting (HTML + terminal output) ### ๐ŸŽจ `just lint` Automatically fix code formatting issues using Black ### Binary Commands ### ๐Ÿš€ `just run [ARGS]` Generate performance reports from experiment configs ```bash # Simple syntax just run configs/experiment-name.yaml # Or with --config flag just run --config configs/experiment-name.yaml # With additional options just run configs/experiment-name.yaml --skip-cache ``` ### ๐Ÿ” `just find-latest-experiment [ARGS]` Find and process latest experiments (automatically includes CPU time percentiles) ```bash just find-latest-experiment index.html failures.json ``` **Default metrics included:** - Memory (total) - Page load metrics (FCP, load time, LCP) - CPU time per process type (percentiles: median, p75, p95) - Crash events - Pageload events (FCP, LCP, load time, response time) ### ๐Ÿ“„ `just update-index [ARGS]` Update experiment index files ```bash just update-index index.html ``` ### ๐Ÿ”ง `just update-probe-index [ARGS]` Update probe index from telemetry schemas ```bash just update-probe-index ``` ## Configuration Catalyst uses **YAML configuration files** to define experiments and metrics. ### Getting Started with Template Start with the provided template that includes default histograms and events: ```bash cp configs/template.yaml configs/my-experiment.yaml ``` ### Configuration Format ```yaml slug: experiment-name segments: - Windows - Linux - Mac histograms: - payload.histograms.memory_total - metrics.timing_distribution.performance_pageload_fcp events: # Include crash events to track total crash counts for each experiment branch - crash # Include pageload events with custom metric configurations - pageload: fcp_time: max: 20000 lcp_time: max: 30000 load_time: max: 25000 ``` ### Key Configuration Fields - **slug**: Experiment identifier - **segments**: Target platforms/segments - **histograms**: List of histogram metrics to analyze (see Histogram Configuration below) - **events**: Event metrics configuration (replaces legacy pageload_event_metrics) - **crash**: Simple crash event tracking - **pageload**: Pageload event metrics with max values (min is always 0) ### Histogram Configuration Histograms can be specified as a simple list or as a dictionary with additional properties: **Simple list format:** ```yaml histograms: - metrics.timing_distribution.performance_pageload_fcp - metrics.custom_distribution.networking_http_3_upload_throughput ``` **Dictionary format with properties:** ```yaml histograms: metrics.custom_distribution.networking_http_3_upload_throughput: higher_is_better: true metrics.timing_distribution.performance_pageload_fcp: higher_is_better: false # default ``` ### Labeled Counter Metrics `labeled_counter` metrics (like `power_cpu_time_per_process_type_ms`) require an **aggregate mode** to specify how to analyze the per-label data: **Available aggregate modes:** **1. `aggregate: sum`** - Sum all values per label (for event counters) ```yaml histograms: - metrics.labeled_counter.javascript_gc_slice_was_long: aggregate: sum ``` - Shows: Total counts per label in categorical bar charts - Layout: Individual tables per label, branches as rows - Use for: Event counters where you want to see total occurrences **2. `aggregate: percentiles`** - Calculate percentiles per label (for measurements) ```yaml histograms: - metrics.labeled_counter.power_cpu_time_per_process_type_ms: aggregate: percentiles ``` - Shows: Median, p75, p95, and their individual uplifts per label - Layout: Individual tables per label, branches as rows - Use for: Understanding distribution characteristics (median, tails) ### Event Configuration Options The `events` section supports flexible configuration: **Simple crash tracking:** ```yaml events: - crash ``` **Pageload events with defaults:** ```yaml events: - pageload # Uses defaults: fcp_time, lcp_time, load_time (all max: 30000) ``` **Custom pageload metrics:** ```yaml events: - pageload: fcp_time: max: 25000 custom_metric: max: 15000 ``` **Both crash and pageload events:** ```yaml events: - crash - pageload: fcp_time: max: 20000 ``` ### Advanced Configuration Options **Custom Conditions:** Add custom SQL conditions to filter your data: ```yaml custom_conditions: - "country_code = 'US'" - "app_version >= '100.0'" ``` **Prerequisite CTEs:** Define reusable SQL CTEs for complex queries: ```yaml prerequisite_ctes: - name: "filtered_users" query: "SELECT client_id FROM table WHERE condition = true" - name: "cohort_data" query: "SELECT * FROM other_table WHERE date >= '2024-01-01'" ``` **Sample Percentage:** Control data sampling for faster processing (only applies to histograms): ```yaml sample_pct: 10 # Use 10% of available data ``` **Parallel Query Threads:** Control the number of parallel BigQuery threads: ```yaml max_parallel_queries: 8 # Default is 4 ``` ### Example Configs Available configuration files: - **`configs/template.yaml`** - Template with default histograms and events (start here!) - **`configs/`** - Production YAML configurations for existing experiments