# sql-engines-common-test-infra **Repository Path**: mirrors_mongodb/sql-engines-common-test-infra ## Basic Information - **Project Name**: sql-engines-common-test-infra - **Description**: This repository contains common test infrastructure for the various SQL Engines team repositories. - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-06-12 - **Last Updated**: 2026-01-31 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # sql-engines-common-test-infra This repository contains common test infrastructure for SQL Engines projects. The binaries, libraries, scripts, and Evergreen configuration files are only intended for use with SQL Engines projects. The repository is broken down into several components, detailed below. See the end of this README for tips and info about how to use changes in this repository in downstream SQL Engines projects. ## Data Loader Binary The `data-loader` is a standalone executable that loads test data for SQL Engines integration tests. This tool must connect to a mongod to write data and may connect to an ADF to write schema. Test data must be specified in YAML or JSON files (using the .y[a]ml or .json extensions); such files must follow the format demonstrated in the [data-loader/sample_files](data-loader/sample_files). See the `--help` output for a full description of the binary. When run with the `adf` flag enabled, or with an `adf_uri` provided, this tool connects to an ADF instance in addition to a mongod. In this mode, data and indexes are written to the mongod, and schemas are written to ADF (via `sqlSetSchema` or `sqlGenerateSchema`, depending on the presence of schema info in the data files). In this mode, views are not written to mongod, as they are assumed to be ADF views which are specified separately, in the ADF config. When run without the `adf` flag enabled, and without an `adf_uri` provided, this tool only connects to a mongod. In this mode, documents, indexes, views, and schema are written directly to the mongod. To run: ```shell cargo run --bin data-loader -- ``` ## Test Generator Library The `test-generator` library is a Rust utility library that provides the primitives needed to auto-generate Rust tests from YAML files as part of a `cargo test` run. Specifying tests via YAML is a common feature of SQL Engines projects written in Rust. See the [test-generator README](test-generator/README.md) for more details. ## Evergreen Configs and Scripts The [evergreen](evergreen) directory contains useful common Evergreen configuration files and scripts to be used across SQL Engines projects. The configs are separated by theme: `benchmark_util.yml` contains common benchmarking functions and tasks (e.g. `install heaptrack`), `rust_util.yml` contains common Rust functions and tasks (e.g. `install rust toolchain` and `check clippy`), and so on. If you need to add new common functions and/or tasks, consider the existing configs before creating a new one. If your new functions and/or tasks do not match any of the existing config themes, create a new config file in [evergreen/configs](evergreen/configs) and follow the naming convention `_util.yml`. The scripts are all grouped together in [evergreen/scripts](evergreen/scripts). Each script focuses on one function and is named appropriately. If you need to add a new Evergreen function, you should strongly consider writing it as a shell script in that directory and using the Evergreen `subprocess.exec` command to invoke that script. See existing configs and scripts for details on this. ## ADF Test Environment The [test-environment](test-environment) directory contains the [run_adf.sh](test-environment/run_adf.sh) script, which is useful for deploying a local ADF instance. This directory also contains the relevant configuration files for ADF. The instance created is general purpose, but the config information is geared toward JDBC and ODBC integration testing. Feel free to use this script to run ADF for yourself locally. ## Using this repository as an Evergreen module As noted above, this repository is intended to be used by the other SQL Engines projects. Typically, that is achieved by having those projects depend on this one as an ["Evergreen module"](https://docs.devprod.prod.corp.mongodb.com/evergreen/Project-Configuration/Project-Configuration-Files#modules). At time of writing, all existing SQL Engines projects already define `sql-engines-common-test-infra` as a module. That typically looks like this in a project's Evergreen configuration file: ```yaml modules: - name: sql-engines-common-test-infra owner: mongodb repo: sql-engines-common-test-infra branch: main auto_update: true ``` To ensure the module is pulled from GitHub for Evergreen patches, projects typically include this in their `fetch source` functions: ```yaml functions: "fetch source": - command: git.get_project params: directory: revisions: sql-engines-common-test-infra: ${sql-engines-common-test-infra_rev} ``` Note that the variable `${_rev}` is automatically provided by Evergreen. By specifying the `sql-engines-common-test-infra` module in the `modules` list, and ensuring the appropriate revision is fetched in the `fetch source` function, the module is effectively available for use throughout the evergreen config. To include configs from this module, you can update the downstream project's `include` list like this: ```yaml include: - filename: evergreen/configs/mongodb_util.yml module: sql-engines-common-test-infra - filename: evergreen/configs/rust_util.yml module: sql-engines-common-test-infra ``` Be sure to check each config file to see if there are any necessary Evergreen expansions that need to be set. For example, it is common for the configs in this module to require the `${working_dir}` expansion to be set appropriately. **Importantly**, the final step to ensure the module is available on the `buildvariant` you need it on is to update the `buildvariant` definition to include the module like this: ```yaml buildvariants: - name: display_name: run_on: modules: - sql-engines-common-test-infra tasks: ``` If you omit the `modules` field from a buildvariant definition or omit `sql-engines-common-test-infra` from the list of `modules`, the module will not be available on that buildvariant despite being specified at the top-level of the config and fetched with the project's source code. You _must_ specify the module in the `modules` field for each `buildvariant` on which you need the module present. (For the most part, this should already be set up on all relevant buildvariants for all existing projects.) ### ⚠️ IMPORTANT: Using updates to this module ⚠️ If you make updates to this repository and then need to use those updates in a downstream repo, you may encounter challenges on Evergreen. In particular, at time of writing, Evergreen has a [bug](https://jira.mongodb.org/browse/DEVPROD-22792) where the `auto_update` flag is ignored even when set to `true`. What that means in practice is that if you commit a change to this repository, you will not be able to utilize that change in downstream repos until the downstream repos themselves have unrelated commits made to their `main` branches. That is because the bug causes Evergreen to always use the module revision (i.e., version) from the "base commit" (i.e., the last commit from the `main` branch off of which your feature branch was created). The intent of using `auto_update: true` for a module is to ensure the latest revision is used as opposed to the revision from the base commit. Until that bug is properly addressed, to work around it all you need to do is either wait for an unrelated change to merge into `main`, or push a dummy commit to main yourself. Either way, the new commit to `main` will pull the latest version of the module on the Evergreen waterfall. After that, you could create a new branch off `main` that utilizes new changes in the module. Note that if you already have a branch that attempted to use the newer module version but failed, you'll need to **rebase** that branch on main (**not** merge). Rebasing ensures the base commit is the one that pulled in the latest module revision; merging does not accomplish this.