# skill-slice-insights **Repository Path**: mirrors_microsoft/skill-slice-insights ## Basic Information - **Project Name**: skill-slice-insights - **Description**: No description available - **Primary Language**: Unknown - **License**: MIT - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-03-19 - **Last Updated**: 2026-03-21 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Overview This is the official code repository for the paper ["Unearthing Skill-level Insights for Understanding Tradeoffs of Foundation Models"](https://arxiv.org/abs/2410.13826). All rationales, localized skills, and skill-slices for the 12 datasets studied in the paper can also be accessed through this repo. # Set up **Quick start**: Complete the below installation, and then check out example.ipynb for some key functionality on navigating our skill-slice annotations. After installing the relevant packages (see requirements.txt), download and unzip the following zip file with our annotations and other useful pre-computed entities: [cached.zip](https://umd.box.com/s/5w26f4t1mbokyugufem3nsq07uqdjr5w). Click the link and then download from there. Soon, we plan to post our annotations to huggingface, to make downloading and viewing previews easier. Also, **be sure to update `_CACHE_ROOT` and `_DATA_ROOT` in `constants.py`**. These paths are described below: - `_CACHED_ROOT` is where all model outputs and embeddings are cached. **This path should point to the `cached` directory downloaded and unzipped from the above link. - `_DATA_ROOT` is where all dataset images are downloaded to. You may also want to update your environment variable `$HF_DATASETS_CACHE` so that huggingface downloads all relevant files to the same place. Note that most of the datasets we use are downloaded through huggingface. - If you end up making plots in `analysis.py`, update `_PLOTS_ROOT` as well. Note: our implementation of commercial models follows from the [Eureka codebase](https://github.com/microsoft/eureka-ml-insights). **Critically, the API keys are missing**, as you will need to add your own keys yourself, if you would like to use those commercial models. Be sure to update any SECRET_KEY_PARAMS in models/models.py. # Citation If you find this work insightful or its code of use, we'd appreciate if you could cite us. Here is the bibtex: ``` @misc{moayeri2024unearthingskilllevelinsightsunderstanding, title={Unearthing Skill-Level Insights for Understanding Trade-Offs of Foundation Models}, author={Mazda Moayeri and Vidhisha Balachandran and Varun Chandrasekaran and Safoora Yousefi and Thomas Fel and Soheil Feizi and Besmira Nushi and Neel Joshi and Vibhav Vineet}, year={2024}, eprint={2410.13826}, archivePrefix={arXiv}, primaryClass={cs.LG}, url={https://arxiv.org/abs/2410.13826}, } ```