# AnyInsertion_V1 **Repository Path**: hf-datasets/AnyInsertion_V1 ## Basic Information - **Project Name**: AnyInsertion_V1 - **Description**: Mirror of https://huggingface.co/datasets/WensongSong/AnyInsertion_V1 - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-07-15 - **Last Updated**: 2025-07-15 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README --- dataset_info: - config_name: mask_prompt features: - name: id dtype: string - name: split dtype: string - name: category dtype: string - name: label dtype: string - name: ref_image dtype: image - name: ref_mask dtype: image - name: tar_image dtype: image - name: tar_mask dtype: image splits: - name: train num_bytes: 58055297653.968 num_examples: 58318 download_size: 57605566640 dataset_size: 58055297653.968 - config_name: text_prompt features: - name: id dtype: string - name: split dtype: string - name: category dtype: string - name: label dtype: string - name: src_label dtype: string - name: src_type dtype: string - name: ref_image dtype: image - name: ref_mask dtype: image - name: src_image dtype: image - name: tar_image dtype: image splits: - name: train num_bytes: 183941024225.888 num_examples: 78234 download_size: 183594500291 dataset_size: 183941024225.888 configs: - config_name: mask_prompt data_files: - split: train path: mask_prompt/train-* - config_name: text_prompt data_files: - split: train path: text_prompt/train-* --- --- license: mit task_categories: - image-to-image language: - en pretty_name: a size_categories: - 10M Wensong Song · Hong Jiang · Zongxing Yang · Ruijie Quan · Yi Yang

Zhejiang University | Harvard University | Nanyang Technological University

## News * **[2025.5.9]** Release new **AnyInsertion** v1 text- and mask-prompt dataset on [HuggingFace](https://huggingface.co/datasets/WensongSong/AnyInsertion_V1). * **[2025.5.7]** Release **AnyInsertion** v1 text prompt dataset on HuggingFace. * **[2025.4.24]** Release **AnyInsertion** v1 mask prompt dataset on HuggingFace. ## Summary This is the dataset proposed in our paper [**Insert Anything: Image Insertion via In-Context Editing in DiT**](https://arxiv.org/abs/2504.15009) AnyInsertion dataset consists of training and testing subsets. The training set includes 136,385 samples across two prompt types: 58,188 mask-prompt image pairs and 78,197 text-prompt image pairs;the test set includes 158 data pairs: 120 mask-prompt pairs and 38 text-prompt pairs. AnyInsertion dataset covers diverse categories including human subjects, daily necessities, garments, furniture, and various objects. ![alt text](dataset_categories.png) ## Directory ``` ├── text_prompt/ │ ├── train/ │ │ ├── accessory/ │ │ │ ├── ref_image/ # Reference image containing the element to be inserted │ │ │ ├── ref_mask/ # The mask corresponding to the inserted element │ │ │ ├── tar_image/ # Ground truth │ │ │ └── src_image/ # Source images │ │ │ ├── add/ # Source image with the inserted element from Ground Truth removed │ │ │ └── replace/ # Source image where the inserted element in Ground Truth is replaced │ │ ├── object/ │ │ │ ├── ref_image/ │ │ │ ├── ref_mask/ │ │ │ ├── tar_image/ │ │ │ └── src_image/ │ │ │ ├── add/ │ │ │ └── replace/ │ │ └── person/ │ │ ├── ref_image/ │ │ ├── ref_mask/ │ │ ├── tar_image/ │ │ └── src_image/ │ │ ├── add/ │ │ └── replace/ │ └── test/ │ ├── garment/ │ │ ├── ref_image/ │ │ ├── ref_mask/ │ │ ├── tar_image/ │ │ └── src_image/ │ └── object/ │ ├── ref_image/ │ ├── ref_mask/ │ ├── tar_image/ │ └── src_image/ │ ├── mask_prompt/ │ ├── train/ │ │ ├── accessory/ │ │ │ ├── ref_image/ │ │ │ ├── ref_mask/ │ │ │ ├── tar_image/ │ │ │ ├── tar_mask/ # The mask corresponding to the edited area of target image │ │ ├── object/ │ │ │ ├── ref_image/ │ │ │ ├── ref_mask/ │ │ │ ├── tar_image/ │ │ │ ├── tar_mask/ │ │ └── person/ │ │ ├── ref_image/ │ │ ├── ref_mask/ │ │ ├── tar_image/ │ │ ├── tar_mask/ │ └── test/ │ ├── garment/ │ │ ├── ref_image/ │ │ ├── ref_mask/ │ │ ├── tar_image/ │ │ ├── tar_mask/ │ ├── object/ │ │ ├── ref_image/ │ │ ├── ref_mask/ │ │ ├── tar_image/ │ │ ├── tar_mask/ │ └── person/ │ ├── ref_image/ │ ├── ref_mask/ │ ├── tar_image/ │ ├── tar_mask/ ``` ## Example

### Text Prompt Add Prompt: Add [label from `tar_image` (in label.json) ]

Replace Prompt: Replace [label from `src_image` (in src_image/replace/replace_label.json) ] with [label from `tar_image` (in label.json) ] ## Citation ``` @article{song2025insert, title={Insert Anything: Image Insertion via In-Context Editing in DiT}, author={Song, Wensong and Jiang, Hong and Yang, Zongxing and Quan, Ruijie and Yang, Yi}, journal={arXiv preprint arXiv:2504.15009}, year={2025} } ```