# story-adapter **Repository Path**: danielxvcg/story-adapter ## Basic Information - **Project Name**: story-adapter - **Description**: No description available - **Primary Language**: Python - **License**: MIT - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 1 - **Created**: 2025-01-02 - **Last Updated**: 2025-01-22 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README
## News 🚀
* **2024.10.10**: [Paper](https://arxiv.org/abs/2410.06244) is released on ArXiv.
* **2024.10.04**: Code released.
## Framework 🤖
> Story-Adapter framework. Illustration of the proposed iterative paradigm, which consists of initialization, iterations in Story-Adapter, and implementation of Global Reference Cross-Attention (GRCA).
Story-Adapter first visualizes each image only based on the text prompt of the story and uses all results as reference images for the future round.
In the iterative paradigm, Story-Adapter inserts GRCA into SD. For the ith iteration of each image visualization, GRCA will aggregate the information flow of all reference images during the denoising process through cross-attention.
All results from this iteration will be used as a reference image to guide the dynamic update of the story visualization in the next iteration.
## Quick Start 🔧
### Installation
The project is built with Python 3.10.14, PyTorch 2.2.2. CUDA 12.1, cuDNN 8.9.02
For installing, follow these instructions:
~~~
# git clone this repository
git clone https://github.com/UCSC-VLAA/story-adapter.git
cd story-adapter
# create new anaconda env
conda create -n StoryAdapter python=3.10
conda activate StoryAdapter
# install packages
pip install -r requirements.txt
~~~
### Download the checkpoint
- downloading [RealVisXL_V4.0](https://huggingface.co/SG161222/RealVisXL_V4.0/tree/main) put it into "./RealVisXL_V4.0"
- downloading [clip_image_encoder](https://huggingface.co/h94/IP-Adapter/tree/main/sdxl_models/image_encoder) put it into "./IP-Adapter/sdxl_models/image_encoder"
- downloading [ip-adapter_sdxl](https://huggingface.co/h94/IP-Adapter/resolve/main/sdxl_models/ip-adapter_sdxl.bin?download=true) put it into "./IP-Adapter/sdxl_models/ip-adapter_sdxl.bin"
### Running Demo
~~~
python run.py --base_model_path your_path/RealVisXL_V4.0 --image_encoder_path your_path/IP-Adapter/sdxl_models/image_encoder --ip_ckpt your_path/IP-Adapter/sdxl_models/ip-adapter_sdxl.bin
~~~
### Customized Running
~~~
python run.py --base_model_path your_path/RealVisXL_V4.0 --image_encoder_path your_path/IP-Adapter/sdxl_models/image_encoder --ip_ckpt your_path/IP-Adapter/sdxl_models/ip-adapter_sdxl.bin
--story "your prompt1" "your prompt2" "your prompt3" ... "your promptN"
~~~
Note: Regarding custom stories, we suggest the template [Character Definition + Interaction Definition + Scene Definition] for better story visualization performance. For example, the Character Definition is "One man wearing yellow robe," the Interaction Definition is "dancing," and the Scene Definition is "the palace hall." So, the input prompt is "One man wearing yellow robe dancing in the palace hall."
## Performance 🎨
### Regular-length Story Visualization
- downloading the [StorySalon](https://huggingface.co/datasets/haoningwu/StorySalon/resolve/main/testset.zip?download=true) test set."
| GIF1 | GIF2 | GIF3 |
| --- | --- | --- |
|
|
|
|
| GIF4 | GIF5 | GIF6 |
| --- | --- | --- |
|
|
|
|
| GIF7 | GIF8 | GIF9 |
| --- | --- | --- |
|
|
|
|
### Long Story Visualization
### Running with Different Style
comic style:
~~~
python run.py --base_model_path your_path/RealVisXL_V4.0 --image_encoder_path your_path/IP-Adapter/sdxl_models/image_encoder --ip_ckpt your_path/IP-Adapter/sdxl_models/ip-adapter_sdxl.bin --style comic
~~~