# MY_HERO_VIDEO_FEATURE_EXTRACTOR **Repository Path**: littezheng/MY_HERO_VIDEO_FEATURE_EXTRACTOR ## Basic Information - **Project Name**: MY_HERO_VIDEO_FEATURE_EXTRACTOR - **Description**: 视频特征提取器 - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2024-04-23 - **Last Updated**: 2024-04-24 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Video Feature Extraction for HERO部分修改(部分bug修改,同时也修改了多个文件,增加clip文本特征输出,舍去docker) [原项目地址](https://github.com/linjieli222/HERO) ## 环境 conda+rtx3090+python3.7+torch1.9.0+cuda11.1(其他的还没测试) ## 环境安装 首先安装torch pip install torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio==0.9.0 -f https://download.pytorch.org/whl/torch_stable.html 然后安装剩下部分(如果安装过程中有提示错误,则是安装顺序问题,按照提示调整顺序即可) pip install -r requirements_zheng.txt ### SlowFast Feature提取(视频输出维度[frames,2304]) ```bash cd MY_HERO_VIDEO_FEATURE_EXTRACTOR/slowfast/extract_feature ``` 1. 生成slowFast Feature推理csv文件 ```bash python gather_video_paths.py ``` ```bash video_path,feature_path /video/video1.mp4, /output/slowfast_features/video1.npz /video/video2.webm, /output/slowfast_features/video2.npz ... ``` 2. 视频特征提取(时间间隔为2s) ```bash python extract.py --num_decoding_thread 4 ``` ## clip图片以及文本特征提取(图片特征输出维度[frames,512],文本输出特征[words+2(start+end),512]) 注:符号标点也算一个单词 ```bash cd ./clip ``` 需要将site-package包下的clip模型文本输出做一下修改,在self encode_text(self,text)函数做以下修改 ``` x = [torch.arrange(x.shape[0]),text.argmax(dim=-1)]@self.text_projection return x ``` 修改成 ``` eos_x = [torch.arrange(x.shape[0]),text.argmax(dim=-1)]@self.text_projection return dict(last_hidden_state=x,pooler_output=eos_x) ``` 1.生成需要clip推理的csv文件 ```bash python gather_video_paths.py ``` ```bash video_path,feature_path /video/video1.mp4, /output/clip_vit_features/video1.npz /video/video2.webm, /output/clip_vit_features/video2.npz ... ``` 2. clip图片和文本特征提取 ```bash python extract.py --csv ./output/csv/clip-vit_info.csv --num_decoding_thread 4 --model_version ViT-B/32 ```