# conv2d_direct **Repository Path**: magicor/conv2d_direct ## Basic Information - **Project Name**: conv2d_direct - **Description**: https://github.com/Pegessi/conv2d_direct.git - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2024-06-05 - **Last Updated**: 2024-06-05 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # conv2d_direct 直接计算conv2d的cuda算子实现 详细介绍参考:https://zhuanlan.zhihu.com/p/613538649 ## 编译 需要确保本机有cuda和cudnn `nvcc main.cu -o test -lcudnn` 如果找不到cudnn请手动指定目录 ## 测试 报告生成:`nsys profile --stats=true -o report_conv ./test ` ncu生成:`sudo LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/path/to/cuda-11.7/lib64 /path/to/cuda-11.7/bin/ncu test` | N/batch_size | inC | inH | inW | outC | outH | outW | kernelH | kernelW | cudnn | v1_conv | speedup | |--------------|-----|------|------|------|------|------|---------|---------|----------|----------|-------------| | 1 | 3 | 768 | 512 | 3 | | | 3 | 3 | 0.106353 | 0.119194 | 0.892268067 | | 1 | 3 | 840 | 1200 | 3 | | | 3 | 3 | 0.29952 | 0.282706 | 1.059475215 | | 1 | 3 | 960 | 1440 | 3 | | | 3 | 3 | 0.407859 | 0.406088 | 1.004361124 | | 1 | 3 | 1200 | 1680 | 3 | | | 3 | 3 | 0.593111 | 0.547789 | 1.082736236 | | 1 | 3 | 1440 | 1920 | 3 | | | 3 | 3 | 0.7454 | 0.745237 | 1.000218722 | | 1 | 3 | 1920 | 2400 | 3 | | | 3 | 3 | 1.349059 | 1.243177 | 1.085170495 | | | | | | | | | | | | | | | 1 | 3 | 768 | 512 | 3 | | | 6 | 6 | 0.256932 | 0.160492 | 1.600902226 | | 1 | 3 | 840 | 1200 | 3 | | | 6 | 6 | 0.667402 | 0.378142 | 1.764950733 | | 1 | 3 | 960 | 1440 | 3 | | | 6 | 6 | 0.808141 | 0.448737 | 1.800923481 | | 1 | 3 | 1200 | 1680 | 3 | | | 6 | 6 | 1.326254 | 0.742543 | 1.786097236 | | 1 | 3 | 1440 | 1920 | 3 | | | 6 | 6 | 1.771069 | 0.970639 | 1.824642323 | | 1 | 3 | 1920 | 2400 | 3 | | | 6 | 6 | 2.904648 | 1.582961 | 1.834946028 |