diff --git a/README.md b/README.md
index 8db7efcfb3ebd73bb325ce8d2d8ed8dc754492e0..d2dc0a499778e5994dbe581c36ff38a789fe9340 100644
--- a/README.md
+++ b/README.md
@@ -1,14 +1,16 @@
# HPCRunner : 贾维斯智能助手
-## ***给每个HPC应用一个温暖的家***
+## ***愿景:在任意机器的任意目录部署最优化HPC应用***
### 项目背景
-因为HPC应用的复杂性,其依赖安装、环境配置、编译、运行、CPU/GPU性能采集分析的门槛比较高,导致迁移和调优的工作量大,不同的人在不同的机器上跑同样的应用和算例基本上是重头开始,费时费力,而且很多情况下需要同时部署ARM/X86两套环境进行验证,增加了很多的重复性工作,无法聚焦软件算法优化。
+ HPC被喻为是IT行业“金字塔上的明珠”,其部署、编译、运行、性能采集分析的门槛非常高,不同的机器上部署HPC应用耗费大量精力,而且很多情况下需要同时部署ARM/X86两套环境进行验证,增加了很多的重复性工作,无法聚焦核心算法优化。
+
+
### 项目特色
-- 支持鲲鹏/X86,一键下载依赖,一键安装依赖、采用业界权威依赖目录结构管理海量依赖,自动生成module file
-- 根据HPC配置一键生成环境脚本、一键编译、一键运行、一键性能采集、一键Benchmark.
+- 支持ARM/X86,一键部署,采用业界权威依赖目录结构管理海量依赖,自动生成module file
+- 根据HPC配置实现一键编译运行、一键CPU/GPU性能采集、一键Benchmark.
- 所有配置仅用一个文件记录,HPC应用部署到不同的机器仅需修改配置文件.
- 日志管理系统自动记录HPC应用部署过程中的所有信息.
- 软件本身无需编译开箱即用,仅依赖Python环境.
@@ -68,7 +70,7 @@ source ./init.sh
| 配置项 | 说明 | 示例 |
| :----------: | :----------------------------------------------------------- | :----------------------------------------------------------- |
| [SERVER] | 服务器节点列表,多节点时用于自动生成hostfile,每行一个节点 | 11.11.11.11 |
-| [DOWNLOAD] | 每行一个软件的版本和下载链接,默认下载到downloads目录(可设置别名) | cmake/3.16.4 https://cmake.org/files/v3.16/cmake-3.16.4.tar.gz 别名 |
+| [DOWNLOAD] | 每行一个软件的版本和下载链接,默认下载到downloads目录(可设置别名) | cp2k/8.2 https://xxx cp2k.8.2.tar.gz |
| [DEPENDENCY] | HPC应用依赖安装脚本 | ./jarvis -install gcc/9.3.1 com
module use ./software/modulefiles
module load gcc9 |
| [ENV] | HPC应用编译运行环境配置 | source env.sh |
| [APP] | HPC应用信息,包括应用名、构建路径、二进制路径、算例路径 | app_name = CP2K
build_dir = /home/cp2k-8.2/
binary_dir = /home/CP2K/cp2k-8.2/bin/
case_dir = /home/CP2K/cp2k-8.2/benchmarks/QS/ |
@@ -78,7 +80,7 @@ source ./init.sh
| [BATCH] | HPC应用批量运行命令 | #!/bin/bash
nvidia-smi -pm 1
nvidia-smi -ac 1215,1410 |
| [PERF] | 性能工具额外参数 | perf= -o
nsys=
ncu=--target-processes all --launch-skip 71434 --launch-count 1 |
-3.一键下载依赖(仅针对无需鉴权的链接,否则需要自行下载到downloads目录)
+3.一键下载HPC应用(仅针对无需鉴权的链接,否则需要自行下载到downloads目录)
```
./jarvis -d
@@ -87,7 +89,7 @@ source ./init.sh
4.安装单个依赖
```
-./jarvis -install [name/version/other] [option]
+./jarvis -install [package/][name/version/other] [option]
```
option支持列表如下所示
@@ -113,6 +115,7 @@ eg:
```
./jarvis -install bisheng/2.1.0 com #安装毕晟编译器
+./jarvis -install package/bisheng/2.1.0 com #安装毕晟编译器
./jarvis -install fftw/3.3.8 gcc+mpi #使用当前gcc和mpi编译fftw 3.3.8版本
./jarvis -install openmpi/4.1.2 gcc #使用当前gcc编译openmpi 4.1.2版本
```
@@ -123,31 +126,31 @@ eg:
./jarvis -remove openblas/0.3.18
```
-6.一键安装所有依赖
+6.一键下载并安装所有依赖(会读取配置文件中的[DEPENDENCY]字段内容并按顺序执行)
```
./jarvis -dp
```
-7.一键生成环境变量(脱离贾维斯运行才需要执行)
+7.一键生成环境变量(会读取配置文件中的[ENV]字段内容并生成env.sh脚本执行,默认自动生成)
```
./jarvis -e && source ./env.sh
```
-8.一键编译
+8.一键编译(会读取配置文件中的[BUILD]字段内容并生成build.sh脚本执行)
```
./jarvis -b
```
-9.一键运行
+9.一键运行(会读取配置文件中的[RUN]字段内容并生成run.sh脚本执行)
```
./jarvis -r
```
-10.一键性能采集(perf)
+10.一键性能采集(会读取配置文件中的[PERF]字段内容的perf值)
```
./jarvis -p
@@ -180,19 +183,21 @@ eg:
./jarvis -use XXX.config
```
-15.其它功能查看(网络检测)
+15.根据当前配置生成Singularity容器定义文件
```
-./jarvis -h
+./jarvis -container docker-hub-address
```
-16.根据当前配置生成Singularity容器定义文件
+16.其它功能查看(网络检测等)
```
-./jarvis -container docker-hub-address
+./jarvis -h
```
+### 路标
+
### 欢迎贡献
@@ -210,8 +215,10 @@ eg:
请添加openEuler HPC SIG微信群了解更多HPC迁移调优知识
-
+
### 技术文章
-揭开HPC应用的神秘面纱:https://zhuanlan.zhihu.com/p/489828346
\ No newline at end of file
+揭开HPC应用的神秘面纱:https://zhuanlan.zhihu.com/p/489828346
+
+我和容器有个约会:https://zhuanlan.zhihu.com/p/489828346
\ No newline at end of file
diff --git a/images/jarvis.png b/images/jarvis.png
new file mode 100644
index 0000000000000000000000000000000000000000..1889eec3ae9f5f81fa30d5943067746eb8db27bb
Binary files /dev/null and b/images/jarvis.png differ
diff --git a/images/roadmap.png b/images/roadmap.png
new file mode 100644
index 0000000000000000000000000000000000000000..8081d3ac0a3fdf3ca14eb0996771ead30d711463
Binary files /dev/null and b/images/roadmap.png differ
diff --git a/images/wechat-group-qr.png b/images/wechat-group-qr.png
new file mode 100644
index 0000000000000000000000000000000000000000..a7bab6c908197e6d26634b9fd316b03390d1123c
Binary files /dev/null and b/images/wechat-group-qr.png differ
diff --git a/package/bisheng/1.3.3/install.sh b/package/bisheng/1.3.3/install.sh
index 118c96b8b082dbd679d71042ae02df4bccd876ea..ba7c3d3527f638f67b8895a687209c56fafea097 100644
--- a/package/bisheng/1.3.3/install.sh
+++ b/package/bisheng/1.3.3/install.sh
@@ -1,5 +1,5 @@
#!/bin/bash
-#download from https://mirrors.huaweicloud.com/kunpeng/archive/compiler/bisheng_compiler/bisheng-compiler-2.1.0-aarch64-linux.tar.gz
set -e
+. ${DOWNLOAD_TOOL} -u https://mirrors.huaweicloud.com/kunpeng/archive/compiler/bisheng_compiler/bisheng-compiler-1.3.3-aarch64-linux.tar.gz
cd ${JARVIS_TMP}
tar xzvf ${JARVIS_DOWNLOAD}/bisheng-compiler-1.3.3-aarch64-linux.tar.gz -C $1 --strip-components=1
\ No newline at end of file
diff --git a/package/bisheng/2.1.0/install.sh b/package/bisheng/2.1.0/install.sh
index 717c1e1931552d3b44b27886383823ea757884d4..9bf1856bd0960f15d92f6d45d2fbd2e8d320190c 100644
--- a/package/bisheng/2.1.0/install.sh
+++ b/package/bisheng/2.1.0/install.sh
@@ -1,6 +1,7 @@
#download from https://mirrors.huaweicloud.com/kunpeng/archive/compiler/bisheng_compiler/bisheng-compiler-2.1.0-aarch64-linux.tar.gz
#!/bin/bash
set -e
+. ${DOWNLOAD_TOOL} -u https://mirrors.huaweicloud.com/kunpeng/archive/compiler/bisheng_compiler/bisheng-compiler-2.1.0-aarch64-linux.tar.gz
cd ${JARVIS_TMP}
yum -y install libatomic libstdc++ libstdc++-devel
tar xzvf ${JARVIS_DOWNLOAD}/bisheng-compiler-2.1.0-aarch64-linux.tar.gz -C $1 --strip-components=1
\ No newline at end of file
diff --git a/package/cmake/3.20.5/install.sh b/package/cmake/3.20.5/install.sh
deleted file mode 100644
index fb01ef8d0b6c2904f4b859ffa3d6bb0a719d6add..0000000000000000000000000000000000000000
--- a/package/cmake/3.20.5/install.sh
+++ /dev/null
@@ -1,4 +0,0 @@
-#!/bin/bash
-set -e
-cd ${JARVIS_TMP}
-tar -xvf ${JARVIS_DOWNLOAD}/cmake-3.20.5-linux-aarch64.tar.gz -C $1 --strip-components=1
\ No newline at end of file
diff --git a/package/cmake/3.23.1/install.sh b/package/cmake/3.23.1/install.sh
new file mode 100644
index 0000000000000000000000000000000000000000..48f77a2c6f6f46445be4481583e141dd9d59d3cb
--- /dev/null
+++ b/package/cmake/3.23.1/install.sh
@@ -0,0 +1,5 @@
+#!/bin/bash
+set -e
+. ${DOWNLOAD_TOOL} -u https://github.com/Kitware/CMake/releases/download/v3.23.1/cmake-3.23.1-linux-aarch64.tar.gz
+cd ${JARVIS_TMP}
+tar -xvf ${JARVIS_DOWNLOAD}/cmake-3.23.1-linux-aarch64.tar.gz -C $1 --strip-components=1
\ No newline at end of file
diff --git a/package/hmpi/1.1.0/gcc/install.sh b/package/hmpi/1.1.0/gcc/install.sh
deleted file mode 100644
index 254a5d9e3d8a84c33970a2eb70a1e7c395265068..0000000000000000000000000000000000000000
--- a/package/hmpi/1.1.0/gcc/install.sh
+++ /dev/null
@@ -1,4 +0,0 @@
-#!/bin/bash
-set -e
-cd ${JARVIS_TMP}
-tar -xvf ${JARVIS_DOWNLOAD}/Hyper-MPI_1.1.0_aarch64_CentOS7.6_GCC9.3_MLNX-OFED4.9.tar.gz -C $1 --strip-components=1
\ No newline at end of file
diff --git a/package/hmpi/1.1.1/install.sh b/package/hmpi/1.1.1/install.sh
index 0a1bd108c7b5fd9bd3a40d0fcb29e516ab4e1a0f..235fe842fbd75482eafe4e956ffbef861747e72d 100644
--- a/package/hmpi/1.1.1/install.sh
+++ b/package/hmpi/1.1.1/install.sh
@@ -1,6 +1,9 @@
#!/bin/bash
set -x
set -e
+. ${DOWNLOAD_TOOL} -u https://github.com/kunpengcompute/hucx/archive/refs/tags/v1.1.1-huawei.zip -f hucx-1.1.1-huawei.zip
+. ${DOWNLOAD_TOOL} -u https://github.com/kunpengcompute/xucg/archive/refs/tags/v1.1.1-huawei.zip -f xucg-1.1.1-huawei.zip
+. ${DOWNLOAD_TOOL} -u https://github.com/kunpengcompute/hmpi/archive/refs/tags/v1.1.1-huawei.zip -f hmpi-1.1.1-huawei.zip
cd ${JARVIS_TMP}
yum install -y perl-Data-Dumper autoconf automake libtool binutils
rm -rf hmpi-1.1.1-huawei hucx-1.1.1-huawei xucg-1.1.1-huawei
diff --git a/package/kml/1.4.0/bisheng/install.sh b/package/kml/1.4.0/bisheng/install.sh
index 129c8eaf1b3ba04aa344007dc4278786e1364f2d..94154c678cfb6e250396af7f8261b15af745024d 100644
--- a/package/kml/1.4.0/bisheng/install.sh
+++ b/package/kml/1.4.0/bisheng/install.sh
@@ -1,11 +1,13 @@
#!/bin/bash
set -x
set -e
+. ${DOWNLOAD_TOOL} -u https://kunpeng-repo.obs.cn-north-4.myhuaweicloud.com/Kunpeng%20BoostKit/Kunpeng%20BoostKit%2021.0.1/BoostKit-kml_1.4.0_bisheng.zip
cd ${JARVIS_TMP}
if [ -d /usr/local/kml ];then
rpm -e boostkit-kml
fi
-rpm --force --nodeps -ivh ${JARVIS_ROOT}/package/kml/1.4.0/bisheng/*.rpm
+unzip -o ${JARVIS_DOWNLOAD}/BoostKit-kml_1.4.0_bisheng.zip
+rpm --force --nodeps -ivh boostkit-kml-1.4.0-1.aarch64.rpm
# generate full lapack
netlib=${JARVIS_DOWNLOAD}/lapack-3.9.1.tar.gz
klapack=/usr/local/kml/lib/libklapack.a
diff --git a/package/kml/1.4.0/gcc/install.sh b/package/kml/1.4.0/gcc/install.sh
index 2f80fe7cf1a7c0cde96355b16757fd44937c9ead..5e0b92d2ef15bd975100bf079ba51d6500799cab 100644
--- a/package/kml/1.4.0/gcc/install.sh
+++ b/package/kml/1.4.0/gcc/install.sh
@@ -1,11 +1,13 @@
#!/bin/bash
set -x
set -e
+. ${DOWNLOAD_TOOL} -u https://kunpeng-repo.obs.cn-north-4.myhuaweicloud.com/Kunpeng%20BoostKit/Kunpeng%20BoostKit%2021.0.1/BoostKit-kml_1.4.0.zip -f BoostKit-kml_1.4.0-gcc.zip
cd ${JARVIS_TMP}
if [ -d /usr/local/kml ];then
rpm -e boostkit-kml
fi
-rpm --force --nodeps -ivh ${JARVIS_ROOT}/package/kml/1.4.0/gcc/*.rpm
+unzip -o ${JARVIS_DOWNLOAD}/BoostKit-kml_1.4.0-gcc.zip
+rpm --force --nodeps -ivh boostkit-kml-1.4.0-1.aarch64.rpm
# generate full lapack
netlib=${JARVIS_DOWNLOAD}/lapack-3.9.1.tar.gz
diff --git a/package/openblas/0.3.18/install.sh b/package/openblas/0.3.18/install.sh
index d475d9e78dd32c9d39a627f87615b6e00937e43f..edc231ae5ab6c2f2ebc2b8d54ffd4e15b378e95f 100644
--- a/package/openblas/0.3.18/install.sh
+++ b/package/openblas/0.3.18/install.sh
@@ -1,6 +1,7 @@
#!/bin/bash
set -x
set -e
+. ${DOWNLOAD_TOOL} -u https://github.com/xianyi/OpenBLAS/releases/download/v0.3.18/OpenBLAS-0.3.18.tar.gz
cd ${JARVIS_TMP}
tar -xzvf ${JARVIS_DOWNLOAD}/OpenBLAS-0.3.18.tar.gz
cd OpenBLAS-0.3.18
diff --git a/package/scalapack/2.1.0/install.sh b/package/scalapack/2.1.0/install.sh
index e79a4709e95a28c04cf4abdbc0db79914a88ccc4..bee6239d78c45a09cdd1b86e776176cba423cfdf 100644
--- a/package/scalapack/2.1.0/install.sh
+++ b/package/scalapack/2.1.0/install.sh
@@ -2,6 +2,7 @@
set -x
set -e
cd ${JARVIS_TMP}
+. ${DOWNLOAD_TOOL} -u http://www.netlib.org/scalapack/scalapack-2.1.0.tgz
tar -xvf ${JARVIS_DOWNLOAD}/scalapack-2.1.0.tgz
cd scalapack-2.1.0
cp SLmake.inc.example SLmake.inc
diff --git a/package/scalapack/2.1.0/kml/install.sh b/package/scalapack/2.1.0/kml/install.sh
index 26da61aa5d306a2a6c53101f40a0af2fd4e4c70a..d0dff549d8b23838b6cc6959418264b9f058270f 100644
--- a/package/scalapack/2.1.0/kml/install.sh
+++ b/package/scalapack/2.1.0/kml/install.sh
@@ -1,6 +1,7 @@
#!/bin/bash
set -x
set -e
+. ${DOWNLOAD_TOOL} -u http://www.netlib.org/scalapack/scalapack-2.1.0.tgz
cd ${JARVIS_TMP}
rm -rf scalapack-2.1.0
tar -xvf ${JARVIS_DOWNLOAD}/scalapack-2.1.0.tgz
diff --git a/software/compiler/bisheng/2.1.0/installed b/software/compiler/bisheng/2.1.0/installed
index c227083464fb9af8955c90d2924774ee50abb547..56a6051ca2b02b04ef92d5150c9ef600403cb1de 100644
--- a/software/compiler/bisheng/2.1.0/installed
+++ b/software/compiler/bisheng/2.1.0/installed
@@ -1 +1 @@
-0
\ No newline at end of file
+1
\ No newline at end of file
diff --git a/src/installService.py b/src/installService.py
index a044b72a78983f7da5230dde923081b9d31788ef..9b50b62879f3abdd23a16862b80910be2f1024ac 100644
--- a/src/installService.py
+++ b/src/installService.py
@@ -33,7 +33,10 @@ class InstallService:
self.UTILS_PATH = os.path.join(self.SOFTWARE_PATH, 'utils')
def get_version_info(self, info):
- return re.search( r'(\d+)\.(\d+)\.',info).group(1)
+ matched_group = re.search( r'(\d+)\.(\d+)\.',info)
+ if not matched_group:
+ return None
+ return matched_group.group(1)
# some command don't generate output, must redirect to a tmp file
def get_cmd_output(self, cmd):
@@ -49,6 +52,9 @@ class InstallService:
gcc_info_list = self.get_cmd_output('gcc -v')
gcc_info = gcc_info_list[-1].strip()
version = self.get_version_info(gcc_info)
+ if not version:
+ print("GCC not found, please install gcc first")
+ sys.exit()
name = 'gcc'
if 'kunpeng' in gcc_info.lower():
name = 'kgcc'
@@ -58,6 +64,9 @@ class InstallService:
clang_info_list = self.get_cmd_output('clang -v')
clang_info = clang_info_list[0].strip()
version = self.get_version_info(clang_info)
+ if not version:
+ print("clang not found, please install clang first")
+ sys.exit()
name = 'clang'
if 'bisheng' in clang_info.lower():
name = 'bisheng'
@@ -74,6 +83,9 @@ class InstallService:
mpi_info = mpi_info_list[0].strip()
name = 'openmpi'
version = self.get_version_info(mpi_info)
+ if not version:
+ print("MPI not found, please install MPI first.")
+ sys.exit()
hmpi_info = self.get_cmd_output('ompi_info | grep "MCA coll: ucx"')[0]
if hmpi_info != "":
name = 'hmpi'
diff --git a/templates/CP2K/8.2/data.CP2K.arm.gpu.config b/templates/CP2K/8.2/data.CP2K.arm.gpu.config
index 2012254a25a02c42d0f5972fed052c8eeccff1fe..d2314db8402cd4ca3557bdb6e6825cdb8c1354ad 100644
--- a/templates/CP2K/8.2/data.CP2K.arm.gpu.config
+++ b/templates/CP2K/8.2/data.CP2K.arm.gpu.config
@@ -2,12 +2,7 @@
11.11.11.11
[DOWNLOAD]
-libint/2.6.0 https://github.com/evaleev/libint/archive/v2.6.0.tar.gz
-libXC/5.1.4 https://www.cp2k.org/static/downloads/libxc-5.1.4.tar.gz
-fftw/3.3.8 https://www.cp2k.org/static/downloads/fftw-3.3.8.tar.gz
-lapack/3.8.0 https://www.cp2k.org/static/downloads/lapack-3.8.0.tgz
-scalapack/2.1.0 https://www.cp2k.org/static/downloads/scalapack-2.1.0.tgz
-cmake/3.16.4 https://cmake.org/files/v3.16/cmake-3.16.4.tar.gz
+cp2k/8.2 https://github.com/cp2k/cp2k/releases/download/v8.2.0/cp2k-8.2.tar.bz2
[DEPENDENCY]
./jarvis -install kgcc/9.3.1 com
@@ -32,6 +27,8 @@ module load openblas/0.3.18
module load gsl/2.6
./jarvis -install plumed/2.6.2 gcc+mpi
./jarvis -install libvori/21.04.12 gcc
+#release CP2K
+tar -jxvf downloads/cp2k-8.2.tar.bz2
[ENV]
module purge
@@ -41,9 +38,9 @@ module load gsl/2.6
[APP]
app_name = CP2K
-build_dir = /home/HT3/HPCRunner2/cp2k-8.2/
-binary_dir = /home/HT3/HPCRunner2/cp2k-8.2/exe/local-cuda/
-case_dir = /home/HT3/HPCRunner2/cp2k-8.2/benchmarks/QS/
+build_dir = ${JARVIS_ROOT}/cp2k-8.2/
+binary_dir = ${JARVIS_ROOT}/cp2k-8.2/exe/local-cuda/
+case_dir = ${JARVIS_ROOT}/cp2k-8.2/benchmarks/QS/
[BUILD]
make -j 128 ARCH=local-cuda VERSION=psmp
diff --git a/templates/qe/6.4/data.qe.test.config b/templates/qe/6.4/data.qe.test.config
index b46531a8738f5e287285fb69a2bffcd92f281df0..bcab7d4a116d994a0be7c2bc39d3629bcebe276c 100644
--- a/templates/qe/6.4/data.qe.test.config
+++ b/templates/qe/6.4/data.qe.test.config
@@ -1,10 +1,6 @@
[SERVER]
11.11.11.11
-[DOWNLOAD]
-kgcc/9.3.1 https://mirrors.huaweicloud.com/kunpeng/archive/compiler/kunpeng_gcc/gcc-9.3.1-2021.03-aarch64-linux.tar.gz
-openmpi/4.1.2 https://download.open-mpi.org/release/open-mpi/v4.1/openmpi-4.1.2.tar.gz
-
[DEPENDENCY]
./jarvis -install kgcc/9.3.1 com
module purge
diff --git a/templates/qe/6.4/data.qe.test.opt.config b/templates/qe/6.4/data.qe.test.opt.config
index 78cf7e01af1340095e60369ba277a622123887fa..b191dcb5b6b7a92e60b412fba6ed046f2da15061 100644
--- a/templates/qe/6.4/data.qe.test.opt.config
+++ b/templates/qe/6.4/data.qe.test.opt.config
@@ -1,13 +1,6 @@
[SERVER]
11.11.11.11
-[DOWNLOAD]
-bisheng/2.1.0 https://mirrors.huaweicloud.com/kunpeng/archive/compiler/bisheng_compiler/bisheng-compiler-2.1.0-aarch64-linux.tar.gz
-hmpi/1.1.1 https://github.com/kunpengcompute/hucx/archive/refs/tags/v1.1.1-huawei.zip hucx-1.1.1-huawei.zip
-hmpi/1.1.1 https://github.com/kunpengcompute/hmpi/archive/refs/tags/v1.1.1-huawei.zip hmpi-1.1.1-huawei.zip
-hmpi/1.1.1 https://github.com/kunpengcompute/xucg/archive/refs/tags/v1.1.1-huawei.zip xucg-1.1.1-huawei.zip
-openblas/0.3.18 https://github.com/xianyi/OpenBLAS/releases/download/v0.3.18/OpenBLAS-0.3.18.tar.gz
-
[DEPENDENCY]
set -x
set -e
diff --git a/wechat-group-qr.png b/wechat-group-qr.png
deleted file mode 100644
index 558033342971cfcb4a72434d89ffbff19a737bb1..0000000000000000000000000000000000000000
Binary files a/wechat-group-qr.png and /dev/null differ