diff --git a/README.md b/README.md index 783b75a550891008ad9e9c43d71af1c3c10f45b0..f613b2d9e9f64986e1276e80258cb92b9b3edb61 100644 --- a/README.md +++ b/README.md @@ -13,12 +13,12 @@ openEuler 目前提供以下两类版本: - 长期支持 (LTS)版本:openEuler 发布间隔周期定为2年,提供4年社区支持,在较长的时间内获得安全、维护和功能的更新。 - 社区创新版本:每隔6个月会发布一个社区创新版本,具有最新的硬软件支持。 -Docs 当前使用以下三类分支,长期支持版本和社区创新版本分别以 20.03 LTS SP3 和 21.09 为例: +Docs 当前使用以下三类分支,长期支持版本和社区创新版本分别以 22.03 LTS SP3 和 23.09 为例: | 分支 | 说明 | 内容呈现 | |-----|----|----| | master | 开发分支,为默认分支|-| -| stable2-20.03_LTS_SP3 | 20.03 LTS 版本 SP3 版本分支 | 分支内容呈现在[openEuler社区](https://openeuler.org/)网站“文档->20.03 LTS SP3” | -| stable2-21.09 | 21.09 版本分支 | 分支内容呈现在[openEuler社区](https://openeuler.org/)网站“文档->21.09” | +| stable2-22.03_LTS_SP3 | 22.03 LTS 版本 SP3 版本分支 | 分支内容呈现在[openEuler社区](https://openeuler.org/)网站“文档->22.03 LTS SP3” | +| stable2-23.09 | 23.09 版本分支 | 分支内容呈现在[openEuler社区](https://openeuler.org/)网站“文档->23.09” | ## 资料清单 @@ -30,24 +30,44 @@ Docs 当前使用以下三类分支,长期支持版本和社区创新版本分 | 安装指南 | 更新 | Installation | | 管理员指南 | 更新 | Administration | | 安全加固指南 | 更新 | SecHarden | -| 虚拟化用户指南 | 更新 | Virtualization | +| GCC插件框架特性用户指南 | 更新 | GCC | +| iSulad支持cgroupv2、CDI、CRI v1alpha2接口 | 新增 | Container | +| CTinspector用户指南 | 更新 | CTinspector | | StratoVirt用户指南 | 更新 | StratoVirt | +| 安全证书用户指南 | 新增 | CertSignature | | 容器用户指南 | 更新 | Container | | A-Tune 用户指南 | 更新 | A-Tune | -| Embedded 用户指南 | 新增 | Embedded | -| 内核热升级指南 | 更新 | KernelLiveUpgrade | +| AI-EulerCopilot智能问答服务(web)使用指南 | 新增 | AI | +| 嵌入式用户指南 | 更新 | Embedded | +| SysCare用户指南 | 更新 | SysCare | +| ods pipeline用户指南 | 更新 | Ods-Pipeline | +| sysMaster用户指南 | 更新 | sysMaster | | 应用开发指南 | 更新 | ApplicationDev | +| Waas-lite用户指南 | 新增 | Waas-lite | +| sysBoost用户指南 | 新增 | sysBoost | | secGear 开发指南 | 更新 | secGear | | Kubernetes 集群部署指南 | 更新 | Kubernetes | -| 第三方软件安装指南 | 更新 | thirdparty_migration | +| DPU-OS用户指南 | 更新 | DPU-OS | +| utshell用户指南 | 更新 | memsafety | +| utsudo用户指南 | 更新 | memsafety | +| migration-tools用户指南 | 更新 | Migration-tools | +| cve-ease用户指南 | 更新 | CVE-ease | | 桌面环境用户指南 | 更新 | desktop | -| 工具集用户指南 | 更新 | userguide | +| PilotGo用户指南 | 更新 | PilotGo | | A-Ops用户指南 | 更新 | A-Ops | -| 容器OS升级指南 | 新增 | KubeOS | +| 容器OS升级指南 | 更新 | KubeOS | +| 云原生混合部署rubik用户指南 | 更新 | rubik | +| CPDS用户指南 | 更新 | CPDS | +| Gazelle用户指南 | 更新 | Gazelle | +| Kmesh用户指南 | 更新 | Kmesh | +| GCC for openEuler用户指南 | 更新 | GCC | +| 内核热升级指南 | 更新 | KernelLiveUpgrade | +| iSula容器引擎 | 新增 | Container | +| ROS用户指南 | 新增 | ROS | ## 如何在Docs中查找文档 -进入[Docs 仓](https://gitee.com/openeuler/docs), 选择 stable2-21.09 分支,进入 “docs” 文件夹,该文件夹包含了中文(“zh”文件夹)和英文(“en”文件夹)两种语言文档,以中文文档举例进行说明。在“zh”文件夹中,“docs”文件夹包含了文档的内容,“menu”展示了具体文档与 Docs 官网目录的映射关系。 “docs” 文件夹与各手册的对应关系可参考资料清单。 +进入[Docs 仓](https://gitee.com/openeuler/docs), 选择 stable2-24.03_LTS 分支,进入 “docs” 文件夹,该文件夹包含了中文(“zh”文件夹)和英文(“en”文件夹)两种语言文档,以中文文档举例进行说明。在“zh”文件夹中,“docs”文件夹包含了文档的内容,“menu”展示了具体文档与 Docs 官网目录的映射关系。 “docs” 文件夹与各手册的对应关系可参考资料清单。 ## 如何修改文档 @@ -77,4 +97,3 @@ Docs 当前使用以下三类分支,长期支持版本和社区创新版本分 ### 如何联系我们 邮件列表: - diff --git a/docs/en/docs/A-Ops/figures/attach_process.png b/docs/en/docs/A-Ops/figures/attach_process.png new file mode 100644 index 0000000000000000000000000000000000000000..f76e8f4513cb45fbece12e6237039c41786b0467 Binary files /dev/null and b/docs/en/docs/A-Ops/figures/attach_process.png differ diff --git a/docs/zh/docs/A-Ops/figures/deadlock.png b/docs/en/docs/A-Ops/figures/deadlock.png similarity index 100% rename from docs/zh/docs/A-Ops/figures/deadlock.png rename to docs/en/docs/A-Ops/figures/deadlock.png diff --git a/docs/zh/docs/A-Ops/figures/deadlock2.png b/docs/en/docs/A-Ops/figures/deadlock2.png similarity index 100% rename from docs/zh/docs/A-Ops/figures/deadlock2.png rename to docs/en/docs/A-Ops/figures/deadlock2.png diff --git a/docs/zh/docs/A-Ops/figures/deadlock3.png b/docs/en/docs/A-Ops/figures/deadlock3.png similarity index 100% rename from docs/zh/docs/A-Ops/figures/deadlock3.png rename to docs/en/docs/A-Ops/figures/deadlock3.png diff --git a/docs/zh/docs/A-Ops/figures/flame_muti_ins.png b/docs/en/docs/A-Ops/figures/flame_muti_ins.png similarity index 100% rename from docs/zh/docs/A-Ops/figures/flame_muti_ins.png rename to docs/en/docs/A-Ops/figures/flame_muti_ins.png diff --git a/docs/en/docs/A-Ops/figures/gala-gopher_architecture.png b/docs/en/docs/A-Ops/figures/gala-gopher_architecture.png new file mode 100644 index 0000000000000000000000000000000000000000..f151965a21d11dd7a3e215cc4ef23d70d059f4b1 Binary files /dev/null and b/docs/en/docs/A-Ops/figures/gala-gopher_architecture.png differ diff --git a/docs/en/docs/A-Ops/figures/gala-gopher_start_success.png b/docs/en/docs/A-Ops/figures/gala-gopher_start_success.png new file mode 100644 index 0000000000000000000000000000000000000000..ab16e9d3661db3fd4adc6c605b2d2d08e79fdc1c Binary files /dev/null and b/docs/en/docs/A-Ops/figures/gala-gopher_start_success.png differ diff --git a/docs/zh/docs/A-Ops/figures/lockcompete1.png b/docs/en/docs/A-Ops/figures/lockcompete1.png similarity index 100% rename from docs/zh/docs/A-Ops/figures/lockcompete1.png rename to docs/en/docs/A-Ops/figures/lockcompete1.png diff --git a/docs/zh/docs/A-Ops/figures/lockcompete2.png b/docs/en/docs/A-Ops/figures/lockcompete2.png similarity index 100% rename from docs/zh/docs/A-Ops/figures/lockcompete2.png rename to docs/en/docs/A-Ops/figures/lockcompete2.png diff --git a/docs/zh/docs/A-Ops/figures/lockcompete3.png b/docs/en/docs/A-Ops/figures/lockcompete3.png similarity index 100% rename from docs/zh/docs/A-Ops/figures/lockcompete3.png rename to docs/en/docs/A-Ops/figures/lockcompete3.png diff --git a/docs/zh/docs/A-Ops/figures/lockcompete4.png b/docs/en/docs/A-Ops/figures/lockcompete4.png similarity index 100% rename from docs/zh/docs/A-Ops/figures/lockcompete4.png rename to docs/en/docs/A-Ops/figures/lockcompete4.png diff --git a/docs/zh/docs/A-Ops/figures/lockcompete5.png b/docs/en/docs/A-Ops/figures/lockcompete5.png similarity index 100% rename from docs/zh/docs/A-Ops/figures/lockcompete5.png rename to docs/en/docs/A-Ops/figures/lockcompete5.png diff --git a/docs/zh/docs/A-Ops/figures/lockcompete6.png b/docs/en/docs/A-Ops/figures/lockcompete6.png similarity index 100% rename from docs/zh/docs/A-Ops/figures/lockcompete6.png rename to docs/en/docs/A-Ops/figures/lockcompete6.png diff --git a/docs/zh/docs/A-Ops/figures/tprofiling-dashboard-detail.png b/docs/en/docs/A-Ops/figures/tprofiling-dashboard-detail.png similarity index 100% rename from docs/zh/docs/A-Ops/figures/tprofiling-dashboard-detail.png rename to docs/en/docs/A-Ops/figures/tprofiling-dashboard-detail.png diff --git a/docs/zh/docs/A-Ops/figures/tprofiling-dashboard.png b/docs/en/docs/A-Ops/figures/tprofiling-dashboard.png similarity index 100% rename from docs/zh/docs/A-Ops/figures/tprofiling-dashboard.png rename to docs/en/docs/A-Ops/figures/tprofiling-dashboard.png diff --git a/docs/en/docs/A-Ops/figures/tprofiling-run-arch.png b/docs/en/docs/A-Ops/figures/tprofiling-run-arch.png new file mode 100644 index 0000000000000000000000000000000000000000..e18e28672beb6306050c42cab1b46f588de83eaa Binary files /dev/null and b/docs/en/docs/A-Ops/figures/tprofiling-run-arch.png differ diff --git a/docs/en/docs/A-Ops/using-gala-gopher.md b/docs/en/docs/A-Ops/using-gala-gopher.md index 6277655b7051d11c1254fbf98b63c5285e6d2846..e3bfae8300c64ada481ea7ef755c785aaacfebe6 100644 --- a/docs/en/docs/A-Ops/using-gala-gopher.md +++ b/docs/en/docs/A-Ops/using-gala-gopher.md @@ -4,21 +4,21 @@ As a data collection module, gala-gopher provides OS-level monitoring capabiliti This chapter describes how to deploy and use the gala-gopher service. -#### Installation +## Installation Mount the repo sources. ```basic -[oe-2209] # openEuler 22.09 officially released repository -name=oe2209 -baseurl=http://119.3.219.20:82/openEuler:/22.09/standard_x86_64 +[oe-{version}] # openEuler official released repository +name=oe{version} +baseurl=https://repo.openeuler.org/openEuler-{version}/everything/$basearch/ enabled=1 gpgcheck=0 priority=1 -[oe-2209:Epol] # openEuler 22.09: Epol officially released repository -name=oe2209_epol -baseurl=http://119.3.219.20:82/openEuler:/22.09:/Epol/standard_x86_64/ +[oe-{version}:Epol] # openEuler:Epol official released repository +name=oe{version}_epol +baseurl=https://repo.openeuler.org/openEuler-{version}/EPOL/main/$basearch/ enabled=1 gpgcheck=0 priority=1 @@ -26,78 +26,71 @@ priority=1 Install gala-gopher. +RPM installation applies to single-node observation scenarios without containers. openEuler 22.03 LTS SP1 or later is supported. + ```bash # yum install gala-gopher ``` +## Configuration +### Configuration Description -#### Configuration - -##### Configuration Description - -The configuration file of gala-gopher is **/opt/gala-gopher/gala-gopher.conf**. The configuration items in the file are described as follows (the parts that do not need to be manually configured are not described): +The configuration file of gala-gopher is **/etc/gala-gopher/gala-gopher.conf**. The configuration items in the file are described as follows (the parts that do not need to be manually configured are not described): The following configurations can be modified as required: -- `global`: gala-gopher global configuration information. - - `log_directory`: gala-gopher log file name. - - `pin_path`: path for storing the map shared by the eBPF probe. You are advised to retain the default value. -- `metric`: metric output mode. - - `out_channel`: metric output channel. The value can be `web_server` or `kafka`. If this parameter is left empty, the output channel is disabled. - - `kafka_topic`: topic configuration information if the output channel is Kafka. -- `event`: output mode of abnormal events. - - `out_channel`: event output channel. The value can be `logs` or `kafka`. If this parameter is left empty, the output channel is disabled. - - `kafka_topic`: topic configuration information if the output channel is Kafka. -- `meta`: metadata output mode. - - `out_channel`: metadata output channel. The value can be `logs` or `kafka`. If this parameter is left empty, the output channel is disabled. - - `kafka_topic`: topic configuration information if the output channel is Kafka. -- `imdb`: cache specification configuration. - - `max_tables_num`: maximum number of cache tables. In the **/opt/gala-gopher/meta** directory, each meta corresponds to a table. - - `max_records_num`: maximum number of records in each cache table. Generally, each probe generates at least one observation record in an observation period. - - `max_metrics_num`: maximum number of metrics contained in each observation record. - - `record_timeout`: aging time of the cache table. If a record in the cache table is not updated within the aging time, the record is deleted. The unit is second. -- `web_server`: configuration of the web_server output channel. - - `port`: listening port. -- `kafka`: configuration of the Kafka output channel. - - `kafka_broker`: IP address and port number of the Kafka server. -- `logs`: configuration of the logs output channel. - - `metric_dir`: path for storing metric data logs. - - `event_dir`: path for storing abnormal event data logs. - - `meta_dir`: metadata log path. - - `debug_dir`: path of gala-gopher run logs. -- `probes`: native probe configuration. - - `name`: probe name, which must be the same as the native probe name. For example, the name of the **example.probe** probe is **example**. - - `param`: probe startup parameters. For details about the supported parameters, see [Startup Parameters](#startup-parameters). - - `switch`: whether to start a probe. The value can be `on` or `off`. -- `extend_probes`: third-party probe configuration. - - `name`: probe name. - - `command`: command for starting a probe. - - `param`: probe startup parameters. For details about the supported parameters, see [Startup Parameters](#startup-parameters). - - `start_check`: If `switch` is set to `auto`, the system determines whether to start the probe based on the execution result of `start_check`. - - `switch`: whether to start a probe. The value can be `on`, `off`, or `auto`. The value `auto` determines whether to start the probe based on the result of `start_check`. - -##### Startup Parameters - -| Parameter| Description | -| ------ | ------------------------------------------------------------ | -| -l | Whether to enable the function of reporting abnormal events. | -| -t | Sampling period, in seconds. By default, the probe reports data every 5 seconds. | -| -T | Delay threshold, in ms. The default value is **0**. | -| -J | Jitter threshold, in ms. The default value is **0**. | -| -O | Offline time threshold, in ms. The default value is **0**. | -| -D | Packet loss threshold. The default value is **0**. | -| -F | If this parameter is set to `task`, data is filtered by **task_whitelist.conf**. If this parameter is set to the PID of a process, only the process is monitored.| -| -P | Range of probe programs loaded to each probe. Currently, the tcpprobe and taskprobe probes are involved.| -| -U | Resource usage threshold (upper limit). The default value is **0** (%). | -| -L | Resource usage threshold (lower limit). The default value is **0** (%). | -| -c | Whether the probe (TCP) identifies `client_port`. The default value is **0** (no). | -| -N | Name of the observation process of the specified probe (ksliprobe). The default value is **NULL**. | -| -p | Binary file path of the process to be observed, for example, `nginx_probe`. You can run `-p /user/local/sbin/nginx` to specify the Nginx file path. The default value is **NULL**.| -| -w | Filtering scope of monitored applications, for example, `-w /opt/gala-gopher/task_whitelist.conf`. You can write the names of the applications to be monitored to the **task_whitelist.conf** file. The default value is **NULL**, indicating that the applications are not filtered.| -| -n | NIC to mount tc eBPF. The default value is **NULL**, indicating that all NICs are mounted. Example: `-n eth0`| - -##### Configuration File Example +- **global**: gala-gopher global configuration information. + - **log_file_name**: gala-gopher log file name. + - **log_level**: gala-gopher log level (not supported currently) + - **pin_path**: path for storing the map shared by the eBPF probe. You are advised to retain the default value. +- **metric**: metric output mode. + - **out_channel**: metric output channel. The value can be **web_server**, **logs**, or **kafka**. If this parameter is left empty, the output channel is disabled. + - **kafka_topic**: topic configuration information if the output channel is Kafka. +- **event**: output mode of abnormal events. + - **out_channel**: event output channel. The value can be **logs** or **kafka**. If this parameter is left empty, the output channel is disabled. + - **kafka_topic**: topic configuration information if the output channel is Kafka. + - **timeout**: interval for reporting the same abnormal event + - **desc_language**: Language of the abnormal event description. Currently, the value can be **zh_CN** or **en_US**. +- **meta**: metadata output mode. + - **out_channel**: metadata output channel. The value can be **logs** or **kafka**. If this parameter is left empty, the output channel is disabled. + - **kafka_topic**: topic configuration information if the output channel is Kafka. +- **ingress**: configuration related to probe data reporting + - **interval**: not in use +- **egress**: configuration related to database reporting + - **interval**: not in use + - **time_range**: not in use +- **imdb**: cache specification configuration. + - **max_tables_num**: maximum number of cache tables. In the **/opt/gala-gopher/meta** directory, each meta corresponds to a table. + - **max_records_num**: maximum number of records in each cache table. Generally, each probe generates at least one observation record in an observation period. + - **max_metrics_num**: maximum number of metrics contained in each observation record. + - **record_timeout**: aging time of the cache table. If a record in the cache table is not updated within the aging time, the record is deleted. The unit is second. +- **web_server**: configuration of the web_server output channel. + - **port**: listening port. + - **ssl_auth**: whether to enable HTTPS encryption and authentication for the web server. The value can be **on** or **off**. You are advised to enable it in the production environment. + - **private_key**: absolute path of the server private key file used for web server HTTPS encryption. This parameter is mandatory when **ssl_auth** is **on**. + - **cert_file**: absolute path of the server certificate used for web server HTTPS encryption. This parameter is mandatory when **ssl_auth** is **on**. + - **ca_file**: absolute path of the CA certificate used by the web server to authenticate the client. This parameter is mandatory when **ssl_auth** is **on**. +- rest_api_server + - **port**: RestFul API listening port. + - **ssl_auth**: whether to enable HTTPS encryption and authentication for the RESTful API. The value can be **on** or **off**. You are advised to enable it in the production environment. + - **private_key**: absolute path of the server private key file used for RESTful API HTTPS encryption. This parameter is mandatory when **ssl_auth** is **on**. + - **cert_file**: absolute path of the server certificate used for RESTful API HTTPS encryption. This parameter is mandatory when **ssl_auth** is **on**. + - **ca_file**: absolute path of the CA certificate used by the RESTful API to authenticate the client. This parameter is mandatory when **ssl_auth** is **on**. +- **kafka**: configuration of the Kafka output channel. + - **kafka_broker**: IP address and port number of the Kafka server. + - **batch_num_messages**: number of messages sent in each batch. + - **compression_codec**: message compression type. + - **queue_buffering_max_messages**: maximum number of messages allowed in the producer buffer. + - **queue_buffering_max_kbytes**: maximum number of bytes allowed in the producer buffer. + - **queue_buffering_max_ms**: maximum time for a producer to wait for more messages to join a batch before sending the batch. +- **logs**: configuration of the logs output channel. + - **metric_dir**: path for storing metric data logs. + - **event_dir**: path for storing abnormal event data logs. + - **meta_dir**: metadata log path. + - **debug_dir**: path of gala-gopher run logs. + +### Configuration File Example - Select the data output channels. @@ -127,6 +120,10 @@ The following configurations can be modified as required: web_server = { port = 8888; + ssl_auth = "off"; + private_key = ""; + cert_file = ""; + ca_file = ""; }; kafka = @@ -135,31 +132,7 @@ The following configurations can be modified as required: }; ``` -- Select the probe to be enabled. The following is an example. - - ```yaml - probes = - ( - { - name = "system_infos"; - param = "-t 5 -w /opt/gala-gopher/task_whitelist.conf -l warn -U 80"; - switch = "on"; - }, - ); - extend_probes = - ( - { - name = "tcp"; - command = "/opt/gala-gopher/extend_probes/tcpprobe"; - param = "-l warn -c 1 -P 7"; - switch = "on"; - } - ); - ``` - - - -#### Start +## Start After the configuration is complete, start gala-gopher. @@ -175,31 +148,917 @@ Query the status of the gala-gopher service. If the following information is displayed, the service is started successfully: Check whether the enabled probe is started. If the probe thread does not exist, check the configuration file and gala-gopher run log file. -![gala-gopher成功启动状态](./figures/gala-gopher成功启动状态.png) +![gala-gopher successful start](./figures/gala-gopher_start_success.png) > Note: The root permission is required for deploying and running gala-gopher. +### Dynamic Configuration RESTful APIs +The port number of the web server is configurable (9999 by default). The URL format is **http://\[IP address of the gala-gopher node]:\[Port number]/\[Function]**. For example, the URL of the flame graph is ****. (The following uses the flame graph as an example.) -#### How to Use +#### Configuring the Probe Monitoring Scope -##### Deployment of External Dependent Software +The probe is disabled by default. You can dynamically enable the probe and set the monitoring scope through the API. Take the flame graph as an example. Use the REST API to enable the oncpu, offcpu, and mem flame graph capabilities. The monitoring scope can be set by process ID, process name, container ID, or Pod. -![gopher软件架构图](./figures/gopher软件架构图.png) +The following is an example of enabling oncpu and offcpu information collection for a flame graph: -As shown in the preceding figure, the green parts are external dependent components of gala-gopher. gala-gopher outputs metric data to Prometheus, metadata and abnormal events to Kafka. gala-anteater and gala-spider in gray rectangles obtain data from Prometheus and Kafka. +```bash +curl -X PUT http://localhost:9999/flamegraph --data-urlencode json=' +{ + "cmd": { + "bin": "/opt/gala-gopher/extend_probes/stackprobe", + "check_cmd": "", + "probe": [ + "oncpu", + "offcpu" + ] + }, + "snoopers": { + "proc_id": [ + 101, + 102 + ], + "proc_name": [ + { + "comm": "app1", + "cmdline": "", + "debugging_dir": "" + }, + { + "comm": "app2", + "cmdline": "", + "debugging_dir": "" + } + ], + "pod_id": [ + "pod1", + "pod2" + ], + "container_id": [ + "container1", + "container2" + ] + } +}' +``` -> Note: Obtain the installation packages of Kafka and Prometheus from the official websites. +The collection features are described as follows: + +| Feature | Description | Collection Subitems | Supported Monitored Object Range | Startup File | Startup Condition | +| ------------- | -------------------------------------------- | ------------------------------------------------------------ | ---------------------------------------- | ---------------------------------- |---------------------------| +| flamegraph | On-line performance flame graph observation capability | oncpu, offcpu, mem | proc_id, proc_name, pod_id, container_id | /opt/gala-gopher/extend_probes/stackprobe | NA | +| l7 | Application layer (layer 7) protocol observation capability | l7_bytes_metrics,l7_rpc_metrics,l7_rpc_trace | proc_id, proc_name, pod_id, container_id | /opt/gala-gopher/extend_probes/l7probe | NA | +| tcp | TCP exception and status observation capability | tcp_abnormal, tcp_rtt, tcp_windows, tcp_rate, tcp_srtt, tcp_sockbuf, tcp_stats,tcp_delay | proc_id, proc_name, pod_id, container_id | /opt/gala-gopher/extend_probes/tcpprobe | NA | +| socket | Socket (TCP/UDP) exception observation capability | tcp_socket, udp_socket | proc_id, proc_name, pod_id, container_id | /opt/gala-gopher/extend_probes/endpoint | NA | +| io | Block layer I/O observation capability | io_trace, io_err, io_count, page_cache | NA | /opt/gala-gopher/extend_probes/ioprobe | NA | +| proc | Process system call, I/O, DNS, VFS, and ioctl observation capabilities| proc_syscall, proc_fs, proc_io, proc_dns,proc_pagecache,proc_net,proc_offcpu,proc_ioctl | proc_id, proc_name, pod_id, container_id | /opt/gala-gopher/extend_probes/taskprobe | NA | +| jvm | JVM layer GC, thread, memory, and cache observation capabilities | NA | proc_id, proc_name, pod_id, container_id | /opt/gala-gopher/extend_probes/jvmprobe | NA | +| ksli | Redis performance SLI (access latency) observation capability | NA | proc_id, proc_name, pod_id, container_id | /opt/gala-gopher/extend_probes/ksliprobe | NA | +| postgre_sli | PostgreSQL database performance SLI (access latency) observation capability | NA | proc_id, proc_name, pod_id, container_id | /opt/gala-gopher/extend_probes/pgsliprobe | NA | +| opengauss_sli | openGauss access throughput observation capability | NA | \[ip, port, dbname, user,password] | /opt/gala-gopher/extend_probes/pg_stat_probe.py | NA | +| dnsmasq | DNS session observation capability | NA | proc_id, proc_name, pod_id, container_id | /opt/gala-gopher/extend_probes/rabbitmq_probe.sh | NA | +| lvs | LVS session observation capability | NA | NA | /opt/gala-gopher/extend_probes/trace_lvs | lsmod\|grep ip_vs\| wc -l | +| nginx | Nginx layer 4/layer 7 session observation capability | NA | proc_id, proc_name, pod_id, container_id | /opt/gala-gopher/extend_probes/nginx_probe | NA | +| haproxy | HAProxy layer 4/layer 7 session observation capability | NA | proc_id, proc_name, pod_id, container_id | /opt/gala-gopher/extend_probes/trace_haproxy | NA | +| kafka | Kafka producer/consumer topic observation capability | NA | NA | /opt/gala-gopher/extend_probes/kafkaprobe | NA | +| baseinfo | Basic system information | cpu, mem, nic, disk, net, fs, proc, host, con | proc_id, proc_name, pod_id, container_id | system_infos | NA | +| virt | Virtualization management information | NA | NA | virtualized_infos | NA | +| tprofiling | Thread-level performance profiling observation capability | oncpu, syscall_file, syscall_net, syscall_lock, syscall_sched | proc_id, proc_name, pod_id, container_id | /opt/gala-gopher/extend_probes/tprofiling | NA | +| container | Container information | NA | proc_id, proc_name, container_id | /opt/gala-gopher/extend_probes/cadvisor_probe.py | NA | +| sermant | Java application layer 7 protocol observation capability. Currently, the dubbo protocol is supported.| l7_bytes_metrics, l7_rpc_metrics | proc_id, proc_name, pod_id, container_id | /opt/gala-gopher/extend_probes/sermant_probe.py | NA | + +#### Configuring the Probe Monitoring Parameters + +During probe running, you need to set some parameters, such as the sampling period and reporting period of the flame graph. + +```bash +curl -X PUT http://localhost:9999/flamegraph --data-urlencode json=' +{ + "params": { + "report_period": 180, + "sample_period": 180, + "metrics_type": [ + "raw", + "telemetry" + ] + } +}' +``` + +The detailed running parameters are as follows: + +| Parameter | Description | Default Value & Range | Unit | Supported Features | Supported or Not| +| :-----------------: | :--------------------------------: | :----------------------------------------------------------: |:-------:| :-----------------------------------------: | :--------: | +| sample_period | Sampling period | 5000, \[100~10000] | ms | io, tcp | Y | +| report_period | Report period | 60, \[5~600] | s | ALL | Y | +| latency_thr | Latency report threshold | 0, \[10~100000] | ms | tcp, io, proc, ksli | Y | +| offline_thr | Process offline report threshold | 0, \[10~100000] | ms | proc | Y | +| drops_thr | Packet loss report threshold | 0, \[10~100000] | package | tcp, nic | Y | +| res_lower_thr | Lower resource threshold | 0, \[0~100] | percent | ALL | Y | +| res_upper_thr | Upper resource threshold | 0, \[0~100] | percent | ALL | Y | +| report_event | Exception event report | 0, \[0, 1] | - | ALL | Y | +| metrics_type | Telemetry metric report | "raw", \["raw", "telemetry"] | - | ALL | N | +| env | Environment type | "node", \["node", "container", "kubenet"] | - | ALL | N | +| l7_protocol | Layer 7 protocol scope | "",\["http", "pgsql", "redis","mysql", "kafka", "mongo", "dns"] | - | l7 | Y | +| support_ssl | SSL observation support | 0, \[0, 1] | - | l7 | Y | +| multi_instance | Independent flame graph for each process | 0, \[0, 1] | - | flamegraph | Y | +| native_stack | Local language stack display (for Java processes)| 0, \[0, 1] | - | flamegraph | Y | +| cluster_ip_backend | Cluster IP backend conversion | 0, \[0, 1] | - | tcp, l7 | Y | +| pyroscope_server | IP address of the flame graph UI server | "localhost:4040" | - | flamegraph | Y | +| svg_period | Flame graph SVG file generation period | 180, \[30, 600] | s | flamegraph | Y | +| perf_sample_period | Stack information collection period for **oncpu** flame graphs | 10, \[10, 1000] | ms | flamegraph | Y | +| svg_dir | Directory for storing flame graph SVG files | "/var/log/gala-gopher/stacktrace" | - | flamegraph | Y | +| flame_dir | Directory for storing original stack information of flame graphs | "/var/log/gala-gopher/flamegraph" | - | flamegraph | Y | +| dev_name | Observed NIC/drive device name | "" | - | io, kafka, ksli, postgre_sli, baseinfo, tcp| Y | +| continuous_sampling | Continuous sampling | 0, \[0, 1] | - | ksli | Y | +| elf_path | Observed executable file path | "" | - | baseinfo, nginx, haproxy, dnsmasq | Y | +| kafka_port | Observed Kafka port | 9092, \[1, 65535] | - | kafka | Y | +| cadvisor_port | Started cAdvisor port | 8080, \[1, 65535] | - | cadvisor | Y | + +#### Starting and Stopping a Probe + +```bash +curl -X PUT http://localhost:9999/flamegraph --data-urlencode json=' +{ + "state": "running" // optional: running,stopped +}' +``` + +#### Restrictions + +1. The interface is stateless. The settings uploaded each time are the final running results of the probe, including the status, parameters, and monitoring scope. +2. The monitored objects can be combined randomly, and the monitoring scope is the combination. +3. The startup file must be authentic and valid. +4. You can enable some or all subitems of a collection feature as required. However, you can only disable all subitems of a collection feature at once. +5. The monitored object of OpenGauss is the database instance (**ip**/**port**/**dbname**/**user**/**password**). +6. The interface can receive a maximum of 2048 bytes of data each time. + +#### Obtaining the Probe Configurations and Running Status + +```bash +curl -X GET http://localhost:9999/flamegraph +{ + "cmd": { + "bin": "/opt/gala-gopher/extend_probes/stackprobe", + "check_cmd": "" + "probe": [ + "oncpu", + "offcpu" + ] + }, + "snoopers": { + "proc_id": [ + 101, + 102 + ], + "proc_name": [ + { + "comm": "app1", + "cmdline": "", + "debugging_dir": "" + }, + { + "comm": "app2", + "cmdline": "", + "debugging_dir": "" + } + ], + "pod_id": [ + "pod1", + "pod2" + ], + "container_id": [ + "container1", + "container2" + ] + }, + "params": { + "report_period": 180, + "sample_period": 180, + "metrics_type": [ + "raw", + "telemetry" + ] + }, + "state": "running" +} +``` + +## Introduction to stackprobe + +stackprobe generates performance flame graphs for cloud native environments. + +### Features + +- C/C++, Go, Rust, and Java applications can be observed. + +- The call stack supports container and process granularities. For processes in a container, the pod name and container name of the workload are prefixed with **\[Pod]** and **\[Con]** at the bottom of the call stack, respectively. A process name is prefixed with **\[__]**. Threads and functions (methods) do not have prefixes. + +- The flame graph in SVG format can be generated locally or the call stack data can be uploaded to the middleware. + +- Flame graphs can be generated or uploaded by multiple instances based on the process granularity. + +- For the flame graph of a Java process, both local and Java methods can be displayed. + +- Supports multiple types of flame graphs, such as **oncpu**, **offcpu**, and **mem**. + +- The sampling period can be customized. + +### Usage Instruction + +Startup command example (basic): Use the default parameters to start the performance flame graph. + +```shell +curl -X PUT http://localhost:9999/flamegraph -d json='{ "cmd": {"probe": ["oncpu"] }, "snoopers": {"proc_name": [{ "comm": "cadvisor"}] }, "state": "running"}' +``` + +Startup command example (advanced): Use custom parameters to start the performance flame graph. For details about the configurable parameters, see [Configuring the Probe Monitoring Parameters](#configuring-the-probe-monitoring-parameters). + +```shell +curl -X PUT http://localhost:9999/flamegraph -d json='{ "cmd": { "check_cmd": "", "probe": ["oncpu", "offcpu", "mem"] }, "snoopers": { "proc_name": [{ "comm": "cadvisor", "cmdline": "", "debugging_dir": "" }, { "comm": "java", "cmdline": "", "debugging_dir": "" }] }, "params": { "perf_sample_period": 100, "svg_period": 300, "svg_dir": "/var/log/gala-gopher/stacktrace", "flame_dir": "/var/log/gala-gopher/flamegraph", "pyroscope_server": "localhost:4040", "multi_instance": 1, "native_stack": 0 }, "state": "running"}' +``` + +The main configuration items are described as follows: + +- Set the type of the flame graph to be enabled. + + The **probe** parameter can be set to **oncpu**, **offcpu**, or **mem**, indicating the CPU occupation time, blocking time, and memory size of a process, respectively. + + Example: + + ```json + "probe": ["oncpu", "offcpu", "mem"] + ``` + +- Set the period for generating a local flame graph SVG file. + + The **svg_period** parameter can be set to an integer ranging from \[30, 600]. The unit is second. The default value is **180**. + + Example: + + ```json + "svg_period": 300 + ``` + +- Enable or disable stack information uploading to Pyroscope + + Set the **pyroscope_server** parameter. The parameter value must contain **addr** and **port**. If the parameter is empty or the format is incorrect, the probe does not attempt to upload stack information. + + The upload period is 30s. + + Example: + + ```json + "pyroscope_server": "localhost:4040" + ``` + +- Set the call stack sampling period. + + Set the **perf_sample_period** parameter. The unit is millisecond. The default value is **10**. You can set this parameter to an integer ranging from \[10, 1000]. This parameter is valid only for flame graphs of the **oncpu** type. + + Example: + + ```json + "perf_sample_period": 100 + ``` + +- Enable or disable flame graph generation for multiple instances + + Set the **multi_instance** parameter. The value can be **0** or **1**. The default value is **0**. **0** indicates that the flame graphs of all processes are combined. **1** indicates that the flame graph of each process is generated separately. + + Example: + + ```json + "multi_instance": 1 + ``` + +- Enable or disable custom local call stack collection. + + Set the **native_stack** parameter. The value can be **0** or **1**. The default value is **0**. This parameter is valid only for Java processes. **0** indicates that the JVM call stack is not collected. **1** indicates that the JVM call stack is collected. + + Example: + + ```json + "native_stack": 1 + ``` + + Output: (left: **"native_stack": 1**; right: **"native_stack": 0**) + + ![image-20230804172905729](./figures/flame_muti_ins.png) + +### Implementation + +#### 1. User-Mode Program Logic + +The program periodically (every 30s) converts the stack information reported by the kernel mode from an address to a symbol based on the symbol table. Then, the flamegraph plugin or pyroscope is used to convert the symbolic call stack into a flame graph. + +Methods for obtaining the symbol tables for the code segment type are different. + +- Obtaining the kernel symbol table: Read **/proc/kallsyms**. + +- Obtaining the local language symbol table: Query the virtual memory mapping file (**/proc/***{pid}***/maps**) of the process to obtain the address mapping of each code segment in the process memory, and then use the libelf library to load the symbol table of the module corresponding to each code segment. + +- Obtaining the Java language symbol table: + + Because Java methods do not statically map to the virtual address space of the process, other ways are used to obtain the symbolic Java call stack. + +##### Method 1: perf + +Load the JVM agent dynamic library to the Java process to trace JVM method compilation and loading events, obtain and record the mapping between memory addresses and Java symbols, and generate the symbol table of the Java process in real time. This method requires the Java process to enable the `-XX:+PreserveFramePointer` startup parameter. The advantage of this method is that the call stack of the JVM can be displayed in the flame graph, and the Java flame graph generated in this method can be displayed together with the flame graphs of other processes. + +##### Method 2: JFR + +Track various events and metrics of Java applications by dynamically turning on the JVM built-in analyzer JFR. To enable JFR, load the Java agent to the Java process. The Java agent invokes the JFR API. The advantage of this method is that the collection of Java method call stacks is more accurate and detailed. + +The preceding two performance analysis methods for Java processes can be loaded in real time (without restarting the Java process) and have the advantage of low overhead. If the stackprobe startup parameters are **"multi_instance": 1** and **"native_stack": 0**, stackprobe uses method 2 to generate Java process flame graphs. Otherwise, method 1 is used. + +#### 2. Kernel-Mode Program Logic + +The kernel-mode logic is implemented based on eBPF. Different flame graph types correspond to different eBPF programs. The eBPF programs traverse the call stacks of the current user mode and kernel mode periodically or by triggering events, and report the call stacks to the user mode. + +##### 2.1 oncpu Flame Graph + +Mount the sampling eBPF program to the **PERF_COUNT_SW_CPU_CLOCK** tracepoint for perf software events to periodically sample the call stack. + +##### 2.2 offcpu Flame Graph + +Mount the sampling eBPF program to the tracepoint (**sched_switch**) for process scheduling. The sampling eBPF program records the time when the process is scheduled and the process ID. When the process is scheduled back, the call stack is sampled. + +#### 2.3 mem Flame Graph + +Mount the sampling eBPF program to the tracepoint (**page_fault_user**) for page faults. When the event is triggered, the call stack is sampled. + +#### 3. Java Language Support + +- Main stackprobe process: + + 1. The IPC message is received to obtain the Java process to be observed. + 2. Use the Java agent loading module to load the JVM agent program **jvm_agent.so** to the Java process to be observed (corresponding to [Method 1](#method-1-perf)) or **JstackProbeAgent.jar** (corresponding to [Method 2](#method-2-jfr)). + 3. The main process of method 1 loads the **java-symbols.bin** file of the corresponding Java process for query during address conversion. The main process of method 2 loads the **stacks-**_{flame_type}_**.txt** file of the corresponding Java process, which can be directly used to generate a flame graph. + +- Java agent loading module + + 1. If a new Java process is found, copy the JVM agent program to the process space **/proc/**_\_**/root/tmp** (because the agent program needs to be visible to the JVM in the container during attach). + + 2. Set the owner of the preceding directory and JVM agent program to be the same as that of the Java process to be observed. + + 3. Start the jvm_attach subprocess and transfers parameters related to the Java process to be observed. + +- JVM agent program + + - **jvm_agent.so**: Registers the JVMTI callback function. + + When the JVM loads a Java method or dynamically compiles a local method, the JVM invokes the callback function. The callback function writes the Java class name, method name, and corresponding memory address to the observed Java process space (**/proc/**_\_**/root/tmp/java-data-**_\_**/java-symbols.bin**). + + - **JstackProbeAgent.jar**: Calls the JFR API. + + Enable the JFR function for 30 seconds, convert the JFR statistics result to the stack format available for the flame graph, and export the result to the observed Java process space (**/proc/**_\_**/root/tmp/java-data-**_\_**/stacks-**_\_**.txt**). For details, see [Introduction to JstackProbe](https://gitee.com/openeuler/gala-gopher/blob/dev/src/probes/extends/java.probe/jstack.probe/readme.md). + +- jvm_attach: Loads the JVM agent program to the JVM of the observed process in real time. + For details, see **sun.tools.attach.LinuxVirtualMachine** and **jattach** in the JDK source code. + + 1. Set its own namespace. (When the JVM loads the agent, the namespace of the loading process must be the same as that of the observed process.) + + 2. Check whether the JVM attach listener is started (whether the UNIX socket file **/proc/**_\_**/root/tmp/.java\_pid**_\_ exists). + + 3. If the process is not started, create **/proc/**_\_**/cwd/.attach_pid**_\_ and send the SIGQUIT signal to the JVM. + + 4. Connect to the UNIX socket. + + 5. If the read response is **0**, the attach operation is successful. + + The following figure shows the attach agent process. + + ![attach process](./figures/attach_process.png) + +### Notes + +- To obtain the best observation effect on Java applications, set the stackprobe startup option to **"multi_instance": 1, "native_stack": 0** to enable JFR observation (JDK8u262+). Otherwise, stackprobe generates a Java flame graph using perf. In perf mode, enable the JVM option `XX:+PreserveFramePointer` (JDK8 or later). + +### Constraints + +- Hotspot JVM-based Java application observation is supported. + +## Introduction to tprofiling + +tprofiling is an eBPF-based thread-level application performance diagnosis tool provided by gala-gopher. It uses the eBPF technology to observe key system performance events of threads and associate rich event content. In this way, the running status and key actions of threads are recorded in real time, helping users quickly identify application performance problems. + +### Features + +From the perspective of the OS, a running application program consists of multiple processes, and each process consists of multiple running threads. tprofiling observes and records key actions (events) performed by these threads during thread running, and displays the actions in a time line on the front-end page. In this way, you can intuitively analyze the actions in a certain period of time, whether an action is performed on the CPU or blocked on a file or network operation. When a performance issue occurs in an application, you can analyze the execution sequence of key performance events of the corresponding thread to quickly demarcate and locate the issue. + +Based on the implemented event observation scope, tProfiling can locate application performance issues in the following scenarios: + +- File I/O time consumption and blocking +- Network I/O time consumption and blocking +- Lock contention +- Deadlock + +As more types of events are supplemented and improved, tprofiling will be able to cover more types of application performance issue scenarios. + +### Event Observation Scope + +Currently, tprofiling supports two types of system performance events: system call events and **oncpu** events. + +**System call events** + +Application performance issues are usually caused by system resource bottlenecks, such as high CPU usage and I/O resource waiting. Applications often access these system resources through system calls. Therefore, key system call events can be observed to identify time-consuming or blocked resource access operations. + +For details about the observed system call events, see [Supported System Call Events](#supported-system-call-events). The events include file operations (**file**), network operations (**net**), lock operations (**lock**), and scheduling operations (**sched**). Some of the observed system call events are as follows: + +- File operations (**file**) + - **read**/**write**: Reading or writing drive files or network resources, which may be time-consuming and blocked. + - **sync**/**fsync**: Synchronizing files to drives. The thread is blocked until the synchronization is complete. +- Network operations (**net**) + - **send**/**recv**: Reading or writing network resources, which may be time-consuming and blocked. +- Lock operations (**lock**) + - **futex**: System call related to user-mode lock implementation. If **futex** is triggered, lock contention occurs and the thread may be blocked. +- Scheduling operations (**sched**): System call events that may change the thread status, such as CPU release, sleep, and waiting for other threads. + - **nanosleep**: The thread enters the sleep state. + - **epoll_wait**: The thread is waiting for the arrival of an I/O event. The thread is blocked before the event arrives. + +**oncpu events** + +The running state of a thread can be **oncpu** or **offcpu** according to whether the thread is running on the CPU. By observing the **oncpu** event of a thread, you can identify whether the thread is performing time-consuming CPU operations. + +### Event Information + +Thread profiling events include the following information: + +- Event sources: thread ID, thread name, process ID, process name, container ID, container name, host ID, and host name of the event. + + - **thread.pid**: ID of the thread to which the event belongs. + - **thread.comm**: name of the thread to which the event belongs. + - **thread.tgid**: ID of the process to which the event belongs. + - **proc.name**: name of the process to which the event belongs. + - **container.id**: ID of the container to which the event belongs. + - **container.name**: name of the container to which the event belongs. + - **host.id**: ID of the host to which the event belongs. + - **host.name**: name of the host to which the event belongs. + +- Event attributes: common event attributes and extended event attributes. + + - Common event attributes: + + - **event.name**: event name. + - **event.type**: event type. Currently, **oncpu**, **file**, **net**, **lock**, and **sched** are supported. + - **start_time**: start time of the event or the first event in an aggregated event. For details about aggregated events, see [Aggregated Events](#aggregated-events). + - **end_time**: event end time or the end time of the last event in an aggregated event. + - **duration**: event execution time. The value is (**end_time** - **start_time**). + - **count**: number of events that are aggregated. + + - Extended event attributes: + + - **func.stack**: information about the function call stack of an event. + - **file.path**: file path of a file event. + - **sock.conn**: TCP connection information about a network event. + - **futex.op**: operation type of a **futex** system call event. The value can be **wait** or **wake**. + + For details about the extended event attributes supported by different event types, see [Supported System Call Events](#supported-system-call-events). + +### Event Output + +As an extended eBPF probe program provided by gala-gopher, tprofiling sends generated system events to gala-gopher for processing. gala-gopher then outputs the events in the open source openTelemetry event format, and sends a message to the Kafka message queue in JSON format. The frontend can connect to Kafka to consume tprofiling events. + +The following is the example output of a thread profiling event: + +```json +{ + "Timestamp": 1661088145000, + "SeverityText": "INFO", + "SeverityNumber": 9, + "Body": "", + "Resource": { + "host.id": "", + "host.name": "", + "thread.pid": 10, + "thread.tgid": 10, + "thread.comm": "java", + "proc.name": "xxx.jar", + "container.id": "", + "container.name": "", + }, + "Attributes": { + values: [ + { + // common info + "event.name": "read", + "event.type": "file", + "start_time": 1661088145000, + "end_time": 1661088146000, + "duration": 0.1, + "count": 1, + // extend info + "func.stack": "read;", + "file.path": "/test.txt" + }, + { + "event.name": "oncpu", + "event.type": "oncpu", + "start_time": 1661088146000, + "end_time": 1661088147000, + "duration": 0.1, + "count": 1, + } + ] + } +} +``` +Field description: +- **Timestamp**: event point reported by the event. +- **Resource**: event source information. +- **Attributes**: event attribute information, which contains a **values** list. Each item in the list indicates a tprofiling event from the same source and contains the attribute information of the event. + +### Getting Started + +#### Installation and Deployment + +tprofiling is an extended eBPF probe program provided by gala-gopher. Therefore, you need to install and deploy gala-gopher before enabling tprofiling. + +In addition, to use the tprofiling capability on the frontend user interface (UI), [gala-ops](https://gitee.com/openeuler/gala-docs) sets up a UI demonstrating the tprofiling function based on the open source solution of Kafka + Logstash + Elasticsearch + Grafana. You can use the deployment tool provided by gala-ops for quick deployment. + +#### Operating Architecture + +![](./figures/tprofiling-run-arch.png) + +Frontend software description: + +- Kafka: open source message queue middleware used to receive and store tprofiling events collected by gala-gopher. +- Logstash: real-time open source log collection engine used to consume tprofiling events from Kafka, filter and convert the events, and send them to Elasticsearch. +- Elasticsearch: open distributed search and analysis engine used to store processed tprofiling events for Grafana query and visualization. +- Grafana: open source visualization tool used to query and visualize collected tprofiling events. You can use the tprofiling function on the UI provided by Grafana to analyze application performance issues. + +#### Deploying the tprofiling Probe + +You need to install gala-gopher first. For details about how to install and deploy gala-gopher, see the _Gala-gopher Documentation_. tprofiling events are sent to Kafka. Therefore, you need to configure the Kafka service address during deployment. + +After installing and running gala-gopher, use the HTTP-based dynamic configuration interface provided by gala-gopher to start the tprofiling probe. + +```sh +curl -X PUT http://:9999/tprofiling -d json='{"cmd": {"probe": ["oncpu", "syscall_file", "syscall_net", "syscall_sched", "syscall_lock"]}, "snoopers": {"proc_name": [{"comm": "java"}]}, "state": "running"}' +``` + +Parameter description: + +- **: IP address of the node where gala-gopher is deployed. +- **probe**: **probe** of **cmd** specifies the range of system events observed by the tprofiling probe. **oncpu**, **syscall_file**, **syscall_net**, **syscall_sched** and **syscall_lock** correspond to the **oncpu** event and system call events **file**, **net**, **sched**, and **lock**. You can choose tprofiling event types to observe. +- **proc_name**: **proc_name** of **snoopers** filters the names of the processes to be observed. You can also use the **proc_id** configuration item to filter the IDs of the processes to be observed. For details, see [Dynamic Configuration RESTful APIs](#dynamic-configuration-restful-apis). + +To disable the tprofiling probe, run the following command: + +```sh +curl -X PUT http://:9999/tprofiling -d json='{"state": "stopped"}' +``` + +#### Deploying the Frontend Software + +The software required for using the tprofiling function includes Kafka, Logstash, Elasticsearch, and Grafana. They are installed on the management node. You can use the deployment tool provided by gala-ops to quickly install and deploy them. For details, see the documents of gala. + +On the management node, run the following command to obtain the deployment script and install middleware Kafka, Logstash, and Elasticsearch: + +```sh +sh deploy.sh middleware -K -E -A -p +``` + +Run the following command to install Grafana: + +```sh +sh deploy.sh grafana -P -E +``` + +#### Usage + +After the deployment is complete, visit **http://**_\_**:3000** and log in to Grafana to use A-Ops. The default username and password are **admin**. + +After logging in to Grafana, find the dashboard named **ThreadProfiling**. + +![image-20230628155002410](./figures/tprofiling-dashboard.png) + +Click to access the frontend page of the tprofiling function. + +![image-20230628155249009](./figures/tprofiling-dashboard-detail.png) + +### Usage Examples + +#### Case 1: Locating a Deadlock Issue + +![image-20230628095802499](./figures/deadlock.png) + +The preceding figure shows the thread profiling result of a deadlock demo process. According to the statistics of the process event execution time in the pie chart, the proportion of lock events (in gray) is high during this period. The lower part displays the thread profiling result of the entire process. The vertical axis displays the execution sequence of profiling events of different threads in the process. The **java** thread is the main thread and is always blocked. After executing some **oncpu** events and **file** events, the service threads **LockThd1** and **LockThd2** intermittently execute **lock** events for a long time. The **lock** event details (as shown in the following figure) shows that the **futex** system call event is triggered and the execution time is 60 seconds. + +![image-20230628101056732](./figures/deadlock2.png) + +Based on the preceding observation, it can be found that the service threads **LockThd1** and **LockThd2** may be abnormal. The thread view shows the thread profiling results of the two service threads **LockThd1** and **LockThd2**. + +![image-20230628102138540](./figures/deadlock3.png) + +The preceding figure shows the profiling result of each thread. The vertical axis displays the execution sequence of different event types in the thread. As shown in the figure, threads **LockThd1** and **LockThd2** periodically execute **oncpu** events, including **file** events and **lock** events. However, at a certain moment (near 10:17:00), they simultaneously execute a **futex** event of the **lock** type for a long time, and no **oncpu** event occurs during this period, indicating that they enter the blocking state. **futex** is a system call related to user-mode lock implementation. If **futex** is triggered, lock contention occurs and threads may enter the blocking state. + +Based on the preceding analysis, deadlocks may occur on threads **LockThd1** and **LockThd2**. + +#### Case 2: Lock Contention Issue Locating + +![image-20230628111119499](./figures/lockcompete1.png) + +The preceding figure shows the running result of thread profiling of a lock contention demo process. As shown in the figure, the process executes three types of events: **lock**, **net**, and **oncpu**. The process contains three running service threads. From 11:05:45 to 11:06:45, the event execution time of the three service threads becomes long, which may indicate performance issues. Similarly, the thread view shows the thread profiling result of each thread. In addition, the time range can be narrowed down to a point near the time when an exception may occur. + +![image-20230628112709827](./figures/lockcompete2.png) + +By viewing the event execution sequence of each thread, we may understand the functions that each thread is performing during this period. + +- Thread **CompeteThd1**: A short **oncpu** event is triggered at a interval to execute a computing task. However, a long oncpu event is triggered around 11:05:45, indicating that a time-consuming computing task is being executed. + + ![image-20230628113336435](./figures/lockcompete3.png) + +- Thread **CompeteThd2**: A short **net** event is triggered at a interval. The event details show that the thread is sending network messages using the **write** system call and the corresponding TCP connection information is displayed. At 11:05:45, the **futex** event is executed for a long time and enters the blocking state. In this case, the execution interval of the write network event is prolonged. + + ![image-20230628113759887](./figures/lockcompete4.png) + + ![image-20230628114340386](./figures/lockcompete5.png) + +- Thread **tcp-server**: The TCP server continuously reads requests sent by the client using the **read** system call. The **read** event execution time is prolonged around 11:05:45, indicating that the server is waiting for network requests. + + ![image-20230628114659071](./figures/lockcompete6.png) + +Based on the preceding analysis, it can be found that each time thread **CompeteThd1** performs a time-consuming **oncpu** operation, thread **CompeteThd2** uses the **futex** system call and enters the blocking state. Once thread **CompeteThd1** completes the **oncpu** operation, thread **CompeteThd2** obtains the CPU and performs the network **write** operation. Therefore, there is a high probability that lock contention occurs between threads **CompeteThd1** and **CompeteThd2**. TCP network communication exists between the **tcp-server** thread and the **CompeteThd2** thread. The **CompeteThd2** thread cannot send network requests because it waits for lock resources. As a result, the **tcp-server** thread waits for **read** network requests most of the time. + +### topics + +#### Supported System Call Events + +The basic principles for selecting the system call events to be observed are as follows: + +1. Select events (such as file operations, network operations, and lock operations) that may be time-consuming and blocked. Such events usually involve access to system resources. +2. Select events that affect the status of threads. + +| Event/System Call| Description | Default Event Type| Extended Event Content | +| ----------------- | ----------------------------------------------------- | -------------- | -------------------------------- | +| read | Reading or writing drive files or network resources, which may be time-consuming and blocked. | file | file.path, sock.conn, func.stack | +| write | Reading or writing drive files or network resources, which may be time-consuming and blocked. | file | file.path, sock.conn, func.stack | +| readv | Reading or writing drive files or network resources, which may be time-consuming and blocked. | file | file.path, sock.conn, func.stack | +| writev | Reading or writing drive files or network resources, which may be time-consuming and blocked. | file | file.path, sock.conn, func.stack | +| preadv | Reading or writing drive files or network resources, which may be time-consuming and blocked. | file | file.path, sock.conn, func.stack | +| pwritev | Reading or writing drive files or network resources, which may be time-consuming and blocked. | file | file.path, sock.conn, func.stack | +| sync | Synchronizing files to drives. The thread is blocked until the synchronization is complete. | file | func.stack | +| fsync | Synchronizing files to drives. The thread is blocked until the synchronization is complete. | file | file.path, sock.conn, func.stack | +| fdatasync | Synchronizing files to drives. The thread is blocked until the synchronization is complete. | file | file.path, sock.conn, func.stack | +| sched_yield | Proactively releasing the CPU for rescheduling. | sched | func.stack | +| nanosleep | Entering the sleep state. | sched | func.stack | +| clock_nanosleep | Entering the sleep state. | sched | func.stack | +| wait4 | Blocked. | sched | func.stack | +| waitpid | Blocked. | sched | func.stack | +| select | Blocked and waiting when no event arrives. | sched | func.stack | +| pselect6 | When no event arrives, the thread is blocked and waits. | sched | func.stack | +| poll | When no event arrives, the thread is blocked and waits. | sched | func.stack | +| ppoll | When no event arrives, the thread is blocked and waits. | sched | func.stack | +| epoll_wait | When no event arrives, the thread is blocked and waits. | sched | func.stack | +| sendto | Reading or writing network resources, which may be time-consuming and blocked. | net | sock.conn, func.stack | +| recvfrom | Reading or writing network resources, which may be time-consuming and blocked. | net | sock.conn, func.stack | +| sendmsg | Reading or writing network resources, which may be time-consuming and blocked. | net | sock.conn, func.stack | +| recvmsg | Reading or writing network resources, which may be time-consuming and blocked. | net | sock.conn, func.stack | +| sendmmsg | Reading or writing network resources, which may be time-consuming and blocked. | net | sock.conn, func.stack | +| recvmmsg | Reading or writing network resources, which may be time-consuming and blocked. | net | sock.conn, func.stack | +| futex | Usually indicates lock wait and thread blocking state.| lock | futex.op, func.stack | + +#### Aggregated Events + +Currently, tprofiling supports two types of system performance events: system call events and **oncpu** events. The **oncpu** event and some system call events (such as **read**/**write**) may be frequently triggered in a specific application scenario. As a result, a large number of system events are generated, which greatly affects the performance of the observed application program and the performance of the tprofiling probe. + +To optimize performance, tprofiling aggregates multiple system events that belong to the same thread and have the same event name within a period of time (1s) into one event for reporting. Therefore, a tprofiling event actually refers to an aggregated event that contains one or more identical system events. Compared with a real system event, the meanings of some attributes of an aggregated event are changed as follows: + +- **start_time**: start time of the event or the first event in an aggregated event. +- **end_time**: event end time, which is (**start_time** + **duration**) in an aggregated event. +- **duration**: event execution time or the accumulated execution time of all system events in an aggregation event. +- **count**: number of system events in an aggregated event. When the value is **1**, an aggregated event is equivalent to a system event. +- Extended event attribute: extended attribute of the first system event in an aggregation event. + +## Introduction to L7Probe + +L7Probe observes layer 7 traffic, covering common HTTP1.X, PG, MySQL, Redis, Kafka, HTTP2.0, MongoDB, RocketMQ, and encrypted traffic. + +It covers node, container, and pod (K8s) scenarios. + +### Code Framework Design + +```text +L7Probe + | --- included //Common header files + + | --- connect.h // L7 connect object definition + + | --- pod.h // Pod/Container object definition + + | --- conn_tracker.h // L7 protocol tracker object definition + + | --- protocol // L7 protocol parsing + + | --- http // HTTP1.X L7 message structure definition and parsing + + | --- mysql // mysql L7 message structure definition and parsing + + | --- pgsql // pgsql L7 message structure definition and parsing + + | --- bpf // Kernel BPF code + + | --- L7.h // BPF program for parsing L7 protocol types + + | --- kern_sock.bpf.c // Kernel socket layer observation + + | --- libssl.bpf.c // openSSL layer observation + + | --- gossl.bpf.c // GO SSL layer observation + + | --- cgroup.bpf.c // Pod life cycle observation + + | --- pod_mng.c // Pod/Container instance management (pod/container lifecycle awareness) + + | --- conn_mng.c // L7 Connect instance management (processing BPF observation events, such as Open/Close events and Stats statistics) + + | --- conn_tracker.c // L7 traffic tracker (tracking BPF observation data, such as data generated by system events such as send/write and read/recv) + + | --- bpf_mng.c // BPF program life cycle management (on-demand and real-time open, load, attach, and unload BPF programs, including uprobe BPF programs) + + | --- session_conn.c // Manages jsse sessions (records the mapping between jsse sessions and sock connections and reports jsse connection information). + + | --- L7Probe.c // Main program of the probe +``` + +### Probe Output + +| metrics_name | table_name | metrics_type | unit | metrics description | +| --------------- | ---------- | ------------ | ---- | ------------------------------------------------------------ | +| tgid | NA | key | NA | Process ID of l7 session. | +| client_ip | NA | key | NA | Client IP address of l7 session. | +| server_ip | NA | key | NA | Server IP address of l7 session.
Remarks: In the K8s scenario, the cluster IP address can be converted into the backend IP address.| +| server_port | NA | key | NA | Server Port of l7 session.
Remarks: In the K8s scenario, the cluster port can be converted to the backend port.| +| l4_role | NA | key | NA | Role of l4 protocol(TCP Client/Server or UDP) | +| l7_role | NA | key | NA | Role of l7 protocol(Client or Server) | +| protocol | NA | key | NA | Name of l7 protocol(http/http2/mysql...) | +| ssl | NA | label | NA | Indicates whether an SSL-encrypted l7 session is used. | +| bytes_sent | l7_link | gauge | NA | Number of bytes sent by a l7 session. | +| bytes_recv | l7_link | gauge | NA | Number of bytes recv by a l7 session. | +| segs_sent | l7_link | gauge | NA | Number of segs sent by a l7 session. | +| segs_recv | l7_link | gauge | NA | Number of segs recv by a l7 session. | +| throughput_req | l7_rpc | gauge | qps | Request throughput of l7 session. | +| throughput_resp | l7_rpc | gauge | qps | Response throughput of l7 session. | +| req_count | l7_rpc | gauge | NA | Request num of l7 session. | +| resp_count | l7_rpc | gauge | NA | Response num of l7 session. | +| latency_avg | l7_rpc | gauge | ns | L7 session averaged latency. | +| latency | l7_rpc | histogram | ns | L7 session histogram latency. | +| latency_sum | l7_rpc | gauge | ns | L7 session sum latency. | +| err_ratio | l7_rpc | gauge | % | L7 session error rate. | +| err_count | l7_rpc | gauge | NA | L7 session error count. | + +### Dynamic Control + +#### Controlling the Scope of Observed Pods + +1. REST->gala-gopher. +2. gala-gopher->L7Probe. +3. L7Probe obtains related containers based on pods. +4. L7Probe obtains the cgroup ID (**cpuacct_cgrp_id**) based on the container and writes the cgroup ID to the object module (the **cgrp_add** API). +5. In the socket system event context, obtain the cgroup (**cpuacct_cgrp_id**) to which the process belongs. For details, see the Linux code (**task_cgroup**). +6. The object module (the **is_cgrp_exist** API) can be filtered during observation. + +#### Observation Capability Control + +1. REST->gala-gopher. +2. gala-gopher->L7Probe. +3. L7Probe dynamically enables or disables BPF observation capabilities (including the throughput, latency, trace, and protocol type) based on the input parameters. + +### Observation Points + +#### Kernel Socket System Calls + +TCP-related system calls + +```c +int connect(int sockfd, const struct sockaddr *addr, socklen_t addrlen); + +int accept(int sockfd, struct sockaddr *addr, socklen_t*addrlen); + +int accept4(int sockfd, struct sockaddr *addr, socklen_t*addrlen, int flags); + +ssize_t write(int fd, const void *buf, size_t count); + +ssize_t send(int sockfd, const void *buf, size_t len, int flags); + +ssize_t read(int fd, void *buf, size_t count); + +ssize_t recv(int sockfd, void *buf, size_t len, int flags); + +ssize_t writev(int fd, const struct iovec *iov, int iovcnt); + +ssize_t readv(int fd, const struct iovec *iov, int iovcnt); +``` + +TCP&UDP-related system calls + +```c +ssize_t sendto(int sockfd, const void *buf, size_t len, int flags, const struct sockaddr*dest_addr, socklen_t addrlen); + +ssize_t recvfrom(int sockfd, void *buf, size_t len, int flags, struct sockaddr*src_addr, socklen_t *addrlen); + +ssize_t sendmsg(int sockfd, const struct msghdr *msg, int flags); + +ssize_t recvmsg(int sockfd, struct msghdr *msg, int flags); + +int close(int fd); +``` + +Notes: + +1. The **read**/**write** and **readv**/**writev** operations may be confused with common file I/O operations. You can observe the **security_socket_sendmsg** function of the kernel to determine whether the FD is a socket operation. +2. **sendto**/**recvfrom** and **sendmsg**/**recvmsg** are used by TCP and UDP. For details, see the following manuals. +3. **sendmmsg**/**recvmmsg** and **sendfile** are not supported currently. + +[**sendto** manual](https://man7.org/linux/man-pages/man2/send.2.html): If **sendto()** is used on a connection-mode (**SOCK_STREAM**, **SOCK_SEQPACKET**) socket, the arguments _dest\_addr_ and _addrlen_ are ignored (and the error **EISCONN** may be returned when they are not NULL and 0), and the error **ENOTCONN** is returned when the socket was not actually connected.otherwise, the address of the target is given by dest_addr with addrlen specifying its size. + +If the value of **dest_addr** is NULL, TCP is used. Otherwise, UDP is used. + +[**recvfrom** manual](https://linux.die.net/man/2/recvfrom): The **recvfrom()** and **recvmsg()** calls are used to receive messages from a socket, and may be used to receive data on a socket whether or not it is connection-oriented. + +If the value of **src_addr** is NULL, TCP is used. Otherwise, UDP is used. + +[**sendmsg** manual](https://man7.org/linux/man-pages/man3/sendmsg.3p.html): The **sendmsg()** function shall send a message through a connection-mode or connectionless-mode socket.If the socket is a connectionless-mode socket, the message shall be sent to the address specified by msghdr if no pre-specified peer address has been set. If a peer address has been pre-specified, either themessage shall be sent to the address specified in msghdr (overriding the pre-specified peer address), or the function shall return -1 and set errno to \[EISCONN]. If the socket is connection-mode, the destination address in msghdr shall be ignored. + +If the value of **msghdr->msg_name** is NULL, TCP is used. Otherwise, UDP is used. + +[recvmsg manual](https://man7.org/linux/man-pages/man3/recvmsg.3p.html): The recvmsg() function shall receive a message from a connection-mode or connectionless-mode socket. It is normally used with connectionless-mode sockets because it permits the application to retrieve the source address of received data. + +If the value of **msghdr->msg_name** is NULL, TCP is used. Otherwise, UDP is used. + +#### libSSL API + +SSL_write + +SSL_read + +#### Go SSL API + +#### JSSE API + +sun/security/ssl/SSLSocketImpl$AppInputStream + +sun/security/ssl/SSLSocketImpl$AppOutputStream + +### JSSE Observation Scheme + +#### Loading JSSEProbe + +In the **main** function, **l7_load_jsse_agent** is used to load JSSEProbe. + +Processes in allowlist (**g_proc_obj_map_fd**) are observed in polling mode. If the processes are Java processes, **jvm_attach** is used to load **JSSEProbeAgent.jar** to the processes. After the loading is successful, the Java process exports the observation information to the jsse-metrics output file (**/tmp/java-data-**_\_**/jsse-metrics.txt**) at the specified observation point. For details, see [JSSE API](#jsse-api). + +#### Processing JSSEProbe Messages + +The **l7_jsse_msg_handler** thread processes JSSEProbe messages. + +Processes in allowlist (**g_proc_obj_map_fd**) are observed in polling mode. If the processes have the corresponding jsse-metrics output file, the file is read by line, and JSSE read and write information is parsed, converted, and reported. + +##### 1. Parsing JSSE Read and Write Information + +The output format of **jsse-metrics.txt** is as follows, from which the PID, session ID, time, read/write operations, IP address, port number, and payload information of a JSSE request is obtained: + +```text +|jsse_msg|662220|Session(1688648699909|TLS_AES_256_GCM_SHA384)|1688648699989|Write|127.0.0.1|58302|This is test message| +``` + +The parsed original information is stored in **session_data_args_s**. + +##### 2. Converting JSSE Read and Write Information + +Information in **session_data_args_s** is converted to **sock_conn** and **conn_data**. + +During conversion, the following hash maps are queried: + +**session_head**: records the mapping between session IDs and sock connection IDs of JSSE connections. If the process IDs and the quadruplet information are the same, the session corresponds to the sock connection. + +**file_conn_head**: records the last session ID of the Java process in case L7probe does not read the JSSEProbe output from the beginning of the request and cannot find session ID. + +##### 3. Reporting JSSE Read and Write Information + +**sock_conn** and **conn_data** are reported to the map. + +## Usage + +### Deployment of External Dependencies + +![gala-gopher architecture](./figures/gala-gopher_architecture.png) + +As shown in the preceding figure, the green parts are external dependent components of gala-gopher. gala-gopher outputs metric data to Prometheus, metadata and abnormal events to Kafka. gala-anteater and gala-spider in gray rectangles obtain data from Prometheus and Kafka. + +> Note: Obtain the installation packages of Kafka and Prometheus from the official websites. -##### Output Data +### Output Data - **Metric** Prometheus Server has a built-in Express Browser UI. You can use PromQL statements to query metric data. For details, see [Using the expression browser](https://prometheus.io/docs/prometheus/latest/getting_started/#using-the-expression-browser) in the official document. The following is an example. - If the specified metric is `gala_gopher_tcp_link_rcv_rtt`, the metric data displayed on the UI is as follows: + If the specified metric is **gala_gopher_tcp_link_rcv_rtt**, the metric data displayed on the UI is as follows: ```basic gala_gopher_tcp_link_rcv_rtt{client_ip="x.x.x.165",client_port="1234",hostname="openEuler",instance="x.x.x.172:8888",job="prometheus",machine_id="1fd3774xx",protocol="2",role="0",server_ip="x.x.x.172",server_port="3742",tgid="1516"} 1 @@ -207,7 +1066,7 @@ As shown in the preceding figure, the green parts are external dependent compone - **Metadata** - You can directly consume data from the Kafka topic `gala_gopher_metadata`. The following is an example. + You can directly consume data from the Kafka topic **gala_gopher_metadata**. The following is an example. ```bash # Input request @@ -218,7 +1077,7 @@ As shown in the preceding figure, the green parts are external dependent compone - **Abnormal events** - You can directly consume data from the Kafka topic `gala_gopher_event`. The following is an example. + You can directly consume data from the Kafka topic **gala_gopher_event**. The following is an example. ```bash # Input request diff --git a/docs/en/docs/Administration/trusted-computing.md b/docs/en/docs/Administration/trusted-computing.md index 0f18f3830c87139ecb13abfb79fd7802135a1230..5ba4816f745e1d327673467a9882645ae6b1d100 100644 --- a/docs/en/docs/Administration/trusted-computing.md +++ b/docs/en/docs/Administration/trusted-computing.md @@ -1836,7 +1836,7 @@ The TPCM interacts with other components as follows: ### Constraints -Supported server: TaiShan 200 server (model 2280) +Supported server: TaiShan 200 server (model 2280) Supported BMC card: BC83SMMC ### Application Scenarios diff --git a/docs/en/docs/CertSignature/figures/cert-tree.png b/docs/en/docs/CertSignature/figures/cert-tree.png new file mode 100644 index 0000000000000000000000000000000000000000..930a664600b31140c3939b1abd005cc2572cdbf9 Binary files /dev/null and b/docs/en/docs/CertSignature/figures/cert-tree.png differ diff --git a/docs/en/docs/CertSignature/figures/mokutil-db.png b/docs/en/docs/CertSignature/figures/mokutil-db.png new file mode 100644 index 0000000000000000000000000000000000000000..82dbe6e04cafe3e9ac039ba19acd5996d4cf2259 Binary files /dev/null and b/docs/en/docs/CertSignature/figures/mokutil-db.png differ diff --git a/docs/en/docs/CertSignature/figures/mokutil-sb-off.png b/docs/en/docs/CertSignature/figures/mokutil-sb-off.png new file mode 100644 index 0000000000000000000000000000000000000000..f3018c9fd0236e9c2cf560f0da3827ed2a877f6d Binary files /dev/null and b/docs/en/docs/CertSignature/figures/mokutil-sb-off.png differ diff --git a/docs/en/docs/CertSignature/figures/mokutil-sb-on.png b/docs/en/docs/CertSignature/figures/mokutil-sb-on.png new file mode 100644 index 0000000000000000000000000000000000000000..449b6774dc61a601cf884845fbd0be5d314108e1 Binary files /dev/null and b/docs/en/docs/CertSignature/figures/mokutil-sb-on.png differ diff --git a/docs/en/docs/CertSignature/figures/mokutil-sb-unsupport.png b/docs/en/docs/CertSignature/figures/mokutil-sb-unsupport.png new file mode 100644 index 0000000000000000000000000000000000000000..525c72f78b897ffaba0d356406ab9d9e64024d91 Binary files /dev/null and b/docs/en/docs/CertSignature/figures/mokutil-sb-unsupport.png differ diff --git a/docs/en/docs/CertSignature/introduction_to_signature_certificates.md b/docs/en/docs/CertSignature/introduction_to_signature_certificates.md new file mode 100644 index 0000000000000000000000000000000000000000..3720dea42fdad92c6b1cae08087d1e6713307d60 --- /dev/null +++ b/docs/en/docs/CertSignature/introduction_to_signature_certificates.md @@ -0,0 +1,46 @@ +# Introduction to Signature Certificates + +openEuler supports two signature mechanisms: openPGP and CMS, which are used for different file types. + +| File Type | Signature Type | Signature Format| +| --------------- | ------------ | -------- | +| EFI files | authenticode | CMS | +| Kernel module files | modsig | CMS | +| IMA digest lists| modsig | CMS | +| RPM software packages | RPM | openPGP | + +## openPGP Certificate Signing + +openEuler uses openPGP certificates to sign RPM software packages. The signature certificates are released with the OS image. You can obtain certificates used by openEuler in either of the following ways: + +Method 1: Download the certificate from the repository. For example, download the certificate of openEuler 24.03 LTS from the following address: + +```text +https://repo.openeuler.org/openEuler-24.03-LTS/OS/aarch64/RPM-GPG-KEY-openEuler +``` + +Method 2: Log in to the system and obtain the file from the specified path. + +```shell +cat /etc/pki/rpm-gpg/RPM-GPG-KEY-openEuler +``` + +## CMS Certificate Signing + +The openEuler signature platform uses a three-level certificate chain to manage signature private keys and certificates. + +![](./figures/cert-tree.png) + +Certificates of different levels have different validity periods. The current plan is as follows: + +| Type| Validity Period| +| -------- | ------ | +| Root certificate | 30 years | +| Level-2 certificate| 10 years | +| Level-3 certificate| 3 years | + +The openEuler root certificate can be downloaded from the community certificate center. + +```text +https://www.openeuler.org/en/security/certificate-center/ +``` diff --git a/docs/en/docs/CertSignature/overview_of_certificates_and_signatures.md b/docs/en/docs/CertSignature/overview_of_certificates_and_signatures.md new file mode 100644 index 0000000000000000000000000000000000000000..5b34fb2790887ffa71ef519565d902264e69afd3 --- /dev/null +++ b/docs/en/docs/CertSignature/overview_of_certificates_and_signatures.md @@ -0,0 +1,29 @@ +# Overview of Certificates and Signatures + +## Overview + +Digital signature is an important technology for protecting the integrity of OSs. By adding signatures to key system components and verifying the signatures in subsequent processes such as component loading and running, you can effectively check component integrity and prevent security problems caused by component tampering. Multiple system integrity protection mechanisms are supported in the industry to protect the integrity of different types of components in each phase of system running. Typical technical mechanisms include: + +- Secure boot +- Kernel module signing +- Integrity measurement architecture (IMA) +- RPM signature verification + +The preceding integrity protection security mechanisms depend on signatures (usually integrated in the component release phase). However, open source communities generally lack signature private keys and certificate management mechanisms. Therefore, Linux distributions released by open source communities generally do not provide default signatures or use only private keys temporarily generated in the build phase for signatures. Usually, these integrity protection security mechanisms can be enabled only after users or downstream OSVs perform secondary signing, which increases the cost of security functions and reduces usability. + +## Solution + +The openEuler community infrastructure supports the signature service. The signature platform manages signature private keys and certificates in a unified manner and works with the EulerMaker build platform to automatically sign key files during the software package build process of the community edition. Currently, the following file types are supported: + +- EFI files +- Kernel module files +- IMA digest lists +- RPM software packages + +## Constraints + +The signature service of the openEuler community has the following constraints: + +- Currently, only official releases of the openEuler community can be signed. Private builds cannot be signed. +- Currently, only EFI files related to OS secure boot can be signed, including shim, GRUB, and kernel files. +- Currently, only the kernel module files provided by the kernel software package can be signed. diff --git a/docs/en/docs/CertSignature/secure_boot.md b/docs/en/docs/CertSignature/secure_boot.md new file mode 100644 index 0000000000000000000000000000000000000000..da3a678ef7c3c7a0678d98df08319d98cc097c9d --- /dev/null +++ b/docs/en/docs/CertSignature/secure_boot.md @@ -0,0 +1,45 @@ +# Secure Boot + +## Overview + +Secure Boot relies on public and private key pairs to sign and verify components in the booting process. During booting, the previous component authenticates the digital signature of the next component. If the authentication is successful, the next component runs. If the authentication fails, the booting stops. Secure Boot ensures the integrity of each component during system booting and prevents unauthenticated components from being loaded and running, preventing security threats to the system and user data. +Components to be authenticated and loaded in sequence in Secure Boot are BIOS, shim, GRUB, and vmlinuz (kernel image). +Related EFI startup components are signed by the openEuler signature platform using signcode. The public key certificate is integrated into the signature database by the BIOS. During the boot, the BIOS verifies the shim. The shim and GRUB components obtain the public key certificate from the signature database of the BIOS and verify the next-level components. + +## Background and Solutions + +In earlier openEuler versions, secure boot components are not signed. Therefore, the secure boot function cannot be directly used to ensure the integrity of system components. +In openEuler 22.03 LTS SP3 and later versions, openEuler uses the community signature platform to sign OS components, including the GRUB and vmlinuz components, and integrates the community signature root certificate in the shim component. +For the shim component, to facilitate end-to-end secure boot, the signature platform of the openEuler community is used for signature. After external CAs officially operate the secure boot component signature service, the signatures of these CAs will be integrated into the shim module of openEuler. + +## Usage + +### Obtaining the openEuler Certificate + +To obtain the openEuler root certificate, visit and download it from the **Certificate Center** directory. +The root certificate name on the web page are **openEuler Shim Default CA** and **default-x509ca.cert**. + +### Operation in the BIOS + +Import the openEuler root certificate to the certificate database of the BIOS and enable secure boot in the BIOS. +For details about how to import the BIOS certificate and enable secure boot, see the documents provided by the BIOS vendor. + +### Operation in the OS + +Check the certificate information in the database: `mokutil –db` +![](./figures/mokutil-db.png) +Note: There is a large amount of certificate information. Only some important information is displayed in the screenshot. +Check the secure boot status: `mokutil --sb` + +- **SecureBoot disabled**: Secure boot is disabled. +![](./figures/mokutil-sb-off.png) +- **SecureBoot enabled**: Secure boot is enabled. +![](./figures/mokutil-sb-on.png) +- **not supported**: The system does not support secure boot. +![](./figures/mokutil-sb-unsupport.png) + +## Constraints + +- **Software**: The OS must be booted in UEFI mode. +- **Architecture**: Arm or x86 +- **Hardware**: The BIOS must support the verification function related to secure boot. diff --git a/docs/en/docs/Container/CRI_API_v1.md b/docs/en/docs/Container/CRI_API_v1.md new file mode 100644 index 0000000000000000000000000000000000000000..f34be40a1a616a201694c0b9d2c8e903c87e6742 --- /dev/null +++ b/docs/en/docs/Container/CRI_API_v1.md @@ -0,0 +1,202 @@ +# CRI API v1 + +## Overview + +Container Runtime Interface (CRI) is the main protocol used by kublet to communicate with container engines. +Kubernetes 1.25 and earlier versions support CRI v1alpha2 and CRI v1. Kubernetes 1.26 and later versions support only CRI v1. + +iSulad supports both [CRI v1alpha2](./CRI_API_v1alpha2.md) and CRI v1. +For CRI v1, iSulad supports the functions described in [CRI v1alpha2](./CRI_API_v1alpha2.md) and new interfaces and fields defined in CRI v1. + +Currently, iSulad supports CRI v1 1.29. The API described on the official website is as follows: + +[https://github.com/kubernetes/cri-api/blob/kubernetes-1.29.0/pkg/apis/runtime/v1/api.proto](https://github.com/kubernetes/cri-api/blob/kubernetes-1.29.0/pkg/apis/runtime/v1/api.proto) + +The API description file used by iSulad is slightly different from the official API. The interfaces in this document prevail. + +## New Fields of CRI v1 + +- **CgroupDriver** + + Enum values for cgroup drivers. + + | Member| Description | + | :----------------: | :----------------: | + | SYSTEMD = 0 | systemd-cgroup driver| + | CGROUPFS = 1 | cgroupfs driver | +- **LinuxRuntimeConfiguration** + + cgroup driver used by the container engine + + | Member | Description | + | :------------------------: | :------------------------------: | + | CgroupDriver cgroup_driver | Enum value for the cgroup driver used by the container engine| +- **ContainerEventType** + + Enum values for container event types + + | Member | Description| + | :-------------------------: | :------------: | + | CONTAINER_CREATED_EVENT = 0 | Container creation event | + | CONTAINER_STARTED_EVENT = 1 | Container startup event | + | CONTAINER_STOPPED_EVENT = 1 | Container stop event | + | CONTAINER_DELETED_EVENT = 1 | Container deletion event | +- **SwapUsage** + + Virtual memory usage + + | Member | Description | + | :------------------------------: | :------------------: | + | int64 timestamp | Timestamp information | + | UInt64Value swap_available_bytes | Available virtual memory bytes| + | UInt64Value swap_usage_bytes | Used virtual memory bytes| + +## New Interfaces + +### RuntimeConfig + +#### Interface Prototype + +```text +rpc RuntimeConfig(RuntimeConfigRequest) returns (RuntimeConfigResponse) {} +``` + +#### Interface Description + +Obtains the cgroup driver configuration (cgroupfs or systemd-cgroup). + +#### Parameter: RuntimeConfigRequest + +No such field + +#### Returns: RuntimeConfigResponse + +| Return | Description | +| :------------------------------ | :------------------------------------------------- | +| LinuxRuntimeConfiguration linux | CgroupDriver enum value for cgroupfs or systemd-cgroup| + +### GetContainerEvents + +#### Interface Prototype + +```text +rpc GetContainerEvents(GetEventsRequest) returns (stream ContainerEventResponse) {} +``` + +#### Interface Description + +Obtains the pod lifecycle event stream. + +#### Parameter: GetEventsRequest + +No such field + +#### Returns: ContainerEventResponse + +| Return | Description | +| :------------------------------------------- | :-------------------------------- | +| string container_id | Container ID | +| ContainerEventType container_event_type | Container event type | +| int64 created_at | Time when the container event is generated | +| PodSandboxStatus pod_sandbox_status | Status of the pod to which the container belongs | +| repeated ContainerStatus containers_statuses | Status of all containers in the pod to which the container belongs| + +## Change Description + +### CRI V1.29 + +#### [Obtaining the cgroup Driver Configuration](https://github.com/kubernetes/kubernetes/pull/118770) + +`RuntimeConfig` obtains the cgroup driver configuration (cgroupfs or systemd-cgroup). + +#### [GetContainerEvents Supports Pod Lifecycle Events](https://github.com/kubernetes/kubernetes/pull/111384) + +`GetContainerEvents` provides event streams related to the pod lifecycle. + +`PodSandboxStatus` is adjusted accordingly. `ContainerStatuses` is added to provide sandbox content status information. + +#### [ContainerStats Virtual Memory Information](https://github.com/kubernetes/kubernetes/pull/118865) + +The virtual memory usage information `SwapUsage` is added to `ContainerStats`. + +#### [OOMKilled Setting in the Reason Field of ContainerStatus](https://github.com/kubernetes/kubernetes/pull/112977) + +The **Reason** field in **ContainerStatus** should be set to OOMKilled when cgroup out-of-memory occurs. + +#### [Modification of PodSecurityContext.SupplementalGroups Description](https://github.com/kubernetes/kubernetes/pull/113047) + +The description is modified to optimize the comments of **PodSecurityContext.SupplementalGroups**. The behavior that the main UID defined by the container image is not in the list is clarified. + +#### [ExecSync Output Restriction](https://github.com/kubernetes/kubernetes/pull/110435) + +The **ExecSync** return value output is less than 16 MB. + +## User Guide + +### Configuring iSulad to Support CRI V1 + +Configure iSulad to support CRI v1 1.29 used by the new Kubernetes version. + +For CRI v1 1.25 or earlier, the functions of V1alpha2 are the same as those of V1. The new features of CRI v1 1.26 or later are supported only in CRI v1. +The functions and features of this upgrade are supported only in CRI v1. Therefore, you need to enable CRI v1as follows. + +Enable CRI v1. + +Set **enable-cri-v1** in **daemon.json** of iSulad to **true** and restart iSulad. + +```json +{ + "group": "isula", + "default-runtime": "runc", + ... + "enable-cri-v1": true +} +``` + +If iSulad is installed from source, enable the **ENABLE_CRI_API_V1** compile option. + +```bash +cmake ../ -D ENABLE_CRI_API_V1=ON +``` + +### Using RuntimeConfig to Obtain the cgroup Driver Configuration + +#### systemd-cgroup Configuration + +iSulad supports both systemd and cgroupfs cgroup drivers. +By default, cgroupfs is used. You can configure iSulad to support systemd-cgroup. +iSulad supports only systemd-cgroup when the runtime is runc. In the iSulad configuration file **daemon.json**, +set **systemd-cgroup** to **true** and restart iSulad to use the systemd-cgroup driver. + +```json +{ + "group": "isula", + "default-runtime": "runc", + ... + "enable-cri-v1": true, + "systemd-cgroup": true +} +``` + +### Using GetContainerEvents to Generate Pod Lifecycle Events + +#### Pod Events Configuration + +In the iSulad configuration file **daemon.json**, +set **enable-pod-events** to **true** and restart iSulad. + +```json +{ + "group": "isula", + "default-runtime": "runc", + ... + "enable-cri-v1": true, + "enable-pod-events": true +} +``` + +## Constraints + +1. The preceding new features are supported by iSulad only when the container runtime is runc. +2. cgroup out-of-memory (OOM) triggers the deletion of the cgroup path of the container. If iSulad processes the OOM event after the cgroup path is deleted, iSulad cannot capture the OOM event of the container. As a result, the **Reason** field in **ContainerStatus** may be incorrect. +3. iSulad does not support the mixed use of different cgroup drivers to manage containers. After a container is started, the cgroup driver configuration in iSulad should not change. diff --git a/docs/en/docs/Container/CRI_API_v1alpha2.md b/docs/en/docs/Container/CRI_API_v1alpha2.md new file mode 100644 index 0000000000000000000000000000000000000000..4d5c173f92591366183123e55480d0a5cb3a384d --- /dev/null +++ b/docs/en/docs/Container/CRI_API_v1alpha2.md @@ -0,0 +1,1220 @@ +# CRI API v1alpha2 + +## Description + +CRI API is the container runtime APIs provided by Kubernetes. CRI defines service interfaces for containers and images. iSulad uses CRI API to interconnect with Kubernetes. + +The lifecycle of a container is isolated from that of an image. Therefore, two services are required. CRI API is defined using [Protocol Buffers](https://developers.google.com/protocol-buffers/) and is based on [gRPC](https://grpc.io/). + +Currently, the default CRI API version used by iSulad is v1alpha2. The official API description file is as follows: + +[https://github.com/kubernetes/kubernetes/blob/release-1.14/pkg/kubelet/apis/cri/runtime/v1alpha2/api.proto](https://github.com/kubernetes/kubernetes/blob/release-1.14/pkg/kubelet/apis/cri/runtime/v1alpha2/api.proto), + +iSulad uses the API description file of version 1.14 used by Pass, which is slightly different from the official API. The interfaces in this document prevail. + +> ![](./public_sys-resources/icon-note.gif) **NOTE:** +> For the WebSocket streaming service of CRI API, the listening address of the server is 127.0.0.1, and the port number is 10350. The port number can be configured through the `--websocket-server-listening-port` command option or in the **daemon.json** configuration file. + +## Interfaces + +The following tables list the parameters that may be used by the interfaces. Some parameters cannot be configured. + +### Interface Parameters + +- **DNSConfig** + + Specifies the DNS servers and search domains of a sandbox. + + | Member | Description | + | :----------------------: | :--------------------------------------------------------: | + | repeated string servers | List of DNS servers of the cluster | + | repeated string searches | List of DNS search domains of the cluster | + | repeated string options | List of DNS options. See .| +- **Protocol** + + Enum values of the protocols. + + | Member| Description | + | :------: | :-----: | + | TCP = 0 | TCP| + | UDP = 1 | UDP| +- **PortMapping** + + Specifies the port mapping configurations of a sandbox. + + | Member | Description | + | :------------------: | :----------------: | + | Protocol protocol | Protocol of the port mapping | + | int32 container_port | Port number within the container | + | int32 host_port | Port number on the host | + | string host_ip | Host IP address | +- **MountPropagation** + + Enum values for mount propagation. + + | Member | Description | + | :-------------------------------: | :--------------------------------------------------: | + | PROPAGATION_PRIVATE = 0 | No mount propagation ("rprivate" in Linux) | + | PROPAGATION_HOST_TO_CONTAINER = 1 | Mounts get propagated from the host to the container ("rslave" in Linux) | + | PROPAGATION_BIDIRECTIONAL = 2 | Mounts get propagated from the host to the container and from the container to the host ("rshared" in Linux) | +- **Mount** + + Specifies a host volume to mount into a container. (Only files and folders are supported.) + + | Member | Description | + | :--------------------------: | :------------------------------------------------------------------------------------------------------------------------------------------: | + | string container_path | Path in the container | + | string host_path | Path on the host | + | bool readonly | Whether the configuration is read-only in the container. The default value is **false**. | + | bool selinux_relabel | Whether to set the SELinux label (not supported) | + | MountPropagation propagation | Mount propagation configuration. The value can be **0**, **1**, or **2**, corresponding to **rprivate**, **rslave**, or **rshared**. The default value is **0**. | +- **NamespaceOption** + + | Member | Description | + | :---------------: | :------------------------: | + | bool host_network | Whether to use the network namespace of the host | + | bool host_pid | Whether to use the PID namespace of the host | + | bool host_ipc | Whether to use the IPC namespace of the host | +- **Capability** + + Contains information about the capabilities to add or drop. + + | Member | Description | + | :-------------------------------: | :----------: | + | repeated string add_capabilities | Capabilities to add | + | repeated string drop_capabilities | Capabilities to drop | +- **Int64Value** + + Wrapper of the int64 type. + + | Member| Description| + | :----------------: | :------------: | + | int64 value | Actual int64 value | +- **UInt64Value** + + Wrapper of the uint64 type. + + | Member| Description| + | :----------------: | :------------: | + | uint64 value | Actual uint64 value | +- **LinuxSandboxSecurityContext** + + Specifies Linux security options for a sandbox. + + Note that these security options are not applied to containers in the sandbox and may not be applicable to a sandbox without any running process. + + | Member | Description | + | :--------------------------------: | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | + | NamespaceOption namespace_options | Options for namespaces of the sandbox | + | SELinuxOption selinux_options | SELinux options (not supported) | + | Int64Value run_as_user | UID to run sandbox processes | + | bool readonly_rootfs | Whether the root file system of the sandbox is read-only | + | repeated int64 supplemental_groups | User group information of process 1 in the sandbox besides the primary group | + | bool privileged | Whether the sandbox can run a privileged container | + | string seccomp_profile_path | Path of the seccomp configuration file. Valid values are:
// **unconfined**: seccomp is not used.
// **localhost/****: path of the configuration file installed in the system.
// **:Full path of the configuration file.
//By default, this parameter is not set, which is identical to **unconfined**.| +- **LinuxPodSandboxConfig** + + Sets configurations related to Linux hosts and containers. + + | Member | Description | + | :------------------------------------------: | :-------------------------------------------------------------------------------------: | + | string cgroup_parent | Parent cgroup path of the sandbox. The runtime can convert it to the cgroupfs or systemd semantics as required. (Not configurable)| + | LinuxSandboxSecurityContext security_context | Security attributes of the sandbox | + | map\ sysctls | Linux sysctls configurations of the sandbox | +- **PodSandboxMetadata** + + Stores all necessary information for building the sandbox name. The container runtime is encouraged to expose the metadata in its user interface for better user experience. For example, the runtime can construct a unique sandbox name based on the metadata. + + | Member| Description | + | :----------------: | :----------------------------: | + | string name | Sandbox name | + | string uid | Sandbox UID | + | string namespace | Sandbox namespace | + | uint32 attempt | Number of attempts to create the sandbox. The default value is **0**.| +- **PodSandboxConfig** + + Contains all the required and optional fields for creating a sandbox. + + | Member | Description | + | :--------------------------------: | :-------------------------------------------------------------------------------------------------------------------------------------------------: | + | PodSandboxMetadata metadata | Metadata of the sandbox. This information uniquely identifies the sandbox, and the runtime should leverage this to ensure correct operation. The runtime may also use this information to improve user experience, such as by constructing a readable sandbox name.| + | string hostname | Host name of the sandbox | + | string log_directory | Directory for storing log files of containers in the sandbox | + | DNSConfig dns_config | DNS configuration of the sandbox | + | repeated PortMapping port_mappings | Port mappings of the sandbox | + | map\ labels | Key-value pairs that may be used to identify a single sandbox or a series of sandboxes | + | map\ annotations | Key-value pair holding arbitrary data. The value cannot be modified and can be queried by using **PodSandboxStatus**. | + | LinuxPodSandboxConfig linux | Options related to the linux host | +- **PodSandboxNetworkStatus** + + Describes the network status of the sandbox. + + | Member| Description | + | :----------------: | :-------------------: | + | string ip | IP address of the sandbox | + | string name | Name of the network interface in the sandbox | + | string network | Name of the additional network | +- **Namespace** + + Stores namespace options. + + | Member | Description | + | :---------------------: | :----------------: | + | NamespaceOption options | Linux namespace options | +- **LinuxPodSandboxStatus** + + Describes the status of the Linux sandbox. + + | Member | Description| + | :---------------------------: | :-------------: | + | Namespace**namespaces** | Sandbox namespace | +- **PodSandboxState** + + Enum values for sandbox states. + + | Member | Description | + | :------------------: | :--------------------: | + | SANDBOX_READY = 0 | Ready state of the sandbox | + | SANDBOX_NOTREADY = 1 | Non-ready state of the sandbox | +- **PodSandboxStatus** + + Describes the podsandbox status. + + | Member | Description | + | :---------------------------------------: | :-----------------------------------------------: | + | string id | Sandbox ID | + | PodSandboxMetadata metadata | Sandbox metadata | + | PodSandboxState state | Sandbox state | + | int64 created_at | Creation timestamps of the sandbox in nanoseconds | + | repeated PodSandboxNetworkStatus networks | Multi-plane network status of the sandbox | + | LinuxPodSandboxStatus linux | Status specific to Linux sandboxes | + | map\ labels | Key-value pairs that may be used to identify a single sandbox or a series of sandboxes | + | map\ annotations | Key-value pair holding arbitrary data. The value cannot be modified by the runtime.| +- **PodSandboxStateValue** + + Wrapper of **PodSandboxState**. + + | Member | Description| + | :-------------------: | :-------------: | + | PodSandboxState state | Sandbox state | +- **PodSandboxFilter** + + Filtering conditions when listing sandboxes. The intersection of multiple conditions is displayed. + + | Member | Description | + | :--------------------------------: | :--------------------------------------------------: | + | string id | Sandbox ID | + | PodSandboxStateValue state | Sandbox state | + | map\ label_selector | Sandbox labels. Only full match is supported. Regular expressions are not supported.| +- **PodSandbox** + + Minimal data that describes a sandbox. + + | Member | Description | + | :-----------------------------: | :-----------------------------------------------: | + | string id | Sandbox ID | + | PodSandboxMetadata metadata | Sandbox metadata | + | PodSandboxState state | Sandbox state | + | int64 created_at | Creation timestamps of the sandbox in nanoseconds | + | map\ labels | Key-value pairs that may be used to identify a single sandbox or a series of sandboxes | + | map\ annotations | Key-value pair holding arbitrary data. The value cannot be modified by the runtime | +- **KeyValue** + + Wrapper of a key-value pair. + + | Member| Description| + | :----------------: | :------------: | + | string key | Key | + | string value | Value | +- **SELinuxOption** + + SELinux labels to be applied to the container. + + | Member| Description| + | :----------------: | :------------: | + | string user | User | + | string role | Role | + | string type | Type | + | string level | Level | +- **ContainerMetadata** + + ContainerMetadata contains all necessary information for building the container name. The container runtime is encouraged to expose the metadata in its user interface for better user experience. For example, the runtime can construct a unique container name based on the metadata. + + | Member| Description | + | :----------------: | :------------------------------: | + | string name | Name of a container | + | uint32 attempt | Number of attempts to create the container. The default value is **0**.| +- **ContainerState** + + Enum values for container states. + + | Member | Description | + | :-------------------: | :-------------------: | + | CONTAINER_CREATED = 0 | The container is created | + | CONTAINER_RUNNING = 1 | The container is running | + | CONTAINER_EXITED = 2 | The container is in the exit state | + | CONTAINER_UNKNOWN = 3 | The container state is unknown | +- **ContainerStateValue** + + Wrapper of ContainerState. + + | Member | Description| + | :-------------------: | :------------: | + | ContainerState state | Container state value | +- **ContainerFilter** + + Filtering conditions when listing containers. The intersection of multiple conditions is displayed. + + | Member | Description | + | :--------------------------------: | :----------------------------------------------------: | + | string id | Container ID | + | PodSandboxStateValue state | Container state | + | string pod_sandbox_id | Sandbox ID | + | map\ label_selector | Container labels. Only full match is supported. Regular expressions are not supported.| +- **LinuxContainerSecurityContext** + + Security configuration that will be applied to a container. + + | Member | Description | + | :--------------------------------- | :------------------------------------------------------------------------------------------------------------------------------------- | + | Capability capabilities | Capabilities to add or drop | + | bool privileged | Whether the container is in privileged mode. The default value is **false**. | + | NamespaceOption namespace_options | Namespace options of the container | + | SELinuxOption selinux_options | SELinux context to be optionally applied (**not supported currently**) | + | Int64Value run_as_user | UID to run container processes. Only one of **run_as_user** and **run_as_username** can be specified at a time. **run_as_username** takes effect preferentially. | + | string run_as_username | User name to run container processes. If specified, the user must exist in the container image (that is, in **/etc/passwd** inside the image) and be resolved there by the runtime. Otherwise, the runtime must throw an error.| + | bool readonly_rootfs | Whether the root file system in the container is read-only. The default value is configured in **config.json**. | + | repeated int64 supplemental_groups | List of groups of the first process in the container besides the primary group | + | string apparmor_profile | AppArmor configuration file for the container (**not supported currently**) | + | string seccomp_profile_path | Seccomp configuration file for the container | + | bool no_new_privs | Whether to set the **no_new_privs** flag on the container | +- **LinuxContainerResources** + + Resource specification for the Linux container. + + | Member | Description | + | :-------------------------- | :------------------------------------------------------------- | + | int64 cpu_period | CPU Completely Fair Scheduler (CFS) period. The default value is **0**. | + | int64 cpu_quota | CPU CFS quota. The default value is **0**. | + | int64 cpu_shares | CPU shares (weight relative to other containers). The default value is **0**.| + | int64 memory_limit_in_bytes | Memory limit, in bytes. The default value is **0**. | + | int64 oom_score_adj | oom-killer score. The default value is **0**. | + | string cpuset_cpus | CPU cores to be used by the container. The default value is **""**. | + | string cpuset_mems | Memory nodes to be used by the container. The default value is **""**. | +- **Image** + + Basic information about a container image. + + | Member | Description | + | :--------------------------- | :--------------------- | + | string id | Image ID | + | repeated string repo_tags | Image tag name (**repo_tags**) | + | repeated string repo_digests | Image digest information | + | uint64 size | Image size | + | Int64Value uid | UID of the default image user | + | string username | Name of the default image user | +- **ImageSpec** + + Internal data structure that represents an image. Currently, **ImageSpec** wraps only the container image name. + + | Member| Description| + | :----------------: | :------------: | + | string image | Container image name | +- **StorageIdentifier** + + Unique identifier of a storage device. + + | Member| Description| + | :----------------: | :------------: | + | string uuid | UUID of the device | +- **FilesystemUsage** + + | Member | Description | + | :--------------------------- | :------------------------- | + | int64 timestamp | Timestamp at which the information was collected | + | StorageIdentifier storage_id | UUID of the file system that stores the image | + | UInt64Value used_bytes | Space size used for storing image metadata | + | UInt64Value inodes_used | Number of inodes for storing image metadata | +- **AuthConfig** + + | Member | Description | + | :-------------------- | :------------------------------------- | + | string username | User name used for downloading images | + | string password | Password used for downloading images | + | string auth | Base64-encoded authentication information used for downloading images | + | string server_address | Address of the server for downloaded images (not supported currently) | + | string identity_token | Token information used for authentication with the image repository (not supported currently) | + | string registry_token | Token information used for interaction with the image repository (not supported currently) | +- **Container** + + Container description information, such as the ID and state. + + | Member | Description | + | :-----------------------------: | :---------------------------------------------------------: | + | string id | Container ID | + | string pod_sandbox_id | ID of the sandbox to which the container belongs | + | ContainerMetadata metadata | Container metadata | + | ImageSpec image | Image specifications | + | string image_ref | Reference to the image used by the container. For most runtimes, this is an image ID.| + | ContainerState state | Container state | + | int64 created_at | Creation timestamps of the container in nanoseconds | + | map\ labels | Key-value pairs that may be used to identify a single container or a series of containers | + | map\ annotations | Key-value pair holding arbitrary data. The value cannot be modified by the runtime | +- **ContainerStatus** + + Container status information. + + | Member | Description | + | :-----------------------------: | :-----------------------------------------------------------------------: | + | string id | Container ID | + | ContainerMetadata metadata | Container metadata | + | ContainerState state | Container state | + | int64 created_at | Creation timestamps of the container in nanoseconds | + | int64 started_at | Startup timestamps of the container in nanoseconds | + | int64 finished_at | Exit timestamps of the container in nanoseconds | + | int32 exit_code | Container exit code | + | ImageSpec image | Image specifications | + | string image_ref | Reference to the image used by the container. For most runtimes, this is an image ID. | + | string reason | Brief explanation of why the container is in its current state | + | string message | Human-readable message explaining why the container is in its current state | + | map\ labels | Key-value pairs that may be used to identify a single container or a series of containers | + | map\ annotations | Key-value pair holding arbitrary data. The value cannot be modified by the runtime. | + | repeated Mount mounts | Container mount point information | + | string log_path | Container log file path. The file is in the **log_directory** folder configured in **PodSandboxConfig**.| +- **ContainerStatsFilter** + + Filtering conditions when listing container states. The intersection of multiple conditions is displayed. + + | Member | Description | + | :--------------------------------: | :----------------------------------------------------: | + | string id | Container ID | + | string pod_sandbox_id | Sandbox ID | + | map\ label_selector | Container labels. Only full match is supported. Regular expressions are not supported.| +- **ContainerStats** + + Filtering conditions when listing container states. The intersection of multiple conditions is displayed. + + | Member | Description| + | :----------------------------: | :------------: | + | ContainerAttributes attributes | Container Information | + | CpuUsage cpu | CPU usage | + | MemoryUsage memory | Memory usage | + | FilesystemUsage writable_layer | Usage of the writable layer | +- **ContainerAttributes** + + Basic information about the container. + + | Member | Description | + | :----------------------------: | :-----------------------------------------------: | + | string id | Container ID | + | ContainerMetadata metadata | Container metadata | + | map\ labels | Key-value pairs that may be used to identify a single container or a series of containers | + | map\ annotations | Key-value pair holding arbitrary data. The value cannot be modified by the runtime.| +- **CpuUsage** + + Container CPU usage. + + | Member | Description | + | :---------------------------------: | :--------------------: | + | int64 timestamp | Timestamp | + | UInt64Value usage_core_nano_seconds | CPU usage duration, in nanoseconds | +- **MemoryUsage** + + Container memory usage. + + | Member | Description| + | :---------------------------: | :------------: | + | int64 timestamp | Timestamp | + | UInt64Value working_set_bytes | Memory usage | +- **FilesystemUsage** + + Usage of the writable layer of the container. + + | Member | Description | + | :--------------------------: | :-----------------------: | + | int64 timestamp | Timestamp | + | StorageIdentifier storage_id | Writable layer directory | + | UInt64Value used_bytes | Number of bytes occupied by the image at the writable layer | + | UInt64Value inodes_used | Number of inodes occupied by the image at the writable layer | +- **Device** + + Host volume to mount into a container. + + | Member | Description | + | :-------------------- | :--------------------------------------------------------------------------------------------------------- | + | string container_path | Mount path within the container | + | string host_path | Mount path on the host | + | string permissions | cgroup permissions of the device (**r** allows the container to read from the specified device; **w** allows the container to write to the specified device; **m** allows the container to create device files that do not yet exist).| +- **LinuxContainerConfig** + + Configuration specific to Linux containers. + + | Member | Description | + | :--------------------------------------------- | :---------------------- | + | LinuxContainerResources resources | Container resource specifications | + | LinuxContainerSecurityContext security_context | Linux container security configuration | +- **ContainerConfig** + + Required and optional fields for creating a container. + + | Member | Description | + | :------------------------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------- | + | ContainerMetadata metadata | Container metadata. This information uniquely identifies the container, and the runtime should leverage this to ensure correct operation. The runtime may also use this information to improve user experience, such as by constructing a readable container name. (**Required**)| + | ImageSpec image | Image used by the container. (Required) | + | repeated string command | Command to be executed. The default value is **"/bin/sh"**. | + | repeated string args | Arguments of the command to be executed | + | string working_dir | Current working directory of the command to be executed | + | repeated KeyValue envs | Environment variables to set in the container | + | repeated Mount mounts | Mount points in the container | + | repeated Device devices | Devices to be mapped in the container | + | mapstring, labels | Key-value pairs that may be used to index and select individual resources | + | mapstring, annotations | Unstructured key-value map that may be used to store and retrieve arbitrary metadata | + | string log_path | Path relative to **PodSandboxConfig.LogDirectory** for container to store the logs (STDOUT and STDERR) on the host + | bool stdin | Whether to enable STDIN of the container | + | bool stdin_once | Whether to immediately disconnect all data streams connected to STDIN when a data stream connected to stdin is disconnected (**not supported currently**) | + | bool tty | Whether to use a pseudo terminal to connect to STDIO of the container | + | LinuxContainerConfig linux | Configuration specific to Linux containers | +- **NetworkConfig** + + Runtime network configuration. + + | Member| Description | + | :----------------- | :-------------------- | + | string pod_cidr | CIDR for pod IP addresses | +- **RuntimeConfig** + + Runtime network configuration. + + | Member | Description | + | :--------------------------- | :---------------- | + | NetworkConfig network_config | Runtime network configuration | +- **RuntimeCondition** + + Runtime condition information. + + | Member| Description | + | :----------------- | :------------------------------------------ | + | string type | Runtime condition type | + | bool status | Runtime status | + | string reason | Brief description of the reason for the runtime condition change | + | string message | Human-readable message describing the reason for the runtime condition change | +- **RuntimeStatus** + + Runtime status. + + | Member | Description | + | :----------------------------------- | :------------------------ | + | repeated RuntimeCondition conditions | Current runtime conditions | + +### Runtime Service + +The runtime service contains interfaces for operating pods and containers, and interfaces for querying the configuration and status of the runtime service. + +#### RunPodSandbox + +#### Interface Prototype + +``` +rpc RunPodSandbox(RunPodSandboxRequest) returns (RunPodSandboxResponse) {} +``` + +#### Interface Description + +Creates and starts a pod sandbox. The sandbox is in the ready state on success. + +#### Precautions + +1. The default image for starting the sandbox is **rnd-dockerhub.huawei.com/library/pause-$\{machine\}:3.0**, where **$\{machine\}** indicates the architecture. On x86\_64, the value of **machine** is **amd64**, on ARM64, the value of **machine** is **aarch64**. Currently, only the **amd64** and **aarch64** images can be downloaded from the rnd-dockerhub repository. If the images do not exist on the host, ensure that the host can download them from the rnd-dockerhub repository. +2. The container names use the field in **PodSandboxMetadata** and are separated by underscores (\_). Therefore, the data in metadata cannot contain underscores. Otherwise, the sandbox runs successfully, but the **ListPodSandbox** interface cannot query the sandbox. + +#### Parameter + +| Member | Description | +| :---------------------- | :-------------------------------------------------------------------- | +| PodSandboxConfig config | Sandbox configuration | +| string runtime_handler | Runtime to use for the sandbox. Currently, **lcr** and **kata-runtime** are supported.| + +#### Returns + +| Return | Description | +| :-------------------- | :--------------------- | +| string pod_sandbox_id | The response data is return on success.| + +#### StopPodSandbox + +#### Interface Prototype + +``` +rpc StopPodSandbox(StopPodSandboxRequest) returns (StopPodSandboxResponse) {} +``` + +#### Interface Description + +Stops the pod sandbox, stops the sandbox container, and reclaims the network resources (such as IP addresses) allocated to the sandbox. If any running container belongs to the sandbox, the container must be forcibly terminated. + +#### Parameter + +| Member | Description| +| :-------------------- | :------------- | +| string pod_sandbox_id | Sandbox ID | + +#### Returns + +| Return| Description| +| :--------------- | :------------- | +| None | None | + +#### RemovePodSandbox + +#### Interface Prototype + +```text +rpc RemovePodSandbox(RemovePodSandboxRequest) returns (RemovePodSandboxResponse) {} +``` + +#### Interface Description + +Removes a sandbox. If there are any running containers in the sandbox, they must be forcibly terminated and removed. This interface must not return an error if the sandbox has already been removed. + +#### Precautions + +1. When a sandbox is deleted, the network resources of the sandbox are not deleted. Before deleting the pod, you must call **StopPodSandbox** to remove the network resources. Ensure that **StopPodSandbox** is called at least once before deleting the sandbox. +2. If the container in a sandbox fails to be deleted when the sandbox is deleted, the sandbox is deleted but the container remains. In this case, you need to manually delete the residual container. + +#### Parameter + +| Member | Description| +| :-------------------- | :------------- | +| string pod_sandbox_id | Sandbox ID | + +#### Returns + +| Return| Description| +| :--------------- | :------------- | +| None | None | + +#### PodSandboxStatus + +#### Interface Prototype + +```text +rpc PodSandboxStatus(PodSandboxStatusRequest) returns (PodSandboxStatusResponse) {} +``` + +#### Interface Description + +Queries the status of the sandbox. If the sandbox does not exist, this interface returns an error. + +#### Parameter + +| Member | Description | +| :-------------------- | :-------------------------------------------------- | +| string pod_sandbox_id | Sandbox ID | +| bool verbose | Whether to return extra information about the sandbox (not configurable currently) | + +#### Returns + +| Return | Description | +| :----------------------- | :--------------------------------------------------------------------------------------------------------------------------------------- | +| PodSandboxStatus status | Sandbox status information | +| map\ info | Extra information of the sandbox. The **key** can be an arbitrary string, and **value** is in JSON format. **info** can include anything debug information. When **verbose** is set to **true**, **info** cannot be empty (not configurable currently). | + +#### ListPodSandbox + +#### Interface Prototype + +```text +rpc ListPodSandbox(ListPodSandboxRequest) returns (ListPodSandboxResponse) {} +``` + +#### Interface Description + +Returns sandbox information. Conditional filtering is supported. + +#### Parameter + +| Member | Description| +| :---------------------- | :------------- | +| PodSandboxFilter filter | Conditional filtering parameters | + +#### Returns + +| Return | Description | +| :------------------------ | :---------------- | +| repeated PodSandbox items | Sandboxes | + +#### CreateContainer + +#### Interface Prototype + +```text +rpc CreateContainer(CreateContainerRequest) returns (CreateContainerResponse) {} +``` + +#### Interface Description + +Creates a container in a PodSandbox. + +#### Precautions + +- **sandbox\_config** in **CreateContainerRequest** is the same as the configuration passed to **RunPodSandboxRequest** to create the PodSandbox. It is passed again for reference. **PodSandboxConfig** is immutable and remains unchanged throughout the lifecycle of a pod. +- The container names use the field in **ContainerMetadata** and are separated by underscores (\_). Therefore, the data in metadata cannot contain underscores. Otherwise, the container runs successfully, but the **ListContainers** interface cannot query the container. +- **CreateContainerRequest** does not contain the **runtime\_handler** field. The runtime type of the created container is the same as that of the corresponding sandbox. + +#### Parameter + +| Member | Description | +| :------------------------------ | :--------------------------------- | +| string pod_sandbox_id | ID of the PodSandbox where the container is to be created | +| ContainerConfig config | Container configuration information | +| PodSandboxConfig sandbox_config | PodSandbox configuration information | + +#### Supplementary Information + +Unstructured key-value map that may be used to store and retrieve arbitrary metadata. Some fields can be transferred through this field because CRI does not provide specific parameters. + +- Customization + + | Custom Key:Value| Description | + | :------------------------- | :------------------------------------------------ | + | cgroup.pids.max:int64_t | Limits the number of processes/threads in a container. (Set **-1** for unlimited.)| + +#### Returns + +| Return | Description | +| :------------------ | :--------------- | +| string container_id | ID of the created container | + +#### StartContainer + +#### Interface Prototype + +```text +rpc StartContainer(StartContainerRequest) returns (StartContainerResponse) {} +``` + +#### Interface Description + +Starts a container. + +#### Parameter + +| Member | Description| +| :------------------ | :------------- | +| string container_id | Container ID | + +#### Returns + +| Return| Description| +| :--------------- | :------------- | +| None | None | + +#### StopContainer + +#### Interface Prototype + +```text +rpc StopContainer(StopContainerRequest) returns (StopContainerResponse) {} +``` + +#### Interface Description + +Stops a running container. The graceful stop timeout can be configured. If the container has been stopped, no error can be returned. + +#### Parameter + +| Member | Description | +| :------------------ | :---------------------------------------------------- | +| string container_id | Container ID | +| int64 timeout | Waiting time before a container is forcibly stopped. The default value is **0**, indicating that the container is forcibly stopped immediately.| + +#### Returns + +None + +#### RemoveContainer + +#### Interface Prototype + +```text +rpc RemoveContainer(RemoveContainerRequest) returns (RemoveContainerResponse) {} +``` + +#### Interface Description + +Deletes a container. If the container is running, it must be forcibly stopped. If the container has been deleted, no error can be returned. + +#### Parameter + +| Member | Description| +| :------------------ | :------------- | +| string container_id | Container ID | + +#### Returns + +None + +#### ListContainers + +#### Interface Prototype + +```text +rpc ListContainers(ListContainersRequest) returns (ListContainersResponse) {} +``` + +#### Interface Description + +Returns container information. Conditional filtering is supported. + +#### Parameter + +| Member | Description| +| :--------------------- | :------------- | +| ContainerFilter filter | Conditional filtering parameters | + +#### Returns + +| Return | Description| +| :---------------------------- | :------------- | +| repeated Container containers | Containers | + +#### ContainerStatus + +#### Interface Prototype + +```text +rpc ContainerStatus(ContainerStatusRequest) returns (ContainerStatusResponse) {} +``` + +#### Interface Description + +Returns container status information. If the container does not exist, an error is returned. + +#### Parameter + +| Member | Description | +| :------------------ | :-------------------------------------------------- | +| string container_id | Container ID | +| bool verbose | Whether to display additional information about the sandbox (not configurable currently) | + +#### Returns + +| Return | Description | +| :----------------------- | :--------------------------------------------------------------------------------------------------------------------------------------- | +| ContainerStatus status | Container status information | +| map\ info | Extra information of the sandbox. The **key** can be an arbitrary string, and **value** is in JSON format. **info** can include anything debug information. When **verbose** is set to **true**, **info** cannot be empty (not configurable currently).| + +#### UpdateContainerResources + +#### Interface Prototype + +```text +rpc UpdateContainerResources(UpdateContainerResourcesRequest) returns (UpdateContainerResourcesResponse) {} +``` + +#### Interface Description + +Updates container resource configurations. + +#### Precautions + +- This interface is used exclusively to update the resource configuration of a container, not a pod. +- Currently, the **oom\_score\_adj** configuration of containers cannot be updated. + +#### Parameter + +| Member | Description | +| :---------------------------- | :---------------- | +| string container_id | Container ID | +| LinuxContainerResources linux | Linux resource configuration information | + +#### Returns + +None + +#### ExecSync + +#### Interface Prototype + +```text +rpc ExecSync(ExecSyncRequest) returns (ExecSyncResponse) {} +``` + +#### Interface Description + +Runs a command synchronously in a container and communicates using gRPC. + +#### Precautions + +This interface runs a single command and cannot open a terminal to interact with the container. + +#### Parameter + +| Member | Description | +| :------------------ | :------------------------------------------------------------------ | +| string container_id | Container ID | +| repeated string cmd | Command to be executed | +| int64 timeout | Timeout interval before a command to be stopped is forcibly terminated, in seconds. The default value is **0**, indicating that there is no timeout limit (**not supported currently**).| + +#### Returns + +| Return| Description | +| :--------------- | :------------------------------------- | +| bytes stdout | Captures the standard output of the command | +| bytes stderr | Captures the standard error output of the command | +| int32 exit_code | Exit code the command finished with. The default value is **0**, indicating success.| + +#### Exec + +#### Interface Prototype + +```text +rpc Exec(ExecRequest) returns (ExecResponse) {} +``` + +#### Interface Description + +Runs a command in the container, obtains the URL from the CRI server using gRPC, and establishes a persistent connection with the WebSocket server based on the obtained URL to interact with the container. + +#### Precautions + +This interface runs a single command and can open a terminal to interact with the container. One of **stdin**, **stdout**, or **stderr** must be true. If **tty** is true, **stderr** must be false. Multiplexing is not supported. In that case, the outputs of **stdout** and **stderr** are combined into a single stream. + +#### Parameter + +| Member | Description | +| :------------------ | :------------------- | +| string container_id | Container ID | +| repeated string cmd | Command to be executed | +| bool tty | Whether to run the command in a TTY | +| bool stdin | Whether to stream standard input | +| bool stdout | Whether to stream standard output | +| bool stderr | Whether to stream standard error output | + +#### Returns + +| Return| Description | +| :--------------- | :------------------------ | +| string url | Fully qualified URL of the exec streaming server | + +#### Attach + +#### Interface Prototype + +```text +rpc Attach(AttachRequest) returns (AttachResponse) {} +``` + +#### Interface Description + +Takes over process 1 of the container, obtains the URL from the CRI server using gRPC, and establishes a persistent connection with the WebSocket server based on the obtained URL to interact with the container. + +#### Parameter + +| Member | Description | +| :------------------ | :------------------ | +| string container_id | Container ID | +| bool tty | Whether to run the command in a TTY | +| bool stdin | Whether to stream standard input | +| bool stdout | Whether to stream standard output | +| bool stderr | Whether to stream standard error output | + +#### Returns + +| Return| Description | +| :--------------- | :-------------------------- | +| string url | Fully qualified URL of the attach streaming server | + +#### ContainerStats + +#### Interface Prototype + +```text +rpc ContainerStats(ContainerStatsRequest) returns (ContainerStatsResponse) {} +``` + +#### Interface Description + +Returns information about the resources occupied by a single container. Only containers whose runtime type is lcr are supported. + +#### Parameter + +| Member | Description| +| :------------------ | :------------- | +| string container_id | Container ID | + +#### Returns + +| Return | Description | +| :------------------- | :------------------------------------------------------ | +| ContainerStats stats | Container information. Information about drives and inodes can be returned only for containers started using images in oci format.| + +#### ListContainerStats + +#### Interface Prototype + +```text +rpc ListContainerStats(ListContainerStatsRequest) returns (ListContainerStatsResponse) {} +``` + +#### Interface Description + +Returns information about resources occupied by multiple containers. Conditional filtering is supported. + +#### Parameter + +| Member | Description| +| :-------------------------- | :------------- | +| ContainerStatsFilter filter | Conditional filtering parameters | + +#### Returns + +| Return | Description | +| :---------------------------- | :-------------------------------------------------------------- | +| repeated ContainerStats stats | List of container information. Information about drives and inodes can be returned only for containers started using images in OCI format.| + +#### UpdateRuntimeConfig + +#### Interface Prototype + +```text +rpc UpdateRuntimeConfig(UpdateRuntimeConfigRequest) returns (UpdateRuntimeConfigResponse); +``` + +#### Interface Description + +Provides standard CRI for updating pod CIDR of the network plugin. Currently, the CNI network plugins do not need to update the pod CIDR. Therefore, this interface only records access logs. + +#### Precautions + +This interface does not modify the system management information, but only records logs. + +#### Parameter + +| Member | Description | +| :--------------------------- | :---------------------- | +| RuntimeConfig runtime_config | Information to be configured for the runtime | + +#### Returns + +None + +#### Status + +#### Interface Prototype + +```text +rpc Status(StatusRequest) returns (StatusResponse) {}; +``` + +#### Interface Description + +Obtains the network status of the runtime and pod. When the network status is obtained, the network configuration is updated. + +#### Precautions + +If the network configuration fails to be updated, the original configuration is not affected. The original configuration is overwritten only when the network configuration is updated successfully. + +#### Parameter + +| Member| Description | +| :----------------- | :---------------------------------------- | +| bool verbose | Whether to display additional runtime information (not supported currently) | + +#### Returns + +| Return | Description | +| :----------------------- | :---------------------------------------------------------------------------------------------------------- | +| RuntimeStatus status | Runtime status | +| map\ info | Additional runtime information. The key of **info** can be any value, and the **value** is in JSON format and can contain any debug information. Additional information is displayed only when **Verbose** is set to **true**.| + +### Image Service + +Provides gRPC APIs for pulling, viewing, and removing images from the image repository. + +#### ListImages + +#### Interface Prototype + +```text +rpc ListImages(ListImagesRequest) returns (ListImagesResponse) {} +``` + +#### Interface Description + +Lists information about existing images. + +#### Precautions + +This interface is a unified interface. Images of embedded format can be queried using **cri images**. However, because embedded images are not in OCI standard, the query result has the following restrictions: + +- The displayed image ID is **digest** of **config** of the image because embedded images do not have image IDs. +- **digest** cannot be displayed because embedded images have only **digest** of **config**, not **digest** of themselves, and **digest** does not comply with OCI specifications. + +#### Parameter + +| Member| Description| +| :----------------- | :------------- | +| ImageSpec filter | Name of images to be filtered | + +#### Returns + +| Return | Description| +| :-------------------- | :------------- | +| repeated Image images | List of images | + +#### ImageStatus + +#### Interface Prototype + +```text +rpc ImageStatus(ImageStatusRequest) returns (ImageStatusResponse) {} +``` + +#### Interface Description + +Queries the details about a specified image. + +#### Precautions + +1. This interface is used to query information about a specified image. If the image does not exist, **ImageStatusResponse** is returned, in which **Image** is **nil**. +2. This interface is a unified interface. Images of embedded format cannot be queried because they do not comply with the OCI specification and lack some fields. + +#### Parameter + +| Member| Description | +| :----------------- | :------------------------------------- | +| ImageSpec image | Image name | +| bool verbose | Queries extra information. This parameter is not supported currently and no extra information is returned.| + +#### Returns + +| Return | Description | +| :----------------------- | :------------------------------------- | +| Image image | Image information | +| map\ info | Extra image information. This parameter is not supported currently and no extra information is returned.| + +#### PullImage + +#### Interface Prototype + +```text + rpc PullImage(PullImageRequest) returns (PullImageResponse) {} +``` + +#### Interface Description + +Downloads an image. + +#### Precautions + +You can download public images or private images using the username, password, and authentication information. The **server_address**, **identity_token**, and **registry_token** fields in **AuthConfig** are not supported. + +#### Parameter + +| Member | Description | +| :------------------------------ | :-------------------------------- | +| ImageSpec image | Name of the image to download | +| AuthConfig auth | Authentication information for downloading a private image | +| PodSandboxConfig sandbox_config | Downloads an Image in the pod context (not supported currently).| + +#### Returns + +| Return| Description | +| :--------------- | :----------------- | +| string image_ref | Information about the downloaded image | + +#### RemoveImage + +#### Interface Prototype + +```text +rpc RemoveImage(RemoveImageRequest) returns (RemoveImageResponse) {} +``` + +#### Interface Description + +Deletes a specified image. + +#### Precautions + +This interface is a unified interface. Images of embedded format cannot be deleted based on the image ID because they do not comply with the OCI specification and lack some fields. + +#### Parameter + +| Member| Description | +| :----------------- | :--------------------- | +| ImageSpec image | Name or ID of the image to be deleted | + +#### Returns + +None + +#### ImageFsInfo + +#### Interface Prototype + +```text +rpc ImageFsInfo(ImageFsInfoRequest) returns (ImageFsInfoResponse) {} +``` + +#### Interface Description + +Queries information about the file systems of an image. + +#### Precautions + +The queried information is the file system information in the image metadata. + +#### Parameter + +None + +#### Returns + +| Return | Description | +| :----------------------------------------- | :------------------- | +| repeated FilesystemUsage image_filesystems | Image file system information | + +### Constraints + +1. If **log_directory** is configured in **PodSandboxConfig** when a sandbox is created, **log_path** must be specified in **ContainerConfig** when a container of the sandbox is created. Otherwise, the container may fail to be started or even deleted using CRI API. + + The actual **LOGPATH** of the container is **log_directory/log_path**. If **log_path** is not configured, the final **LOGPATH** changes to **log_directory**. + + - If the path does not exist, iSulad creates a soft link pointing to the final path of container logs when starting the container, and **log_directory** becomes a soft link. In this case, there are two situations: + + 1. If **log_path** is not configured for other containers in the sandbox, when other containers are started, **log_directory** is deleted and points to **log_path** of the newly started container. As a result, the logs of the previously started container point to the logs of the container started later. + 2. If **log_path** is configured for other containers in the sandbox, **LOGPATH** of the container is **log_directory/log_path**. Because **log_directory** is a soft link, if **log_directory/log_path** is used as the soft link target to point to the actual log path of the container, the container creation fails. + - If the path exists, iSulad attempts to delete the path (non-recursively) when starting the container. If the path is a folder that contains content, the deletion fails. As a result, the soft link fails to be created and the container fails to be started. When the container is deleted, the same symptom occurs. As a result, the container deletion fails. +2. If **log_directory** is configured in **PodSandboxConfig** when a sandbox is created and **log_path** is configured in **ContainerConfig** when a container is created, the final **LOGPATH** is **log_directory/log_path**. iSulad does not create **LOGPATH** recursively. Therefore, you must ensure that **dirname(LOGPATH)**, that is, the parent directory of the final log directory, exists. +3. If **log_directory** is configured in **PodSandboxConfig** when a sandbox is created, and the same **log_path** is specified in **ContainerConfig** when two or more containers are created or containers in different sandboxes point to the same **LOGPATH**, when the containers are started successfully, the log path of the container that is started later overwrites that of the container that is started earlier. +4. If the image content in the remote image repository changes and the CRI image pulling interface is used to download the image again, the image name and tag of the local original image (if it exists) change to "none." + + Example: + + Local image: + + ```text + IMAGE TAG IMAGE ID SIZE + rnd-dockerhub.huawei.com/pproxyisulad/test latest 99e59f495ffaa 753kB + ``` + + After the **rnd-dockerhub.huawei.com/pproxyisulad/test:latest** image in the remote repository is updated and downloaded again: + + ```text + IMAGE TAG IMAGE ID SIZE + 99e59f495ffaa 753kB + rnd-dockerhub.huawei.com/pproxyisulad/test latest d8233ab899d41 1.42MB + ``` + + Run the `isula images` command. **REF** is displayed as **-**. + + ```text + REF IMAGE ID CREATED SIZE + rnd-dockerhub.huawei.com/pproxyisulad/test:latest d8233ab899d41 2019-02-14 19:19:37 1.42MB + - 99e59f495ffaa 2016-05-04 02:26:41 753kB + ``` + +5. The exec and attach interfaces of iSulad CRI API are implemented using WebSocket. Clients interact with iSulad using the same protocol. When using the exec or attach interface, do not transfer a large amount of data or files over the serial port. The exec or attach interface is used only for basic command interaction. If the user side does not process the data or files in a timely manner, data may be lost. In addition, do not use the exec or attach interface to transfer binary data or files. +6. The iSulad CRI API exec/attach depends on libwebsockets (LWS). It is recommended that the streaming API be used only for persistent connection interaction but not in high-concurrency scenarios, because the connection may fail due to insufficient host resources. It is recommended that the number of concurrent connections be less than or equal to 100. diff --git a/docs/en/docs/Container/cri.md b/docs/en/docs/Container/cri.md deleted file mode 100644 index 72f71923c393bd0ca0c7c04833401f378374642e..0000000000000000000000000000000000000000 --- a/docs/en/docs/Container/cri.md +++ /dev/null @@ -1,2902 +0,0 @@ -# CRI - -- [CRI](#cri) - - [Description](#description) - - [APIs](#apis) - - [API Parameters](#api-parameters) - - [Runtime Service](#runtime-service) - - [RunPodSandbox](#runpodsandbox) - - [StopPodSandbox](#stoppodsandbox) - - [RemovePodSandbox](#removepodsandbox) - - [PodSandboxStatus](#podsandboxstatus) - - [ListPodSandbox](#listpodsandbox) - - [CreateContainer](#createcontainer) - - [Supplement](#supplement) - - [StartContainer](#startcontainer) - - [StopContainer](#stopcontainer) - - [RemoveContainer](#removecontainer) - - [ListContainers](#listcontainers) - - [ContainerStatus](#containerstatus) - - [UpdateContainerResources](#updatecontainerresources) - - [ExecSync](#execsync) - - [Exec](#exec) - - [Attach](#attach) - - [ContainerStats](#containerstats) - - [ListContainerStats](#listcontainerstats) - - [UpdateRuntimeConfig](#updateruntimeconfig) - - [Status](#status) - - [Image Service](#image-service) - - [ListImages](#listimages) - - [ImageStatus](#imagestatus) - - [PullImage](#pullimage) - - [RemoveImage](#removeimage) - - [ImageFsInfo](#imagefsinfo) - - [Constraints](#constraints) - -## Description - -The Container Runtime Interface \(CRI\) provided by Kubernetes defines container and image service APIs. iSulad uses the CRI to interconnect with Kubernetes. - -Since the container runtime is isolated from the image lifecycle, two services need to be defined. This API is defined by using [Protocol Buffer](https://developers.google.com/protocol-buffers/) based on [gRPC](https://grpc.io/). - -The current CRI version is v1alpha1. For official API description, access the following link: - -[https://github.com/kubernetes/kubernetes/blob/release-1.14/pkg/kubelet/apis/cri/runtime/v1alpha2/api.proto](https://github.com/kubernetes/kubernetes/blob/release-1.14/pkg/kubelet/apis/cri/runtime/v1alpha2/api.proto) - -iSulad uses the API description file of version 1.14 used by Pass, which is slightly different from the official API description file. API description in this document prevails. - ->![](./public_sys-resources/icon-note.gif) **NOTE:** ->The listening IP address of the CRI WebSocket streaming service is **127.0.0.1** and the port number is **10350**. The port number can be configured in the **--websocket-server-listening-port** command or in the **daemon.json** configuration file. - -## APIs - -The following tables list the parameters that may be used in each API. Some parameters do not take effect now, which have been noted in the corresponding parameter description. - -### API Parameters - -- **DNSConfig** - - The API is used to configure DNS servers and search domains of a sandbox. - - - - - - - - - - - - - - - -

Parameter

-

Description

-

repeated string servers

-

DNS server list of a cluster.

-

repeated string searches

-

DNS search domain list of a cluster.

-

repeated string options

-

DNS option list. For details, see https://linux.die.net/man/5/resolv.conf.

-
- -- **Protocol** - - The API is used to specify enum values of protocols. - - - - - - - - - - - - -

Parameter

-

Description

-

TCP = 0↵

-

Transmission Control Protocol (TCP).

-

UDP = 1

-

User Datagram Protocol (UDP).

-
- -- **PortMapping** - - The API is used to configure the port mapping for a sandbox. - - - - - - - - - - - - - - - - - - -

Parameter

-

Description

-

Protocol protocol

-

Protocol used for port mapping.

-

int32 container_port

-

Port number in the container.

-

int32 host_port

-

Port number on the host.

-

string host_ip

-

Host IP address.

-
- -- **MountPropagation** - - The API is used to specify enums of mount propagation attributes. - - - - - - - - - - - - - - - -

Parameter

-

Description

-

PROPAGATION_PRIVATE = 0

-

No mount propagation attributes, that is, private in Linux.

-

PROPAGATION_HOST_TO_CONTAINER = 1

-

Mount attribute that can be propagated from the host to the container, that is, rslave in Linux.

-

PROPAGATION_BIDIRECTIONAL = 2

-

Mount attribute that can be propagated between a host and a container, that is, rshared in Linux.

-
- -- **Mount** - - The API is used to mount a volume on the host to a container. \(Only files and folders are supported.\) - - - - - - - - - - - - - - - - - - - - - -

Parameter

-

Description

-

string container_path

-

Path in the container.

-

string host_path

-

Path on the host.

-

bool readonly

-

Whether the configuration is read-only in the container.

-

Default value: false

-

bool selinux_relabel

-

Whether to set the SELinux label. This parameter does not take effect now.

-

MountPropagation propagation

-

Mount propagation attribute.

-

The value can be 0, 1, or 2, corresponding to the private, rslave, and rshared propagation attributes respectively.

-

Default value: 0

-
- -- **NamespaceOption** - - - - - - - - - - - - - - - -

Parameter

-

Description

-

bool host_network

-

Whether to use host network namespaces.

-

bool host_pid

-

Whether to use host PID namespaces.

-

bool host_ipc

-

Whether to use host IPC namespaces.

-
- -- **Capability** - - This API is used to specify the capabilities to be added and deleted. - - - - - - - - - - - - -

Parameter

-

Description

-

repeated string add_capabilities

-

Capabilities to be added.

-

repeated string drop_capabilities

-

Capabilities to be deleted.

-
- -- **Int64Value** - - The API is used to encapsulate data of the signed 64-bit integer type. - - - - - - - - - -

Parameter

-

Description

-

int64 value

-

Actual value of the signed 64-bit integer type.

-
- -- **UInt64Value** - - The API is used to encapsulate data of the unsigned 64-bit integer type. - - - - - - - - - -

Parameter

-

Description

-

uint64 value

-

Actual value of the unsigned 64-bit integer type.

-
- -- **LinuxSandboxSecurityContext** - - The API is used to configure the Linux security options of a sandbox. - - Note that these security options are not applied to containers in the sandbox, and may not be applied to the sandbox without any running process. - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Parameter

-

Description

-

NamespaceOption namespace_options

-

Sandbox namespace options.

-

SELinuxOption selinux_options

-

SELinux options. This parameter does not take effect now.

-

Int64Value run_as_user

-

Process UID in the sandbox.

-

bool readonly_rootfs

-

Whether the root file system of the sandbox is read-only.

-

repeated int64 supplemental_groups

-

Information of the user group of the init process in the sandbox (except the primary GID).

-

bool privileged

-

Whether the sandbox is a privileged container.

-

string seccomp_profile_path

-

Path of the seccomp configuration file. Valid values are as follows:

-

// unconfined: Seccomp is not configured.

-

// localhost/ Full path of the configuration file: configuration file path installed in the system.

-

// Full path of the configuration file: full path of the configuration file.

-

// unconfined is the default value.

-
- -- **LinuxPodSandboxConfig** - - The API is used to configure information related to the Linux host and containers. - - - - - - - - - - - - - - - -

Parameter

-

Description

-

string cgroup_parent

-

Parent path of the cgroup of the sandbox. The runtime can use the cgroupfs or systemd syntax based on site requirements. This parameter does not take effect now.

-

LinuxSandboxSecurityContext security_context

-

Security attribute of the sandbox.

-

map<string, string> sysctls

-

Linux sysctls configuration of the sandbox.

-
- -- **PodSandboxMetadata** - - Sandbox metadata contains all information that constructs a sandbox name. It is recommended that the metadata be displayed on the user interface during container running to improve user experience. For example, a unique sandbox name can be generated based on the metadata during running. - - - - - - - - - - - - - - - - - - -

Parameter

-

Description

-

string name

-

Sandbox name.

-

string uid

-

Sandbox UID.

-

string namespace

-

Sandbox namespace.

-

uint32 attempt

-

Number of attempts to create a sandbox.

-

Default value: 0

-
- -- **PodSandboxConfig** - - This API is used to specify all mandatory and optional configurations for creating a sandbox. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Parameter

-

Description

-

PodSandboxMetadata metadata

-

Sandbox metadata, which uniquely identifies a sandbox. The runtime must use the information to ensure that operations are correctly performed, and to improve user experience, for example, construct a readable sandbox name.

-

string hostname

-

Host name of the sandbox.

-

string log_directory

-

Folder for storing container log files in the sandbox.

-

DNSConfig dns_config

-

Sandbox DNS configuration.

-

repeated PortMapping port_mappings

-

Sandbox port mapping.

-

map<string, string> labels

-

Key-value pair that can be used to identify a sandbox or a series of sandboxes.

-

map<string, string> annotations

-

Key-value pair that stores any information, whose values cannot be changed and can be queried by using the PodSandboxStatus API.

-

LinuxPodSandboxConfig linux

-

Options related to the Linux host.

-
- -- **PodSandboxNetworkStatus** - - The API is used to describe the network status of a sandbox. - - - - - - - - - - - - - - - -

Parameter

-

Description

-

string ip

-

IP address of the sandbox.

-

string name

-

Network interface name in the sandbox.

-

string network

-

Name of the additional network.

-
- -- **Namespace** - - The API is used to set namespace options. - - - - - - - - - -

Parameter

-

Description

-

NamespaceOption options

-

Linux namespace options.

-
- -- **LinuxPodSandboxStatus** - - The API is used to describe the status of a Linux sandbox. - - - - - - - - - -

Parameter

-

Description

-

Namespace namespaces

-

Sandbox namespace.

-
- -- **PodSandboxState** - - The API is used to specify enum data of the sandbox status values. - - - - - - - - - - - - -

Parameter

-

Description

-

SANDBOX_READY = 0

-

The sandbox is ready.

-

SANDBOX_NOTREADY = 1

-

The sandbox is not ready.

-
- -- **PodSandboxStatus** - - The API is used to describe the PodSandbox status. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Parameter

-

Description

-

string id

-

Sandbox ID.

-

PodSandboxMetadata metadata

-

Sandbox metadata.

-

PodSandboxState state

-

Sandbox status value.

-

int64 created_at

-

Sandbox creation timestamp (unit: ns).

-

repeated PodSandboxNetworkStatus networks

-

Multi-plane network status of the sandbox.

-

LinuxPodSandboxStatus linux

-

Sandbox status complying with the Linux specifications.

-

map<string, string> labels

-

Key-value pair that can be used to identify a sandbox or a series of sandboxes.

-

map<string, string> annotations

-

Key-value pair that stores any information, whose values cannot be changed by the runtime.

-
- -- **PodSandboxStateValue** - - The API is used to encapsulate [PodSandboxState](#en-us_topic_0182207110_li1818214574195). - - - - - - - - - -

Parameter

-

Description

-

PodSandboxState state

-

Sandbox status value.

-
- -- **PodSandboxFilter** - - The API is used to add filter criteria for the sandbox list. The intersection of multiple filter criteria is displayed. - - - - - - - - - - - - - - - -

Parameter

-

Description

-

string id

-

Sandbox ID.

-

PodSandboxStateValue state

-

Sandbox status.

-

map<string, string> label_selector

-

Sandbox label, which does not support regular expressions and must be fully matched.

-
- -- **PodSandbox** - - This API is used to provide a minimum description of a sandbox. - - - - - - - - - - - - - - - - - - - - - - - - -

Parameter

-

Description

-

string id

-

Sandbox ID.

-

PodSandboxMetadata metadata

-

Sandbox metadata.

-

PodSandboxState state

-

Sandbox status value.

-

int64 created_at

-

Sandbox creation timestamp (unit: ns).

-

map<string, string> labels

-

Key-value pair that can be used to identify a sandbox or a series of sandboxes.

-

map<string, string> annotations

-

Key-value pair that stores any information, whose values cannot be changed by the runtime.

-
- -- **KeyValue** - - The API is used to encapsulate key-value pairs. - - - - - - - - - - - - -

Parameter

-

Description

-

string key

-

Key

-

string value

-

Value

-
- -- **SELinuxOption** - - The API is used to specify the SELinux label of a container. - - - - - - - - - - - - - - - - - - -

Parameter

-

Description

-

string user

-

User

-

string role

-

Role

-

string type

-

Type

-

string level

-

Level

-
- -- **ContainerMetadata** - - Container metadata contains all information that constructs a container name. It is recommended that the metadata be displayed on the user interface during container running to improve user experience. For example, a unique container name can be generated based on the metadata during running. - - - - - - - - - - - - -

Parameter

-

Description

-

string name

-

Container name.

-

uint32 attempt

-

Number of attempts to create a container.

-

Default value: 0

-
- -- **ContainerState** - - The API is used to specify enums of container status values. - - - - - - - - - - - - - - - - - - -

Parameter

-

Description

-

CONTAINER_CREATED = 0

-

The container is created.

-

CONTAINER_RUNNING = 1

-

The container is running.

-

CONTAINER_EXITED = 2

-

The container exits.

-

CONTAINER_UNKNOWN = 3

-

Unknown container status.

-
- -- **ContainerStateValue** - - The API is used to encapsulate the data structure of [ContainerState](#en-us_topic_0182207110_li65182518309). - - - - - - - - - -

Parameter

-

Description

-

ContainerState state

-

Container status value.

-
- -- **ContainerFilter** - - The API is used to add filter criteria for the container list. The intersection of multiple filter criteria is displayed. - - - - - - - - - - - - - - - - - - -

Parameter

-

Description

-

string id

-

Container ID.

-

PodSandboxStateValue state

-

Container status.

-

string pod_sandbox_id

-

Sandbox ID.

-

map<string, string> label_selector

-

Container label, which does not support regular expressions and must be fully matched.

-
- -- **LinuxContainerSecurityContext** - - The API is used to specify container security configurations. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Parameter

-

Description

-

Capability capabilities

-

Added or removed capabilities.

-

bool privileged

-

Whether the container is in privileged mode. Default value: false

-

NamespaceOption namespace_options

-

Container namespace options.

-

SELinuxOption selinux_options

-

SELinux context, which is optional. This parameter does not take effect now.

-

Int64Value run_as_user

-

UID for running container processes. Only run_as_user or run_as_username can be specified at a time. run_as_username takes effect preferentially.

-

string run_as_username

-

Username for running container processes. If specified, the user must exist in /etc/passwd in the container image and be parsed by the runtime. Otherwise, an error must occur during running.

-

bool readonly_rootfs

-

Whether the root file system in a container is read-only. The default value is configured in config.json.

-

repeated int64 supplemental_groups

-

List of user groups of the init process running in the container (except the primary GID).

-

string apparmor_profile

-

AppArmor configuration file of the container. This parameter does not take effect now.

-

string seccomp_profile_path

-

Path of the seccomp configuration file of the container.

-

bool no_new_privs

-

Whether to set the no_new_privs flag in the container.

-
- -- **LinuxContainerResources** - - The API is used to specify configurations of Linux container resources. - - - - - - - - - - - - - - - - - - - - - - - - - - -

Parameter

-

Description

-

int64 cpu_period

-

CPU CFS period. Default value: 0

-

int64 cpu_quota

-

CPU CFS quota. Default value: 0

-

int64 cpu_shares

-

CPU share (relative weight). Default value: 0

-

int64 memory_limit_in_bytes

-

Memory limit (unit: byte). Default value: 0

-

int64 oom_score_adj

-

OOMScoreAdj that is used to adjust the OOM killer. Default value: 0

-

string cpuset_cpus

-

CPU core used by the container. Default value: null

-

string cpuset_mems

-

Memory nodes used by the container. Default value: null

-
- -- **Image** - - The API is used to describe the basic information about an image. - - - - - - - - - - - - - - - - - - - - - - - -

Parameter

-

Description

-

string id

-

Image ID.

-

repeated string repo_tags

-

Image tag name repo_tags.

-

repeated string repo_digests

-

Image digest information.

-

uint64 size

-

Image size.

-

Int64Value uid

-

Default image UID.

-

string username

-

Default image username.

-
- -- **ImageSpec** - - The API is used to represent the internal data structure of an image. Currently, ImageSpec encapsulates only the container image name. - - - - - - - - - -

Parameter

-

Description

-

string image

-

Container image name.

-
- -- **StorageIdentifier** - - The API is used to specify the unique identifier for defining the storage. - - - - - - - - - -

Parameter

-

Description

-

string uuid

-

Device UUID.

-
- -- **FilesystemUsage** - - - - - - - - - - - - - - - - - -

Parameter

-

Description

-

int64 timestamp

-

Timestamp when file system information is collected.

-

StorageIdentifier storage_id

-

UUID of the file system that stores images.

-

UInt64Value used_bytes

-

Size of the metadata that stores images.

-

UInt64Value inodes_used

-

Number of inodes of the metadata that stores images.

-
- -- **AuthConfig** - - - - - - - - - - - - - - - - - - - - - - - -

Parameter

-

Description

-

string username

-

Username used for downloading images.

-

string password

-

Password used for downloading images.

-

string auth

-

Authentication information used for downloading images. The value is encoded by using Base64.

-

string server_address

-

IP address of the server where images are downloaded. This parameter does not take effect now.

-

string identity_token

-

Information about the token used for the registry authentication. This parameter does not take effect now.

-

string registry_token

-

Information about the token used for the interaction with the registry. This parameter does not take effect now.

-
- -- **Container** - - The API is used to describe container information, such as the ID and status. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Parameter

-

Description

-

string id

-

Container ID.

-

string pod_sandbox_id

-

ID of the sandbox to which the container belongs.

-

ContainerMetadata metadata

-

Container metadata.

-

ImageSpec image

-

Image specifications.

-

string image_ref

-

Image used by the container. This parameter is an image ID for most runtime.

-

ContainerState state

-

Container status.

-

int64 created_at

-

Container creation timestamp (unit: ns).

-

map<string, string> labels

-

Key-value pair that can be used to identify a container or a series of containers.

-

map<string, string> annotations

-

Key-value pair that stores any information, whose values cannot be changed by the runtime.

-
- -- **ContainerStatus** - - The API is used to describe the container status information. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Parameter

-

Description

-

string id

-

Container ID.

-

ContainerMetadata metadata

-

Container metadata.

-

ContainerState state

-

Container status.

-

int64 created_at

-

Container creation timestamp (unit: ns).

-

int64 started_at

-

Container start timestamp (unit: ns).

-

int64 finished_at

-

Container exit timestamp (unit: ns).

-

int32 exit_code

-

Container exit code.

-

ImageSpec image

-

Image specifications.

-

string image_ref

-

Image used by the container. This parameter is an image ID for most runtime.

-

string reason

-

Brief description of the reason why the container is in the current status.

-

string message

-

Information that is easy to read and indicates the reason why the container is in the current status.

-

map<string, string> labels

-

Key-value pair that can be used to identify a container or a series of containers.

-

map<string, string> annotations

-

Key-value pair that stores any information, whose values cannot be changed by the runtime.

-

repeated Mount mounts

-

Information about the container mount point.

-

string log_path

-

Path of the container log file that is in the log_directory folder configured in PodSandboxConfig.

-
- -- **ContainerStatsFilter** - - The API is used to add filter criteria for the container stats list. The intersection of multiple filter criteria is displayed. - - - - - - - - - - - - - - - -

Parameter

-

Description

-

string id

-

Container ID.

-

string pod_sandbox_id

-

Sandbox ID.

-

map<string, string> label_selector

-

Container label, which does not support regular expressions and must be fully matched.

-
- -- **ContainerStats** - - The API is used to add filter criteria for the container stats list. The intersection of multiple filter criteria is displayed. - - - - - - - - - - - - - - - - - - -

Parameter

-

Description

-

ContainerAttributes attributes

-

Container information.

-

CpuUsage cpu

-

CPU usage information.

-

MemoryUsage memory

-

Memory usage information.

-

FilesystemUsage writable_layer

-

Information about the writable layer usage.

-
- -- **ContainerAttributes** - - The API is used to list basic container information. - - - - - - - - - - - - - - - - - - -

Parameter

-

Description

-

string id

-

Container ID.

-

ContainerMetadata metadata

-

Container metadata.

-

map<string,string> labels

-

Key-value pair that can be used to identify a container or a series of containers.

-

map<string,string> annotations

-

Key-value pair that stores any information, whose values cannot be changed by the runtime.

-
- -- **CpuUsage** - - The API is used to list the CPU usage information of a container. - - - - - - - - - - - - -

Parameter

-

Description

-

int64 timestamp

-

Timestamp.

-

UInt64Value usage_core_nano_seconds

-

CPU usage (unit: ns).

-
- -- **MemoryUsage** - - The API is used to list the memory usage information of a container. - - - - - - - - - - - - -

Parameter

-

Description

-

int64 timestamp

-

Timestamp.

-

UInt64Value working_set_bytes

-

Memory usage.

-
- -- **FilesystemUsage** - - The API is used to list the read/write layer information of a container. - - - - - - - - - - - - - - - - - - -

Parameter

-

Description

-

int64 timestamp

-

Timestamp.

-

StorageIdentifier storage_id

-

Writable layer directory.

-

UInt64Value used_bytes

-

Number of bytes occupied by images at the writable layer.

-

UInt64Value inodes_used

-

Number of inodes occupied by images at the writable layer.

-
- -- **Device** - - The API is used to specify the host volume to be mounted to a container. - - - - - - - - - - - - - - -

Parameter

-

Description

-

string container_path

-

Mounting path of a container.

-

string host_path

-

Mounting path on the host.

-

string permissions

-

Cgroup permission of a device. (r indicates that containers can be read from a specified device. w indicates that containers can be written to a specified device. m indicates that containers can create new device files.)

-
- -- **LinuxContainerConfig** - - The API is used to specify Linux configurations. - - - - - - - - - - - -

Parameter

-

Description

-

LinuxContainerResources resources

-

Container resource specifications.

-

LinuxContainerSecurityContext security_context

-

Linux container security configuration.

-
- -- **ContainerConfig** - - The API is used to specify all mandatory and optional fields for creating a container. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Parameter

-

Description

-

ContainerMetadata metadata

-

Container metadata. The information will uniquely identify a container and should be used at runtime to ensure correct operations. The information can also be used at runtime to optimize the user experience (UX) design, for example, construct a readable name. This parameter is mandatory.

-

ImageSpec image

-

Image used by the container. This parameter is mandatory.

-

repeated string command

-

Command to be executed. Default value: /bin/sh

-

repeated string args

-

Parameters of the command to be executed.

-

string working_dir

-

Current working path of the command.

-

repeated KeyValue envs

-

Environment variables configured in the container.

-

repeated Mount mounts

-

Information about the mount point to be mounted in the container.

-

repeated Device devices

-

Information about the device to be mapped in the container.

-

map<string, string> labels

-

Key-value pair that can be used to index and select a resource.

-

map<string, string> annotations

-

Unstructured key-value mappings that can be used to store and retrieve any metadata.

-

string log_path

-

Relative path to PodSandboxConfig.LogDirectory, which is used to store logs (STDOUT and STDERR) on the container host.

-

bool stdin

-

Whether to open stdin of the container.

-

bool stdin_once

-

Whether to immediately disconnect other data flows connected with stdin when a data flow connected with stdin is disconnected. This parameter does not take effect now.

-

bool tty

-

Whether to use a pseudo terminal to connect to stdio of the container.

-

LinuxContainerConfig linux

-

Container configuration information in the Linux system.

-
- -- **NetworkConfig** - - This API is used to specify runtime network configurations. - - - - - - - - -

Parameter

-

Description

-

string pod_cidr

-

CIDR used by pod IP addresses.

-
- -- **RuntimeConfig** - - This API is used to specify runtime network configurations. - - - - - - - - -

Parameter

-

Description

-

NetworkConfig network_config

-

Runtime network configurations.

-
- -- **RuntimeCondition** - - The API is used to describe runtime status information. - - - - - - - - - - - - - - - - - -

Parameter

-

Description

-

string type

-

Runtime status type.

-

bool status

-

Runtime status.

-

string reason

-

Brief description of the reason for the runtime status change.

-

string message

-

Message with high readability, which indicates the reason for the runtime status change.

-
- -- **RuntimeStatus** - - The API is used to describe runtime status. - - - - - - - - -

Parameter

-

Description

-

repeated RuntimeCondition conditions

-

List of current runtime status.

-
- -### Runtime Service - -The runtime service provides APIs for operating pods and containers, and APIs for querying the configuration and status information of the runtime service. - -#### RunPodSandbox - -##### Prototype - -```shell -rpc RunPodSandbox(RunPodSandboxRequest) returns (RunPodSandboxResponse) {} -``` - -##### Description - -This API is used to create and start a PodSandbox. If the PodSandbox is successfully run, the sandbox is in the ready state. - -##### Precautions - -1. The default image for starting a sandbox is **rnd-dockerhub.huawei.com/library/pause-$\{**_machine_**\}:3.0** where **$\{**_machine_**\}** indicates the architecture. On x86\_64, the value of _machine_ is **amd64**. On ARM64, the value of _machine_ is **aarch64**. Currently, only the **amd64** or **aarch64** image can be downloaded from the rnd-dockerhub registry. If the image does not exist on the host, ensure that the host can download the image from the rnd-dockerhub registry. If you want to use another image, refer to **pod-sandbox-image** in the _iSulad Deployment Configuration_. -2. The container name is obtained from fields in [PodSandboxMetadata](#apis.md#en-us_topic_0182207110_li2359918134912) and separated by underscores \(\_\). Therefore, the metadata cannot contain underscores \(\_\). Otherwise, the [ListPodSandbox](#listpodsandbox.md#EN-US_TOPIC_0184808098) API cannot be used for query even when the sandbox is running successfully. - -##### Parameters - - - - - - - - - - - - -

Parameter

-

Description

-

PodSandboxConfig config

-

Sandbox configuration.

-

string runtime_handler

-

Runtime for the created sandbox. Currently, lcr and kata-runtime are supported.

-
- -##### Return Values - - - - - - - - - -

Return Value

-

Description

-

string pod_sandbox_id

-

If the operation is successful, the response is returned.

-
- -#### StopPodSandbox - -##### Prototype - -```shell -rpc StopPodSandbox(StopPodSandboxRequest) returns (StopPodSandboxResponse) {} -``` - -##### Description - -This API is used to stop PodSandboxes and sandbox containers, and reclaim the network resources \(such as IP addresses\) allocated to a sandbox. If any running container belongs to the sandbox, the container must be forcibly stopped. - -##### Parameters - - - - - - - - - -

Parameter

-

Description

-

string pod_sandbox_id

-

Sandbox ID.

-
- -##### Return Values - - - - - - - - - -

Return Value

-

Description

-

None

-

None

-
- -#### RemovePodSandbox - -##### Prototype - -```shell -rpc RemovePodSandbox(RemovePodSandboxRequest) returns (RemovePodSandboxResponse) {} -``` - -##### Description - -This API is used to delete a sandbox. If any running container belongs to the sandbox, the container must be forcibly stopped and deleted. If the sandbox has been deleted, no errors will be returned. - -##### Precautions - -1. When a sandbox is deleted, network resources of the sandbox are not deleted. Before deleting a pod, you must call StopPodSandbox to clear network resources. Ensure that StopPodSandbox is called at least once before deleting the sandbox. -2. Ifa sanbox is deleted and containers in the sandbox is not deleted successfully, you need to manually delete the containers. - -##### Parameters - - - - - - - - - -

Parameter

-

Description

-

string pod_sandbox_id

-

Sandbox ID.

-
- -##### Return Values - - - - - - - - - -

Return Value

-

Description

-

None

-

None

-
- -#### PodSandboxStatus - -##### Prototype - -```shell -rpc PodSandboxStatus(PodSandboxStatusRequest) returns (PodSandboxStatusResponse) {} -``` - -##### Description - -This API is used to query the sandbox status. If the sandbox does not exist, an error is returned. - -##### Parameters - - - - - - - - - - - - -

Parameter

-

Description

-

string pod_sandbox_id

-

Sandbox ID

-

bool verbose

-

Whether to display additional information about the sandbox. This parameter does not take effect now.

-
- -##### Return Values - - - - - - - - - - - - -

Return Value

-

Description

-

PodSandboxStatus status

-

Status of the sandbox.

-

map<string, string> info

-

Additional information about the sandbox. The key can be any string, and the value is a JSON character string. The information can be any debugging content. When verbose is set to true, info cannot be empty. This parameter does not take effect now.

-
- -#### ListPodSandbox - -##### Prototype - -```shell -rpc ListPodSandbox(ListPodSandboxRequest) returns (ListPodSandboxResponse) {} -``` - -##### Description - -This API is used to return the sandbox information list. Filtering based on criteria is supported. - -##### Parameters - - - - - - - - - -

Parameter

-

Description

-

PodSandboxFilter filter

-

Filter criteria.

-
- -##### Return Values - - - - - - - - - -

Return Value

-

Description

-

repeated PodSandbox items

-

Sandbox information list.

-
- -#### CreateContainer - -```shell -grpc::Status CreateContainer(grpc::ServerContext *context, const runtime::CreateContainerRequest *request, runtime::CreateContainerResponse *reply) {} -``` - -##### Description - -This API is used to create a container in the PodSandbox. - -##### Precautions - -- **sandbox\_config** in**CreateContainerRequest** is the same as the configuration transferred to **RunPodSandboxRequest** to create a PodSandbox. It is transferred again for reference only. PodSandboxConfig must remain unchanged throughout the lifecycle of a pod. -- The container name is obtained from fields in [ContainerMetadata](#apis.md#en-us_topic_0182207110_li17135914132319) and separated by underscores \(\_\). Therefore, the metadata cannot contain underscores \(\_\). Otherwise, the [ListContainers](#listcontainers.md#EN-US_TOPIC_0184808103) API cannot be used for query even when the sandbox is running successfully. -- **CreateContainerRequest** does not contain the **runtime\_handler** field. The runtime type of the container is the same as that of the corresponding sandbox. - -##### Parameters - - - - - - - - - - - - - - - -

Parameter

-

Description

-

string pod_sandbox_id

-

ID of the PodSandbox where a container is to be created.

-

ContainerConfig config

-

Container configuration information.

-

PodSandboxConfig sandbox_config

-

PodSandbox configuration information.

-
- -#### Supplement - -Unstructured key-value mappings that can be used to store and retrieve any metadata. The field can be used to transfer parameters for the fields for which the CRI does not provide specific parameters. - -- Customize the field: - - - - - - - - - -

Custom key:value

-

Description

-

cgroup.pids.max:int64_t

-

Used to limit the number of processes or threads in a container. (Set the parameter to -1 for unlimited number.)

-
- -##### Return Values - - - - - - - - - -

Return Value

-

Description

-

string container_id

-

ID of the created container.

-
- -#### StartContainer - -##### Prototype - -```shell -rpc StartContainer(StartContainerRequest) returns (StartContainerResponse) {} -``` - -##### Description - -This API is used to start a container. - -##### Parameters - - - - - - - - - -

Parameter

-

Description

-

string container_id

-

Container ID.

-
- -##### Return Values - - - - - - - - - -

Return Value

-

Description

-

None

-

None

-
- -#### StopContainer - -##### Prototype - -```shell -rpc StopContainer(StopContainerRequest) returns (StopContainerResponse) {} -``` - -##### Description - -This API is used to stop a running container. You can set a graceful timeout time. If the container has been stopped, no errors will be returned. - -##### Parameters - - - - - - - - - - - - -

Parameter

-

Description

-

string container_id

-

Container ID.

-

int64 timeout

-

Waiting time before a container is forcibly stopped. The default value is 0, indicating forcible stop.

-
- -##### Return Values - -None - -#### RemoveContainer - -##### Prototype - -```shell -rpc RemoveContainer(RemoveContainerRequest) returns (RemoveContainerResponse) {} -``` - -##### Description - -This API is used to delete a container. If the container is running, it must be forcibly stopped. If the container has been deleted, no errors will be returned. - -##### Parameters - - - - - - - - - -

Parameter

-

Description

-

string container_id

-

Container ID.

-
- -##### Return Values - -None - -#### ListContainers - -##### Prototype - -```shell -rpc ListContainers(ListContainersRequest) returns (ListContainersResponse) {} -``` - -##### Description - -This API is used to return the container information list. Filtering based on criteria is supported. - -##### Parameters - - - - - - - - - -

Parameter

-

Description

-

ContainerFilter filter

-

Filter criteria.

-
- -##### Return Values - - - - - - - - - -

Return Value

-

Description

-

repeated Container containers

-

Container information list.

-
- -#### ContainerStatus - -##### Prototype - -```shell -rpc ContainerStatus(ContainerStatusRequest) returns (ContainerStatusResponse) {} -``` - -##### Description - -This API is used to return the container status information. If the container does not exist, an error will be returned. - -##### Parameters - - - - - - - - - - - - -

Parameter

-

Description

-

string container_id

-

Container ID.

-

bool verbose

-

Whether to display additional information about the sandbox. This parameter does not take effect now.

-
- -##### Return Values - - - - - - - - - - - - -

Return Value

-

Description

-

ContainerStatus status

-

Container status information.

-

map<string, string> info

-

Additional information about the sandbox. The key can be any string, and the value is a JSON character string. The information can be any debugging content. When verbose is set to true, info cannot be empty. This parameter does not take effect now.

-
- -#### UpdateContainerResources - -##### Prototype - -```shell -rpc UpdateContainerResources(UpdateContainerResourcesRequest) returns (UpdateContainerResourcesResponse) {} -``` - -##### Description - -This API is used to update container resource configurations. - -##### Precautions - -- This API cannot be used to update the pod resource configurations. -- The value of **oom\_score\_adj** of any container cannot be updated. - -##### Parameters - - - - - - - - - - - - -

Parameter

-

Description

-

string container_id

-

Container ID.

-

LinuxContainerResources linux

-

Linux resource configuration information.

-
- -##### Return Values - -None - -#### ExecSync - -##### Prototype - -```shell -rpc ExecSync(ExecSyncRequest) returns (ExecSyncResponse) {} -``` - -##### Description - -The API is used to run a command in containers in synchronization mode through the gRPC communication method. - -##### Precautions - -The interaction between the terminal and the containers must be disabled when a single command is executed. - -##### Parameters - - - - - - - - - - - - - - - -

Parameter

-

Description

-

string container_id

-

Container ID.

-

repeated string cmd

-

Command to be executed.

-

int64 timeout

-

Timeout period for stopping the command (unit: second). The default value is 0, indicating that there is no timeout limit. This parameter does not take effect now.

-
- -##### Return Values - - - - - - - - - - - - - - - -

Return Value

-

Description

-

bytes stdout

-

Standard output of the capture command.

-

bytes stderr

-

Standard error output of the capture command.

-

int32 exit_code

-

Exit code, which represents the completion of command execution. The default value is 0, indicating that the command is executed successfully.

-
- -#### Exec - -##### Prototype - -```shell -rpc Exec(ExecRequest) returns (ExecResponse) {} -``` - -##### Description - -This API is used to run commands in a container through the gRPC communication method, that is, obtain URLs from the CRI server, and then use the obtained URLs to establish a long connection to the WebSocket server, implementing the interaction with the container. - -##### Precautions - -The interaction between the terminal and the container can be enabled when a single command is executed. One of **stdin**, **stdout**, and **stderr**must be true. If **tty** is true, **stderr** must be false. Multiplexing is not supported. In this case, the output of **stdout** and **stderr** will be combined to a stream. - -##### Parameters - - - - - - - - - - - - - - - - - - - - - - - - -

Parameter

-

Description

-

string container_id

-

Container ID.

-

repeated string cmd

-

Command to be executed.

-

bool tty

-

Whether to run the command in a TTY.

-

bool stdin

-

Whether to generate the standard input stream.

-

bool stdout

-

Whether to generate the standard output stream.

-

bool stderr

-

Whether to generate the standard error output stream.

-
- -##### Return Values - - - - - - - - - -

Return Value

-

Description

-

string url

-

Fully qualified URL of the exec streaming server.

-
- -#### Attach - -##### Prototype - -```shell -rpc Attach(AttachRequest) returns (AttachResponse) {} -``` - -##### Description - -This API is used to take over the init process of a container through the gRPC communication method, that is, obtain URLs from the CRI server, and then use the obtained URLs to establish a long connection to the WebSocket server, implementing the interaction with the container. Only containers whose runtime is of the LCR type are supported. - -##### Parameters - - - - - - - - - - - - - - - - - - - - - -

Parameter

-

Description

-

string container_id

-

Container ID.

-

bool tty

-

Whether to run the command in a TTY.

-

bool stdin

-

Whether to generate the standard input stream.

-

bool stdout

-

Whether to generate the standard output stream.

-

bool stderr

-

Whether to generate the standard error output stream.

-
- -##### Return Values - - - - - - - - - -

Return Value

-

Description

-

string url

-

Fully qualified URL of the attach streaming server.

-
- -#### ContainerStats - -##### Prototype - -```shell -rpc ContainerStats(ContainerStatsRequest) returns (ContainerStatsResponse) {} -``` - -##### Description - -This API is used to return information about resources occupied by a container. Only containers whose runtime is of the LCR type are supported. - -##### Parameters - - - - - - - - - -

Parameter

-

Description

-

string container_id

-

Container ID.

-
- -##### Return Values - - - - - - - - - -

Return Value

-

Description

-

ContainerStats stats

-

Container information. Note: Disks and inodes support only the query of containers started by OCI images.

-
- -#### ListContainerStats - -##### Prototype - -```shell -rpc ListContainerStats(ListContainerStatsRequest) returns (ListContainerStatsResponse) {} -``` - -##### Description - -This API is used to return the information about resources occupied by multiple containers. Filtering based on criteria is supported. - -##### Parameters - - - - - - - - - -

Parameter

-

Description

-

ContainerStatsFilter filter

-

Filter criteria.

-
- -##### Return Values - - - - - - - - - -

Return Value

-

Description

-

repeated ContainerStats stats

-

Container information list. Note: Disks and inodes support only the query of containers started by OCI images.

-
- -#### UpdateRuntimeConfig - -##### Prototype - -```shell -rpc UpdateRuntimeConfig(UpdateRuntimeConfigRequest) returns (UpdateRuntimeConfigResponse); -``` - -##### Description - -This API is used as a standard CRI to update the pod CIDR of the network plug-in. Currently, the CNI network plug-in does not need to update the pod CIDR. Therefore, this API records only access logs. - -##### Precautions - -API operations will not modify the system management information, but only record a log. - -##### Parameters - - - - - - - - - -

Parameter

-

Description

-

RuntimeConfig runtime_config

-

Information to be configured for the runtime.

-
- -##### Return Values - -None - -#### Status - -##### Prototype - -```shell -rpc Status(StatusRequest) returns (StatusResponse) {}; -``` - -##### Description - -This API is used to obtain the network status of the runtime and pod. Obtaining the network status will trigger the update of network configuration. Only containers whose runtime is of the LCR type are supported. - -##### Precautions - -If the network configuration fails to be updated, the original configuration is not affected. The original configuration is overwritten only when the update is successful. - -##### Parameters - - - - - - - - - -

Parameter

-

Description

-

bool verbose

-

Whether to display additional runtime information. This parameter does not take effect now.

-
- -##### Return Values - - - - - - - - - - - - -

Return Value

-

Description

-

RuntimeStatus status

-

Runtime status.

-

map<string, string> info

-

Additional information about the runtime. The key of info can be any value. The value must be in JSON format and can contain any debugging information. When verbose is set to true, info cannot be empty.

-
- -### Image Service - -The service provides the gRPC API for pulling, viewing, and removing images from the registry. - -#### ListImages - -##### Prototype - -```shell -rpc ListImages(ListImagesRequest) returns (ListImagesResponse) {} -``` - -##### Description - -This API is used to list existing image information. - -##### Precautions - -This is a unified API. You can run the **cri images** command to query embedded images. However, embedded images are not standard OCI images. Therefore, query results have the following restrictions: - -- An embedded image does not have an image ID. Therefore, the value of **image ID** is the config digest of the image. -- An embedded image has only config digest, and it does not comply with the OCI image specifications. Therefore, the value of **digest** cannot be displayed. - -##### Parameters - - - - - - - - - -

Parameter

-

Description

-

ImageSpec filter

-

Name of the image to be filtered.

-
- -##### Return Values - - - - - - - - - -

Return Value

-

Description

-

repeated Image images

-

Image information list.

-
- -#### ImageStatus - -##### Prototype - -```shell -rpc ImageStatus(ImageStatusRequest) returns (ImageStatusResponse) {} -``` - -##### Description - -The API is used to query the information about a specified image. - -##### Precautions - -1. If the image to be queried does not exist, **ImageStatusResponse** is returned and **Image** is set to **nil** in the return value. -2. This is a unified API. Since embedded images do not comply with the OCI image specifications and do not contain required fields, the images cannot be queried by using this API. - -##### Parameters - - - - - - - - - - - - -

Parameter

-

Description

-

ImageSpec image

-

Image name.

-

bool verbose

-

Whether to query additional information. This parameter does not take effect now. No additional information is returned.

-
- -##### Return Values - - - - - - - - - - - - -

Return Value

-

Description

-

Image image

-

Image information.

-

map<string, string> info

-

Additional image information. This parameter does not take effect now. No additional information is returned.

-
- -#### PullImage - -##### Prototype - -```shell - rpc PullImage(PullImageRequest) returns (PullImageResponse) {} -``` - -##### Description - -This API is used to download images. - -##### Precautions - -Currently, you can download public images, and use the username, password, and auth information to download private images. The **server\_address**, **identity\_token**, and **registry\_token** fields in **authconfig** cannot be configured. - -##### Parameters - - - - - - - - - - - - - - - -

Parameter

-

Description

-

ImageSpec image

-

Name of the image to be downloaded.

-

AuthConfig auth

-

Verification information for downloading a private image.

-

PodSandboxConfig sandbox_config

-

Whether to download an image in the pod context. This parameter does not take effect now.

-
- -##### Return Values - - - - - - - - - -

Return Value

-

Description

-

string image_ref

-

Information about the downloaded image.

-
- -#### RemoveImage - -##### Prototype - -```shell -rpc RemoveImage(RemoveImageRequest) returns (RemoveImageResponse) {} -``` - -##### Description - -This API is used to delete specified images. - -##### Precautions - -This is a unified API. Since embedded images do not comply with the OCI image specifications and do not contain required fields, you cannot delete embedded images by using this API and the image ID. - -##### Parameters - - - - - - - - - -

Parameter

-

Description

-

ImageSpec image

-

Name or ID of the image to be deleted.

-
- -##### Return Values - -None - -#### ImageFsInfo - -##### Prototype - -```shell -rpc ImageFsInfo(ImageFsInfoRequest) returns (ImageFsInfoResponse) {} -``` - -##### Description - -This API is used to query the information about the file system that stores images. - -##### Precautions - -Queried results are the file system information in the image metadata. - -##### Parameters - -None - -##### Return Values - - - - - - - - - -

Return Value

-

Description

-

repeated FilesystemUsage image_filesystems

-

Information about the file system that stores images.

-
- -### Constraints - -1. If **log\_directory** is configured in the **PodSandboxConfig** parameter when a sandbox is created, **log\_path** must be specified in **ContainerConfig** when all containers that belong to the sandbox are created. Otherwise, the containers may not be started or deleted by using the CRI. - - The actual value of **LOGPATH** of containers is **log\_directory/log\_path**. If **log\_path** is not set, the final value of **LOGPATH** is changed to **log\_directory**. - - - If the path does not exist, iSulad will create a soft link pointing to the actual path of container logs when starting a container. Then **log\_directory** becomes a soft link. There are two cases: - 1. In the first case, if **log\_path** is not configured for other containers in the sandbox, **log\_directory** will be deleted and point to **log\_path** of the newly started container. As a result, logs of the first started container point to logs of the later started container. - 2. In the second case, if **log\_path** is configured for other containers in the sandbox, the value of **LOGPATH** of the container is **log\_directory/log\_path**. Because **log\_directory** is a soft link, the creation fails when **log\_directory/log\_path** is used as the soft link to point to the actual path of container logs. - - - If the path exists, iSulad will attempt to delete the path \(non-recursive\) when starting a container. If the path is a folder path containing content, the deletion fails. As a result, the soft link fails to be created, the container fails to be started, and the same error occurs when the container is going to be deleted. - -2. If **log\_directory** is configured in the **PodSandboxConfig** parameter when a sandbox is created, and **log\_path** is specified in **ContainerConfig** when a container is created, the final value of **LOGPATH** is **log\_directory/log\_path**. iSulad does not recursively create **LOGPATH**, therefore, you must ensure that **dirname\(LOGPATH\)** exists, that is, the upper-level path of the final log file path exists. -3. If **log\_directory** is configured in the **PodSandboxConfig** parameter when a sandbox is created, and the same **log\_path** is specified in **ContainerConfig** when multiple containers are created, or if containers in different sandboxes point to the same **LOGPATH**, the latest container log path will overwrite the previous path after the containers are started successfully. -4. If the image content in the remote registry changes and the original image is stored in the local host, the name and tag of the original image are changed to **none** when you call the CRI Pull image API to download the image again. - - An example is as follows: - - Locally stored images: - - ```console - IMAGE TAG IMAGE ID SIZE - rnd-dockerhub.huawei.com/pproxyisulad/test latest 99e59f495ffaa 753kB - ``` - - After the **rnd-dockerhub.huawei.com/pproxyisulad/test:latest** image in the remote registry is updated and downloaded again: - - ```console - IMAGE TAG IMAGE ID SIZE - 99e59f495ffaa 753kB - rnd-dockerhub.huawei.com/pproxyisulad/test latest d8233ab899d41 1.42MB - ``` - - Run the **isula images** command. The value of **REF** is displayed as **-**. - - ```console - REF IMAGE ID CREATED SIZE - rnd-dockerhub.huawei.com/pproxyisulad/test:latest d8233ab899d41 2019-02-14 19:19:37 1.42MB - - 99e59f495ffaa 2016-05-04 02:26:41 753kB - ``` diff --git a/docs/en/docs/Container/iSulad_support_for_CDI.md b/docs/en/docs/Container/iSulad_support_for_CDI.md new file mode 100644 index 0000000000000000000000000000000000000000..e7e3cd7ab551a524f2f898fbb3c018d4cc0ea422 --- /dev/null +++ b/docs/en/docs/Container/iSulad_support_for_CDI.md @@ -0,0 +1,120 @@ +# iSulad Support for CDI + +## Overview + +Container Device Interface (CDI) is a container runtime specification used to support third-party devices. + +CDI solves the following problems: +In Linux, only one device node needed to be exposed in a container in the past to enable device awareness of the container. However, as devices and software become more complex, vendors want to perform more operations, such as: + +- Exposing multiple device nodes to a container, mounting files from a runtime namespace to a container, or hiding procfs entries. +- Checking the compatibility between containers and devices. For example, checking whether a container can run on a specified device. +- Performing runtime-specific operations, such as virtual machines and Linux container-based runtimes. +- Performing device-specific operations, such as GPU memory cleanup and FPGA re-programming. + +In the absence of third-party device standards, vendors often have to write and maintain multiple plugins for different runtimes, or even contribute vendor-specific code directly in a runtime. In addition, the runtime does not expose the plugin system in a unified manner (or even not at all), resulting in duplication of functionality in higher-level abstractions (such as Kubernetes device plugins). + +To solve the preceding problem, CDI provides the following features: +CDI describes a mechanism that allows third-party vendors to interact with devices without modifying the container runtime. + +The mechanism is exposed as a JSON file (similar to the container network interface CNI), which allows vendors to describe the operations that the container runtime should perform on the OCI-based container. + +Currently, iSulad supports the [CDI v0.6.0](https://github.com/cncf-tags/container-device-interface/blob/v0.6.0/SPEC.md) specification. + +## Configuring iSulad to Support CDI + +Modify the **daemon.json** file as follows and restart iSulad: + +```json +{ + ... + "enable-cri-v1": true, + "cdi-spec-dirs": ["/etc/cdi", "/var/run/cdi"], + "enable-cdi": true +} +``` + +**cdi-spec-dirs** specifies the directory where CDI specifications are stored. If this parameter is not specified, the default value **/etc/cdi** or **/var/run/cdi** is used. + +## Examples + +### CDI Specification Example + +For details about each field, see [CDI v0.6.0](https://github.com/cncf-tags/container-device-interface/blob/v0.6.0/SPEC.md). + +```bash +$ mkdir /etc/cdi +$ cat > /etc/cdi/vendor.json < configuration file \> default configuration in code. - ->![](./public_sys-resources/icon-note.gif) **NOTE:** ->If systemd is used to manage the iSulad process, modify the **OPTIONS** field in the **/etc/sysconfig/iSulad** file, which functions the same as using the CLI. - -- **CLI** - - During service startup, configure iSulad using the CLI. To view the configuration options, run the following command: - - ``` - $ isulad --help - lightweight container runtime daemon - - Usage: isulad [global options] - - GLOBAL OPTIONS: - - --authorization-plugin Use authorization plugin - --cgroup-parent Set parent cgroup for all containers - --cni-bin-dir The full path of the directory in which to search for CNI plugin binaries. Default: /opt/cni/bin - --cni-conf-dir The full path of the directory in which to search for CNI config files. Default: /etc/cni/net.d - --default-ulimit Default ulimits for containers (default []) - -e, --engine Select backend engine - -g, --graph Root directory of the iSulad runtime - -G, --group Group for the unix socket(default is isulad) - --help Show help - --hook-spec Default hook spec file applied to all containers - -H, --host The socket name used to create gRPC server - --image-layer-check Check layer intergrity when needed - --image-opt-timeout Max timeout(default 5m) for image operation - --insecure-registry Disable TLS verification for the given registry - --insecure-skip-verify-enforce Force to skip the insecure verify(default false) - --log-driver Set daemon log driver, such as: file - -l, --log-level Set log level, the levels can be: FATAL ALERT CRIT ERROR WARN NOTICE INFO DEBUG TRACE - --log-opt Set daemon log driver options, such as: log-path=/tmp/logs/ to set directory where to store daemon logs - --native.umask Default file mode creation mask (umask) for containers - --network-plugin Set network plugin, default is null, support null and cni - -p, --pidfile Save pid into this file - --pod-sandbox-image The image whose network/ipc namespaces containers in each pod will use. (default "rnd-dockerhub.huawei.com/library/pause-${machine}:3.0") - --registry-mirrors Registry to be prepended when pulling unqualified images, can be specified multiple times - --start-timeout timeout duration for waiting on a container to start before it is killed - -S, --state Root directory for execution state files - --storage-driver Storage driver to use(default overlay2) - -s, --storage-opt Storage driver options - --tls Use TLS; implied by --tlsverify - --tlscacert Trust certs signed only by this CA (default "/root/.iSulad/ca.pem") - --tlscert Path to TLS certificate file (default "/root/.iSulad/cert.pem") - --tlskey Path to TLS key file (default "/root/.iSulad/key.pem") - --tlsverify Use TLS and verify the remote - --use-decrypted-key Use decrypted private key by default(default true) - -V, --version Print the version - --websocket-server-listening-port CRI websocket streaming service listening port (default 10350) - ``` - - Example: Start iSulad and change the log level to DEBUG. - - ``` - $ isulad -l DEBUG - ``` - - -- **Configuration file** - - The iSulad configuration file is **/etc/isulad/daemon.json**. The parameters in the file are described as follows: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Parameter

-

Example

-

Description

-

Remarks

-

-e, --engine

-

"engine": "lcr"

-

iSulad runtime, which is Icr by default.

-

None

-

-G, --group

-

"group": "isulad"

-

Socket group.

-

None

-

--hook-spec

-

"hook-spec": "/etc/default/isulad/hooks/default.json"

-

Default hook configuration file for all containers.

-

None

-

-H, --host

-

"hosts": "unix:///var/run/isulad.sock"

-

Communication mode.

-

In addition to the local socket, the tcp://ip:port mode is supported. The port number ranges from 0 to 65535, excluding occupied ports.

-

--log-driver

-

"log-driver": "file"

-

Log driver configuration.

-

None

-

-l, --log-level

-

"log-level": "ERROR"

-

Log output level.

-

None

-

--log-opt

-

"log-opts": {

-

"log-file-mode": "0600",

-

"log-path": "/var/lib/isulad",

-

"max-file": "1",

-

"max-size": "30KB"

-

}

-

Log-related configuration.

-

You can specify max-file, max-size, and log-path. max-file indicates the number of log files. max-size indicates the threshold for triggering log anti-explosion. If max-file is 1, max-size is invalid. log-path specifies the path for storing log files. The log-file-mode command is used to set the permissions to read and write log files. The value must be in octal format, for example, 0666.

-

--start-timeout

-

"start-timeout": "2m"

-

Time required for starting a container.

-

None

-

--runtime

-

"default-runtime": "lcr"

-

Container runtime, which is lcr by default.

-

If neither the CLI nor the configuration file specifies the runtime, lcr is used by default. The priorities of the three specifying methods are as follows: CLI > configuration file > default value lcr. Currently, lcr and kata-runtime are supported.

-

None

-
"runtimes":  {
-        "kata-runtime": {
-          "path": "/usr/bin/kata-runtime",
-          "runtime-args": [
-            "--kata-config",
-            "/usr/share/defaults/kata-containers/configuration.toml"
-          ]
-        }
-    }
-

When starting a container, set this parameter to specify multiple runtimes. Runtimes in this set are valid for container startup.

-

Runtime whitelist of a container. The customized runtimes in this set are valid. kata-runtime is used as the example.

-

-p, --pidfile

-

"pidfile": "/var/run/isulad.pid"

-

File for storing PIDs.

-

This parameter is required only when more than two container engines need to be started.

-

-g, --graph

-

"graph": "/var/lib/isulad"

-

Root directory for iSulad runtimes.

-

-S, --state

-

"state": "/var/run/isulad"

-

Root directory of the execution file.

-

--storage-driver

-

"storage-driver": "overlay2"

-

Image storage driver, which is overlay2 by default.

-

Only overlay2 is supported.

-

-s, --storage-opt

-

"storage-opts": [ "overlay2.override_kernel_check=true" ]

-

Image storage driver configuration options.

-

The options are as follows:

-
overlay2.override_kernel_check=true #Ignore the kernel version check.
-    overlay2.size=${size} #Set the rootfs quota to ${size}.
-    overlay2.basesize=${size} #It is equivalent to overlay2.size.
-

--image-opt-timeout

-

"image-opt-timeout": "5m"

-

Image operation timeout interval, which is 5m by default.

-

The value -1 indicates that the timeout interval is not limited.

-

--registry-mirrors

-

"registry-mirrors": [ "docker.io" ]

-

Registry address.

-

None

-

--insecure-registry

-

"insecure-registries": [ ]

-

Registry without TLS verification.

-

None

-

--native.umask

-

"native.umask": "secure"

-

Container umask policy. The default value is secure. The value normal indicates insecure configuration.

-

Set the container umask value.

-

The value can be null (0027 by default), normal, or secure.

-
normal #The umask value of the started container is 0022.
-    secure #The umask value of the started container is 0027 (default value).
-

--pod-sandbox-image

-

"pod-sandbox-image": "rnd-dockerhub.huawei.com/library/pause-aarch64:3.0"

-

By default, the pod uses the image. The default value is rnd-dockerhub.huawei.com/library/pause-${machine}:3.0.

-

None

-

--network-plugin

-

"network-plugin": ""

-

Specifies a network plug-in. The value is a null character by default, indicating that no network configuration is available and the created sandbox has only the loop NIC.

-

The CNI and null characters are supported. Other invalid values will cause iSulad startup failure.

-

--cni-bin-dir

-

"cni-bin-dir": ""

-

Specifies the storage location of the binary file on which the CNI plug-in depends.

-

The default value is /opt/cni/bin.

-

--cni-conf-dir

-

"cni-conf-dir": ""

-

Specifies the storage location of the CNI network configuration file.

-

The default value is /etc/cni/net.d.

-

--image-layer-check=false

-

"image-layer-check": false

-

Image layer integrity check. To enable the function, set it to true; otherwise, set it to false. It is disabled by default.

-

When iSulad is started, the image layer integrity is checked. If the image layer is damaged, the related images are unavailable. iSulad cannot verify empty files, directories, and link files. Therefore, if the preceding files are lost due to a power failure, the integrity check of iSulad image data may fail to be identified. When the iSulad version changes, check whether the parameter is supported. If not, delete it from the configuration file.

-

--insecure-skip-verify-enforce=false

-

"insecure-skip-verify-enforce": false

-

Indicates whether to forcibly skip the verification of the certificate host name/domain name. The value is of the Boolean type, and the default value is false. If this parameter is set to true, the verification of the certificate host name/domain name is skipped.

-

The default value is false (not skipped). Note: Restricted by the YAJL JSON parsing library, if a non-Boolean value that meets the JSON format requirements is configured in the /etc/isulad/daemon.json configuration file, the default value used by iSulad is false.

-

--use-decrypted-key=true

-

"use-decrypted-key": true

-

Specifies whether to use an unencrypted private key. The value is of the Boolean type. If this parameter is set to true, an unencrypted private key is used. If this parameter is set to false, the encrypted private key is used, that is, two-way authentication is required.

-

The default value is true, indicating that an unencrypted private key is used. Note: Restricted by the YAJL JSON parsing library, if a non-Boolean value that meets the JSON format requirements is configured in the /etc/isulad/daemon.json configuration file, the default value used by iSulad is true.

-

--tls

-

"tls":false

-

Specifies whether to use TLS. The value is of the Boolean type.

-

This parameter is used only in -H tcp://IP:PORT mode. The default value is false.

-

--tlsverify

-

"tlsverify":false

-

Specifies whether to use TLS and verify remote access. The value is of the Boolean type.

-

This parameter is used only in -H tcp://IP:PORT mode.

-

--tlscacert

-

--tlscert

-

--tlskey

-

"tls-config": {

-

"CAFile": "/root/.iSulad/ca.pem",

-

"CertFile": "/root/.iSulad/server-cert.pem",

-

"KeyFile":"/root/.iSulad/server-key.pem"

-

}

-

TLS certificate-related configuration.

-

This parameter is used only in -H tcp://IP:PORT mode.

-

--authorization-plugin

-

"authorization-plugin": "authz-broker"

-

User permission authentication plugin.

-

Only authz-broker is supported.

-

--cgroup-parent

-

"cgroup-parent": "lxc/mycgroup"

-

Default cgroup parent path of a container, which is of the string type.

-

Specifies the cgroup parent path of a container. If --cgroup-parent is specified on the client, the client parameter prevails.

-

Note: If container A is started before container B, the cgroup parent path of container B is specified as the cgroup path of container A. When deleting a container, you need to delete container B and then container A in sequence. Otherwise, residual cgroup resources exist.

-

--default-ulimits

-

"default-ulimits": {

-

"nofile": {

-

"Name": "nofile",

-

"Hard": 6400,

-

"Soft": 3200

-

}

-

}

-

Specifies the ulimit restriction type, soft value, and hard value.

-

Specifies the restricted resource type, for example, nofile. The two field names must be the same, that is, nofile. Otherwise, an error is reported. The value of Hard must be greater than or equal to that of Soft. If the Hard or Soft field is not set, the default value 0 is used.

-

--websocket-server-listening-port

-

"websocket-server-listening-port": 10350

-

Specifies the listening port of the CRI WebSocket streaming service. The default port number is 10350.

-

Specifies the listening port of the CRI websocket streaming service.

-

If the client specifies --websocket-server-listening-port, the specified value is used. The port number ranges from 1024 to 49151.

-
- - Example: - - ``` - $ cat /etc/isulad/daemon.json - { - "group": "isulad", - "default-runtime": "lcr", - "graph": "/var/lib/isulad", - "state": "/var/run/isulad", - "engine": "lcr", - "log-level": "ERROR", - "pidfile": "/var/run/isulad.pid", - "log-opts": { - "log-file-mode": "0600", - "log-path": "/var/lib/isulad", - "max-file": "1", - "max-size": "30KB" - }, - "log-driver": "stdout", - "hook-spec": "/etc/default/isulad/hooks/default.json", - "start-timeout": "2m", - "storage-driver": "overlay2", - "storage-opts": [ - "overlay2.override_kernel_check=true" - ], - "registry-mirrors": [ - "docker.io" - ], - "insecure-registries": [ - "rnd-dockerhub.huawei.com" - ], - "pod-sandbox-image": "", - "image-opt-timeout": "5m", - "native.umask": "secure", - "network-plugin": "", - "cni-bin-dir": "", - "cni-conf-dir": "", - "image-layer-check": false, - "use-decrypted-key": true, - "insecure-skip-verify-enforce": false - } - ``` - - >![](./public_sys-resources/icon-notice.gif) **NOTICE:** - >The default configuration file **/etc/isulad/daemon.json** is for reference only. Configure it based on site requirements. - - -### Storage Description - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

File

-

Directory

-

Description

-

\*

-

/etc/default/isulad/

-

Stores the OCI configuration file and hook template file of iSulad. The file configuration permission is set to 0640, and the sysmonitor check permission is set to 0550.

-

\*

-

/etc/isulad/

-

Default configuration files of iSulad and seccomp.

-

isulad.sock

-

/var/run/

-

Pipe communication file, which is used for the communication between the client and iSulad.

-

isulad.pid

-

/var/run/

-

File for storing the iSulad PIDs. It is also a file lock to prevent multiple iSulad instances from being started.

-

\*

-

/run/lxc/

-

Lock file, which is created during iSulad running.

-

\*

-

/var/run/isulad/

-

Real-time communication cache file, which is created during iSulad running.

-

\*

-

/var/run/isula/

-

Real-time communication cache file, which is created during iSulad running.

-

\*

-

/var/lib/lcr/

-

Temporary directory of the LCR component.

-

\*

-

/var/lib/isulad/

-

Root directory where iSulad runs, which stores the created container configuration, default log path, database file, and mount point.

-

/var/lib/isulad/mnt/: mount point of the container rootfs.

-

/var/lib/isulad/engines/lcr/: directory for storing LCR container configurations. Each container has a directory named after the container.

-
- -### Constraints - -- In high concurrency scenarios \(200 containers are concurrently started\), the memory management mechanism of Glibc may cause memory holes and large virtual memory \(for example, 10 GB\). This problem is caused by the restriction of the Glibc memory management mechanism in the high concurrency scenario, but not by memory leakage. Therefore, the memory consumption does not increase infinitely. You can set **MALLOC\_ARENA\_MAX** to reducevirtual memory error and increase the rate of reducing physical memory. However, this environment variable will cause the iSulad concurrency performance to deteriorate. Set this environment variable based on the site requirements. - - ``` - To balance performance and memory usage, set MALLOC_ARENA_MAX to 4. (The iSulad performance on the ARM64 server is affected by less than 10%.) - - Configuration method: - 1. To manually start iSulad, run the export MALLOC_ARENA_MAX=4 command and then start iSulad. - 2. If systemd manages iSulad, you can modify the /etc/sysconfig/iSulad file by adding MALLOC_ARENA_MAX=4. - ``` - -- Precautions for specifying the daemon running directories - - Take **--root** as an example. When **/new/path/** is used as the daemon new root directory, if a file exists in **/new/path/** and the directory or file name conflicts with that required by iSulad \(for example, **engines** and **mnt**\), iSulad may update the original directory or file attributes including the owner and permission. - - Therefore, please note the impact of re-specifying various running directories and files on their attributes. You are advised to specify a new directory or file for iSulad to avoid file attribute changes and security issues caused by conflicts. - -- Log file management: - - >![](./public_sys-resources/icon-notice.gif) **NOTICE:** - >Log function interconnection: logs are managed by systemd as iSulad is and then transmitted to rsyslogd. By default, rsyslog restricts the log writing speed. You can add the configuration item **$imjournalRatelimitInterval 0** to the **/etc/rsyslog.conf** file and restart the rsyslogd service. - -- Restrictions on command line parameter parsing - - When the iSulad command line interface is used, the parameter parsing mode is slightly different from that of Docker. For flags with parameters in the command line, regardless of whether a long or short flag is used, only the first space after the flag or the character string after the equal sign \(=\) directly connected to the flag is used as the flag parameter. The details are as follows: - - 1. When a short flag is used, each character in the character string connected to the hyphen \(-\) is considered as a short flag. If there is an equal sign \(=\), the character string following the equal sign \(=\) is considered as the parameter of the short flag before the equal sign \(=\). - - **isula run -du=root busybox** is equivalent to **isula run -du root busybox**, **isula run -d -u=root busybox**, or **isula run -d -u root busybox**. When **isula run -du:root** is used, as **-:** is not a valid short flag, an error is reported. The preceding command is equivalent to **isula run -ud root busybox**. However, this method is not recommended because it may cause semantic problems. - - 1. When a long flag is used, the character string connected to **--** is regarded as a long flag. If the character string contains an equal sign \(=\), the character string before the equal sign \(=\) is a long flag, and the character string after the equal sign \(=\) is a parameter. - - ``` - isula run --user=root busybox - ``` - - or - - ``` - isula run --user root busybox - ``` - - -- After an iSulad container is started, you cannot run the **isula run -i/-t/-ti** and **isula attach/exec** commands as a non-root user. -- When iSulad connects to an OCI container, only kata-runtime can be used to start the OCI container. - -### Daemon Multi-Port Binding - -#### Description - -The daemon can bind multiple UNIX sockets or TCP ports and listen on these ports. The client can interact with the daemon through these ports. - -#### Port - -Users can configure one or more ports in the hosts field in the **/etc/isulad/daemon.json** file, or choose not to specify hosts. - -``` -{ - "hosts": [ - "unix:///var/run/isulad.sock", - "tcp://localhost:5678", - "tcp://127.0.0.1:6789" - ] -} -``` - -Users can also run the **-H** or **--host** command in the **/etc/sysconfig/iSulad** file to configure a port, or choose not to specify hosts. - -``` -OPTIONS='-H unix:///var/run/isulad.sock --host tcp://127.0.0.1:6789' -``` - -If hosts are not specified in the **daemon.json** file and iSulad, the daemon listens on **unix:///var/run/isulad.sock** by default after startup. - -#### Restrictions - -- Users cannot specify hosts in the **/etc/isulad/daemon.json** and **/etc/sysconfig/iSuald** files at the same time. Otherwise, an error will occur and iSulad cannot be started. - - ``` - unable to configure the isulad with file /etc/isulad/daemon.json: the following directives are specified both as a flag and in the configuration file: hosts: (from flag: [unix:///var/run/isulad.sock tcp://127.0.0.1:6789], from file: [unix:///var/run/isulad.sock tcp://localhost:5678 tcp://127.0.0.1:6789]) - ``` - -- If the specified host is a UNIX socket, the socket must start with **unix://** followed by a valid absolute path. -- If the specified host is a TCP port, the TCP port number must start with **tcp://** followed by a valid IP address and port number. The IP address can be that of the local host. -- A maximum of 10 valid ports can be specified. If more than 10 ports are specified, an error will occur and iSulad cannot be started. - -### Configuring TLS Authentication and Enabling Remote Access - -#### Description - -iSulad is designed in C/S mode. By default, the iSulad daemon process listens only on the local/var/run/isulad.sock. Therefore, you can run commands to operate containers only on the local client iSula. To enable iSula's remote access to the container, the iSulad daemon process needs to listen on the remote access port using TCP/IP. However, listening is performed only by simply configuring tcp ip:port. In this case, all IP addresses can communicate with iSulad by calling **isula -H tcp://**_remote server IP address_**:port**, which may cause security problems. Therefore, it is recommended that a more secure version, namely Transport Layer Security \(TLS\), be used for remote access. - -#### Generating TLS Certificate - -- Example of generating a plaintext private key and certificate - - ``` - #!/bin/bash - set -e - echo -n "Enter pass phrase:" - read password - echo -n "Enter public network ip:" - read publicip - echo -n "Enter host:" - read HOST - - echo " => Using hostname: $publicip, You MUST connect to iSulad using this host!" - - mkdir -p $HOME/.iSulad - cd $HOME/.iSulad - rm -rf $HOME/.iSulad/* - - echo " => Generating CA key" - openssl genrsa -passout pass:$password -aes256 -out ca-key.pem 4096 - echo " => Generating CA certificate" - openssl req -passin pass:$password -new -x509 -days 365 -key ca-key.pem -sha256 -out ca.pem -subj "/C=CN/ST=zhejiang/L=hangzhou/O=Huawei/OU=iSulad/CN=iSulad@huawei.com" - echo " => Generating server key" - openssl genrsa -passout pass:$password -out server-key.pem 4096 - echo " => Generating server CSR" - openssl req -passin pass:$password -subj /CN=$HOST -sha256 -new -key server-key.pem -out server.csr - echo subjectAltName = DNS:$HOST,IP:$publicip,IP:127.0.0.1 >> extfile.cnf - echo extendedKeyUsage = serverAuth >> extfile.cnf - echo " => Signing server CSR with CA" - openssl x509 -req -passin pass:$password -days 365 -sha256 -in server.csr -CA ca.pem -CAkey ca-key.pem -CAcreateserial -out server-cert.pem -extfile extfile.cnf - echo " => Generating client key" - openssl genrsa -passout pass:$password -out key.pem 4096 - echo " => Generating client CSR" - openssl req -passin pass:$password -subj '/CN=client' -new -key key.pem -out client.csr - echo " => Creating extended key usage" - echo extendedKeyUsage = clientAuth > extfile-client.cnf - echo " => Signing client CSR with CA" - openssl x509 -req -passin pass:$password -days 365 -sha256 -in client.csr -CA ca.pem -CAkey ca-key.pem -CAcreateserial -out cert.pem -extfile extfile-client.cnf - rm -v client.csr server.csr extfile.cnf extfile-client.cnf - chmod -v 0400 ca-key.pem key.pem server-key.pem - chmod -v 0444 ca.pem server-cert.pem cert.pem - ``` - - -- Example of generating an encrypted private key and certificate request file - - ``` - #!/bin/bash - - echo -n "Enter public network ip:" - read publicip - echo -n "Enter pass phrase:" - read password - - # remove certificates from previous execution. - rm -f *.pem *.srl *.csr *.cnf - - - # generate CA private and public keys - echo 01 > ca.srl - openssl genrsa -aes256 -out ca-key.pem -passout pass:$password 2048 - openssl req -subj '/C=CN/ST=zhejiang/L=hangzhou/O=Huawei/OU=iSulad/CN=iSulad@huawei.com' -new -x509 -days $DAYS -passin pass:$password -key ca-key.pem -out ca.pem - - # create a server key and certificate signing request (CSR) - openssl genrsa -aes256 -out server-key.pem -passout pass:$PASS 2048 - openssl req -new -key server-key.pem -out server.csr -passin pass:$password -subj '/CN=iSulad' - - echo subjectAltName = DNS:iSulad,IP:${publicip},IP:127.0.0.1 > extfile.cnf - echo extendedKeyUsage = serverAuth >> extfile.cnf - # sign the server key with our CA - openssl x509 -req -days $DAYS -passin pass:$password -in server.csr -CA ca.pem -CAkey ca-key.pem -out server-cert.pem -extfile extfile.cnf - - # create a client key and certificate signing request (CSR) - openssl genrsa -aes256 -out key.pem -passout pass:$password 2048 - openssl req -subj '/CN=client' -new -key key.pem -out client.csr -passin pass:$password - - # create an extensions config file and sign - echo extendedKeyUsage = clientAuth > extfile.cnf - openssl x509 -req -days 365 -passin pass:$password -in client.csr -CA ca.pem -CAkey ca-key.pem -out cert.pem -extfile extfile.cnf - - # remove the passphrase from the client and server key - openssl rsa -in server-key.pem -out server-key.pem -passin pass:$password - openssl rsa -in key.pem -out key.pem -passin pass:$password - - # remove generated files that are no longer required - rm -f ca-key.pem ca.srl client.csr extfile.cnf server.csr - ``` - - -#### APIs - -``` -{ - "tls": true, - "tls-verify": true, - "tls-config": { - "CAFile": "/root/.iSulad/ca.pem", - "CertFile": "/root/.iSulad/server-cert.pem", - "KeyFile":"/root/.iSulad/server-key.pem" - } -} -``` - -#### Restrictions - -The server supports the following modes: - -- Mode 1 \(client verified\): tlsverify, tlscacert, tlscert, tlskey -- Mode 2 \(client not verified\): tls, tlscert, tlskey - -The client supports the following modes: - -- Mode 1 \(verify the identity based on the client certificate, and verify the server based on the specified CA\): tlsverify, tlscacert, tlscert, tlskey -- Mode 2 \(server verified\): tlsverify, tlscacert - -Mode 1 is used for the server, and mode 2 for the client if the two-way authentication mode is used for communication. - -Mode 2 is used for the server and the client if the unidirectional authentication mode is used for communication. - ->![](./public_sys-resources/icon-notice.gif) **NOTICE:** ->- If RPM is used for installation, the server configuration can be modified in the **/etc/isulad/daemon.json** and **/etc/sysconfig/iSulad** files. ->- Two-way authentication is recommended as it is more secure than non-authentication or unidirectional authentication. ->- GRPC open-source component logs are not taken over by iSulad. To view gRPC logs, set the environment variables **gRPC\_VERBOSITY** and **gRPC\_TRACE** as required. ->   - -#### Example - -On the server: - -``` - isulad -H=tcp://0.0.0.0:2376 --tlsverify --tlscacert ~/.iSulad/ca.pem --tlscert ~/.iSulad/server-cert.pem --tlskey ~/.iSulad/server-key.pem -``` - -On the client: - -``` - isula version -H=tcp://$HOSTIP:2376 --tlsverify --tlscacert ~/.iSulad/ca.pem --tlscert ~/.iSulad/cert.pem --tlskey ~/.iSulad/key.pem -``` - -### devicemapper Storage Driver Configuration - -To use the devicemapper storage driver, you need to configure a thinpool device which requires an independent block device with sufficient free space. Take the independent block device **/dev/xvdf** as an example. The configuration method is as follows: - -1. Configuring a thinpool - -1. Stop the iSulad service. - - ``` - # systemctl stop isulad - ``` - -2. Create a logical volume manager \(LVM\) volume based on the block device. - - ``` - # pvcreate /dev/xvdf - ``` - -3. Create a volume group based on the created physical volume. - - ``` - # vgcreate isula /dev/xvdf - Volume group "isula" successfully created: - ``` - -4. Create two logical volumes named **thinpool** and **thinpoolmeta**. - - ``` - # lvcreate --wipesignatures y -n thinpool isula -l 95%VG - Logical volume "thinpool" created. - ``` - - ``` - # lvcreate --wipesignatures y -n thinpoolmeta isula -l 1%VG - Logical volume "thinpoolmeta" created. - ``` - -5. Convert the two logical volumes into a thinpool and the metadata used by the thinpool. - - ``` - # lvconvert -y --zero n -c 512K --thinpool isula/thinpool --poolmetadata isula/thinpoolmeta - - WARNING: Converting logical volume isula/thinpool and isula/thinpoolmeta to - thin pool's data and metadata volumes with metadata wiping. - THIS WILL DESTROY CONTENT OF LOGICAL VOLUME (filesystem etc.) - Converted isula/thinpool to thin pool. - ``` - - -   - -2. Modifying the iSulad configuration files - -1. If iSulad has been used in the environment, back up the running data first. - - ``` - # mkdir /var/lib/isulad.bk - # mv /var/lib/isulad/* /var/lib/isulad.bk - ``` - -2. Modify configuration files. - - Two configuration methods are provided. Select one based on site requirements. - - - Edit the **/etc/isulad/daemon.json** file, set **storage-driver** to **devicemapper**, and set parameters related to the **storage-opts** field. For details about related parameters, see [Parameter Description](#en-us_topic_0222861454_section1712923715282). The following lists the configuration reference: - - ``` - { - "storage-driver": "devicemapper" - "storage-opts": [ - "dm.thinpooldev=/dev/mapper/isula-thinpool", - "dm.fs=ext4", - "dm.min_free_space=10%" - ] - } - ``` - - - You can also edit **/etc/sysconfig/iSulad** to explicitly specify related iSulad startup parameters. For details about related parameters, see [Parameter Description](#en-us_topic_0222861454_section1712923715282). The following lists the configuration reference: - - ``` - OPTIONS="--storage-driver=devicemapper --storage-opt dm.thinpooldev=/dev/mapper/isula-thinpool --storage-opt dm.fs=ext4 --storage-opt dm.min_free_space=10%" - ``` - -3. Start iSulad for the settings to take effect. - - ``` - # systemctl start isulad - ``` - - -#### Parameter Description - -For details about parameters supported by storage-opts, see [Table 1](#en-us_topic_0222861454_table3191161993812). - -**Table 1** Parameter description - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Parameter

-

Mandatory or Not

-

Description

-

dm.fs

-

Yes

-

Specifies the type of the file system used by a container. This parameter must be set to ext4, that is, dm.fs=ext4.

-

dm.basesize

-

No

-

Specifies the maximum storage space of a single container. The unit can be k, m, g, t, or p. An uppercase letter can also be used, for example, dm.basesize=50G. This parameter is valid only during the first initialization.

-

dm.mkfsarg

-

No

-

Specifies the additional mkfs parameters when a basic device is created. For example: dm.mkfsarg=-O ^has_journal

-

dm.mountopt

-

No

-

Specifies additional mount parameters when a container is mounted. For example: dm.mountopt=nodiscard

-

dm.thinpooldev

-

No

-

Specifies the thinpool device used for container or image storage.

-

dm.min_free_space

-

No

-

Specifies minimum percentage of reserved space. For example, dm.min_free_space=10% indicates that storage-related operations such as container creation will fail when the remaining storage space falls below 10%.

-
- -#### Precautions - -- When configuring devicemapper, if the system does not have sufficient space for automatic capacity expansion of thinpool, disable the automatic capacity expansion function. - - To disable automatic capacity expansion, set both **thin\_pool\_autoextend\_threshold** and **thin\_pool\_autoextend\_percent** in the **/etc/lvm/profile/isula-thinpool.profile** file to **100**. - - ``` - activation { - thin_pool_autoextend_threshold=100 - thin_pool_autoextend_percent=100 - } - ``` - -- When devicemapper is used, use Ext4 as the container file system. You need to add **--storage-opt dm.fs=ext4** to the iSulad configuration parameters. -- If graphdriver is devicemapper and the metadata files are damaged and cannot be restored, you need to manually restore the metadata files. Do not directly operate or tamper with metadata of the devicemapper storage driver in Docker daemon. -- When the devicemapper LVM is used, if the devicemapper thinpool is damaged due to abnormal power-off, you cannot ensure the data integrity or whether the damaged thinpool can be restored. Therefore, you need to rebuild the thinpool. - -**Precautions for Switching the devicemapper Storage Pool When the User Namespace Feature Is Enabled on iSula** - -- Generally, the path of the deviceset-metadata file is **/var/lib/isulad/devicemapper/metadata/deviceset-metadata** during container startup. -- If user namespaces are used, the path of the deviceset-metadata file is **/var/lib/isulad/**_userNSUID.GID_**/devicemapper/metadata/deviceset-metadata**. -- When you use the devicemapper storage driver and the container is switched between the user namespace scenario and common scenario, the **BaseDeviceUUID** content in the corresponding deviceset-metadata file needs to be cleared. In the thinpool capacity expansion or rebuild scenario, you also need to clear the **BaseDeviceUUID** content in the deviceset-metadata file. Otherwise, the iSulad service fails to be restarted. - diff --git a/docs/en/docs/Container/interconnecting-isula-shim-v2-with-stratovirt.md b/docs/en/docs/Container/interconnecting-isula-shim-v2-with-stratovirt.md index 180081851db7c53da48c45ade53e9d0d1ccfc85a..649c5ccadabe6183ce2fa013d1640dfb61991c8e 100644 --- a/docs/en/docs/Container/interconnecting-isula-shim-v2-with-stratovirt.md +++ b/docs/en/docs/Container/interconnecting-isula-shim-v2-with-stratovirt.md @@ -4,11 +4,11 @@ shim v2 is a next-generation shim solution. Compared with shim v1, shim v2 features shorter call chains, clearer architecture, and lower memory overhead in multi-service container scenarios. iSula can run secure containers through isulad-shim or containerd-shim-kata-v2. The isulad-shim component is the implementation of the shim v1 solution, and the containerd-shim-kata-v2 component is the implementation of the shim v2 solution in the secure container scenario. This document describes how to interconnect iSula with containerd-shim-kata-v2. -## Interconnecting with containerd-shim-v2-kata +## Interconnecting with containerd-shim-kata-v2 ### Prerequisites -Before interconnecting iSula with containerd-shim-v2-kata, ensure that the following prerequisites are met: +Before interconnecting iSula with containerd-shim-kata-v2, ensure that the following prerequisites are met: - iSulad and kata-containers have been installed. - StratoVirt supports only the devicemapper storage driver. Therefore, you need to configure the devicemapper environment and ensure that the devicemapper storage driver used by iSulad works properly. diff --git a/docs/en/docs/Container/isula-build.md b/docs/en/docs/Container/isula-build.md index 011d7acf83879834de58d96fc4ee5f1c97193480..a26ac516a723e384b765671803b5aafcb0a6be6a 100644 --- a/docs/en/docs/Container/isula-build.md +++ b/docs/en/docs/Container/isula-build.md @@ -11,1034 +11,3 @@ The isula-build uses the server/client mode. The isula-build functions as a clie >![](./public_sys-resources/icon-note.gif) **Note:** > > Currently, isula-build supports OCI image format ([OCI Image Format Specification](https://github.com/opencontainers/image-spec/blob/main/spec.md/)) and Docker image format ([Image Manifest Version 2, Schema 2](https://docs.docker.com/registry/spec/manifest-v2-2/)). Use the `export ISULABUILD_CLI_EXPERIMENTAL=enabled` command to enable the experimental feature for supporting OCI image format. When the experimental feature is disabled, isula-build will take Docker image format as the default image format. Otherwise, isula-build will take OCI image format as the default image format. - -## Installation - -### Preparations - -To ensure that isula-build can be successfully installed, the following software and hardware requirements must be met: - -- Supported architectures: x86_64 and AArch64 -- Supported OS: openEuler -- You have the permissions of the root user. - -#### Installing isula-build - -Before using isula-build to build a container image, you need to install the following software packages: - -##### (Recommended) Method 1: Using Yum - -1. Configure the openEuler Yum source. - -2. Log in to the target server as the root user and install isula-build. - - ```shell - sudo yum install -y isula-build - ``` - -##### Method 2: Using the RPM Package - -1. Obtain an **isula-build-*.rpm** installation package from the openEuler Yum source, for example, **isula-build-0.9.6-4.oe1.x86_64.rpm**. - -2. Upload the obtained RPM software package to any directory on the target server, for example, **/home/**. - -3. Log in to the target server as the root user and run the following command to install isula-build: - - ```shell - sudo rpm -ivh /home/isula-build-*.rpm - ``` - ->![](./public_sys-resources/icon-note.gif) **Note:** -> -> After the installation is complete, you need to manually start the isula-build service. For details about how to start the service, see [Managing the isula-build Service](#managing-the-isula-build-service). - -## Configuring and Managing the isula-build Service - -### Configuring the isula-build Service - -After the isula-build software package is installed, the systemd starts the isula-build service based on the default configuration contained in the isula-build software package on the isula-build server. If the default configuration file on the isula-build server cannot meet your requirements, perform the following operations to customize the configuration file: After the default configuration is modified, restart the isula-build server for the new configuration to take effect. For details, see [Managing the isula-build Service](#managing-the-isula-build-service). - -Currently, the isula-build server contains the following configuration file: - -- **/etc/isula-build/configuration.toml**: general isula-builder configuration file, which is used to set the isula-builder log level, persistency directory, runtime directory, and OCI runtime. Parameters in the configuration file are described as follows: - -| Configuration Item | Mandatory or Optional | Description | Value | -| --------- | -------- | --------------------------------- | ----------------------------------------------- | -| debug | Optional | Indicates whether to enable the debug log function. | **true**: Enables the debug log function. **false**: Disables the debug log function. | -| loglevel | Optional | Sets the log level. | debug
info
warn
error | -| run_root | Mandatory | Sets the root directory of runtime data. | For example, **/var/run/isula-build/** | -| data_root | Mandatory | Sets the local persistency directory. | For example, **/var/lib/isula-build/** | -| runtime | Optional | Sets the runtime type. Currently, only **runc** is supported. | runc | -| group | Optional | Sets the owner group for the local socket file **isula_build.sock** so that non-privileged users in the group can use isula-build. | isula | -| experimental | Optional | Indicates whether to enable experimental features. | **true**: Enables experimental features. **false**: Disables experimental features. | - -- **/etc/isula-build/storage.toml**: configuration file for local persistent storage, including the configuration of the storage driver in use. - -| Configuration Item | Mandatory or Optional | Description | -| ------ | -------- | ------------------------------ | -| driver | Optional | Storage driver type. Currently, **overlay2** is supported. | - - For more settings, see [containers-storage.conf.5](https://github.com/containers/storage/blob/main/docs/containers-storage.conf.5.md). - -- **/etc/isula-build/registries.toml**: configuration file for each image repository. - -| Configuration Item | Mandatory or Optional | Description | -| ------------------- | -------- | ------------------------------------------------------------ | -| registries.search | Optional | Search domain of the image repository. Only listed image repositories can be found. | -| registries.insecure | Optional | Accessible insecure image repositories. Listed image repositories cannot pass the authentication and are not recommended. | - - For more settings, see [containers-registries.conf.5](https://github.com/containers/image/blob/main/docs/containers-registries.conf.5.md). - -- **/etc/isula-build/policy.json**: image pull/push policy file. Currently, this file cannot be configured. - ->![](./public_sys-resources/icon-note.gif) **Note:** -> -> - isula-build supports the preceding configuration file with the maximum size of 1 MB. -> - The persistent working directory dataroot cannot be configured on the memory disk, for example, tmpfs. -> - Currently, only overlay2 can be used as the underlying storage driver. -> - Before setting the `--group` option, ensure that the corresponding user group has been created on a local OS and non-privileged users have been added to the group. After isula-builder is restarted, non-privileged users in the group can use the isula-build function. In addition, to ensure permission consistency, the owner group of the isula-build configuration file directory **/etc/isula-build** is set to the group specified by `--group`. - -### Managing the isula-build Service - -Currently, openEuler uses systemd to manage the isula-build service. The isula-build software package contains the systemd service files. After installing the isula-build software package, you can use the systemd tool to start or stop the isula-build service. You can also manually start the isula-builder software. Note that only one isula-builder process can be started on a node at a time. - ->![](./public_sys-resources/icon-note.gif) **Note:** -> -> Only one isula-builder process can be started on a node at a time. - -#### (Recommended) Using systemd for Management - -You can run the following systemd commands to start, stop, and restart the isula-build service: - -- Run the following command to start the isula-build service: - - ```sh - sudo systemctl start isula-build.service - ``` - -- Run the following command to stop the isula-build service: - - ```sh - sudo systemctl stop isula-build.service - ``` - -- Run the following command to restart the isula-build service: - - ```sh - sudo systemctl restart isula-build.service - ``` - -The systemd service file of the isula-build software installation package is stored in the `/usr/lib/systemd/system/isula-build.service` directory. If you need to modify the systemd configuration of the isula-build service, modify the file and run the following command to make the modification take effect. Then restart the isula-build service based on the systemd management command. - -```sh -sudo systemctl daemon-reload -``` - -#### Directly Running isula-builder - -You can also run the `isula-builder` command on the server to start the service. The `isula-builder` command can contain flags for service startup. The following flags are supported: - -- `-D, --debug`: whether to enable the debugging mode. -- `--log-level`: log level. The options are **debug**, **info**, **warn**, and **error**. The default value is **info**. -- `--dataroot`: local persistency directory. The default value is **/var/lib/isula-build/**. -- `--runroot`: runtime directory. The default value is **/var/run/isula-build/**. -- `--storage-driver`: underlying storage driver type. -- `--storage-opt`: underlying storage driver configuration. -- `--group`: sets the owner group for the local socket file **isula_build.sock** so that non-privileged users in the group can use isula-build. The default owner group is **isula**. -- `--experimental`: whether to enable experimental features. - ->![](./public_sys-resources/icon-note.gif) **Note:** -> -> If the command line parameters contain the same configuration items as those in the configuration file, the command line parameters are preferentially used for startup. - -Start the isula-build service. For example, to specify the local persistency directory **/var/lib/isula-build** and disable debugging, run the following command: - -```sh -sudo isula-builder --dataroot "/var/lib/isula-build" --debug=false -``` - -## Usage Guidelines - -### Prerequisites - -isula-build depends on the executable file **runc** to build the **RUN** instruction in the Dockerfile. Therefore, runc must be pre-installed in the running environment of isula-build. The installation method depends on the application scenario. If you do not need to use the complete docker-engine tool chain, you can install only the docker-runc RPM package. - -```sh -sudo yum install -y docker-runc -``` - -If you need to use a complete docker-engine tool chain, install the docker-engine RPM package, which contains the executable file **runc** by default. - -```sh -sudo yum install -y docker-engine -``` - ->![](./public_sys-resources/icon-note.gif) **Note:** -> -> Ensure the security of OCI runtime (runc) executable files to prevent malicious replacement. - -### Overview - -The isula-build client provides a series of commands for building and managing container images. Currently, the isula-build client provides the following commands: - -- `ctr-img`: manages container images. The `ctr-img` command contains the following subcommands: - - `build`: builds a container image based on the specified Dockerfile. - - `images`: lists local container images. - - `import`: imports a basic container image. - - `load`: imports a cascade image. - - `rm`: deletes a local container image. - - `save`: exports a cascade image to a local disk. - - `tag`: adds a tag to a local container image. - - `pull`: pulls an image to a local host. - - `push`: pushes a local image to a remote repository. -- `info`: displays the running environment and system information of isula-build. -- `login`: logs in to the remote container image repository. -- `logout`: logs out of the remote container image repository. -- `version`: displays the versions of isula-build and isula-builder. -- `manifest` (experimental): manages the manifest list. - ->![](./public_sys-resources/icon-note.gif) **Note:** -> -> - The `isula-build completion` and `isula-builder completion` commands are used to generate the bash command completion script. These commands are implicitly provided by the command line framework and is not displayed in the help information. -> - isula-build client does not have any configuration file. To use isula-build experimental features, enable the environment variable **ISULABUILD_CLI_EXPERIMENTAL** on the client using the `export ISULABUILD_CLI_EXPERIMENTAL=enabled` command. - -The following describes how to use these commands in detail. - -### ctr-img: Container Image Management - -The isula-build command groups all container image management commands into the `ctr-img` command. The command format is as follows: - -```shell -isula-build ctr-img [command] -``` - -#### build: Container Image Build - -The subcommand build of the `ctr-img` command is used to build container images. The command format is as follows: - -```shell -isula-build ctr-img build [flags] -``` - -The `build` command contains the following flags: - -- `--build-arg`: string list containing variables required during the build process. -- `--build-static`: key value, which is used to build binary equivalence. Currently, the following key values are included: - `- build-time`: string indicating that a container image is built at a specified timestamp. The timestamp format is *YYYY-MM-DD HH-MM-SS*. -- `-f, --filename`: string indicating the path of the Dockerfiles. If this parameter is not specified, the current path is used. -- `--format`: string indicating the image format **oci** or **docker** (**ISULABUILD_CLI_EXPERIMENTAL** needs to be enabled). -- `--iidfile`: string indicating a local file to which the ID of the image is output. -- `-o, --output`: string indicating the image export mode and path. -- `--proxy`: boolean, which inherits the proxy environment variable on the host. The default value is **true**. -- `--tag`: string indicating the tag value of the image that is successfully built. -- `--cap-add`: string list containing permissions required by the **RUN** instruction during the build process. - -**The following describes the flags in detail.** - -##### \--build-arg - -Parameters in the Dockerfile are inherited from the commands. The usage is as follows: - -```sh -$ echo "This is bar file" > bar.txt -$ cat Dockerfile_arg -FROM busybox -ARG foo -ADD ${foo}.txt . -RUN cat ${foo}.txt -$ sudo isula-build ctr-img build --build-arg foo=bar -f Dockerfile_arg -STEP 1: FROM busybox -Getting image source signatures -Copying blob sha256:8f52abd3da461b2c0c11fda7a1b53413f1a92320eb96525ddf92c0b5cde781ad -Copying config sha256:e4db68de4ff27c2adfea0c54bbb73a61a42f5b667c326de4d7d5b19ab71c6a3b -Writing manifest to image destination -Storing signatures -STEP 2: ARG foo -STEP 3: ADD ${foo}.txt . -STEP 4: RUN cat ${foo}.txt -This is bar file -Getting image source signatures -Copying blob sha256:6194458b07fcf01f1483d96cd6c34302ffff7f382bb151a6d023c4e80ba3050a -Copying blob sha256:6bb56e4a46f563b20542171b998cb4556af4745efc9516820eabee7a08b7b869 -Copying config sha256:39b62a3342eed40b41a1bcd9cd455d77466550dfa0f0109af7a708c3e895f9a2 -Writing manifest to image destination -Storing signatures -Build success with image id: 39b62a3342eed40b41a1bcd9cd455d77466550dfa0f0109af7a708c3e895f9a2 -``` - -##### \--build-static - -Specifies a static build. That is, when isula-build is used to build a container image, differences between all timestamps and other build factors (such as the container ID and hostname) are eliminated. Finally, a container image that meets the static requirements is built. - -When isula-build is used to build a container image, assume that a fixed timestamp is given to the build subcommand and the following conditions are met: - -- The build environment is consistent before and after the upgrade. -- The Dockerfile is consistent before and after the build. -- The intermediate data generated before and after the build is consistent. -- The build commands are the same. -- The versions of the third-party libraries are the same. - -For container image build, isula-build supports the same Dockerfile. If the build environments are the same, the image content and image ID generated in multiple builds are the same. - -`--build-static` supports the key-value pair option in the *key=value* format. Currently, the following options are supported: - -- build-time: string, which indicates the fixed timestamp for creating a static image. The value is in the format of *YYYY-MM-DD HH-MM-SS*. The timestamp affects the attribute of the file for creating and modifying the time at the diff layer. - - Example: - - ```sh - sudo isula-build ctr-img build -f Dockerfile --build-static='build-time=2020-05-23 10:55:33' . - ``` - - In this way, the container images and image IDs built in the same environment for multiple times are the same. - -##### \--format - -This option can be used when the experiment feature is enabled. The default image format is **oci**. You can specify the image format to build. For example, the following commands are used to build an OCI image and a Docker image, respectively. - - ```sh - export ISULABUILD_CLI_EXPERIMENTAL=enabled; sudo isula-build ctr-img build -f Dockerfile --format oci . - ``` - - ```sh - export ISULABUILD_CLI_EXPERIMENTAL=enabled; sudo isula-build ctr-img build -f Dockerfile --format docker . - ``` - -##### \--iidfile - -Run the following command to output the ID of the built image to a file: - -```shell -isula-build ctr-img build --iidfile filename -``` - -For example, to export the container image ID to the **testfile** file, run the following command: - - ```sh -sudo isula-build ctr-img build -f Dockerfile_arg --iidfile testfile - ``` - - Check the container image ID in the **testfile** file. - - ```sh -$ cat testfile -76cbeed38a8e716e22b68988a76410eaf83327963c3b29ff648296d5cd15ce7b - ``` - -##### \-o, --output - -Currently, `-o` and `--output` support the following formats: - -- `isulad:image:tag`: directly pushes the image that is successfully built to iSulad, for example, `-o isulad:busybox:latest`. The following restrictions apply: - - - isula-build and iSulad must be on the same node. - - The tag must be configured. - - On the isula-build client, you need to temporarily save the successfully built image as **/var/tmp/isula-build-tmp-%v.tar** and then import it to iSulad. Ensure that the **/var/tmp/** directory has sufficient disk space. - -- `docker-daemon:image:tag`: directly pushes the successfully built image to Docker daemon, for example, `-o docker-daemon:busybox:latest`. The following restrictions apply: -- isula-build and Docker must be on the same node. - - The tag must be configured. - -- `docker://registry.example.com/repository:tag`: directly pushes the successfully built image to the remote image repository in Docker image format, for example, `-o docker://localhost:5000/library/busybox:latest`. - -- `docker-archive:/:image:tag`: saves the successfully built image to the local host in Docker image format, for example, `-o docker-archive:/root/image.tar:busybox:latest`. - -When experiment feature is enabled, you can build image in OCI image format with: - -- `oci://registry.example.com/repository:tag`: directly pushes the successfully built image to the remote image repository in OCI image format(OCI image format should be supported by the remote repository), for example, `-o oci://localhost:5000/library/busybox:latest`. - -- `oci-archive:/:image:tag`:saves the successfully built image to the local host in OCI image format, for example, `-o oci-archive:/root/image.tar:busybox:latest`。 - -In addition to the flags, the `build` subcommand also supports an argument whose type is string and meaning is context, that is, the context of the Dockerfile build environment. The default value of this parameter is the current path where isula-build is executed. This path affects the path retrieved by the **ADD** and **COPY** instructions of the .dockerignore file and Dockerfile. - -##### \--proxy - -Specifies whether the container started by the **RUN** instruction inherits the proxy-related environment variables **http_proxy**, **https_proxy**, **ftp_proxy**, **no_proxy**, **HTTP_PROXY**, **HTTPS_PROXY**, and **FTP_PROXY**. The default value is **true**. - -When a user configures proxy-related **ARG** or **ENV** in the Dockerfile, the inherited environment variables will be overwritten. - ->![](./public_sys-resources/icon-note.gif) **Note:** -> -> If the client and daemon are running on different terminals, the environment variables of the terminal where the daemon is running are inherited. - -##### \--tag - -Specifies the tag of the image stored on the local disk after the image is successfully built. - -##### \--cap-add - -Run the following command to add the permission required by the **RUN** instruction during the build process: - -```shell -isula-build ctr-img build --cap-add ${CAP} -``` - -Example: - -```sh -sudo isula-build ctr-img build --cap-add CAP_SYS_ADMIN --cap-add CAP_SYS_PTRACE -f Dockerfile -``` - -> **Note:** -> -> - A maximum of 100 container images can be concurrently built. -> - isula-build supports Dockerfiles with a maximum size of 1 MB. -> - isula-build supports a .dockerignore file with a maximum size of 1 MB. -> - Ensure that only the current user has the read and write permissions on the Dockerfiles to prevent other users from tampering with the files. -> - During the build, the **RUN** instruction starts the container to build in the container. Currently, isula-build supports the host network only. -> - isula-build only supports the tar compression format. -> - isula-build commits once after each image build stage is complete, instead of each time a Dockerfile line is executed. -> - isula-build does not support cache build. -> - isula-build starts the build container only when the **RUN** instruction is built. -> - Currently, the history function of Docker images is not supported. -> - The stage name can start with a digit. -> - The stage name can contain a maximum of 64 characters. -> - isula-build does not support resource restriction on a single Dockerfile build. If resource restriction is required, you can configure a resource limit on isula-builder. -> - Currently, isula-build does not support a remote URL as the data source of the **ADD** instruction in the Dockerfile. -> - The local tar package exported using the **docker-archive** and **oci-archive** types are not compressed, you can manually compress the file as required. - -#### image: Viewing Local Persistent Build Images - -You can run the `images` command to view the images in the local persistent storage. - -```sh -$ sudo isula-build ctr-img images ---------------------------------------- ----------- ----------------- ------------------------ ------------ -REPOSITORY TAG IMAGE ID CREATED SIZE ---------------------------------------- ----------- ----------------- ------------------------ ------------ -localhost:5000/library/alpine latest a24bb4013296 2022-01-17 10:02:19 5.85 MB - 39b62a3342ee 2022-01-17 10:01:12 1.45 MB ---------------------------------------- ----------- ----------------- ------------------------ ------------ -``` - ->![](./public_sys-resources/icon-note.gif) **Note:** -> -> The image size displayed by running the `isula-build ctr-img images` command may be different from that displayed by running the `docker images` command. When calculating the image size, `isula-build` directly calculates the total size of .tar packages at each layer, while `docker` calculates the total size of files by decompressing the .tar packages and traversing the diff directory. Therefore, the statistics are different. - -#### import: Importing a Basic Container Image - -A tar file in rootfs form can be imported into isula-build via the `ctr-img import` command. - -The command format is as follows: - -```shell -isula-build ctr-img import [flags] -``` - -Example: - -```sh -$ sudo isula-build ctr-img import busybox.tar mybusybox:latest -Getting image source signatures -Copying blob sha256:7b8667757578df68ec57bfc9fb7754801ec87df7de389a24a26a7bf2ebc04d8d -Copying config sha256:173b3cf612f8e1dc34e78772fcf190559533a3b04743287a32d549e3c7d1c1d1 -Writing manifest to image destination -Storing signatures -Import success with image id: "173b3cf612f8e1dc34e78772fcf190559533a3b04743287a32d549e3c7d1c1d1" -$ sudo isula-build ctr-img images ---------------------------------------- ----------- ----------------- ------------------------ ------------ -REPOSITORY TAG IMAGE ID CREATED SIZE ---------------------------------------- ----------- ----------------- ------------------------ ------------ -mybusybox latest 173b3cf612f8 2022-01-12 16:02:31 1.47 MB ---------------------------------------- ----------- ----------------- ------------------------ ------------ -``` - ->![](./public_sys-resources/icon-note.gif) **Note** -> -> isula-build supports the import of container basic images with a maximum size of 1 GB. - -#### load: Importing Cascade Images - -Cascade images are images that are saved to the local computer by running the `docker save` or `isula-build ctr-img save` command. The compressed image package contains a layer-by-layer image package named **layer.tar**. You can run the `ctr-img load` command to import the image to isula-build. - -The command format is as follows: - -```shell -isula-build ctr-img load [flags] -``` - -Currently, the following flags are supported: - -- `-i, --input`: path of the local .tar package. - -Example: - -```sh -$ sudo isula-build ctr-img load -i ubuntu.tar -Getting image source signatures -Copying blob sha256:cf612f747e0fbcc1674f88712b7bc1cd8b91cf0be8f9e9771235169f139d507c -Copying blob sha256:f934e33a54a60630267df295a5c232ceb15b2938ebb0476364192b1537449093 -Copying blob sha256:943edb549a8300092a714190dfe633341c0ffb483784c4fdfe884b9019f6a0b4 -Copying blob sha256:e7ebc6e16708285bee3917ae12bf8d172ee0d7684a7830751ab9a1c070e7a125 -Copying blob sha256:bf6751561805be7d07d66f6acb2a33e99cf0cc0a20f5fd5d94a3c7f8ae55c2a1 -Copying blob sha256:c1bd37d01c89de343d68867518b1155cb297d8e03942066ecb44ae8f46b608a3 -Copying blob sha256:a84e57b779297b72428fc7308e63d13b4df99140f78565be92fc9dbe03fc6e69 -Copying blob sha256:14dd68f4c7e23d6a2363c2320747ab88986dfd43ba0489d139eeac3ac75323b2 -Copying blob sha256:a2092d776649ea2301f60265f378a02405539a2a68093b2612792cc65d00d161 -Copying blob sha256:879119e879f682c04d0784c9ae7bc6f421e206b95d20b32ce1cb8a49bfdef202 -Copying blob sha256:e615448af51b848ecec00caeaffd1e30e8bf5cffd464747d159f80e346b7a150 -Copying blob sha256:f610bd1e9ac6aa9326d61713d552eeefef47d2bd49fc16140aa9bf3db38c30a4 -Copying blob sha256:bfe0a1336d031bf5ff3ce381e354be7b2bf310574cc0cd1949ad94dda020cd27 -Copying blob sha256:f0f15db85788c1260c6aa8ad225823f45c89700781c4c793361ac5fa58d204c7 -Copying config sha256:c07ddb44daa97e9e8d2d68316b296cc9343ab5f3d2babc5e6e03b80cd580478e -Writing manifest to image destination -Storing signatures -Loaded image as c07ddb44daa97e9e8d2d68316b296cc9343ab5f3d2babc5e6e03b80cd580478e -``` - ->![](./public_sys-resources/icon-note.gif) **Note:** -> -> - isula-build allows you to import a container image with a maximum size of 50 GB. -> - isula-build automatically recognizes the image format and loads it from the cascade image file. - -#### rm: Deleting a Local Persistent Image - -You can run the `rm` command to delete an image from the local persistent storage. The command format is as follows: - -```shell -isula-build ctr-img rm IMAGE [IMAGE...] [FLAGS] -``` - -Currently, the following flags are supported: - -- `-a, --all`: deletes all images stored locally. -- `-p, --prune`: deletes all images that are stored locally and do not have tags. - -Example: - -```sh -$ sudo isula-build ctr-img rm -p -Deleted: sha256:78731c1dde25361f539555edaf8f0b24132085b7cab6ecb90de63d72fa00c01d -Deleted: sha256:eeba1bfe9fca569a894d525ed291bdaef389d28a88c288914c1a9db7261ad12c -``` - -#### save: Exporting Cascade Images - -You can run the `save` command to export the cascade images to the local disk. The command format is as follows: - -```shell -isula-build ctr-img save [REPOSITORY:TAG]|imageID -o xx.tar -``` - -Currently, the following flags are supported: - -- `-f, --format`: which indicates the exported image format: **oci** or **docker** (**ISULABUILD_CLI_EXPERIMENTAL** needs to be enabled) -- `-o, --output`: which indicates the local path for storing the exported images. - -The following example shows how to export an image using *image/tag*: - -```sh -$ sudo isula-build ctr-img save busybox:latest -o busybox.tar -Getting image source signatures -Copying blob sha256:50644c29ef5a27c9a40c393a73ece2479de78325cae7d762ef3cdc19bf42dd0a -Copying blob sha256:824082a6864774d5527bda0d3c7ebd5ddc349daadf2aa8f5f305b7a2e439806f -Copying blob sha256:5f70bf18a086007016e948b04aed3b82103a36bea41755b6cddfaf10ace3c6ef -Copying config sha256:21c3e96ac411242a0e876af269c0cbe9d071626bdfb7cc79bfa2ddb9f7a82db6 -Writing manifest to image destination -Storing signatures -Save success with image: busybox:latest -``` - -The following example shows how to export an image using *ImageID*: - -```sh -$ sudo isula-build ctr-img save 21c3e96ac411 -o busybox.tar -Getting image source signatures -Copying blob sha256:50644c29ef5a27c9a40c393a73ece2479de78325cae7d762ef3cdc19bf42dd0a -Copying blob sha256:824082a6864774d5527bda0d3c7ebd5ddc349daadf2aa8f5f305b7a2e439806f -Copying blob sha256:5f70bf18a086007016e948b04aed3b82103a36bea41755b6cddfaf10ace3c6ef -Copying config sha256:21c3e96ac411242a0e876af269c0cbe9d071626bdfb7cc79bfa2ddb9f7a82db6 -Writing manifest to image destination -Storing signatures -Save success with image: 21c3e96ac411 -``` - -The following example shows how to export multiple images to the same tarball: - -```sh -$ sudo isula-build ctr-img save busybox:latest nginx:latest -o all.tar -Getting image source signatures -Copying blob sha256:eb78099fbf7fdc70c65f286f4edc6659fcda510b3d1cfe1caa6452cc671427bf -Copying blob sha256:29f11c413898c5aad8ed89ad5446e89e439e8cfa217cbb404ef2dbd6e1e8d6a5 -Copying blob sha256:af5bd3938f60ece203cd76358d8bde91968e56491daf3030f6415f103de26820 -Copying config sha256:b8efb18f159bd948486f18bd8940b56fd2298b438229f5bd2bcf4cedcf037448 -Writing manifest to image destination -Storing signatures -Getting image source signatures -Copying blob sha256:e2d6930974a28887b15367769d9666116027c411b7e6c4025f7c850df1e45038 -Copying config sha256:a33de3c85292c9e65681c2e19b8298d12087749b71a504a23c576090891eedd6 -Writing manifest to image destination -Storing signatures -Save success with image: [busybox:latest nginx:latest] -``` - ->![](./public_sys-resources/icon-note.gif) **NOTE:** -> ->- Save exports an image in .tar format by default. If necessary, you can save the image and then manually compress it. ->- When exporting an image using image name, specify the entire image name in the *REPOSITORY:TAG* format. - -#### tag: Tagging Local Persistent Images - -You can run the `tag` command to add a tag to a local persistent container image. The command format is as follows: - -```shell -isula-build ctr-img tag / busybox:latest -``` - -Example: - -```sh -$ sudo isula-build ctr-img images ---------------------------------------- ----------- ----------------- -------------------------- ------------ -REPOSITORY TAG IMAGE ID CREATED SIZE ---------------------------------------- ----------- ----------------- -------------------------- ------------ -alpine latest a24bb4013296 2020-05-29 21:19:46 5.85 MB ---------------------------------------- ----------- ----------------- -------------------------- ------------ -$ sudo isula-build ctr-img tag a24bb4013296 alpine:v1 -$ sudo isula-build ctr-img images ---------------------------------------- ----------- ----------------- ------------------------ ------------ -REPOSITORY TAG IMAGE ID CREATED SIZE ---------------------------------------- ----------- ----------------- ------------------------ ------------ -alpine latest a24bb4013296 2020-05-29 21:19:46 5.85 MB -alpine v1 a24bb4013296 2020-05-29 21:19:46 5.85 MB ---------------------------------------- ----------- ----------------- ------------------------ ------------ -``` - -#### pull: Pulling an Image To a Local Host - -Run the `pull` command to pull an image from a remote image repository to a local host. Command format: - -```shell -isula-build ctr-img pull REPOSITORY[:TAG] -``` - -Example: - -```sh -$ sudo isula-build ctr-img pull example-registry/library/alpine:latest -Getting image source signatures -Copying blob sha256:8f52abd3da461b2c0c11fda7a1b53413f1a92320eb96525ddf92c0b5cde781ad -Copying config sha256:e4db68de4ff27c2adfea0c54bbb73a61a42f5b667c326de4d7d5b19ab71c6a3b -Writing manifest to image destination -Storing signatures -Pull success with image: example-registry/library/alpine:latest -``` - -#### push: Pushing a Local Image to a Remote Repository - -Run the `push` command to push a local image to a remote repository. Command format: - -```shell -isula-build ctr-img push REPOSITORY[:TAG] -``` - -Currently, the following flags are supported: - -- `-f, --format`: indicates the pushed image format **oci** or **docker** (**ISULABUILD_CLI_EXPERIMENTAL** needs to be enabled) - -Example: - -```sh -$ sudo isula-build ctr-img push example-registry/library/mybusybox:latest -Getting image source signatures -Copying blob sha256:d2421964bad195c959ba147ad21626ccddc73a4f2638664ad1c07bd9df48a675 -Copying config sha256:f0b02e9d092d905d0d87a8455a1ae3e9bb47b4aa3dc125125ca5cd10d6441c9f -Writing manifest to image destination -Storing signatures -Push success with image: example-registry/library/mybusybox:latest -``` - ->![](./public_sys-resources/icon-note.gif) **NOTE:** -> ->- Before pushing an image, log in to the corresponding image repository. - -### info: Viewing the Operating Environment and System Information - -You can run the `isula-build info` command to view the running environment and system information of isula-build. The command format is as follows: - -```shell - isula-build info [flags] -``` - -The following flags are supported: - -- `-H, --human-readable`: Boolean. The memory information is printed in the common memory format. The value is 1000 power. -- `-V, --verbose`: Boolean. The memory usage is displayed during system running. - -Example: - -```sh -$ sudo isula-build info -H - General: - MemTotal: 7.63 GB - MemFree: 757 MB - SwapTotal: 8.3 GB - SwapFree: 8.25 GB - OCI Runtime: runc - DataRoot: /var/lib/isula-build/ - RunRoot: /var/run/isula-build/ - Builders: 0 - Goroutines: 12 - Store: - Storage Driver: overlay - Backing Filesystem: extfs - Registry: - Search Registries: - oepkgs.net - Insecure Registries: - localhost:5000 - oepkgs.net - Runtime: - MemSys: 68.4 MB - HeapSys: 63.3 MB - HeapAlloc: 7.41 MB - MemHeapInUse: 8.98 MB - MemHeapIdle: 54.4 MB - MemHeapReleased: 52.1 MB -``` - -### login: Logging In to the Remote Image Repository - -You can run the `login` command to log in to the remote image repository. The command format is as follows: - -```shell - isula-build login SERVER [FLAGS] -``` - -Currently, the following flags are supported: - -```shell - Flags: - -p, --password-stdin Read password from stdin - -u, --username string Username to access registry -``` - -Enter the password through stdin. In the following example, the password in creds.txt is transferred to the stdin of isula-build through a pipe for input. - -```sh - $ cat creds.txt | sudo isula-build login -u cooper -p mydockerhub.io - Login Succeeded -``` - -Enter the password in interactive mode. - -```sh - $ sudo isula-build login mydockerhub.io -u cooper - Password: - Login Succeeded -``` - -### logout: Logging Out of the Remote Image Repository - -You can run the `logout` command to log out of the remote image repository. The command format is as follows: - -```shell - isula-build logout [SERVER] [FLAGS] -``` - -Currently, the following flags are supported: - -```shell - Flags: - -a, --all Logout all registries -``` - -Example: - -```sh - $ sudo isula-build logout -a - Removed authentications -``` - -### version: Querying the isula-build Version - -You can run the `version` command to view the current version information. - -```sh -$ sudo isula-build version -Client: - Version: 0.9.6-4 - Go Version: go1.15.7 - Git Commit: 83274e0 - Built: Wed Jan 12 15:32:55 2022 - OS/Arch: linux/amd64 - -Server: - Version: 0.9.6-4 - Go Version: go1.15.7 - Git Commit: 83274e0 - Built: Wed Jan 12 15:32:55 2022 - OS/Arch: linux/amd64 -``` - -### manifest: Manifest List Management - -The manifest list contains the image information corresponding to different system architectures. You can use the same manifest (for example, **openeuler:latest**) in different architectures to obtain the image of the corresponding architecture. The manifest contains the create, annotate, inspect, and push subcommands. - ->![](./public_sys-resources/icon-note.gif) **NOTE:** -> -> manifest is an experiment feature. When using this feature, you need to enable the experiment options on the client and server. For details, see Client Overview and Configuring Services. - -#### create: Manifest List Creation - -The create subcommand of the `manifest` command is used to create a manifest list. The command format is as follows: - -```shell -isula-build manifest create MANIFEST_LIST MANIFEST [MANIFEST...] -``` - -You can specify the name of the manifest list and the remote images to be added to the list. If no remote image is specified, an empty manifest list is created. - -Example: - -```sh -sudo isula-build manifest create openeuler localhost:5000/openeuler_x86:latest localhost:5000/openeuler_aarch64:latest -``` - -#### annotate: Manifest List Update - -The `annotate` subcommand of the `manifest` command is used to update the manifest list. The command format is as follows: - -```shell -isula-build manifest annotate MANIFEST_LIST MANIFEST [flags] -``` - -You can specify the manifest list to be updated and the images in the manifest list, and use flags to specify the options to be updated. This command can also be used to add new images to the manifest list. - -Currently, the following flags are supported: - -- --arch: Applicable architecture of the rewritten image. The value is a string. -- --os: Indicates the applicable system of the image. The value is a string. -- --os-features: Specifies the OS features required by the image. This parameter is a string and rarely used. -- --variant: Variable of the image recorded in the list. The value is a string. - -Example: - -```sh -sudo isula-build manifest annotate --os linux --arch arm64 openeuler:latest localhost:5000/openeuler_aarch64:latest -``` - -#### inspect: Manifest List Inspect - -The `inspect` subcommand of the `manifest` command is used to query the manifest list. The command format is as follows: - -```shell -isula-build manifest inspect MANIFEST_LIST -``` - -Example: - -```sh -$ sudo isula-build manifest inspect openeuler:latest -{ - "schemaVersion": 2, - "mediaType": "application/vnd.docker.distribution.manifest.list.v2+json", - "manifests": [ - { - "mediaType": "application/vnd.docker.distribution.manifest.v2+json", - "size": 527, - "digest": "sha256:bf510723d2cd2d4e3f5ce7e93bf1e52c8fd76831995ac3bd3f90ecc866643aff", - "platform": { - "architecture": "amd64", - "os": "linux" - } - }, - { - "mediaType": "application/vnd.docker.distribution.manifest.v2+json", - "size": 527, - "digest": "sha256:f814888b4bb6149bd39ba8375a1932fb15071b4dbffc7f76c7b602b06abbb820", - "platform": { - "architecture": "arm64", - "os": "linux" - } - } - ] -} -``` - -#### push: Manifest List Push to the Remote Repository - -The manifest subcommand `push` is used to push the manifest list to the remote repository. The command format is as follows: - -```shell -isula-build manifest push MANIFEST_LIST DESTINATION -``` - -Example: - -```sh -sudo isula-build manifest push openeuler:latest localhost:5000/openeuler:latest -``` - -## Directly Integrating a Container Engine - -isula-build can be integrated with iSulad or Docker to import the built container image to the local storage of the container engine. - -### Integration with iSulad - -Images that are successfully built can be directly exported to the iSulad. - -Example: - -```sh -sudo isula-build ctr-img build -f Dockerfile -o isulad:busybox:2.0 -``` - -Specify iSulad in the -o parameter to export the built container image to iSulad. You can query the image using isula images. - -```sh -$ sudo isula images -isula images -REPOSITORY TAG IMAGE ID CREATED SIZE -busybox 2.0 2d414a5cad6d 2020-08-01 06:41:36 5.577 MB -``` - ->![](./public_sys-resources/icon-note.gif) **Note:** -> -> - It is required that isula-build and iSulad be on the same node. -> - When an image is directly exported to the iSulad, the isula-build client needs to temporarily store the successfully built image as `/var/lib/isula-build/tmp/[build_id]/isula-build-tmp-%v.tar` and then import it to the iSulad. Ensure that the /var/tmp/ directory has sufficient disk space. If the isula-build client process is killed or Ctrl+C is pressed during the export, you need to manually clear the `/var/lib/isula-build/tmp/[build_id]/isula-build-tmp-%v.tar` file. - -### Integration with Docker - -Images that are successfully built can be directly exported to the Docker daemon. - -Example: - -```sh -sudo isula-build ctr-img build -f Dockerfile -o docker-daemon:busybox:2.0 -``` - -Specify docker-daemon in the -o parameter to export the built container image to Docker. You can run the `docker images` command to query the image. - -```sh -$ sudo docker images -REPOSITORY TAG IMAGE ID CREATED SIZE -busybox 2.0 2d414a5cad6d 2 months ago 5.22MB -``` - ->![](./public_sys-resources/icon-note.gif) **Note:** -> -> The isula-build and Docker must be on the same node. - -## Precautions - -This chapter is something about constraints, limitations and differences with `docker build` when use isula-builder build images. - -### Constraints or Limitations - -1. When export an image to [iSulad](https://gitee.com/openeuler/iSulad/blob/master/README.md/), a tag is necessary. -2. Because the OCI runtime, for example, **runc**, will be called by isula-builder when executing the **RUN** instruction, the integrity of the runtime binary should be guaranteed by the user. -3. DataRoot should not be set to **tmpfs**. -4. **Overlay2** is the only storage driver supported by isula-builder currently. -5. Docker image is the only image format supported by isula-builder currently. -6. You are advised to set file permission of the Dockerfile to **0600** to avoid tampering by other users. -7. Only host network is supported by the **RUN** instruction currently. -8. When export image to a tar package, only tar compression format is supported by isula-builder currently. -9. The base image size is limited to 1 GB when importing a base image using `import`. - -### Differences with "docker build" - -`isula-build` complies with [Dockerfile specification](https://docs.docker.com/engine/reference/builder), but there are also some subtle differences between `isula-builder` and `docker build` as follows: - -1. isula-builder commits after each build stage, but not every line. -2. Build cache is not supported by isula-builder. -3. Only **RUN** instruction will be executed in the build container. -4. Build history is not supported currently. -5. Stage name can be start with a number. -6. The length of the stage name is limited to 64 in `isula-builder`. -7. **ADD** instruction source can not be a remote URL currently. -8. Resource restriction on a single build is not supported. If resource restriction is required, you can configure a resource limit on isula-builder. -9. `isula-builder` add each origin layer tar size to get the image size, but docker only uses the diff content of each layer. So the image size listed by `isula-builder images` is different. -10. Image name should be in the *NAME:TAG* format. For example **busybox:latest**, where **latest** must not be omitted. - -## Appendix - -### Command Line Parameters - -**Table 1** Parameters of the `ctr-img build` command - -| **Command** | **Parameter** | **Description** | -| ------------- | -------------- | ------------------------------------------------------------ | -| ctr-img build | --build-arg | String list, which contains variables required during the build. | -| | --build-static | Key value, which is used to build binary equivalence. Currently, the following key values are included: - build-time: string, which indicates that a fixed timestamp is used to build a container image. The timestamp format is YYYY-MM-DD HH-MM-SS. | -| | -f, --filename | String, which indicates the path of the Dockerfiles. If this parameter is not specified, the current path is used. | -| | --format | String, which indicates the image format **oci** or **docker** (**ISULABUILD_CLI_EXPERIMENTAL** needs to be enabled). | -| | --iidfile | String, which indicates the ID of the image output to a local file. | -| | -o, --output | String, which indicates the image export mode and path.| -| | --proxy | Boolean, which inherits the proxy environment variable on the host. The default value is true. | -| | --tag | String, which indicates the tag value of the image that is successfully built. | -| | --cap-add | String list, which contains permissions required by the **RUN** instruction during the build process.| - -**Table 2** Parameters of the `ctr-img load` command - -| **Command** | **Parameter** | **Description** | -| ------------ | ----------- | --------------------------------- | -| ctr-img load | -i, --input | String, path of the local .tar package to be imported.| - -**Table 3** Parameters of the `ctr-img push` command - -| **Command** | **Parameter** | **Description** | -| ------------ | ----------- | --------------------------------- | -| ctr-img push | -f, --format | String, which indicates the pushed image format **oci** or **docker** (**ISULABUILD_CLI_EXPERIMENTAL** needs to be enabled).| - -**Table 4** Parameters of the `ctr-img rm` command - -| **Command** | **Parameter** | **Description** | -| ---------- | ----------- | --------------------------------------------- | -| ctr-img rm | -a, --all | Boolean, which is used to delete all local persistent images. | -| | -p, --prune | Boolean, which is used to delete all images that are stored persistently on the local host and do not have tags. | - -**Table 5** Parameters of the `ctr-img save` command - -| **Command** | **Parameter** | **Description** | -| ------------ | ------------ | ---------------------------------- | -| ctr-img save | -o, --output | String, which indicates the local path for storing the exported images.| -| ctr-img save | -f, --format | String, which indicates the exported image format **oci** or **docker** (**ISULABUILD_CLI_EXPERIMENTAL** needs to be enabled).| - -**Table 6** Parameters of the `login` command - -| **Command** | **Parameter** | **Description** | -| -------- | -------------------- | ------------------------------------------------------- | -| login | -p, --password-stdin | Boolean, which indicates whether to read the password through stdin. or enter the password in interactive mode. | -| | -u, --username | String, which indicates the username for logging in to the image repository.| - -**Table 7** Parameters of the `logout` command - -| **Command** | **Parameter** | **Description** | -| -------- | --------- | ------------------------------------ | -| logout | -a, --all | Boolean, which indicates whether to log out of all logged-in image repositories. | - -**Table 8** Parameters of the `manifest annotate` command - -| **Command** | **Parameter** | **Description** | -| ----------------- | ------------- | ---------------------------- | -| manifest annotate | --arch | Set architecture | -| | --os | Set operating system | -| | --os-features | Set operating system feature | -| | --variant | Set architecture variant | - -### Communication Matrix - -The isula-build component processes communicate with each other through the Unix socket file. No port is used for communication. - -### File and Permission - -- All isula-build operations must be performed by the **root** user. To perform operations as a non-privileged user, you need to configure the `--group` option. - -- The following table lists the file permissions involved in the running of isula-build. - -| **File Path** | **File/Folder Permission** | **Description** | -| ------------------------------------------- | ------------------- | ------------------------------------------------------------ | -| /usr/bin/isula-build | 550 | Binary file of the command line tool. | -| /usr/bin/isula-builder | 550 | Binary file of the isula-builder process. | -| /usr/lib/systemd/system/isula-build.service | 640 | systemd configuration file, which is used to manage the isula-build service. | -| /usr/isula-build | 650 | Root directory of the isula-builder configuration file. | -| /etc/isula-build/configuration.toml | 600 | General isula-builder configuration file, including the settings of the isula-builder log level, persistency directory, runtime directory, and OCI runtime. | -| /etc/isula-build/policy.json | 600 | Syntax file of the signature verification policy file. | -| /etc/isula-build/registries.toml | 600 | Configuration file of each image repository, including the available image repository list and image repository blacklist. | -| /etc/isula-build/storage.toml | 600 | Configuration file of the local persistent storage, including the configuration of the used storage driver. | -| /etc/isula-build/isula-build.pub | 400 | Asymmetric encryption public key file. | -| /var/run/isula_build.sock | 660 | Local socket of isula-builder. | -| /var/lib/isula-build | 700 | Local persistency directory. | -| /var/run/isula-build | 700 | Local runtime directory. | -| /var/lib/isula-build/tmp/\[build_id\]/isula-build-tmp-*.tar | 644 | Local temporary directory for storing the images when they are exported to iSulad. | diff --git a/docs/en/docs/Container/isula-build_user_guide.md b/docs/en/docs/Container/isula-build_user_guide.md new file mode 100644 index 0000000000000000000000000000000000000000..078424ace9c2969abe3306ab7799af6398bd45a8 --- /dev/null +++ b/docs/en/docs/Container/isula-build_user_guide.md @@ -0,0 +1,1030 @@ +# Installation + +## Preparations + +To ensure that isula-build can be successfully installed, the following software and hardware requirements must be met: + +- Supported architectures: x86_64 and AArch64 +- Supported OS: openEuler +- You have the permissions of the root user. + +### Installing isula-build + +Before using isula-build to build a container image, you need to install the following software packages: + +#### (Recommended) Method 1: Using Yum + +1. Configure the openEuler Yum source. + +2. Log in to the target server as the root user and install isula-build. + + ```shell + sudo yum install -y isula-build + ``` + +#### Method 2: Using the RPM Package + +1. Obtain an **isula-build-*.rpm** installation package from the openEuler Yum source, for example, **isula-build-0.9.6-4.oe1.x86_64.rpm**. + +2. Upload the obtained RPM software package to any directory on the target server, for example, **/home/**. + +3. Log in to the target server as the root user and run the following command to install isula-build: + + ```shell + sudo rpm -ivh /home/isula-build-*.rpm + ``` + +>![](./public_sys-resources/icon-note.gif) **Note:** +> +> After the installation is complete, you need to manually start the isula-build service. For details about how to start the service, see [Managing the isula-build Service](#managing-the-isula-build-service). + +# Configuring and Managing the isula-build Service + +## Configuring the isula-build Service + +After the isula-build software package is installed, the systemd starts the isula-build service based on the default configuration contained in the isula-build software package on the isula-build server. If the default configuration file on the isula-build server cannot meet your requirements, perform the following operations to customize the configuration file: After the default configuration is modified, restart the isula-build server for the new configuration to take effect. For details, see [Managing the isula-build Service](#managing-the-isula-build-service). + +Currently, the isula-build server contains the following configuration file: + +- **/etc/isula-build/configuration.toml**: general isula-builder configuration file, which is used to set the isula-builder log level, persistency directory, runtime directory, and OCI runtime. Parameters in the configuration file are described as follows: + +| Configuration Item | Mandatory or Optional | Description | Value | +| --------- | -------- | --------------------------------- | ----------------------------------------------- | +| debug | Optional | Indicates whether to enable the debug log function. | **true**: Enables the debug log function. **false**: Disables the debug log function. | +| loglevel | Optional | Sets the log level. | debug
info
warn
error | +| run_root | Mandatory | Sets the root directory of runtime data. | For example, **/var/run/isula-build/** | +| data_root | Mandatory | Sets the local persistency directory. | For example, **/var/lib/isula-build/** | +| runtime | Optional | Sets the runtime type. Currently, only **runc** is supported. | runc | +| group | Optional | Sets the owner group for the local socket file **isula_build.sock** so that non-privileged users in the group can use isula-build. | isula | +| experimental | Optional | Indicates whether to enable experimental features. | **true**: Enables experimental features. **false**: Disables experimental features. | + +- **/etc/isula-build/storage.toml**: configuration file for local persistent storage, including the configuration of the storage driver in use. + +| Configuration Item | Mandatory or Optional | Description | +| ------ | -------- | ------------------------------ | +| driver | Optional | Storage driver type. Currently, **overlay2** is supported. | + + For more settings, see [containers-storage.conf.5](https://github.com/containers/storage/blob/main/docs/containers-storage.conf.5.md). + +- **/etc/isula-build/registries.toml**: configuration file for each image repository. + +| Configuration Item | Mandatory or Optional | Description | +| ------------------- | -------- | ------------------------------------------------------------ | +| registries.search | Optional | Search domain of the image repository. Only listed image repositories can be found. | +| registries.insecure | Optional | Accessible insecure image repositories. Listed image repositories cannot pass the authentication and are not recommended. | + + For more settings, see [containers-registries.conf.5](https://github.com/containers/image/blob/main/docs/containers-registries.conf.5.md). + +- **/etc/isula-build/policy.json**: image pull/push policy file. Currently, this file cannot be configured. + +>![](./public_sys-resources/icon-note.gif) **Note:** +> +> - isula-build supports the preceding configuration file with the maximum size of 1 MB. +> - The persistent working directory dataroot cannot be configured on the memory disk, for example, tmpfs. +> - Currently, only overlay2 can be used as the underlying storage driver. +> - Before setting the `--group` option, ensure that the corresponding user group has been created on a local OS and non-privileged users have been added to the group. After isula-builder is restarted, non-privileged users in the group can use the isula-build function. In addition, to ensure permission consistency, the owner group of the isula-build configuration file directory **/etc/isula-build** is set to the group specified by `--group`. + +## Managing the isula-build Service + +Currently, openEuler uses systemd to manage the isula-build service. The isula-build software package contains the systemd service files. After installing the isula-build software package, you can use the systemd tool to start or stop the isula-build service. You can also manually start the isula-builder software. Note that only one isula-builder process can be started on a node at a time. + +>![](./public_sys-resources/icon-note.gif) **Note:** +> +> Only one isula-builder process can be started on a node at a time. + +### (Recommended) Using systemd for Management + +You can run the following systemd commands to start, stop, and restart the isula-build service: + +- Run the following command to start the isula-build service: + + ```sh + sudo systemctl start isula-build.service + ``` + +- Run the following command to stop the isula-build service: + + ```sh + sudo systemctl stop isula-build.service + ``` + +- Run the following command to restart the isula-build service: + + ```sh + sudo systemctl restart isula-build.service + ``` + +The systemd service file of the isula-build software installation package is stored in the `/usr/lib/systemd/system/isula-build.service` directory. If you need to modify the systemd configuration of the isula-build service, modify the file and run the following command to make the modification take effect. Then restart the isula-build service based on the systemd management command. + +```sh +sudo systemctl daemon-reload +``` + +### Directly Running isula-builder + +You can also run the `isula-builder` command on the server to start the service. The `isula-builder` command can contain flags for service startup. The following flags are supported: + +- `-D, --debug`: whether to enable the debugging mode. +- `--log-level`: log level. The options are **debug**, **info**, **warn**, and **error**. The default value is **info**. +- `--dataroot`: local persistency directory. The default value is **/var/lib/isula-build/**. +- `--runroot`: runtime directory. The default value is **/var/run/isula-build/**. +- `--storage-driver`: underlying storage driver type. +- `--storage-opt`: underlying storage driver configuration. +- `--group`: sets the owner group for the local socket file **isula_build.sock** so that non-privileged users in the group can use isula-build. The default owner group is **isula**. +- `--experimental`: whether to enable experimental features. + +>![](./public_sys-resources/icon-note.gif) **Note:** +> +> If the command line parameters contain the same configuration items as those in the configuration file, the command line parameters are preferentially used for startup. + +Start the isula-build service. For example, to specify the local persistency directory **/var/lib/isula-build** and disable debugging, run the following command: + +```sh +sudo isula-builder --dataroot "/var/lib/isula-build" --debug=false +``` + +# Usage Guidelines + +## Prerequisites + +isula-build depends on the executable file **runc** to build the **RUN** instruction in the Dockerfile. Therefore, runc must be pre-installed in the running environment of isula-build. The installation method depends on the application scenario. If you do not need to use the complete docker-engine tool chain, you can install only the docker-runc RPM package. + +```sh +sudo yum install -y docker-runc +``` + +If you need to use a complete docker-engine tool chain, install the docker-engine RPM package, which contains the executable file **runc** by default. + +```sh +sudo yum install -y docker-engine +``` + +>![](./public_sys-resources/icon-note.gif) **Note:** +> +> Ensure the security of OCI runtime (runc) executable files to prevent malicious replacement. + +## Overview + +The isula-build client provides a series of commands for building and managing container images. Currently, the isula-build client provides the following commands: + +- `ctr-img`: manages container images. The `ctr-img` command contains the following subcommands: + - `build`: builds a container image based on the specified Dockerfile. + - `images`: lists local container images. + - `import`: imports a basic container image. + - `load`: imports a cascade image. + - `rm`: deletes a local container image. + - `save`: exports a cascade image to a local disk. + - `tag`: adds a tag to a local container image. + - `pull`: pulls an image to a local host. + - `push`: pushes a local image to a remote repository. +- `info`: displays the running environment and system information of isula-build. +- `login`: logs in to the remote container image repository. +- `logout`: logs out of the remote container image repository. +- `version`: displays the versions of isula-build and isula-builder. +- `manifest` (experimental): manages the manifest list. + +>![](./public_sys-resources/icon-note.gif) **Note:** +> +> - The `isula-build completion` and `isula-builder completion` commands are used to generate the bash command completion script. These commands are implicitly provided by the command line framework and is not displayed in the help information. +> - isula-build client does not have any configuration file. To use isula-build experimental features, enable the environment variable **ISULABUILD_CLI_EXPERIMENTAL** on the client using the `export ISULABUILD_CLI_EXPERIMENTAL=enabled` command. + +The following describes how to use these commands in detail. + +## ctr-img: Container Image Management + +The isula-build command groups all container image management commands into the `ctr-img` command. The command format is as follows: + +```shell +isula-build ctr-img [command] +``` + +### build: Container Image Build + +The subcommand build of the `ctr-img` command is used to build container images. The command format is as follows: + +```shell +isula-build ctr-img build [flags] +``` + +The `build` command contains the following flags: + +- `--build-arg`: string list containing variables required during the build process. +- `--build-static`: key value, which is used to build binary equivalence. Currently, the following key values are included: + `- build-time`: string indicating that a container image is built at a specified timestamp. The timestamp format is *YYYY-MM-DD HH-MM-SS*. +- `-f, --filename`: string indicating the path of the Dockerfiles. If this parameter is not specified, the current path is used. +- `--format`: string indicating the image format **oci** or **docker** (**ISULABUILD_CLI_EXPERIMENTAL** needs to be enabled). +- `--iidfile`: string indicating a local file to which the ID of the image is output. +- `-o, --output`: string indicating the image export mode and path. +- `--proxy`: boolean, which inherits the proxy environment variable on the host. The default value is **true**. +- `--tag`: string indicating the tag value of the image that is successfully built. +- `--cap-add`: string list containing permissions required by the **RUN** instruction during the build process. + +**The following describes the flags in detail.** + +#### \--build-arg + +Parameters in the Dockerfile are inherited from the commands. The usage is as follows: + +```sh +$ echo "This is bar file" > bar.txt +$ cat Dockerfile_arg +FROM busybox +ARG foo +ADD ${foo}.txt . +RUN cat ${foo}.txt +$ sudo isula-build ctr-img build --build-arg foo=bar -f Dockerfile_arg +STEP 1: FROM busybox +Getting image source signatures +Copying blob sha256:8f52abd3da461b2c0c11fda7a1b53413f1a92320eb96525ddf92c0b5cde781ad +Copying config sha256:e4db68de4ff27c2adfea0c54bbb73a61a42f5b667c326de4d7d5b19ab71c6a3b +Writing manifest to image destination +Storing signatures +STEP 2: ARG foo +STEP 3: ADD ${foo}.txt . +STEP 4: RUN cat ${foo}.txt +This is bar file +Getting image source signatures +Copying blob sha256:6194458b07fcf01f1483d96cd6c34302ffff7f382bb151a6d023c4e80ba3050a +Copying blob sha256:6bb56e4a46f563b20542171b998cb4556af4745efc9516820eabee7a08b7b869 +Copying config sha256:39b62a3342eed40b41a1bcd9cd455d77466550dfa0f0109af7a708c3e895f9a2 +Writing manifest to image destination +Storing signatures +Build success with image id: 39b62a3342eed40b41a1bcd9cd455d77466550dfa0f0109af7a708c3e895f9a2 +``` + +#### \--build-static + +Specifies a static build. That is, when isula-build is used to build a container image, differences between all timestamps and other build factors (such as the container ID and hostname) are eliminated. Finally, a container image that meets the static requirements is built. + +When isula-build is used to build a container image, assume that a fixed timestamp is given to the build subcommand and the following conditions are met: + +- The build environment is consistent before and after the upgrade. +- The Dockerfile is consistent before and after the build. +- The intermediate data generated before and after the build is consistent. +- The build commands are the same. +- The versions of the third-party libraries are the same. + +For container image build, isula-build supports the same Dockerfile. If the build environments are the same, the image content and image ID generated in multiple builds are the same. + +`--build-static` supports the key-value pair option in the *key=value* format. Currently, the following options are supported: + +- build-time: string, which indicates the fixed timestamp for creating a static image. The value is in the format of *YYYY-MM-DD HH-MM-SS*. The timestamp affects the attribute of the file for creating and modifying the time at the diff layer. + + Example: + + ```sh + sudo isula-build ctr-img build -f Dockerfile --build-static='build-time=2020-05-23 10:55:33' . + ``` + + In this way, the container images and image IDs built in the same environment for multiple times are the same. + +#### \--format + +This option can be used when the experiment feature is enabled. The default image format is **oci**. You can specify the image format to build. For example, the following commands are used to build an OCI image and a Docker image, respectively. + + ```sh + export ISULABUILD_CLI_EXPERIMENTAL=enabled; sudo isula-build ctr-img build -f Dockerfile --format oci . + ``` + + ```sh + export ISULABUILD_CLI_EXPERIMENTAL=enabled; sudo isula-build ctr-img build -f Dockerfile --format docker . + ``` + +#### \--iidfile + +Run the following command to output the ID of the built image to a file: + +```shell +isula-build ctr-img build --iidfile filename +``` + +For example, to export the container image ID to the **testfile** file, run the following command: + + ```sh +sudo isula-build ctr-img build -f Dockerfile_arg --iidfile testfile + ``` + + Check the container image ID in the **testfile** file. + + ```sh +$ cat testfile +76cbeed38a8e716e22b68988a76410eaf83327963c3b29ff648296d5cd15ce7b + ``` + +#### \-o, --output + +Currently, `-o` and `--output` support the following formats: + +- `isulad:image:tag`: directly pushes the image that is successfully built to iSulad, for example, `-o isulad:busybox:latest`. The following restrictions apply: + + - isula-build and iSulad must be on the same node. + - The tag must be configured. + - On the isula-build client, you need to temporarily save the successfully built image as **/var/tmp/isula-build-tmp-%v.tar** and then import it to iSulad. Ensure that the **/var/tmp/** directory has sufficient disk space. + +- `docker-daemon:image:tag`: directly pushes the successfully built image to Docker daemon, for example, `-o docker-daemon:busybox:latest`. The following restrictions apply: +- isula-build and Docker must be on the same node. + - The tag must be configured. + +- `docker://registry.example.com/repository:tag`: directly pushes the successfully built image to the remote image repository in Docker image format, for example, `-o docker://localhost:5000/library/busybox:latest`. + +- `docker-archive:/:image:tag`: saves the successfully built image to the local host in Docker image format, for example, `-o docker-archive:/root/image.tar:busybox:latest`. + +When experiment feature is enabled, you can build image in OCI image format with: + +- `oci://registry.example.com/repository:tag`: directly pushes the successfully built image to the remote image repository in OCI image format(OCI image format should be supported by the remote repository), for example, `-o oci://localhost:5000/library/busybox:latest`. + +- `oci-archive:/:image:tag`:saves the successfully built image to the local host in OCI image format, for example, `-o oci-archive:/root/image.tar:busybox:latest`。 + +In addition to the flags, the `build` subcommand also supports an argument whose type is string and meaning is context, that is, the context of the Dockerfile build environment. The default value of this parameter is the current path where isula-build is executed. This path affects the path retrieved by the **ADD** and **COPY** instructions of the .dockerignore file and Dockerfile. + +#### \--proxy + +Specifies whether the container started by the **RUN** instruction inherits the proxy-related environment variables **http_proxy**, **https_proxy**, **ftp_proxy**, **no_proxy**, **HTTP_PROXY**, **HTTPS_PROXY**, and **FTP_PROXY**. The default value is **true**. + +When a user configures proxy-related **ARG** or **ENV** in the Dockerfile, the inherited environment variables will be overwritten. + +>![](./public_sys-resources/icon-note.gif) **Note:** +> +> If the client and daemon are running on different terminals, the environment variables of the terminal where the daemon is running are inherited. + +#### \--tag + +Specifies the tag of the image stored on the local disk after the image is successfully built. + +#### \--cap-add + +Run the following command to add the permission required by the **RUN** instruction during the build process: + +```shell +isula-build ctr-img build --cap-add ${CAP} +``` + +Example: + +```sh +sudo isula-build ctr-img build --cap-add CAP_SYS_ADMIN --cap-add CAP_SYS_PTRACE -f Dockerfile +``` + +> **Note:** +> +> - A maximum of 100 container images can be concurrently built. +> - isula-build supports Dockerfiles with a maximum size of 1 MB. +> - isula-build supports a .dockerignore file with a maximum size of 1 MB. +> - Ensure that only the current user has the read and write permissions on the Dockerfiles to prevent other users from tampering with the files. +> - During the build, the **RUN** instruction starts the container to build in the container. Currently, isula-build supports the host network only. +> - isula-build only supports the tar compression format. +> - isula-build commits once after each image build stage is complete, instead of each time a Dockerfile line is executed. +> - isula-build does not support cache build. +> - isula-build starts the build container only when the **RUN** instruction is built. +> - Currently, the history function of Docker images is not supported. +> - The stage name can start with a digit. +> - The stage name can contain a maximum of 64 characters. +> - isula-build does not support resource restriction on a single Dockerfile build. If resource restriction is required, you can configure a resource limit on isula-builder. +> - Currently, isula-build does not support a remote URL as the data source of the **ADD** instruction in the Dockerfile. +> - The local tar package exported using the **docker-archive** and **oci-archive** types are not compressed, you can manually compress the file as required. + +### image: Viewing Local Persistent Build Images + +You can run the `images` command to view the images in the local persistent storage. + +```sh +$ sudo isula-build ctr-img images +--------------------------------------- ----------- ----------------- ------------------------ ------------ +REPOSITORY TAG IMAGE ID CREATED SIZE +--------------------------------------- ----------- ----------------- ------------------------ ------------ +localhost:5000/library/alpine latest a24bb4013296 2022-01-17 10:02:19 5.85 MB + 39b62a3342ee 2022-01-17 10:01:12 1.45 MB +--------------------------------------- ----------- ----------------- ------------------------ ------------ +``` + +>![](./public_sys-resources/icon-note.gif) **Note:** +> +> The image size displayed by running the `isula-build ctr-img images` command may be different from that displayed by running the `docker images` command. When calculating the image size, `isula-build` directly calculates the total size of .tar packages at each layer, while `docker` calculates the total size of files by decompressing the .tar packages and traversing the diff directory. Therefore, the statistics are different. + +### import: Importing a Basic Container Image + +A tar file in rootfs form can be imported into isula-build via the `ctr-img import` command. + +The command format is as follows: + +```shell +isula-build ctr-img import [flags] +``` + +Example: + +```sh +$ sudo isula-build ctr-img import busybox.tar mybusybox:latest +Getting image source signatures +Copying blob sha256:7b8667757578df68ec57bfc9fb7754801ec87df7de389a24a26a7bf2ebc04d8d +Copying config sha256:173b3cf612f8e1dc34e78772fcf190559533a3b04743287a32d549e3c7d1c1d1 +Writing manifest to image destination +Storing signatures +Import success with image id: "173b3cf612f8e1dc34e78772fcf190559533a3b04743287a32d549e3c7d1c1d1" +$ sudo isula-build ctr-img images +--------------------------------------- ----------- ----------------- ------------------------ ------------ +REPOSITORY TAG IMAGE ID CREATED SIZE +--------------------------------------- ----------- ----------------- ------------------------ ------------ +mybusybox latest 173b3cf612f8 2022-01-12 16:02:31 1.47 MB +--------------------------------------- ----------- ----------------- ------------------------ ------------ +``` + +>![](./public_sys-resources/icon-note.gif) **Note** +> +> isula-build supports the import of container basic images with a maximum size of 1 GB. + +### load: Importing Cascade Images + +Cascade images are images that are saved to the local computer by running the `docker save` or `isula-build ctr-img save` command. The compressed image package contains a layer-by-layer image package named **layer.tar**. You can run the `ctr-img load` command to import the image to isula-build. + +The command format is as follows: + +```shell +isula-build ctr-img load [flags] +``` + +Currently, the following flags are supported: + +- `-i, --input`: path of the local .tar package. + +Example: + +```sh +$ sudo isula-build ctr-img load -i ubuntu.tar +Getting image source signatures +Copying blob sha256:cf612f747e0fbcc1674f88712b7bc1cd8b91cf0be8f9e9771235169f139d507c +Copying blob sha256:f934e33a54a60630267df295a5c232ceb15b2938ebb0476364192b1537449093 +Copying blob sha256:943edb549a8300092a714190dfe633341c0ffb483784c4fdfe884b9019f6a0b4 +Copying blob sha256:e7ebc6e16708285bee3917ae12bf8d172ee0d7684a7830751ab9a1c070e7a125 +Copying blob sha256:bf6751561805be7d07d66f6acb2a33e99cf0cc0a20f5fd5d94a3c7f8ae55c2a1 +Copying blob sha256:c1bd37d01c89de343d68867518b1155cb297d8e03942066ecb44ae8f46b608a3 +Copying blob sha256:a84e57b779297b72428fc7308e63d13b4df99140f78565be92fc9dbe03fc6e69 +Copying blob sha256:14dd68f4c7e23d6a2363c2320747ab88986dfd43ba0489d139eeac3ac75323b2 +Copying blob sha256:a2092d776649ea2301f60265f378a02405539a2a68093b2612792cc65d00d161 +Copying blob sha256:879119e879f682c04d0784c9ae7bc6f421e206b95d20b32ce1cb8a49bfdef202 +Copying blob sha256:e615448af51b848ecec00caeaffd1e30e8bf5cffd464747d159f80e346b7a150 +Copying blob sha256:f610bd1e9ac6aa9326d61713d552eeefef47d2bd49fc16140aa9bf3db38c30a4 +Copying blob sha256:bfe0a1336d031bf5ff3ce381e354be7b2bf310574cc0cd1949ad94dda020cd27 +Copying blob sha256:f0f15db85788c1260c6aa8ad225823f45c89700781c4c793361ac5fa58d204c7 +Copying config sha256:c07ddb44daa97e9e8d2d68316b296cc9343ab5f3d2babc5e6e03b80cd580478e +Writing manifest to image destination +Storing signatures +Loaded image as c07ddb44daa97e9e8d2d68316b296cc9343ab5f3d2babc5e6e03b80cd580478e +``` + +>![](./public_sys-resources/icon-note.gif) **Note:** +> +> - isula-build allows you to import a container image with a maximum size of 50 GB. +> - isula-build automatically recognizes the image format and loads it from the cascade image file. + +### rm: Deleting a Local Persistent Image + +You can run the `rm` command to delete an image from the local persistent storage. The command format is as follows: + +```shell +isula-build ctr-img rm IMAGE [IMAGE...] [FLAGS] +``` + +Currently, the following flags are supported: + +- `-a, --all`: deletes all images stored locally. +- `-p, --prune`: deletes all images that are stored locally and do not have tags. + +Example: + +```sh +$ sudo isula-build ctr-img rm -p +Deleted: sha256:78731c1dde25361f539555edaf8f0b24132085b7cab6ecb90de63d72fa00c01d +Deleted: sha256:eeba1bfe9fca569a894d525ed291bdaef389d28a88c288914c1a9db7261ad12c +``` + +### save: Exporting Cascade Images + +You can run the `save` command to export the cascade images to the local disk. The command format is as follows: + +```shell +isula-build ctr-img save [REPOSITORY:TAG]|imageID -o xx.tar +``` + +Currently, the following flags are supported: + +- `-f, --format`: which indicates the exported image format: **oci** or **docker** (**ISULABUILD_CLI_EXPERIMENTAL** needs to be enabled) +- `-o, --output`: which indicates the local path for storing the exported images. + +The following example shows how to export an image using *image/tag*: + +```sh +$ sudo isula-build ctr-img save busybox:latest -o busybox.tar +Getting image source signatures +Copying blob sha256:50644c29ef5a27c9a40c393a73ece2479de78325cae7d762ef3cdc19bf42dd0a +Copying blob sha256:824082a6864774d5527bda0d3c7ebd5ddc349daadf2aa8f5f305b7a2e439806f +Copying blob sha256:5f70bf18a086007016e948b04aed3b82103a36bea41755b6cddfaf10ace3c6ef +Copying config sha256:21c3e96ac411242a0e876af269c0cbe9d071626bdfb7cc79bfa2ddb9f7a82db6 +Writing manifest to image destination +Storing signatures +Save success with image: busybox:latest +``` + +The following example shows how to export an image using *ImageID*: + +```sh +$ sudo isula-build ctr-img save 21c3e96ac411 -o busybox.tar +Getting image source signatures +Copying blob sha256:50644c29ef5a27c9a40c393a73ece2479de78325cae7d762ef3cdc19bf42dd0a +Copying blob sha256:824082a6864774d5527bda0d3c7ebd5ddc349daadf2aa8f5f305b7a2e439806f +Copying blob sha256:5f70bf18a086007016e948b04aed3b82103a36bea41755b6cddfaf10ace3c6ef +Copying config sha256:21c3e96ac411242a0e876af269c0cbe9d071626bdfb7cc79bfa2ddb9f7a82db6 +Writing manifest to image destination +Storing signatures +Save success with image: 21c3e96ac411 +``` + +The following example shows how to export multiple images to the same tarball: + +```sh +$ sudo isula-build ctr-img save busybox:latest nginx:latest -o all.tar +Getting image source signatures +Copying blob sha256:eb78099fbf7fdc70c65f286f4edc6659fcda510b3d1cfe1caa6452cc671427bf +Copying blob sha256:29f11c413898c5aad8ed89ad5446e89e439e8cfa217cbb404ef2dbd6e1e8d6a5 +Copying blob sha256:af5bd3938f60ece203cd76358d8bde91968e56491daf3030f6415f103de26820 +Copying config sha256:b8efb18f159bd948486f18bd8940b56fd2298b438229f5bd2bcf4cedcf037448 +Writing manifest to image destination +Storing signatures +Getting image source signatures +Copying blob sha256:e2d6930974a28887b15367769d9666116027c411b7e6c4025f7c850df1e45038 +Copying config sha256:a33de3c85292c9e65681c2e19b8298d12087749b71a504a23c576090891eedd6 +Writing manifest to image destination +Storing signatures +Save success with image: [busybox:latest nginx:latest] +``` + +>![](./public_sys-resources/icon-note.gif) **NOTE:** +> +>- Save exports an image in .tar format by default. If necessary, you can save the image and then manually compress it. +>- When exporting an image using image name, specify the entire image name in the *REPOSITORY:TAG* format. + +### tag: Tagging Local Persistent Images + +You can run the `tag` command to add a tag to a local persistent container image. The command format is as follows: + +```shell +isula-build ctr-img tag / busybox:latest +``` + +Example: + +```sh +$ sudo isula-build ctr-img images +--------------------------------------- ----------- ----------------- -------------------------- ------------ +REPOSITORY TAG IMAGE ID CREATED SIZE +--------------------------------------- ----------- ----------------- -------------------------- ------------ +alpine latest a24bb4013296 2020-05-29 21:19:46 5.85 MB +--------------------------------------- ----------- ----------------- -------------------------- ------------ +$ sudo isula-build ctr-img tag a24bb4013296 alpine:v1 +$ sudo isula-build ctr-img images +--------------------------------------- ----------- ----------------- ------------------------ ------------ +REPOSITORY TAG IMAGE ID CREATED SIZE +--------------------------------------- ----------- ----------------- ------------------------ ------------ +alpine latest a24bb4013296 2020-05-29 21:19:46 5.85 MB +alpine v1 a24bb4013296 2020-05-29 21:19:46 5.85 MB +--------------------------------------- ----------- ----------------- ------------------------ ------------ +``` + +### pull: Pulling an Image To a Local Host + +Run the `pull` command to pull an image from a remote image repository to a local host. Command format: + +```shell +isula-build ctr-img pull REPOSITORY[:TAG] +``` + +Example: + +```sh +$ sudo isula-build ctr-img pull example-registry/library/alpine:latest +Getting image source signatures +Copying blob sha256:8f52abd3da461b2c0c11fda7a1b53413f1a92320eb96525ddf92c0b5cde781ad +Copying config sha256:e4db68de4ff27c2adfea0c54bbb73a61a42f5b667c326de4d7d5b19ab71c6a3b +Writing manifest to image destination +Storing signatures +Pull success with image: example-registry/library/alpine:latest +``` + +### push: Pushing a Local Image to a Remote Repository + +Run the `push` command to push a local image to a remote repository. Command format: + +```shell +isula-build ctr-img push REPOSITORY[:TAG] +``` + +Currently, the following flags are supported: + +- `-f, --format`: indicates the pushed image format **oci** or **docker** (**ISULABUILD_CLI_EXPERIMENTAL** needs to be enabled) + +Example: + +```sh +$ sudo isula-build ctr-img push example-registry/library/mybusybox:latest +Getting image source signatures +Copying blob sha256:d2421964bad195c959ba147ad21626ccddc73a4f2638664ad1c07bd9df48a675 +Copying config sha256:f0b02e9d092d905d0d87a8455a1ae3e9bb47b4aa3dc125125ca5cd10d6441c9f +Writing manifest to image destination +Storing signatures +Push success with image: example-registry/library/mybusybox:latest +``` + +>![](./public_sys-resources/icon-note.gif) **NOTE:** +> +>- Before pushing an image, log in to the corresponding image repository. + +## info: Viewing the Operating Environment and System Information + +You can run the `isula-build info` command to view the running environment and system information of isula-build. The command format is as follows: + +```shell + isula-build info [flags] +``` + +The following flags are supported: + +- `-H, --human-readable`: Boolean. The memory information is printed in the common memory format. The value is 1000 power. +- `-V, --verbose`: Boolean. The memory usage is displayed during system running. + +Example: + +```sh +$ sudo isula-build info -H + General: + MemTotal: 7.63 GB + MemFree: 757 MB + SwapTotal: 8.3 GB + SwapFree: 8.25 GB + OCI Runtime: runc + DataRoot: /var/lib/isula-build/ + RunRoot: /var/run/isula-build/ + Builders: 0 + Goroutines: 12 + Store: + Storage Driver: overlay + Backing Filesystem: extfs + Registry: + Search Registries: + oepkgs.net + Insecure Registries: + localhost:5000 + oepkgs.net + Runtime: + MemSys: 68.4 MB + HeapSys: 63.3 MB + HeapAlloc: 7.41 MB + MemHeapInUse: 8.98 MB + MemHeapIdle: 54.4 MB + MemHeapReleased: 52.1 MB +``` + +## login: Logging In to the Remote Image Repository + +You can run the `login` command to log in to the remote image repository. The command format is as follows: + +```shell + isula-build login SERVER [FLAGS] +``` + +Currently, the following flags are supported: + +```shell + Flags: + -p, --password-stdin Read password from stdin + -u, --username string Username to access registry +``` + +Enter the password through stdin. In the following example, the password in creds.txt is transferred to the stdin of isula-build through a pipe for input. + +```sh + $ cat creds.txt | sudo isula-build login -u cooper -p mydockerhub.io + Login Succeeded +``` + +Enter the password in interactive mode. + +```sh + $ sudo isula-build login mydockerhub.io -u cooper + Password: + Login Succeeded +``` + +## logout: Logging Out of the Remote Image Repository + +You can run the `logout` command to log out of the remote image repository. The command format is as follows: + +```shell + isula-build logout [SERVER] [FLAGS] +``` + +Currently, the following flags are supported: + +```shell + Flags: + -a, --all Logout all registries +``` + +Example: + +```sh + $ sudo isula-build logout -a + Removed authentications +``` + +## version: Querying the isula-build Version + +You can run the `version` command to view the current version information. + +```sh +$ sudo isula-build version +Client: + Version: 0.9.6-4 + Go Version: go1.15.7 + Git Commit: 83274e0 + Built: Wed Jan 12 15:32:55 2022 + OS/Arch: linux/amd64 + +Server: + Version: 0.9.6-4 + Go Version: go1.15.7 + Git Commit: 83274e0 + Built: Wed Jan 12 15:32:55 2022 + OS/Arch: linux/amd64 +``` + +## manifest: Manifest List Management + +The manifest list contains the image information corresponding to different system architectures. You can use the same manifest (for example, **openeuler:latest**) in different architectures to obtain the image of the corresponding architecture. The manifest contains the create, annotate, inspect, and push subcommands. + +>![](./public_sys-resources/icon-note.gif) **NOTE:** +> +> manifest is an experiment feature. When using this feature, you need to enable the experiment options on the client and server. For details, see Client Overview and Configuring Services. + +### create: Manifest List Creation + +The create subcommand of the `manifest` command is used to create a manifest list. The command format is as follows: + +```shell +isula-build manifest create MANIFEST_LIST MANIFEST [MANIFEST...] +``` + +You can specify the name of the manifest list and the remote images to be added to the list. If no remote image is specified, an empty manifest list is created. + +Example: + +```sh +sudo isula-build manifest create openeuler localhost:5000/openeuler_x86:latest localhost:5000/openeuler_aarch64:latest +``` + +### annotate: Manifest List Update + +The `annotate` subcommand of the `manifest` command is used to update the manifest list. The command format is as follows: + +```shell +isula-build manifest annotate MANIFEST_LIST MANIFEST [flags] +``` + +You can specify the manifest list to be updated and the images in the manifest list, and use flags to specify the options to be updated. This command can also be used to add new images to the manifest list. + +Currently, the following flags are supported: + +- --arch: Applicable architecture of the rewritten image. The value is a string. +- --os: Indicates the applicable system of the image. The value is a string. +- --os-features: Specifies the OS features required by the image. This parameter is a string and rarely used. +- --variant: Variable of the image recorded in the list. The value is a string. + +Example: + +```sh +sudo isula-build manifest annotate --os linux --arch arm64 openeuler:latest localhost:5000/openeuler_aarch64:latest +``` + +### inspect: Manifest List Inspect + +The `inspect` subcommand of the `manifest` command is used to query the manifest list. The command format is as follows: + +```shell +isula-build manifest inspect MANIFEST_LIST +``` + +Example: + +```sh +$ sudo isula-build manifest inspect openeuler:latest +{ + "schemaVersion": 2, + "mediaType": "application/vnd.docker.distribution.manifest.list.v2+json", + "manifests": [ + { + "mediaType": "application/vnd.docker.distribution.manifest.v2+json", + "size": 527, + "digest": "sha256:bf510723d2cd2d4e3f5ce7e93bf1e52c8fd76831995ac3bd3f90ecc866643aff", + "platform": { + "architecture": "amd64", + "os": "linux" + } + }, + { + "mediaType": "application/vnd.docker.distribution.manifest.v2+json", + "size": 527, + "digest": "sha256:f814888b4bb6149bd39ba8375a1932fb15071b4dbffc7f76c7b602b06abbb820", + "platform": { + "architecture": "arm64", + "os": "linux" + } + } + ] +} +``` + +### push: Manifest List Push to the Remote Repository + +The manifest subcommand `push` is used to push the manifest list to the remote repository. The command format is as follows: + +```shell +isula-build manifest push MANIFEST_LIST DESTINATION +``` + +Example: + +```sh +sudo isula-build manifest push openeuler:latest localhost:5000/openeuler:latest +``` + +# Directly Integrating a Container Engine + +isula-build can be integrated with iSulad or Docker to import the built container image to the local storage of the container engine. + +## Integration with iSulad + +Images that are successfully built can be directly exported to the iSulad. + +Example: + +```sh +sudo isula-build ctr-img build -f Dockerfile -o isulad:busybox:2.0 +``` + +Specify iSulad in the -o parameter to export the built container image to iSulad. You can query the image using isula images. + +```sh +$ sudo isula images +isula images +REPOSITORY TAG IMAGE ID CREATED SIZE +busybox 2.0 2d414a5cad6d 2020-08-01 06:41:36 5.577 MB +``` + +>![](./public_sys-resources/icon-note.gif) **Note:** +> +> - It is required that isula-build and iSulad be on the same node. +> - When an image is directly exported to the iSulad, the isula-build client needs to temporarily store the successfully built image as `/var/lib/isula-build/tmp/[build_id]/isula-build-tmp-%v.tar` and then import it to the iSulad. Ensure that the /var/tmp/ directory has sufficient disk space. If the isula-build client process is killed or Ctrl+C is pressed during the export, you need to manually clear the `/var/lib/isula-build/tmp/[build_id]/isula-build-tmp-%v.tar` file. + +## Integration with Docker + +Images that are successfully built can be directly exported to the Docker daemon. + +Example: + +```sh +sudo isula-build ctr-img build -f Dockerfile -o docker-daemon:busybox:2.0 +``` + +Specify docker-daemon in the -o parameter to export the built container image to Docker. You can run the `docker images` command to query the image. + +```sh +$ sudo docker images +REPOSITORY TAG IMAGE ID CREATED SIZE +busybox 2.0 2d414a5cad6d 2 months ago 5.22MB +``` + +>![](./public_sys-resources/icon-note.gif) **Note:** +> +> The isula-build and Docker must be on the same node. + +# Precautions + +This chapter is something about constraints, limitations and differences with `docker build` when use isula-builder build images. + +## Constraints or Limitations + +1. When export an image to [iSulad](https://gitee.com/openeuler/iSulad/blob/master/README.md/), a tag is necessary. +2. Because the OCI runtime, for example, **runc**, will be called by isula-builder when executing the **RUN** instruction, the integrity of the runtime binary should be guaranteed by the user. +3. DataRoot should not be set to **tmpfs**. +4. **Overlay2** is the only storage driver supported by isula-builder currently. +5. Docker image is the only image format supported by isula-builder currently. +6. You are advised to set file permission of the Dockerfile to **0600** to avoid tampering by other users. +7. Only host network is supported by the **RUN** instruction currently. +8. When export image to a tar package, only tar compression format is supported by isula-builder currently. +9. The base image size is limited to 1 GB when importing a base image using `import`. + +## Differences with "docker build" + +`isula-build` complies with [Dockerfile specification](https://docs.docker.com/engine/reference/builder), but there are also some subtle differences between `isula-builder` and `docker build` as follows: + +1. isula-builder commits after each build stage, but not every line. +2. Build cache is not supported by isula-builder. +3. Only **RUN** instruction will be executed in the build container. +4. Build history is not supported currently. +5. Stage name can be start with a number. +6. The length of the stage name is limited to 64 in `isula-builder`. +7. **ADD** instruction source can not be a remote URL currently. +8. Resource restriction on a single build is not supported. If resource restriction is required, you can configure a resource limit on isula-builder. +9. `isula-builder` add each origin layer tar size to get the image size, but docker only uses the diff content of each layer. So the image size listed by `isula-builder images` is different. +10. Image name should be in the *NAME:TAG* format. For example **busybox:latest**, where **latest** must not be omitted. + +# Appendix + +## Command Line Parameters + +**Table 1** Parameters of the `ctr-img build` command + +| **Command** | **Parameter** | **Description** | +| ------------- | -------------- | ------------------------------------------------------------ | +| ctr-img build | --build-arg | String list, which contains variables required during the build. | +| | --build-static | Key value, which is used to build binary equivalence. Currently, the following key values are included: - build-time: string, which indicates that a fixed timestamp is used to build a container image. The timestamp format is YYYY-MM-DD HH-MM-SS. | +| | -f, --filename | String, which indicates the path of the Dockerfiles. If this parameter is not specified, the current path is used. | +| | --format | String, which indicates the image format **oci** or **docker** (**ISULABUILD_CLI_EXPERIMENTAL** needs to be enabled). | +| | --iidfile | String, which indicates the ID of the image output to a local file. | +| | -o, --output | String, which indicates the image export mode and path.| +| | --proxy | Boolean, which inherits the proxy environment variable on the host. The default value is true. | +| | --tag | String, which indicates the tag value of the image that is successfully built. | +| | --cap-add | String list, which contains permissions required by the **RUN** instruction during the build process.| + +**Table 2** Parameters of the `ctr-img load` command + +| **Command** | **Parameter** | **Description** | +| ------------ | ----------- | --------------------------------- | +| ctr-img load | -i, --input | String, path of the local .tar package to be imported.| + +**Table 3** Parameters of the `ctr-img push` command + +| **Command** | **Parameter** | **Description** | +| ------------ | ----------- | --------------------------------- | +| ctr-img push | -f, --format | String, which indicates the pushed image format **oci** or **docker** (**ISULABUILD_CLI_EXPERIMENTAL** needs to be enabled).| + +**Table 4** Parameters of the `ctr-img rm` command + +| **Command** | **Parameter** | **Description** | +| ---------- | ----------- | --------------------------------------------- | +| ctr-img rm | -a, --all | Boolean, which is used to delete all local persistent images. | +| | -p, --prune | Boolean, which is used to delete all images that are stored persistently on the local host and do not have tags. | + +**Table 5** Parameters of the `ctr-img save` command + +| **Command** | **Parameter** | **Description** | +| ------------ | ------------ | ---------------------------------- | +| ctr-img save | -o, --output | String, which indicates the local path for storing the exported images.| +| ctr-img save | -f, --format | String, which indicates the exported image format **oci** or **docker** (**ISULABUILD_CLI_EXPERIMENTAL** needs to be enabled).| + +**Table 6** Parameters of the `login` command + +| **Command** | **Parameter** | **Description** | +| -------- | -------------------- | ------------------------------------------------------- | +| login | -p, --password-stdin | Boolean, which indicates whether to read the password through stdin. or enter the password in interactive mode. | +| | -u, --username | String, which indicates the username for logging in to the image repository.| + +**Table 7** Parameters of the `logout` command + +| **Command** | **Parameter** | **Description** | +| -------- | --------- | ------------------------------------ | +| logout | -a, --all | Boolean, which indicates whether to log out of all logged-in image repositories. | + +**Table 8** Parameters of the `manifest annotate` command + +| **Command** | **Parameter** | **Description** | +| ----------------- | ------------- | ---------------------------- | +| manifest annotate | --arch | Set architecture | +| | --os | Set operating system | +| | --os-features | Set operating system feature | +| | --variant | Set architecture variant | + +## Communication Matrix + +The isula-build component processes communicate with each other through the Unix socket file. No port is used for communication. + +## File and Permission + +- All isula-build operations must be performed by the **root** user. To perform operations as a non-privileged user, you need to configure the `--group` option. + +- The following table lists the file permissions involved in the running of isula-build. + +| **File Path** | **File/Folder Permission** | **Description** | +| ------------------------------------------- | ------------------- | ------------------------------------------------------------ | +| /usr/bin/isula-build | 550 | Binary file of the command line tool. | +| /usr/bin/isula-builder | 550 | Binary file of the isula-builder process. | +| /usr/lib/systemd/system/isula-build.service | 640 | systemd configuration file, which is used to manage the isula-build service. | +| /usr/isula-build | 650 | Root directory of the isula-builder configuration file. | +| /etc/isula-build/configuration.toml | 600 | General isula-builder configuration file, including the settings of the isula-builder log level, persistency directory, runtime directory, and OCI runtime. | +| /etc/isula-build/policy.json | 600 | Syntax file of the signature verification policy file. | +| /etc/isula-build/registries.toml | 600 | Configuration file of each image repository, including the available image repository list and image repository blacklist. | +| /etc/isula-build/storage.toml | 600 | Configuration file of the local persistent storage, including the configuration of the used storage driver. | +| /etc/isula-build/isula-build.pub | 400 | Asymmetric encryption public key file. | +| /var/run/isula_build.sock | 660 | Local socket of isula-builder. | +| /var/lib/isula-build | 700 | Local persistency directory. | +| /var/run/isula-build | 700 | Local runtime directory. | +| /var/lib/isula-build/tmp/\[build_id\]/isula-build-tmp-*.tar | 644 | Local temporary directory for storing the images when they are exported to iSulad. | diff --git a/docs/en/docs/Container/kuasar-appendix.md b/docs/en/docs/Container/kuasar-appendix.md deleted file mode 100644 index f2d7af6a8de4cb6df136e1976a98be843fa1f3bc..0000000000000000000000000000000000000000 --- a/docs/en/docs/Container/kuasar-appendix.md +++ /dev/null @@ -1,24 +0,0 @@ -# Appendix - -Fields in the **/var/lib/kuasar/config_stratovirt.toml** configuration file: - -```text -[sandbox] -log_level: Kuasar log level. The default value is info. - -[hypervisor] -path: path of the StratoVirt binary file -machine_type: the processor type to be simulated (virt for the Arm architecture and q35 for the x86 architecture) -kernel_path: execution path of the guest kernel -image_path: execution path of the guest image -initrd_path: execution path of the guest initrd (Configure either initrd_path or image_path.) -kernel_params: guest kernel parameters -vcpus: default number of vCPUs for each sandbox (default: 1) -memory_in_mb: default memory size of each sandbox (default: 1024 MiB) -block_device_driver: block device driver -debug: whether to enable debug mode -enable_mem_prealloc: whether to enable memory pre-allocation - -[hypervisor.virtiofsd_conf] -path: path of vhost_user_fs -``` diff --git a/docs/en/docs/Gazelle/Gazelle.md b/docs/en/docs/Gazelle/Gazelle.md index 257c1d37a6c8a2cc3c0c0672a5717c671f3ce8be..c93925431235edb7093763d32c66211903a4dee4 100644 --- a/docs/en/docs/Gazelle/Gazelle.md +++ b/docs/en/docs/Gazelle/Gazelle.md @@ -9,7 +9,7 @@ Zero-copy and lock-free packets that can be flexibly scaled out and scheduled ad - Universality Compatible with POSIX without modification, and applicable to different types of applications. -In the single-process scenario where the NIC supports multiple queues, use **liblstack.so** only to shorten the packet path. In other scenarios, use the ltran process to distribute packets to each thread. +In the single-process scenario where the NIC supports multiple queues, use **liblstack.so** only to shorten the packet path. ## Installation @@ -33,21 +33,7 @@ To configure the operating environment and use Gazelle to accelerate application ### 1. Installing the .ko File as the root User -Install the .ko files based on the site requirements to enable the virtual network ports and bind NICs to the user-mode driver. -To enable the virtual network port function, use **rte_kni.ko**. - -```sh -modprobe rte_kni carrier="on" -``` - -Configure NetworkManager not to manage the KNI NIC. - -```sh -$ cat /etc/NetworkManager/conf.d/99-unmanaged-devices.conf -[keyfile] -unmanaged-devices=interface-name:kni -$ systemctl reload NetworkManager -``` +Install the .ko files based on the site requirements to bind NICs to the user-mode driver. Bind the NIC from the kernel driver to the user-mode driver. Choose one of the following .ko files based on the site requirements. @@ -96,21 +82,15 @@ Run the **cat** command to query the actual number of reserved pages. If the con ### 4. Mounting Memory Huge Pages -Create two directories for the lstack and ltran processes to access the memory huge pages. Run the following commands: +Create a directory for the lstack process to access the memory huge pages. Run the following commands: ```sh -mkdir -p /mnt/hugepages-ltran mkdir -p /mnt/hugepages-lstack -chmod -R 700 /mnt/hugepages-ltran chmod -R 700 /mnt/hugepages-lstack -mount -t hugetlbfs nodev /mnt/hugepages-ltran -o pagesize=2M mount -t hugetlbfs nodev /mnt/hugepages-lstack -o pagesize=2M ``` ->NOTE: -The huge pages mounted to **/mnt/hugepages-ltran** and **/mnt/hugepages-lstack** must be in the same page size. - ### 5. Enabling Gazelle for an Application Enable Gazelle for an application using either of the following methods as required. @@ -147,22 +127,27 @@ GAZELLE_BIND_PROCNAME=test GAZELLE_THREAD_NAME=test_thread LD_PRELOAD=/usr/lib64 |:---|:---|:---| |dpdk_args|--socket-mem (mandatory)
--huge-dir (mandatory)
--proc-type (mandatory)
--legacy-mem
--map-perfect
-d|DPDK initialization parameter. For details, see the DPDK description.
**--map-perfect** is an extended feature. It is used to prevent the DPDK from occupying excessive address space and ensure that extra address space is available for lstack.
The **-d** option is used to load the specified .so library file.| |listen_shadow| 0/1 | Whether to use the shadow file descriptor for listening. This function is enabled when there is a single listen thread and multiple protocol stack threads.| -|use_ltran| 0/1 | Whether to use ltran.| +|use_ltran| 0/1 | Whether to use ltran. This parameter is no longer supported.| |num_cpus|"0,2,4 ..."|IDs of the CPUs bound to the lstack threads. The number of IDs is the number of lstack threads (less than or equal to the number of NIC queues). You can select CPUs by NUMA nodes.| -|num_wakeup|"1,3,5 ..."|IDs of the CPUs bound to the wakeup threads. The number of IDs is the number of wakeup threads, which is the same as the number of lstack threads. Select CPUs of the same NUMA nodes of the **num_cpus** parameter respectively. If this parameter is not set, the wakeup thread is not used.| |low_power_mode|0/1|Whether to enable the low-power mode. This parameter is not supported currently.| -|kni_switch|0/1|Whether to enable the rte_kni module. The default value is **0**. This module can be enabled only when ltran is not used.| +|kni_switch|0/1|Whether to enable the rte_kni module. The default value is **0**. This parameter is no longer supported.| |unix_prefix|"string"|Prefix string of the Unix socket file used for communication between Gazelle processes. By default, this parameter is left blank. The value must be the same as the value of **unix_prefix** in **ltran.conf** of the ltran process that participates in communication, or the value of the **-u** option for `gazellectl`. The value cannot contain special characters and can contain a maximum of 128 characters.| |host_addr|"192.168.xx.xx"|IP address of the protocol stack, which is also the IP address of the application.| |mask_addr|"255.255.xx.xx"|Subnet mask.| |gateway_addr|"192.168.xx.1"|Gateway address.| -|devices|"aa:bb:cc:dd:ee:ff"|MAC address for NIC communication. The value must be the same as that of **bond_macs** in the **ltran.conf** file.| +|devices|"aa:bb:cc:dd:ee:ff"|MAC address for NIC communication. The NIC is used as the primary bond NIC in bond 1 mode. | |app_bind_numa|0/1|Whether to bind the epoll and poll threads of an application to the NUMA node where the protocol stack is located. The default value is 1, indicating that the threads are bound.| |send_connect_number|4|Number of connections for sending packets in each protocol stack loop. The value is a positive integer.| |read_connect_number|4|Number of connections for receiving packets in each protocol stack loop. The value is a positive integer.| |rpc_number|4|Number of RPC messages processed in each protocol stack loop. The value is a positive integer.| |nic_read_num|128|Number of data packets read from the NIC in each protocol stack cycle. The value is a positive integer.| -|mbuf_pool_size|1024000|Size of the mbuf address pool applied for during initialization. Set this parameter based on the NIC configuration. The value must be a positive integer less than 5120000 and not too small, otherwise the startup fails.| +|bond_mode|-1|Bond mode. Currently, two network ports can be bonded. The default value is -1, indicating that the bond mode is disabled. bond1/4/6 is supported.| +|bond_slave_mac|"aa:bb:cc:dd:ee:ff;AA:BB:CC:DD:EE:FF"|MAC addresses of the bond network ports. Separate the MAC addresses with semicolons (;).| +|bond_miimon|10|Listening interval in bond mode. The default value is 10. The value ranges from 0 to 1500.| +|udp_enable|0/1|Whether to enable the UDP function. The default value is 1.| +|nic_vlan_mode|-1|Whether to enable the VLAN mode. The default value is -1, indicating that the VLAN mode is disabled. The value ranges from -1 to 4095. IDs 0 and 4095 are commonly reserved in the industry and have no actual effect.| +|tcp_conn_count|1500|Maximum number of TCP connections. The value of this parameter multiplied by **mbuf_count_per_conn** is the size of the mbuf pool applied for during initialization. If the value is too small, the startup fails. The value of (**tcp_conn_count** x **mbuf_count_per_conn** x 2048) cannot be greater than the huge page size.| +|mbuf_count_per_conn|170|Number of mbuf required by each TCP connection. The value of this parameter multiplied by **tcp_conn_count** is the size of the mbuf address pool applied for during initialization. If the value is too small, the startup fails. The value of (**tcp_conn_count** x **mbuf_count_per_conn** x 2048) cannot be greater than the huge page size.| lstack.conf example: @@ -187,55 +172,15 @@ read_connect_number=4 rpc_number=4 nic_read_num=128 mbuf_pool_size=1024000 -``` - -- The **ltran.conf** file is used to specify ltran startup parameters. The default path is **/etc/gazelle/ltran.conf**. To enable ltran, set **use_ltran=1** in the **lstack.conf** file. The configuration parameters are as follows: - -|Options|Value|Remarks| -|:---|:---|:---| -|forward_kit|"dpdk"|Specified transceiver module of an NIC.
This field is reserved and is not used currently.| -|forward_kit_args|-l
--socket-mem (mandatory)
--huge-dir (mandatory)
--proc-TYPE (mandatory)
--legacy-mem (mandatory)
--map-perfect (mandatory)
-d|DPDK initialization parameter. For details, see the DPDK description.
**--map-perfect** is an extended feature. It is used to prevent the DPDK from occupying excessive address space and ensure that extra address space is available for lstack.
The **-d** option is used to load the specified .so library file.| -|kni_switch|0/1|Whether to enable the rte_kni module. The default value is **0**.| -|unix_prefix|"string"|Prefix string of the Unix socket file used for communication between Gazelle processes. By default, this parameter is left blank. The value must be the same as the value of **unix_prefix** in **lstack.conf** of the lstack process that participates in communication, or the value of the **-u** option for `gazellectl`.| -|dispatch_max_clients|n|Maximum number of clients supported by ltran.
The total number of lstack protocol stack threads cannot exceed 32.| -|dispatch_subnet|192.168.xx.xx|Subnet mask, which is the subnet segment of the IP addresses that can be identified by ltran. The value is an example. Set the subnet based on the site requirements.| -|dispatch_subnet_length|n|Length of the Subnet that can be identified by ltran. For example, if the value of length is 4, the value ranges from 192.168.1.1 to 192.168.1.16.| -|bond_mode|n|Bond mode. Currently, only Active Backup(Mode1) is supported. The value is 1.| -|bond_miimon|n|Bond link monitoring time. The unit is millisecond. The value ranges from 1 to 2^64 - 1 - (1000 x 1000).| -|bond_ports|"0x01"|DPDK NIC to be used. The value **0x01** indicates the first NIC.| -|bond_macs|"aa:bb:cc:dd:ee:ff"|MAC address of the bound NIC, which must be the same as the MAC address of the KNI.| -|bond_mtu|n|Maximum transmission unit. The default and maximum value is 1500. The minimum value is 68.| - -ltran.conf example: - -```sh -forward_kit_args="-l 0,1 --socket-mem 1024,0,0,0 --huge-dir /mnt/hugepages-ltran --proc-type primary --legacy-mem --map-perfect --syslog daemon" -forward_kit="dpdk" - -kni_switch=0 - -dispatch_max_clients=30 -dispatch_subnet="192.168.1.0" -dispatch_subnet_length=8 - bond_mode=1 -bond_mtu=1500 -bond_miimon=100 -bond_macs="aa:bb:cc:dd:ee:ff" -bond_ports="0x1" - -tcp_conn_scan_interval=10 +bond_slave_mac="aa:bb:cc:dd:ee:ff;AA:BB:CC:DD:EE:FF" +udp_enable=1 +nic_vlan_mode=-1 ``` -### 7. Starting an Application - -- Start the ltran process. -If there is only one process and the NIC supports multiple queues, the NIC multi-queue is used to distribute packets to each thread. You do not need to start the ltran process. Set the value of **use_ltran** in the **lstack.conf** file to **0**. -If you do not use `-config-file` to specify a configuration file when starting ltran, the default configuration file path **/etc/gazelle/ltran.conf** is used. +- The ltran mode is deprecated. If multiple processes are required, try the virtual network mode using SR-IOV network hardware. -```sh -ltran --config-file ./ltran.conf -``` +### 7. Starting an Application - Start the application. If the environment variable **LSTACK_CONF_PATH** is not used to specify the configuration file before the application is started, the default configuration file path **/etc/gazelle/lstack.conf** is used. @@ -249,42 +194,31 @@ LD_PRELOAD=/usr/lib64/liblstack.so GAZELLE_BIND_PROCNAME=redis-server redis-ser Gazelle wraps the POSIX interfaces of the application. The code of the application does not need to be modified. -### 9. Commissioning Commands - -- If the ltran mode is not used, the **gazellectl ltran xxx** and **gazellectl lstack show {ip | pid} -r** commands are not supported. +### 9. Debugging Commands ```sh Usage: gazellectl [-h | help] - or: gazellectl ltran {quit | show | set} [LTRAN_OPTIONS] [time] [-u UNIX_PREFIX] or: gazellectl lstack {show | set} {ip | pid} [LSTACK_OPTIONS] [time] [-u UNIX_PREFIX] - quit ltran process exit - - where LTRAN_OPTIONS := - show ltran all statistics - -r, rate show ltran statistics per second - -i, instance show ltran instance register info - -b, burst show ltran NIC packet len per second - -l, latency show ltran latency - set: - loglevel {error | info | debug} set ltran loglevel - where LSTACK_OPTIONS := show lstack all statistics -r, rate show lstack statistics per second -s, snmp show lstack snmp -c, connetct show lstack connect -l, latency show lstack latency + -x, xstats show lstack xstats + -k, nic-features show state of protocol offload and other feature + -a, aggregatin [time] show lstack send/recv aggregation set: loglevel {error | info | debug} set lstack loglevel lowpower {0 | 1} set lowpower enable [time] measure latency time default 1S ``` -The `-u` option specifies the prefix of the Unix socket for communication between Gazelle processes. The value of this parameter must be the same as that of **unix_prefix** in the **ltran.conf** or **lstack.conf** file. +The `-u` option specifies the prefix of the Unix socket for communication between Gazelle processes. The value of this parameter must be the same as that of **unix_prefix** in the **lstack.conf** file. **Packet Capturing Tool** -The NIC used by Gazelle is managed by DPDK. Therefore, tcpdump cannot capture Gazelle packets. As a substitute, Gazelle uses gazelle-pdump provided in the dpdk-tools software package as the packet capturing tool. gazelle-pdump uses the multi-process mode of DPDK to share memory with the lstack or ltran process. In ltran mode, gazelle-pdump can capture only ltran packets that directly communicate with the NIC. By filtering tcpdump data packets, gazelle-pdump can filter packets of a specific lstack process. +The NIC used by Gazelle is managed by DPDK. Therefore, tcpdump cannot capture Gazelle packets. As a substitute, Gazelle uses gazelle-pdump provided in the dpdk-tools software package as the packet capturing tool. gazelle-pdump uses the multi-process mode of DPDK to share memory with the lstack process. [Usage](https://gitee.com/openeuler/gazelle/blob/master/doc/pdump/pdump.md) **Thread Binding** @@ -309,11 +243,9 @@ Restrictions of Gazelle are as follows: - Blocking **accept()** or **connect()** is not supported. - A maximum of 1500 TCP connections are supported. -- Currently, only TCP, ICMP, ARP, and IPv4 are supported. +- Currently, only TCP, ICMP, ARP, IPv4, and UDP are supported. - When a peer end pings Gazelle, the specified packet length must be less than or equal to 14,000 bytes. - Transparent huge pages are not supported. -- ltran does not support the hybrid bonding of multiple types of NICs. -- The active/standby mode (bond1 mode) of ltran supports active/standby switchover only when a fault occurs at the link layer (for example, the network cable is disconnected), but does not support active/standby switchover when a fault occurs at the physical layer (for example, the NIC is powered off or removed). - VM NICs do not support multiple queues. ### Operation Restrictions @@ -321,24 +253,25 @@ Restrictions of Gazelle are as follows: - By default, the command lines and configuration files provided by Gazelle requires **root** permissions. Privilege escalation and changing of file owner are required for non-root users. - To bind the NIC from user-mode driver back to the kernel driver, you must exit Gazelle first. - Memory huge pages cannot be remounted to subdirectories created in the mount point. -- The minimum huge page memory required by ltran is 1 GB. - The minimum hugepage memory of each application instance protocol stack thread is 800 MB. - Gazelle supports only 64-bit OSs. - The `-march=native` option is used when building the x86 version of Gazelle to optimize Gazelle based on the CPU instruction set of the build environment (Intel® Xeon® Gold 5118 CPU @ 2.30GHz). Therefore, the CPU of the operating environment must support the SSE4.2, AVX, AVX2, and AVX-512 instruction set extensions. - The maximum number of IP fragments is 10 (the maximum ping packet length is 14,790 bytes). TCP does not use IP fragments. - You are advised to set the **rp_filter** parameter of the NIC to 1 using the `sysctl` command. Otherwise, the Gazelle protocol stack may not be used as expected. Instead, the kernel protocol stack is used. -- If ltran is not used, the KNI cannot be configured to be used only for local communication. In addition, you need to configure the NetworkManager not to manage the KNI network adapter before starting Gazelle. -- The IP address and MAC address of the virtual KNI must be the same as those in the **lstack.conf** file. +- The hybrid bonding of multiple types of NICs is not supported. +- The active/standby mode (bond1 mode) supports active/standby switchover only when a fault occurs at the link layer (for example, the network cable is disconnected), but does not support active/standby switchover when a fault occurs at the physical layer (for example, the NIC is powered off or removed). +- If the length of UDP packets to be sent exceeds 45952 (32 x 1436) bytes, increase the value of **send_ring_size** to at least 64. ## Precautions You need to evaluate the use of Gazelle based on application scenarios. +The ltran mode and kni module is no longer supported due to changes in the dependencies and upstream community. + **Shared Memory** - Current situation: - The memory huge pages are mounted to the **/mnt/hugepages-lstack** directory. During process initialization, files are created in the **/mnt/hugepages-lstack** directory. Each file corresponds to a huge page, and the mmap function is performed on the files. After receiving the registration information of lstask, ltran also perform the mmap function on the file in the directory based on the huge page memory configuration information to implement shared huge page memory. - The procedure also applies to the files in the **/mnt/hugepages-ltran** directory. + The memory huge pages are mounted to the **/mnt/hugepages-lstack** directory. During process initialization, files are created in the **/mnt/hugepages-lstack** directory. Each file corresponds to a huge page, and the mmap function is performed on the files. - Current mitigation measures The huge page file permission is **600**. Only the owner can access the files. The default owner is the **root** user. Other users can be configured. Huge page files are locked by DPDK and cannot be directly written or mapped. @@ -349,4 +282,4 @@ You need to evaluate the use of Gazelle based on application scenarios. Gazelle does not limit the traffic. Users can send packets at the maximum NIC line rate to the network, which may congest the network. **Process Spoofing** -If two lstack processes A and B are legitimately registered with ltran, A can impersonate B to send spoofing messages to ltran and modify the ltran forwarding control information. As a result, the communication of B becomes abnormal, and information leakage occurs when packets for B are sent to A. Ensure that all lstack processes are trusted. +Ensure that all lstack processes are trusted. diff --git a/docs/en/docs/Installation/RISC-V-LicheePi4A.md b/docs/en/docs/Installation/RISC-V-LicheePi4A.md new file mode 100644 index 0000000000000000000000000000000000000000..d10c5bf708ea9b9dda285614659b81ef937b8072 --- /dev/null +++ b/docs/en/docs/Installation/RISC-V-LicheePi4A.md @@ -0,0 +1,96 @@ +# Installing on Licheepi4A + +## Hardware Preparation + +- `Sipeed LicheePi 4A` device (either `8 GB` or `16 GB` version) +- Monitor +- `USB` keyboard and mouse +- Equipment/components required for serial operation (optional) +- `RJ45` network cable and router/switch for wired network connection + +## Device Firmware + +Different memory versions of `LicheePi4A` require different firmware: + +- `u-boot-with-spl-lpi4a.bin` is the u-boot file for the 8 GB version. +- `u-boot-with-spl-lpi4a-16g.bin` is the u-boot file for the 16 GB version. + +The following flashing method uses the `16GB + 128GB` core board as an example, assuming the user already downloaded `base` image and the corresponding u-boot file. + +## Flashing Method + +### Flashing Tools + +Please use `fastboot` command for flashing. You can download `burn_tools.zip` from `https://dl.sipeed.com/shareURL/LICHEE/licheepi4a/07_Tools`. The archive contains flashing tools for `Windows`, `macOS`, and `Linux`. + +### Set Hardware to Enter Flashing Mode + +> Please first check that the DIP switch on the baseboard is set to EMMC boot mode. After confirming, you can proceed with the flashing. + +Hold down the `BOOT` button on the board, then insert the `USB-C` cable to power on the device (the other end of the cable should be connected to a `PC`), entering `USB` flashing mode. +On `Windows`, check the device manager for the `USB download gadget` device. +On `Linux`, use `lsusb` to check for the device, showing: `ID 2345:7654 T-HEAD USB download gadget`. + +### Driver Installation on Windows + +> Note: +> The provided image does not include Windows drivers. You can download `burn_tools.zip` [here](https://dl.sipeed.com/shareURL/LICHEE/licheepi4a/07_Tools) and find the `windows/usb_driver-fullmask` folder inside. This folder contains the drivers needed for Windows. + +To flash on Windows, you need to enter advanced startup mode and disable digital signature enforcement. Follow the steps below to disable digital signature enforcement: + +#### Windows 10 + +1. Go to `Settings` -> `Update & Security` +2. Click `Recovery` on the left, then click `Restart now` under `Advanced startup`. Your computer will restart. Save any ongoing work before proceeding. + +#### Windows 11 + +1. Go to `Settings` -> `System` -> `Recovery` +2. Click `Restart now` under `Advanced startup`. Your computer will restart. Save any ongoing work before proceeding. + +##### After Restart + +1. Click `Troubleshoot`, then `Advanced options` -> `Startup Settings`. The system will restart again. +2. After restarting, select `Disable driver signature enforcement`. This option is usually number 7 but may vary. After selecting the appropriate option, the system will restart again. +3. After rebooting into the system, install the driver. Open `Device Manager`, find `USB download gadget` under `Other devices`, and double-click it. +4. Click `Update driver` under the `General` tab. +5. On the `Browse my computer for drivers` page, paste the path to the `usb_driver-fullmask` directory. +6. Click `Next` to install the driver. + +### Flashing the Image + +After entering flashing mode, use fastboot to flash the image. On macOS or Linux, if fastboot is self-installed, you may need to give it executable permissions. + +#### Windows Steps + +First, add `fastboot` to the system environment variable `PATH`, or place `fastboot` in the same directory. Also, extract the image files. Open `PowerShell` and execute the following commands: + +```bash +# Replace with the u-boot file corresponding to your board version +fastboot flash ram u-boot-with-spl-lpi4a-16g.bin +fastboot reboot +# After rebooting, wait 5 seconds before continuing +# Replace with the u-boot file corresponding to your board version +fastboot flash uboot u-boot-with-spl-lpi4a-16g.bin +fastboot flash boot openEuler-24.03-LTS-riscv64-lpi4a-base-boot.ext4 +fastboot flash root openEuler-24.03-LTS-riscv64-lpi4a-base-root.ext4 +``` + +#### Linux/macOS Steps + +You may need to prefix the fastboot commands with `sudo`. + +```bash +# Replace with the u-boot file corresponding to your board version +sudo fastboot flash ram u-boot-with-spl-lpi4a-16g.bin +sudo fastboot reboot +# After rebooting, wait 5 seconds before continuing +# Replace with the u-boot file corresponding to your board version +sudo fastboot flash uboot u-boot-with-spl-lpi4a-16g.bin +sudo fastboot flash boot openEuler-24.03-LTS-riscv64-lpi4a-base-boot.ext4 +sudo fastboot flash root openEuler-24.03-LTS-riscv64-lpi4a-base-root.ext4 +``` + +## Hardware Availability + +The official release is based on the [`openEuler kernel6.6`](./RISCV-OLK6.6.md) version, and not all kernel modules are fully supported. This version emphasizes a consistent official ecosystem experience. For more complete hardware functionality, use third-party releases. diff --git a/docs/en/docs/Installation/RISC-V-Pioneer1.3.md b/docs/en/docs/Installation/RISC-V-Pioneer1.3.md new file mode 100644 index 0000000000000000000000000000000000000000..feb321f4e7886cdbaf76dc8070f001c05367e8f2 --- /dev/null +++ b/docs/en/docs/Installation/RISC-V-Pioneer1.3.md @@ -0,0 +1,104 @@ +# Installing on Pioneer Box + +## Hardware Preparation + +- `Milk-V Pioneer v1.3` device or motherboard (with necessary peripherals) - 1 set + +- `m.2 NVMe` solid-state drive - 1 unit + +> If it contains data, format it to clear the data (make sure to back up personal files) +> +> If you have a `PCIe` adapter card, place it in the first `PCIe` slot of the device (recommended) +> +> If no `PCIe` adapter card, use the onboard `NVMe` interface + +- `AMD R5 230` graphics card - 1 unit + +> Place it in the second `PCIe` slot of the device + +- `USB` flash drive - 1 unit + +> Should be at least `16GiB` + +- `microSD card` - 1 unit + +> Should be at least `4GiB` + +- Monitor (the display interface should match the graphics card) + +- `USB` keyboard and mouse - 1 set + +- Equipment/components required for serial operation (optional) + +- `RJ45` network cable - at least 1, and router/switch for wired network connection + +> It is recommended to use the device's onboard `RJ45` network port rather than the manufacturer's provided `PCIe` network card. +> +> The device does not come with a `WiFi` network card and does not support WiFi or Bluetooth connectivity. Please prepare the corresponding equipment if needed. + +## Types of Images + +### ISO + +> `ISO` images support booting via `UEFI`, corresponding to the **UEFI version** firmware described below. + +Download the `ISO` file (e.g., `openEuler-24.03-LTS-riscv64-dvd.iso`) from the official download page and burn it to a **USB flash drive**. + +- It is recommended to use the `Balena Etcher` software for graphical burning [download from `https://github.com/balena-io/etcher/releases/latest`]. The burning process is not detailed here. +- In a command-line environment, you can also use the `dd` method to burn the image. Refer to the following command: + +```txt +~$ sudo dd if=openEuler-24.03-LTS-riscv64-dvd.iso of=/dev/sda bs=512K iflag=fullblock oflag=direct conv=fsync status=progress +``` + +### Image + +> `Image` images support booting via `Legacy`, corresponding to the **non-UEFI version** firmware described below. + +Download the `Zip` archive containing the image (e.g., `openEuler-24.03-LTS-riscv64-sg2042.img.zip`) from the official download page and burn it directly to an **SDCARD** or **solid-state drive**. + +## Device Firmware + +> Since the device's factory firmware currently does not support `UEFI`, users of the `ISO` version need to manually replace the firmware with the `UEFI` version based on `EDK2`. + +Download the device firmware archive `sg2042_firmware_uefi.zip` from the official download page under the **Embedded category**, extract it, and burn the `img` file to an **SDCARD**. + +```txt +~$ sudo dd if=firmware_single_sg2042-master.img of=/dev/sda bs=512K iflag=fullblock oflag=direct conv=fsync status=progress +261619712 bytes (262 MB, 250 MiB) copied, 20 s, 13.1 MB/s +268435456 bytes (268 MB, 256 MiB) copied, 20.561 s, 13.1 MB/s + +512+0 records in +512+0 records out +268435456 bytes (268 MB, 256 MiB) copied, 20.5611 s, 13.1 MB/s +``` + +> Due to the older firmware version shipped with the device, Image users who want to use a newer version of the firmware can update to the **non-UEFI version** of the firmware. + +Download the device firmware package `sg2042_firmware_uboot.zip` from the embedded category on the download page, and follow the same procedure as the UEFI firmware to extract and flash the img file to the **SDCARD**. + +After burning, insert the **SDCARD** into the device's card slot. + +## Pre-Startup Checks + +For `ISO` version users: + +- Ensure the `microSD card` with the `UEFI` firmware is inserted into the device's card slot. + + > The current `UEFI` firmware cannot manually adjust or specify the boot order, please understand. + +- If using the factory-provided solid-state drive, or if another bootable `RISC-V` operating system exists on the drive, remove the solid-state drive for formatting or replace it with another empty solid-state drive to avoid interference with the boot order. + +For `Image` version users: + +- If using the factory-provided solid-state drive, or if another bootable `RISC-V` operating system exists on the drive, remove the solid-state drive for formatting or replace it with another empty solid-state drive to avoid interference with the boot order. + +## Notes for Use + +For `ISO` version users: + +- Due to the limitations of the current version of the `UEFI` firmware, the `Grub2` boot menu may take a long time (~15s) to load and respond slowly if the graphics card is inserted into the `PCIe` slot during startup. + +For `Image` version users: + +- Due to the limitations of the current factory firmware, the `RISC-V` serial output is incomplete during device startup, and the serial output will be closed before the operating system is fully loaded. The graphics card needs to be inserted into the `PCIe` slot and connected to a monitor to observe the complete startup process. diff --git a/docs/en/docs/Installation/RISC-V-QEMU.md b/docs/en/docs/Installation/RISC-V-QEMU.md new file mode 100644 index 0000000000000000000000000000000000000000..b44198e923609dfeab014337d49719e5dcb1984e --- /dev/null +++ b/docs/en/docs/Installation/RISC-V-QEMU.md @@ -0,0 +1,70 @@ +# Installing on QEMU + +## Firmware + +### Standard EFI Firmware + +Download the following binaries from the download page: + +``` text +RISCV_VIRT_CODE.fd +RISCV_VIRT_VARS.fd +``` + +Alternatively, compile the latest EDK2 OVMF firmware locally according to the [official documentation](https://github.com/tianocore/edk2/tree/master/OvmfPkg/RiscVVirt) + +### EFI Firmware with Penglai TEE Support + +Download the following binary from the download page: + +``` text +fw_dynamic_oe_2403_penglai.bin +``` + +## QEMU Version + +>To support UEFI, QEMU version 8.1 or above is required. +> +>During compilation, the libslirp dependency needs to be installed (package name varies by distribution, for openEuler it's libslirp-devel) and the --enable-slirp parameter should be added. + +``` bash +~$ qemu-system-riscv64 --version +QEMU emulator version 8.2.2 +Copyright (c) 2003-2023 Fabrice Bellard and the QEMU Project developers +``` + +## qcow2 Image + +Obtain the ISO file (e.g., `openEuler-24.03-LTS-riscv64.qcow2.xz`) + +``` bash +~$ ls *.qcow2.xz +openEuler-24.03-LTS-riscv64.qcow2.xz +``` + +## Getting the Startup Script + +Obtain the startup script from the download page: + +- `start_vm.sh`: Default script, requires manual installation of the desktop environment. +- `start_vm_penglai.sh`: Script supporting Penglai TEE functionality. + +Script Parameters: + +- `ssh_port`: Local SSH forwarding port, default is 12055. +- `vcpu`: Number of threads for QEMU execution, default is 8 cores, adjustable as needed. +- `memory`: Amount of memory allocated for QEMU execution, default is 8GiB, adjustable as needed. +- `fw`: Firmware payload for booting. +- `drive`: Path to the virtual disk, adjustable as needed. +- `bios` (optional): Boot firmware, can be used to load firmware enabled with Penglai TEE. + +## Creating a Virtual Hard Disk File + +Create a new virtual hard disk file, in the example below, the size of the virtual hard disk is 40GiB. +> Do not use qcow2 virtual hard disk files that already contain data to avoid unexpected situations during the boot process. +> +> Ensure that there is only one qcow2 virtual hard disk file in the current directory to avoid errors in script recognition. + +``` bash +~$ qemu-img create -f qcow2 qemu.qcow2 40G +``` diff --git a/docs/en/docs/Installation/RISCV-OLK6.6.md b/docs/en/docs/Installation/RISCV-OLK6.6.md new file mode 100644 index 0000000000000000000000000000000000000000..7a91466c592886e355b64ec999eb6f2f7dc63197 --- /dev/null +++ b/docs/en/docs/Installation/RISCV-OLK6.6.md @@ -0,0 +1,55 @@ +# RISCV-OLK6.6 Source-Compatible Version Guide + +## RISCV-OLK6.6 Source-Compatible Plan + +Currently, the kernel versions maintained by various RISC-V SoC manufacturers are inconsistent, while the openEuler system requires a unified kernel version. This inconsistency results in various operating system versions based on different development boards having divergent third-party kernels, increasing maintenance difficulty and causing ecosystem fragmentation. The goal of the riscv-kernel project is to establish a unified kernel ecosystem for the RISC-V architecture in openEuler, sharing the benefits of Euler's ecosystem development and influence. This project is under development, and contributions from all parties are welcome. + +The project is primarily developed on the OLK-6.6 branch at and will be further integrated into OLK's source and artifact repositories. + +![riscv-olk6.6](figures/riscv-olk6.6.jpg) + +The project has nearly completed the source compatibility work for SG2042 and the basic source compatibility work for TH1520. + +## Supported Features + +SG2042 Verification Platform: MilkV Pioneer 1.3 + +TH1520 Verification Platform: LicheePi4A + +### Milk-V Pioneer feature status + +| Features | Status | +| ----------------------- | :----: | +| 64 Core CPU | O | +| PCIe Network Card | O | +| PCIe Graghic Card | O | +| PCIe Slots | O | +| 4x DDR4 128GB RAM | O | +| USB | O | +| Reset | O | +| eMMC | O | +| Micro USB debug console | O | +| micro SD card | O | +| SPI flash | O | +| RVV 0.71 | X | + +### LicheePi 4A feature status + +| Features | Status | +| ---------------- | :----: | +| 4 Core CPU | O | +| RAM | O | +| eMMC | O | +| Ethernet | O | +| WIFI | X | +| GPU IMG BXM-4-64 | X | +| NPU 4TOPS@INT8 | X | +| DSP | X | +| USB | O | +| MicroSD | O | +| GPIO | O | +| PWM-fan | O | +| PVT Sensor | O | +| Reboot | O | +| Poweroff | O | +| cpufreq | O | diff --git a/docs/en/docs/Installation/faqs.md b/docs/en/docs/Installation/faqs.md index 6404d9c95f640c8b2fbefbd360c5506d98236f51..636edc6921458bdcf581580ba2037049da8539fe 100644 --- a/docs/en/docs/Installation/faqs.md +++ b/docs/en/docs/Installation/faqs.md @@ -1,25 +1,25 @@ # FAQs -## openEuler Fails to Start After It Is Installed to the Second Disk +## openEuler Fails to Start After It Is Installed to the Second Drive ### Symptom -The OS is installed on the second disk **sdb** during the installation, causing startup failure. +The OS is installed on the second drive **sdb** during the installation, causing startup failure. ### Possible Causes -When openEuler is installed to the second disk, MBR and GRUB are installed to the second disk **sdb** by default. The following two situations may occur: +When openEuler is installed to the second drive, MBR and GRUB are installed to the second drive **sdb** by default. The following two situations may occur: -1. openEuler installed on the first disk is loaded and started if it is complete. -2. openEuler installed on the first disk fails to be started from hard disks if it is incomplete. +1. openEuler installed on the first drive is loaded and started if it is complete. +2. openEuler installed on the first drive fails to be started from hard drives if it is incomplete. -The preceding two situations occur because the first disk **sda** is booted by default to start openEuler in the BIOS window. If openEuler is not installed on the **sda** disk, system restart fails. +The preceding two situations occur because the first drive **sda** is booted by default to start openEuler in the BIOS window. If openEuler is not installed on the **sda** drive, system restart fails. ### Solutions This problem can be solved using either of the following two methods: -- During the openEuler installation, select the first disk or both disks, and install the boot loader on the first disk **sda**. +- During the openEuler installation, select the first drive or both drives, and install the boot loader on the first drive **sda**. - After installing openEuler, restart it by modifying the boot option in the BIOS window. ## openEuler Enters Emergency Mode After It Is Started @@ -32,9 +32,9 @@ openEuler enters emergency mode after it is powered on. ### Possible Causes -Damaged OS files result in disk mounting failure, or overpressured I/O results in disk mounting timeout \(threshold: 90s\). +Damaged OS files result in drive mounting failure, or overpressured I/O results in drive mounting timeout \(threshold: 90s\). -An unexpected system power-off and low I/O performance of disks may also cause the problem. +An unexpected system power-off and low I/O performance of drives may also cause the problem. ### Solutions @@ -42,16 +42,16 @@ An unexpected system power-off and low I/O performance of disks may also cause t 2. Check and restore files by using the file system check \(fsck\) tool, and restart openEuler. >![fig](./public_sys-resources/icon-note.gif) **NOTE:** - >The fsck tool checks and maintains inconsistent file systems. If the system is powered off or a disk is faulty, run the **fsck** command to check file systems. Run the **fsck.ext3 -h** and **fsck.ext4 -h** commands to view the usage method of the fsck tool. + >The fsck tool checks and maintains inconsistent file systems. If the system is powered off or a drive is faulty, run the **fsck** command to check file systems. Run the **fsck.ext3 -h** and **fsck.ext4 -h** commands to view the usage method of the fsck tool. -If you want to disable the timeout mechanism of disk mounting, add **x-systemd.device-timeout=0** to the **etc/fstab** file. For example: +If you want to disable the timeout mechanism of drive mounting, add **x-systemd.device-timeout=0** to the **etc/fstab** file. For example: ```sh # # /etc/fstab # Created by anaconda on Mon Sep 14 17:25:48 2015 # -# Accessible filesystems, by reference, are maintained under '/dev/disk' +# Accessible filesystems, by reference, are maintained under '/dev/drive' # See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info # /dev/mapper/openEuler-root / ext4 defaults,x-systemd.device-timeout=0 0 0 @@ -64,7 +64,7 @@ UUID=afcc811f-4b20-42fc-9d31-7307a8cfe0df /boot ext4 defaults,x-systemd.device-t ### Symptom -After a disk fails, openEuler fails to be reinstalled because a logical volume group that cannot be activated exists in openEuler. +After a drive fails, openEuler fails to be reinstalled because a logical volume group that cannot be activated exists in openEuler. ### Possible Causes @@ -205,22 +205,22 @@ The following table describes the parameters of the memory reserved for the kdum -## Fails to Select Only One Disk for Reinstallation When openEuler Is Installed on a Logical Volume Consisting of Multiple Disks +## Fails to Select Only One Drive for Reinstallation When openEuler Is Installed on a Logical Volume Consisting of Multiple Drives ### Symptom -If openEuler is installed on a logical volume consisting of multiple disks, an error message will be displayed as shown in [Figure 1](#fig115949762617) when you attempt to select one of the disks for reinstallation. +If openEuler is installed on a logical volume consisting of multiple drives, an error message will be displayed as shown in [Figure 1](#fig115949762617) when you attempt to select one of the drives for reinstallation. **Figure 1** Error message ![fig](./figures/error-message.png "error-message") ### Possible Causes -The previous logical volume contains multiple disks. If you select one of the disks for reinstallation, the logical volume will be damaged. +The previous logical volume contains multiple drives. If you select one of the drives for reinstallation, the logical volume will be damaged. ### Solutions -The logical volume formed by multiple disks is equivalent to a volume group. Therefore, you only need to delete the corresponding volume group. +The logical volume formed by multiple drives is equivalent to a volume group. Therefore, you only need to delete the corresponding volume group. 1. Press **Ctrl**+**Alt**+**F2** to switch to the CLI and run the following command to find the volume group: @@ -305,11 +305,11 @@ After the OS is installed and restarted, perform either of the following two ope ```sh -## Installation Fails when a User Selects Two Disks with OS Installed and Customizes Partitioning +## Installation Fails when a User Selects Two Drives with OS Installed and Customizes Partitioning ### Symptom -During the OS installation, the OS has been installed on two disks. In this case, if you select one disk for custom partitioning, and click **Cancel** to perform custom partitioning on the other disk, the installation fails. +During the OS installation, the OS has been installed on two drives. In this case, if you select one drive for custom partitioning, and click **Cancel** to perform custom partitioning on the other drive, the installation fails. ![fig](./figures/cancle_disk.png) @@ -317,11 +317,11 @@ During the OS installation, the OS has been installed on two disks. In this case ### Possible Causes -A user selects a disk for partitioning twice. After the user clicks **Cancel** and then selects the other disk, the disk information is incorrect. As a result, the installation fails. +A user selects a drive for partitioning twice. After the user clicks **Cancel** and then selects the other drive, the drive information is incorrect. As a result, the installation fails. ### Solutions -Select the target disk for custom partitioning. Do not frequently cancel the operation. If you have to cancel and select another disk, you are advised to reinstall the OS. +Select the target drive for custom partitioning. Do not frequently cancel the operation. If you have to cancel and select another drive, you are advised to reinstall the OS. ### Learn More About the Issue at @@ -337,7 +337,7 @@ After the Kdump service is deployed, kernel breaks down due to the manual execut ### Possible Causes -The **reset_devices** parameter is configured by default and is enabled during second kernel startup, making MegaRAID driver or disk faulty. An error is reported when the vmcore file is dumped ana accesses the MegaRAID card. As a result, vmcore fails to be generated. +The **reset_devices** parameter is configured by default and is enabled during second kernel startup, making MegaRAID driver or drive faulty. An error is reported when the vmcore file is dumped ana accesses the MegaRAID card. As a result, vmcore fails to be generated. ### Solutions diff --git a/docs/en/docs/Installation/figures/riscv-olk6.6.jpg b/docs/en/docs/Installation/figures/riscv-olk6.6.jpg new file mode 100644 index 0000000000000000000000000000000000000000..8a00c4fd2033954b1d0d99eb376ea5c1436db7fb Binary files /dev/null and b/docs/en/docs/Installation/figures/riscv-olk6.6.jpg differ diff --git a/docs/en/docs/Installation/installation-guideline.md b/docs/en/docs/Installation/installation-guideline.md index 99464095ca2af3ad86c5eb2a8d0d9575a978eaf5..8daaf4b7d4efa72da7f340df04af0eeba61adf90 100644 --- a/docs/en/docs/Installation/installation-guideline.md +++ b/docs/en/docs/Installation/installation-guideline.md @@ -48,8 +48,7 @@ Installation wizard options are described as follows: - **Troubleshooting**: Troubleshooting mode, which is used when the system cannot be installed properly. In troubleshooting mode, the following options are available: - **Install openEuler 21.09 in basic graphics mode**: Basic graphics installation mode. In this mode, the video driver is not started before the system starts and runs. - - **Rescue the openEuler system**: Rescue mode, which is used to restore the system. In rescue mode, the installation process is printed in the VNC or BMC, and the serial port is unavailable. - + - **Rescue the openEuler system**: Rescue mode, which is used to restore the system. In rescue mode, the installation process is printed to the Virtual Network Computing (VNC) or BMC interface, and the serial port is unavailable. On the installation wizard screen, press **e** to go to the parameter editing screen of the selected option, and press **c** to go to the command line interface (CLI). ### Installation in GUI Mode diff --git a/docs/en/docs/Installation/riscv_more.md b/docs/en/docs/Installation/riscv_more.md deleted file mode 100644 index 7e6cd81b929f3ea58e5ef0ff8c7bc20ee364548e..0000000000000000000000000000000000000000 --- a/docs/en/docs/Installation/riscv_more.md +++ /dev/null @@ -1,4 +0,0 @@ -# Reference - -- Play with OpenEuler on VisionFive -- Play with openEuler on RISC-V platform diff --git a/docs/en/docs/Installation/riscv_qemu.md b/docs/en/docs/Installation/riscv_qemu.md deleted file mode 100644 index 29f5bdf8723d1b9a384e8e98c6963979584afb8b..0000000000000000000000000000000000000000 --- a/docs/en/docs/Installation/riscv_qemu.md +++ /dev/null @@ -1,341 +0,0 @@ -# Installation Guide - -This chapter describes the installation of openEuler using QEMU as an example. For other installation methods, refer to the installation guides for the development boards. - -## Installing QEMU - -### System Environment - -The environments tested so far include WSL2 (Ubuntu 20.04.4 LTS and Ubuntu 22.04.1 LTS) and Ubuntu 22.04.1 live-server LTS. - -## Installing QEMU of the RISC-V Architecture - -Install the qemu-system-riscv64 package provided by the Linux distribution. openEuler 23.09 x86_64 provides QEMU 6.2.0 (**qemu-system-riscv-6.2.0-80.oe2309.x86_64**): - -``` bash -dnf install -y qemu-system-riscv -``` - -QEMU 8.0 and later versions are recommended because they provide a lot of fixes and updates for RISC-V. The following uses QEMU 8.1.2 as an example. - -### Manual Compilation and Installation - -If the provided software package is outdated, you can manually compile and install QEMU. - -``` bash -wget https://download.qemu.org/qemu-8.1.2.tar.xz -tar -xvf qemu-8.1.2.tar.xz -cd qemu-8.1.2 -mkdir res -cd res -sudo apt install libspice-protocol-dev libepoxy-dev libgtk-3-dev libspice-server-dev build-essential autoconf automake autotools-dev pkg-config bc curl gawk git bison flex texinfo gperf libtool patchutils mingw-w64 libmpc-dev libmpfr-dev libgmp-dev libexpat-dev libfdt-dev zlib1g-dev libglib2.0-dev libpixman-1-dev libncurses5-dev libncursesw5-dev meson libvirglrenderer-dev libsdl2-dev -y -../configure --target-list=riscv64-softmmu,riscv64-linux-user --prefix=/usr/local/bin/qemu-riscv64 --enable-slirp -make -j$(nproc) -sudo make install -``` - -The above commands will install QEMU to **/usr/local/bin/qemu-riscv64**. Add **/usr/local/bin/qemu-riscv64/bin** to **$PATH**. - -For compilation and installation on other OSs, including openEuler, see the QEMU official documentation. - -You can refer to the compilation procedure on RHEL or CentOS for the compilation on openEuler, for example: - -``` bash -sudo dnf install -y git glib2-devel libfdt-devel pixman-devel zlib-devel bzip2 ninja-build python3 \ - libaio-devel libcap-ng-devel libiscsi-devel capstone-devel \ - gtk3-devel vte291-devel ncurses-devel \ - libseccomp-devel nettle-devel libattr-devel libjpeg-devel \ - brlapi-devel libgcrypt-devel lzo-devel snappy-devel \ - librdmacm-devel libibverbs-devel cyrus-sasl-devel libpng-devel \ - libuuid-devel pulseaudio-libs-devel curl-devel libssh-devel \ - systemtap-sdt-devel libusbx-devel -curl -LO https://download.qemu.org/qemu-8.1.2.tar.xz -tar -xvf qemu-8.1.2.tar.xz -cd qemu-8.1.2 -mkdir res -cd res -../configure --target-list=riscv64-softmmu,riscv64-linux-user --prefix=/usr/local/bin/qemu-riscv64 -make -j$(nproc) -sudo make install -``` - -## Preparing the openEuler RISC-V Disk Image - -### Disk Image Download - -Download the boot firmware (**fw_payload_oe_uboot_2304.bin**), disk image (**openEuler-23.09-RISC-V-qemu-riscv64.qcow2.xz**) and startup script (**start_vm.sh**). - -### Download Directory - -The current build is located in the [openEuler Repo](https://repo.openeuler.org/openEuler-23.09/virtual_machine_img/riscv64/). You can also visit the [openEuler official website](https://www.openeuler.org/zh/download/) to obtain the image from other mirrored sources. - -### Content Description - -- **fw_payload_oe_uboot_2304.bin**: Boot firmware -- **openEuler-23.09-RISC-V-qemu-riscv64.qcow2.xz**: Compressed disk image for the openEuler RISC-V QEMU virtual machine (VM) -- **openEuler-23.09-RISC-V-qemu-riscv64.qcow2.xz.sha256sum**: Verification file for the compressed disk image. Run `sha256sum -c openEuler-23.09-RISC-V-qemu-riscv64.qcow2.xz.sha256sum` for verification. -- **start_vm.sh**: Official VM startup script - -### (Optional) Copy-On-Write (COW) Disk Configuration - -> The copy-on-write (COW) technology does not make changes to the original image file; the modification is written to another image file. This feature is only supported by the QCOW format in QEMU. Multiple disk images can point to the same image for testing multiple configurations without damaging the original image. - -#### Creating an Image - -Run the following command to create an image, and use the new image when starting the virtual machine below. Assume that the original image is **openEuler-23.09-RISC-V-qemu-riscv64.qcow2**, and the new image is **test.qcow2**. - -``` bash -qemu-img create -o backing_file=openEuler-23.09-RISC-V-qemu-riscv64.qcow2,backing_fmt=qcow2 -f qcow2 test.qcow2 -``` - -#### Viewing Image Information - -``` bash -qemu-img info --backing-chain test.qcow2 -``` - -#### Modifying the Base Image Location - -Run the following command to modify the base image location. Assume that the new base image is **another.qcow2**, and the image to be modified is **test.qcow2**. - -``` bash -qemu-img rebase -b another.qcow2 test.qcow2 -``` - -#### Merging Images - -Merge the modified image into the original image. Assume that the new image is **test.qcow2**. - -``` bash -qemu-img commit test.qcow2 -``` - -#### Expanding Root Partition - -To expand the root partition for more available space, perform the operations below. - -Expand the disk image. - -``` bash -qemu-img resize test.qcow2 +100G -``` - -Output - -``` text -Image resized. -``` - -Start the VM and run the following command to check the disk size. - -``` bash -lsblk -``` - -List the partition information. - -``` bash -fdisk -l -``` - -Modify the root Partition. - -``` bash -fdisk /dev/vda -Welcome to fdisk (util-linux 2.35.2). -Changes will remain in memory only, until you decide to write them. -Be careful before using the write command. - -Command (m for help): p # Display partition information. -Disk /dev/vda: 70 GiB, 75161927680 bytes, 146800640 sectors -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -Disklabel type: dos -Disk identifier: 0x247032e6 - -Device Boot Start End Sectors Size Id Type -/dev/vda1 2048 4194303 4192256 2G e W95 FAT16 (LBA) -/dev/vda2 4194304 83886079 79691776 38G 83 Linux - -Command (m for help): d # Delete the existing partition. -Partition number (1,2, default 2): 2 - -Partition 2 has been deleted. - -Command (m for help): n # Create new partition. -Partition type - p primary (1 primary, 0 extended, 3 free) - e extended (container for logical partitions) -Select (default p): p # Select the primary partition. -Partition number (2-4, default 2): 2 -First sector (4194304-146800639, default 4194304): # The starting block here should be consistent with the /dev/vda2 mentioned above. -Last sector, +/-sectors or +/-size{K,M,G,T,P} (4194304-146800639, default 146800639): # Keep the default and allocate directly to the end. - -Created a new partition 2 of type 'Linux' and of size 68 GiB. -Partition #2 contains a ext4 signature.Do you want to remove the signature? [Y]es/[N]o: n - -Command (m for help): p # Check again. - -Disk /dev/vda: 70 GiB, 75161927680 bytes, 146800640 sectors -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -Disklabel type: dos -Disk identifier: 0x247032e6 - -Device Boot Start End Sectors Size Id Type -/dev/vda1 2048 4194303 4192256 2G e W95 FAT16 (LBA) -/dev/vda2 4194304 146800639 142606336 68G 83 Linux - -Command (m for help): w # Write to the disk. -The partition table has been altered. -Syncing disks. -``` - -Update Disk Information. - -``` bash -resize2fs /dev/vda2 -``` - -## Launching the openEuler RISC-V VM - -### Starting the VM - -- Ensure that the current directory contains **fw_payload_oe_uboot_2304.bin**, the disk image compressed file, and the startup script. -- Decompress the image file with `xz -dk openEuler-23.09-RISC-V-qemu-riscv64.qcow2.xz`. -- Adjust the startup parameters. -- Run the startup script using `$ bash start_vm.sh`. - -### (Optional) Adjusting Startup Parameters - -- **vcpu** specifies the number of QEMU threads, which does not strictly correspond to the number of CPU cores. If the **vcpu** value exceeds the number of CPU cores of the host machine, the execution may be slowed or blocked. The default value is **4**. -- **memory** specifies the size of the VM memory and can be adjusted as needed. The default value is **2**. -- **drive** specifies the path of the virtual disk. If you configure the COW image as mentioned earlier, enter the path of the created image. -- **fw** specifies the path for the U-Boot image. -- **ssh_port** specifies the port forwarded for SSH, defaulting to **12055**. Set it to blank to disable this feature. - -## Logging into the VM - -The script offers support for SSH logins. - -If the VM is exposed to a public network, change the **root** user password immediately after logging in. - -### SSH Login - -Secure Shell (SSH) is an encrypted network transfer protocol, offering a secure transfer environment for network services over an unsecured network. SSH achieves this by creating a secure tunnel to connect SSH clients and servers. The most common use for SSH is for remote OS login, including the remote CLI and remote command execution. While SSH is most frequently used on Unix-like OSs, some of its functions are also available on Windows. In 2015, Microsoft announced they would provide native SSH protocol support in future Windows OSs. The OpenSSH client is available in Windows 10 1803 and later versions. - -- Username: **root** or **openeuler** -- Default Password: **openEuler12#$** -- Login method: See script prompts (or use your preferred SSH client). - -Upon successful login, you will see the following information: - -``` bash -Authorized users only. All activities may be monitored and reported. - -Authorized users only. All activities may be monitored and reported. -Last login: Sun Oct 15 17:19:52 2023 from 10.0.2.2 - -Welcome to 6.4.0-10.1.0.20.oe2309.riscv64 - -System information as of time: Sun Oct 15 19:40:07 CST 2023 - -System load: 0.47 -Processes: 161 -Memory used: .7% -Swap used: 0.0% -Usage On: 11% -IP address: 10.0.2.15 -Users online: 1 - -[root@openeuler ~]# -``` - -### VNC Login - -This method is natively supported by QEMU and is similar to remotely operating a physical machine without sound. - -> Virtual Network Computing (VNC) is a screen sharing and remote operation program that uses the Remote Frame Buffer (RFB) protocol. VNC sends keyboard and mouse actions as well as real-time screen images through the network. -> -> VNC is independent of the OS, which means it can be used across platforms. For instance, a Windows computer can connect to a Linux computer, and vice versa. You can even use VNC through a Java-supported browser on a computer without the client software. - -#### Installing VNC Viewer - -Download [TigerVNC](https://sourceforge.net/projects/tigervnc/files/stable/) or [VNC Viewer](https://www.realvnc.com/en/connect/download/viewer/). - -#### Modifying the Startup Script - -Before the **sleep 2** line in the startup script, add the following content: - -``` bash -vnc_port=12056 -echo -e "\033[37mVNC Port: \033[0m \033[34m"$vnc_port"\033[0m" -cmd="${cmd} -vnc :"$((vnc_port-5900)) -``` - -#### Connecting to VNC - -Launch TigerVNC or VNC Viewer, paste the address, and press **Enter**. The operating interface is similar to that of a physical machine. - -## Modifying the Default Software Source Configuration - -The software source for openEuler 23.09 RISC-V version currently only contains the **\[OS]** and **\[source]** repositories. However, the default configuration file includes other repositories not provided in this RISC-V version. - -Before using a package manager to install software packages, edit the software source configuration to keep only the **\[OS]** and **\[source]** sections. - -Connect to the VM via SSH or VNC and log in as the **root** user (if you log in as a non-privileged user, run the commands with `sudo`). Then, perform the following operations. - -### Modify /etc/yum.repos.d/openEuler.repo - -``` bash -vi /etc/yum.repos.d/openEuler.repo -# or: nano /etc/yum.repos.d/openEuler.repo -``` - -Remove the **\[everything]**, **\[EPOL]**, **\[debuginfo]**, **\[update]**, and **\[update-source]** sections to keep only the **\[OS]** and **\[source]** sections. - -After making the changes, the configurations in **openEuler.repo** should be similar to the following: - -``` text -#generic-repos is licensed under the Mulan PSL v2. -#You can use this software according to the terms and conditions of the Mulan PSL v2. -#You may obtain a copy of Mulan PSL v2 at: -# http://license.coscl.org.cn/MulanPSL2 -#THIS SOFTWARE IS PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES OF ANY KIND, EITHER EXPRESS OR -#IMPLIED, INCLUDING BUT NOT LIMITED TO NON-INFRINGEMENT, MERCHANTABILITY OR FIT FOR A PARTICULAR -#PURPOSE. -#See the Mulan PSL v2 for more details. - -[OS] -name=OS -baseurl=http://repo.openeuler.org/openEuler-23.09/OS/$basearch/ -metalink=https://mirrors.openeuler.org/metalink?repo=$releasever/OS&arch=$basearch -metadata_expire=1h -enabled=1 -gpgcheck=1 -gpgkey=http://repo.openeuler.org/openEuler-23.09/OS/$basearch/RPM-GPG-KEY-openEuler - -[source] -name=source -baseurl=http://repo.openeuler.org/openEuler-23.09/source/ -metalink=https://mirrors.openeuler.org/metalink?repo=$releasever&arch=source -metadata_expire=1h -enabled=1 -gpgcheck=1 -gpgkey=http://repo.openeuler.org/openEuler-23.09/source/RPM-GPG-KEY-openEuler -``` - -Then, you can use the DNF package manager to install software packages normally. When installing for the first time, it is necessary to import the openEuler GPG key. If you see the following prompt, enter **y** to confirm. - -``` text -retrieving repo key for OS unencrypted from http://repo.openeuler.org/openEuler-23.09/OS/riscv64/RPM-GPG-KEY-openEuler -OS 18 kB/s | 2.1 kB 00:00 -Importing GPG key 0xB25E7F66: - Userid : "private OBS (key without passphrase) " - Fingerprint: 12EA 74AC 9DF4 8D46 C69C A0BE D557 065E B25E 7F66 - From : http://repo.openeuler.org/openEuler-23.09/OS/riscv64/RPM-GPG-KEY-openEuler -Is this ok [y/N]: y -Key imported successfully -``` diff --git a/docs/en/docs/Kmesh/appendixes.md b/docs/en/docs/Kmesh/appendixes.md deleted file mode 100644 index ffa7b4d28662efb22fcded563f759fd035ac9140..0000000000000000000000000000000000000000 --- a/docs/en/docs/Kmesh/appendixes.md +++ /dev/null @@ -1,3 +0,0 @@ -# Appendixes - -For more details, visit the [Kmesh](https://gitee.com/openeuler/Kmesh#kmesh) repository. diff --git a/docs/en/docs/Kmesh/getting-to-know-kmesh.md b/docs/en/docs/Kmesh/getting-to-know-kmesh.md deleted file mode 100644 index 2264bd9b7770b8e0cf044cf3c663d09889dd69ab..0000000000000000000000000000000000000000 --- a/docs/en/docs/Kmesh/getting-to-know-kmesh.md +++ /dev/null @@ -1,36 +0,0 @@ -# Getting to Know Kmesh - -## Introduction - -As more and more applications become cloud-native, the scale of cloud applications and application SLA requirements put high demands on cloud infrastructure. - -Kubernetes-based cloud infrastructure can help applications achieve agile deployment and management, but it lacks the application traffic orchestration ability. The emergence of service mesh has effectively compensated for Kubernetes shortcomings, allowing Kubernetes to completely realize agile cloud application development and O&M. However, as the application of service mesh gradually deepens, the current sidecar-based mesh architecture has obvious performance defects in the data plane, and the following problems have become a consensus in the industry: - -* High latency - Take the Istio service mesh for example. The single-hop access delay of a service is increased by 2.65 ms, which cannot meet the requirements of latency-sensitive applications. - -* High overhead - In Istio, each sidecar consumes 50 MB memory and occupies 2 CPU cores. This causes high overhead in large-scale clusters and decreases the deployment density of service containers. - -Based on the programmable kernel, Kmesh moves mesh traffic management down to the OS level and shortens the data path from 3 hops to 1. This greatly improves the latency performance of the mesh data plane and helps services innovate quickly. - -## Architecture - -![](./figures/kmesh-arch.png) - -Main components of Kmesh include: - -* kmesh-controller: - The management program of Kmesh, which manages the Kmesh lifecycle, XDS interconnection, and O&M observation. - -* kmesh-api: - The external API layer of Kmesh, including APIs for converted XDS orchestration and O&M observation. - -* kmesh-runtime: - The runtime for orchestration of traffic in layer 3 to layer 7, which is implemented in the kernel. - -* kmesh-orchestration: - Orchestration of traffic in layer 3 to layer 7 based on eBPF, implementing functions such as routing, gray release, and load balancing. - -* kmesh-probe: - O&M observation probe, which provides end-to-end observation. diff --git a/docs/en/docs/SecHarden/file-permissions.md b/docs/en/docs/SecHarden/file-permissions.md index 62925c84449cc2c3bf75485b33abf4ee47206147..eaee0cb7a3f59b3cb970a7bf89d015934f036e16 100644 --- a/docs/en/docs/SecHarden/file-permissions.md +++ b/docs/en/docs/SecHarden/file-permissions.md @@ -221,7 +221,7 @@ The **cron** command is used to create a routine task. Users who can run the ### Description -A common user can use the **sudo** command to run commands as the user **root**. To harden system security, it is necessary to restrict permissions on the **sudo** command. Only user **root** can use the **sudo** command. By default, openEuler does not retrict the permission of non-root users to run the sudo command. +A common user can use the **sudo** command to run commands as the user **root**. To harden system security, it is necessary to restrict permissions on the **sudo** command. Only user **root** can use the **sudo** command. By default, openEuler does not restrict the permission of non-root users to run the sudo command. ### Implementation diff --git a/docs/en/docs/StratoVirt/VM_configuration.md b/docs/en/docs/StratoVirt/VM_configuration.md index b705276e53e7a8c4758e91f4045098953238419f..72170ecfb81206f635507287db1746cbcdf6e040 100644 --- a/docs/en/docs/StratoVirt/VM_configuration.md +++ b/docs/en/docs/StratoVirt/VM_configuration.md @@ -679,8 +679,8 @@ This section provides an example of the minimum configuration for creating a sta $ /path/to/stratovirt \ -kernel /path/to/vmlinux.bin \ -append console=ttyAMA0 root=/dev/vda rw reboot=k panic=1 \ - -drive file=/path/to/code_storage_file,if=pflash,unit=0[,readonly=true] \ - -drive file=/path/to/data_storage_file,if=pfalsh,unit=1, \ + -drive file=/path/to/edk2/code_storage_file,if=pflash,unit=0[,readonly=true] \ + -drive file=/path/to/edk2/data_storage_file,if=pflash,unit=1, \ -drive file=/home/rootfs.ext4,id=rootfs,readonly=false \ -device virtio-blk-device,drive=rootfs,bus=pcie.0,addr=0x1 \ -qmp unix:/tmp/stratovirt.socket,server,nowait \ diff --git a/docs/en/docs/TailorCustom/overview.md b/docs/en/docs/TailorCustom/overview.md deleted file mode 100644 index 053bc0b8481c1b95bbdba0abbe17e94ff674f4fa..0000000000000000000000000000000000000000 --- a/docs/en/docs/TailorCustom/overview.md +++ /dev/null @@ -1,3 +0,0 @@ -# Tailoring and Customization Tool Usage Guide - -This document describes the tailoring and customization tool of openEuler, including the introduction, installation, and usage. \ No newline at end of file diff --git a/docs/en/docs/Virtualization/configuring-disk-io-suspension.md b/docs/en/docs/Virtualization/configuring-disk-io-suspension.md deleted file mode 100644 index d43141ef3f49277fa242a3b8f80e6f4cae5f11f3..0000000000000000000000000000000000000000 --- a/docs/en/docs/Virtualization/configuring-disk-io-suspension.md +++ /dev/null @@ -1,105 +0,0 @@ -# Configuring Disk I/O Suspension - - - -- [Configuring Disk I/O Suspension](#configuring-disk-io-suspension) - - [Introduction](#introduction) - - [Overview](#overview) - - [Applicable Scenario](#applicable-scenario) - - [Precautions and Restrictions](#precautions-and-restrictions) - - [Disk I/O Suspension Configuration](#disk-io-suspension-configuration) - - [Qemu Command Line Configuration](#qemu-command-line-configuration) - - [XML Configuration](#xml-configuration) - - - -## Introduction - -### Overview - -When a storage fault occurs (for example, the storage link is disconnected), the I/O error of the physical disk is sent to the VM front end through the virtualization layer. After the VM receives the I/O error, the user file system in the VM may change to the read-only state. In this case, the VM needs to be restarted or the user needs to manually recover the file system, which brings extra workload. - -In this case, the virtualization platform provides the disk I/O suspension capability. When a storage fault occurs, the VM I/O being delivered to the host is suspended. During the suspension period, no I/O error is returned to the VM. In this way, the VM file system will not be in read-only state but is hung. At the same time, the VM backend retries I/Os based on the specified suspension interval. If the storage fault is rectified within the suspension time, the suspended I/O can be written to the disk. The internal file system of the VM automatically recovers and the VM does not need to be restarted. If the storage fault is not rectified within the suspension time, an error is reported to the VM and the user is notified. - -### Applicable Scenario - -The cloud that may be disconnected from the storage plane is used as the backend of a virtual disk. - -### Precautions and Restrictions - -- Only virtio-blk and virtio-scsi virtual drives support disk I/O suspension. - -- The backend of virtual disks suspended by disk I/O is usually the cloud drive that may cause storage plane link disconnection. - -- The disk I/O suspension can be enabled for read and write I/O errors. The retry interval and timeout interval for read and write I/O errors of the same disk are the same. - -- The disk I/O suspension retry interval does not include the actual I/O overhead on the host. That is, the actual interval between two I/O retry operations is greater than the configured I/O error retry interval. - -- The disk I/O suspension cannot identify the I/O error type (such as storage link disconnection, bad disk, and reservation conflict). As long as the hardware returns an I/O error, the disk I/O suspension is performed. - -- When the disk I/O is suspended, the internal I/O of the VM is not returned. The system commands for accessing the disk, such as fdisk, are suspended. The services that depend on the returned command are also suspended. - -- When the disk I/O is suspended, the I/O cannot be written to the disk. As a result, the VM may fail to be gracefully shut down. In this case, you need to forcibly shut down the VM. - -- When the disk I/O is suspended, the disk data cannot be read. As a result, the VM cannot be restarted. You need to forcibly shut down the VM, wait until the storage fault is rectified, and then restart the VM. - -- After a storage fault occurs, the following problems cannot be solved even though disk I/O suspension exists: - - 1. Failed to execute advanced storage features. - - Advanced features include virtual disk hot swapping, virtual disk creation, VM startup, VM shutdown, forcible VM shutdown, VM hibernation and wakeup, VM storage hot migration, VM storage hot migration cancellation, VM storage snapshot creation, VM storage snapshot combination, and VM disk capacity query, VM online scale-out, virtual CD-ROM drive insertion and ejection. - - 2. Failed to execute the VM life cycle. - -- When a VM configured with disk I/O suspension initiates hot migration, the XML configuration of the destination disk must contain the same disk I/O suspension configuration as that of the source disk. - -## Disk I/O Suspension Configuration - -### Qemu Command Line Configuration - -The disk I/O suspension function is enabled by specifying `werror=retry` and `rerror=retry` on the virtual disk device and using `retry_interval` and `retry_timeout` to configure the retry policy. `retry_interval` indicates the I/O error retry interval. The value ranges from 0 to MAX_LONG, in milliseconds. If this parameter is not set, the default value 1000 ms is used. `retry_timeout` indicates the I/O retry timeout interval. The value ranges from 0 to MAX_LONG. The value 0 indicates that no timeout occurs. The unit is millisecond. If this parameter is not set, the default value is 0. - -The I/O suspension configuration of the virtio-blk disk is as follows: - -```shell --drive file=/path/to/your/storage,format=raw,if=none,id=drive-virtio-disk0,cache=none,aio=native \ --device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,\ -drive=drive-virtio-disk0,id=virtio-disk0,write-cache=on,\ -werror=retry,rerror=retry,retry_interval=2000,retry_timeout=10000 -``` - -The I/O suspension configuration of the virtio-scsi disk is as follows: - -```shell --drive file=/path/to/your/storage,format=raw,if=none,id=drive-scsi0-0-0-0,cache=none,aio=native \ --device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,\ -device_id=drive-scsi0-0-0-0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,write-cache=on,\ -werror=retry,rerror=retry,retry_interval=2000,retry_timeout=10000 -``` - -### XML Configuration - -The disk I/O suspension function is enabled by specifying `error_policy='retry'` and `rerror_policy='retry'`in the disk XML configuration file. Configure the values of `retry_interval` and `retry_timeout`. `retry_interval` indicates the I/O error retry interval. The value ranges from 0 to MAX_LONG, in milliseconds. If this parameter is not set, the default value 1000 ms is used. `retry_timeout` indicates the I/O retry timeout interval. The value ranges from 0 to MAX_LONG. The value 0 indicates that no timeout occurs. The unit is millisecond. If this parameter is not set, the default value is 0. - -The disk I/O suspension XML configuration of the virtio-blk disk is as follows: - -```xml - - - - - - -``` - -The disk I/O suspension XML configuration of the virtio-scsi disk is as follows: - -```xml - - - - - -
- -``` diff --git a/docs/en/docs/Virtualization/system-resource-management.md b/docs/en/docs/Virtualization/system-resource-management.md index a581202aa71ea32a1e8ba6b943ccd4125d9f8b77..18030aa275e58f1c0e6650f54c39538bdf45d957 100644 --- a/docs/en/docs/Virtualization/system-resource-management.md +++ b/docs/en/docs/Virtualization/system-resource-management.md @@ -1,11 +1,11 @@ # system Resource Management -The **libvirt** command manages VM system resources, such as vCPU and virtual memory resources. +The `libvirt` command manages VM system resources, such as vCPU and virtual memory resources. Before you start: - Ensure that the libvirtd daemon is running on the host. -- Run the **virsh list --all** command to check that the VM has been defined. +- Run the `virsh list --all` command to check that the VM has been defined. ## Managing vCPU @@ -37,7 +37,7 @@ Change the value of **cpu\_shares** allocated to the VM to balance the schedul iothread_quota : -1 ``` -- Online modification: Run the **virsh schedinfo** command with the **--live** parameter to modify the CPU share of a running VM. +- Online modification: Run the `virsh schedinfo` command with the `--live` parameter to modify the CPU share of a running VM. ```shell virsh schedinfo --live cpu_shares= @@ -61,7 +61,7 @@ Change the value of **cpu\_shares** allocated to the VM to balance the schedul The modification of the **cpu\_shares** value takes effect immediately. The running time of the _openEulerVM_ is twice the original running time. However, the modification will become invalid after the VM is shut down and restarted. -- Permanent modification: Run the **virsh schedinfo** command with the **--config** parameter to change the CPU share of the VM in the libvirt internal configuration. +- Permanent modification: Run the `virsh schedinfo` command with the `--config` parameter to change the CPU share of the VM in the libvirt internal configuration. ```shell virsh schedinfo --config cpu_shares= @@ -93,7 +93,7 @@ You can bind the QEMU main process to a specific physical CPU range, ensuring th #### Procedure -Run the **virsh emulatorpin** command to bind the QEMU main process to a physical CPU. +Run the `virsh emulatorpin` command to bind the QEMU main process to a physical CPU. - Check the range of the physical CPU bound to the QEMU process: @@ -106,7 +106,7 @@ Run the **virsh emulatorpin** command to bind the QEMU main process to a physi This indicates that the QEMU main process corresponding to VM **openEulerVM** can be scheduled on all physical CPUs of the host. -- Online binding: Run the **virsh emulatorpin** command with the **--live** parameter to modify the binding relationship between the QEMU process and the running VM. +- Online binding: Run the `virsh emulatorpin` command with the `--live` parameter to modify the binding relationship between the QEMU process and the running VM. ```shell $ virsh emulatorpin openEulerVM --live 2-3 @@ -119,7 +119,7 @@ Run the **virsh emulatorpin** command to bind the QEMU main process to a physi The preceding commands bind the QEMU process corresponding to VM **openEulerVM** to physical CPUs **2** and **3**. That is, the QEMU process is scheduled only on the two physical CPUs. The binding relationship takes effect immediately but becomes invalid after the VM is shut down and restarted. -- Permanent binding: Run the **virsh emulatorpin** command with the **--config** parameter to modify the binding relationship between the VM and the QEMU process in the libvirt internal configuration. +- Permanent binding: Run the `virsh emulatorpin` command with the `--config` parameter to modify the binding relationship between the VM and the QEMU process in the libvirt internal configuration. ```shell $ virsh emulatorpin openEulerVM --config 0-3,^1 @@ -140,7 +140,7 @@ The vCPU of a VM is bound to a physical CPU. That is, the vCPU is scheduled only #### Procedure -Run the **virsh vcpupin** command to adjust the binding relationship between vCPUs and physical CPUs. +Run the `virsh vcpupin` command to adjust the binding relationship between vCPUs and physical CPUs. - View the vCPU binding information of the VM. @@ -156,7 +156,7 @@ Run the **virsh vcpupin** command to adjust the binding relationship between v This indicates that all vCPUs of VM **openEulerVM** can be scheduled on all physical CPUs of the host. -- Online adjustment: Run the **vcpu vcpupin** command with the **--live** parameter to modify the vCPU binding relationship of a running VM. +- Online adjustment: Run the `vcpu vcpupin` command with the `--live` parameter to modify the vCPU binding relationship of a running VM. ```shell $ virsh vcpupin openEulerVM --live 0 2-3 @@ -172,7 +172,7 @@ Run the **virsh vcpupin** command to adjust the binding relationship between v The preceding commands bind vCPU **0** of VM **openEulerVM** to pCPU **2** and pCPU **3**. That is, vCPU **0** is scheduled only on the two physical CPUs. The binding relationship takes effect immediately but becomes invalid after the VM is shut down and restarted. -- Permanent adjustment: Run the **virsh vcpupin** command with the **--config** parameter to modify the vCPU binding relationship of the VM in the libvirt internal configuration. +- Permanent adjustment: Run the `virsh vcpupin` command with the `--config` parameter to modify the vCPU binding relationship of the VM in the libvirt internal configuration. ```shell $ virsh vcpupin openEulerVM --config 0 0-3,^1 @@ -188,35 +188,39 @@ Run the **virsh vcpupin** command to adjust the binding relationship between v The preceding commands bind vCPU **0** of VM **openEulerVM** to physical CPUs **0**, **2**, and **3**. That is, vCPU **0** is scheduled only on the three physical CPUs. The modification of the binding relationship does not take effect immediately. Instead, the modification takes effect after the next startup of the VM and takes effect permanently. -### CPU Hot Add +### CPU Hotplug #### Overview -This feature allows users to hot add CPUs to a running VM without affecting its normal running. When the internal service pressure of a VM keeps increasing, all CPUs will be overloaded. To improve the computing capability of the VM, you can use the CPU hot add function to increase the number of CPUs on the VM without stopping it. +CPU hotplug allows you to increase or decrease the number of CPUs for a running VM without affecting services on it. When the internal service pressure rises to a level where existing CPUs become saturated, CPU hotplug can dynamically boost the computing power of a VM, guaranteeing stable service throughput. CPU hotplug also enables the removal of unused computing resources during low service load, minimizing computing costs. + +Note: CPU hotplug is added for the AArch64 architecture in openEuler 24.03 LTS. However, the new implementation of the mainline community is not compatible with that of earlier openEuler versions. Therefore, the guest OS must match the host OS. That is, the guest and host machines must both run openEuler 24.03 LTS or later versions, or versions earlier than openEuler 24.03 LTS. #### Constraints - For processors using the AArch64 architecture, the specified VM chipset type \(machine\) needs to be virt-4.1 or a later version when a VM is created. For processors using the x86\_64 architecture, the specified VM chipset type \(machine\) needs to be pc-i440fx-1.5 or a later version when a VM is created. -- When configuring Guest NUMA, you need to configure the vCPUs that belong to the same socket in the same vNode. Otherwise, the VM may be soft locked up after the CPU is hot added, which may cause the VM panic. -- VMs do not support CPU hot add during migration, hibernation, wake-up, or snapshot. +- The initial CPU of an AArch64 VM cannot be hot removed. +- When configuring Guest NUMA, you need to configure the vCPUs that belong to the same socket in the same vNode. Otherwise, the VM may be soft locked up after the CPU is hot added or removed, which may cause the VM panic. +- VMs do not support CPU hotplug during migration, hibernation, wake-up, or snapshot. - Whether the hot added CPU can automatically go online depends on the VM OS logic rather than the virtualization layer. - CPU hot add is restricted by the maximum number of CPUs supported by the Hypervisor and GuestOS. - When a VM is being started, stopped, or restarted, the hot added CPU may become invalid. However, the hot added CPU takes effect after the VM is restarted. -- During VM CPU hot add, if the number of added CPUs is not an integer multiple of the number of cores in the VM CPU topology configuration item, the CPU topology displayed in the VM may be disordered. You are advised to add CPUs whose number is an integer multiple of the number of cores each time. -- If the hot added CPU needs to take effect online and is still valid after the VM is restarted, the --config and --live options need to be transferred to the virsh setvcpus API to persist the hot added CPU. +- CPU hotplug may time out when a VM is starting, shutting down, or restarting. Retry when the VM is in the normal running state. +- During VM CPU hotplug, if the number of added or removed CPUs is not an integer multiple of the number of cores in the VM CPU topology configuration item, the CPU topology displayed in the VM may be disordered. You are advised to add or remove CPUs whose number is an integer multiple of the number of cores each time. +- If the hot added or removed CPU needs to take effect online and is still valid after the VM is restarted, the `--config` and `--live` options need to be transferred to the `virsh setvcpus` interface to persist the hot added or removed CPU. #### Procedure **VM XML Configuration** -1. To use the CPU hot add function, configure the number of CPUs, the maximum number of CPUs supported by the VM, and the VM chipset type when creating the VM. (For the AArch64 architecture, the virt-4.1 or a later version is required. For the x86\_64 architecture, the pc-i440fx-1.5 or later version is required. The AArch64 VM is used as an example. The configuration template is as follows: +1. To use the CPU hot add function, configure the number of CPUs, the maximum number of CPUs supported by the VM, and the VM chipset type when creating the VM. (For the AArch64 architecture, the virt-4.2 or a later version is required. For the x86\_64 architecture, the pc-i440fx-1.5 or later version is required. The AArch64 VM is used as an example. The configuration template is as follows: ```xml ... n - hvm + hvm ... @@ -231,12 +235,12 @@ This feature allows users to hot add CPUs to a running VM without affecting its ```xml - …… + ... 64 - hvm + hvm - …… + ... ``` **Hot Adding and Bringing CPUs Online** @@ -265,16 +269,36 @@ This feature allows users to hot add CPUs to a running VM without affecting its ``` >![](./public_sys-resources/icon-note.gif) **Note** - >The format for running the virsh setvcpus command to hot add a VM CPU is as follows: + >The format for running the `virsh setvcpus` command to hot add VM CPUs is as follows: > >```shell >virsh setvcpus [--config] [--live] >``` > - >- domain: Parameter, which is mandatory. Specifies the name of a VM. - >- count: Parameter, which is mandatory. Specifies the number of target CPUs, that is, the number of CPUs after hot adding. - >- --config: Option, which is optional. This parameter is still valid when the VM is restarted. - >- --live: Option, which is optional. The configuration takes effect online. + >- `domain`: Parameter, which is mandatory. Specifies the name of a VM. + >- `count`: Parameter, which is mandatory. Specifies the number of target CPUs, that is, the number of CPUs after hot adding. + >- `--config`: Option, which is optional. This parameter is still valid when the VM is restarted. + >- `--live`: Option, which is optional. The configuration takes effect online. + +**Hot Removing CPUs** + +Use the virsh tool to hot remove CPUs from the VM. For example, to set the number of CPUs after hot removal to 4 on the VM named openEulerVM, run the following command: + +```shell +virsh setvcpus openEulerVM 4 --live +``` + +>![](./public_sys-resources/icon-note.gif) **Note** +>The format for running the `virsh setvcpus` command to hot remove VM CPUs is as follows: +> +>```shell +>virsh setvcpus [--config] [--live] +>``` +> +>- `domain`: Parameter, which is mandatory. Specifies the name of a VM. +>- `count`: Parameter, which is mandatory. Specifies the number of target CPUs, that is, the number of CPUs after hot removal. +>- `--config`: Option, which is optional. This parameter is still valid when the VM is restarted. +>- `--live`: Option, which is optional. The configuration takes effect online. ## Managing Virtual Memory @@ -364,8 +388,8 @@ After Guest NUMA is configured in the VM XML configuration file, you can view th > >- **** provides the NUMA topology function for VMs. **cell id** indicates the vNode ID, **cpus** indicates the vCPU ID, and **memory** indicates the memory size on the vNode. >- If you want to use Guest NUMA to provide better performance, configure <**numatune\>** and **** so that the vCPU and memory are distributed on the same physical NUMA node. -> - **cellid** in **** corresponds to **cell id** in ****. **mode** can be set to **strict** \(apply for memory from a specified node strictly. If the memory is insufficient, the application fails.\), **preferred** \(apply for memory from a node first. If the memory is insufficient, apply for memory from another node\), or **interleave** \(apply for memory from a specified node in cross mode\).; **nodeset** indicates the specified physical NUMA node. -> - In ****, you need to bind the vCPU in the same **cell id** to the physical NUMA node that is the same as the **memnode**. +>- **cellid** in **** corresponds to **cell id** in ****. **mode** can be set to **strict** \(apply for memory from a specified node strictly. If the memory is insufficient, the application fails.\), **preferred** \(apply for memory from a node first. If the memory is insufficient, apply for memory from another node\), or **interleave** \(apply for memory from a specified node in cross mode\).; **nodeset** indicates the specified physical NUMA node. +>- In ****, you need to bind the vCPU in the same **cell id** to the physical NUMA node that is the same as the **memnode**. ### Memory Hot Add @@ -375,7 +399,7 @@ In virtualization scenarios, the memory, CPU, and external devices of VMs are si #### Constraints -- For processors using the AArch64 architecture, the specified VM chipset type \(machine\) needs to be virt-4.1 or a later version when a VM is created.For processors using the x86 architecture, the specified VM chipset type \(machine\) needs to be a later version than pc-i440fx-1.5 when a VM is created. +- For processors using the AArch64 architecture, the specified VM chipset type \(machine\) needs to be virt-4.2 or a later version when a VM is created.For processors using the x86 architecture, the specified VM chipset type \(machine\) needs to be a later version than pc-i440fx-1.5 when a VM is created. - Guest NUMA on which the memory hot add feature depends needs to be configured on the VM. Otherwise, the memory hot add process cannot be completed. - When hot adding memory, you need to specify the ID of Guest NUMA node to which the new memory belongs. Otherwise, the memory hot add fails. - The VM kernel should support memory hot add. Otherwise, the VM cannot identify the newly added memory or the memory cannot be brought online. @@ -447,7 +471,7 @@ In virtualization scenarios, the memory, CPU, and external devices of VMs are si >![](./public_sys-resources/icon-note.gif) **Note** >If you do not use the udev rules, you can use the root permission to manually bring the hot added memory online by running the following command: -> + > >```text >for i in `grep -l offline /sys/devices/system/memory/memory*/state` >do diff --git a/docs/en/docs/Virtualization/user-and-administrator-guide.md b/docs/en/docs/Virtualization/user-and-administrator-guide.md deleted file mode 100644 index 47646f59b84e0a9b7e9952054286eba74f6ebdd1..0000000000000000000000000000000000000000 --- a/docs/en/docs/Virtualization/user-and-administrator-guide.md +++ /dev/null @@ -1,437 +0,0 @@ -# User and Administrator Guide - -This chapter describes how to create VMs on the virtualization platform, manage VM life cycles, and query information. - - - -- [Best Practices](#best-practices) - - [Performance Best Practices](#performance-best-practices) - - [Halt-Polling](#halt-polling) - - [I/O Thread Configuration](#i-o-thread-configuration) - - [Raw Device Mapping](#raw-device-mapping) - - [kworker Isolation and Binding](#kworker-isolation-and-binding) - - [HugePage Memory](#hugepage-memory) - - [Security Best Practices](#security-best-practices) - - [Libvirt Authentication](#libvirt-authentication) - - [qemu-ga](#qemu-ga) - - [sVirt Protection](#svirt-protection) - - -## Best Practices - -### Performance Best Practices -#### Halt-Polling - -##### Overview - -If compute resources are sufficient, the halt-polling feature can be used to enable VMs to obtain performance similar to that of physical machines. If the halt-polling feature is not enabled, the host allocates CPU resources to other processes when the vCPU exits due to idle timeout. When the halt-polling feature is enabled on the host, the vCPU of the VM performs polling when it is idle. The polling duration depends on the actual configuration. If the vCPU is woken up during the polling, the vCPU can continue to run without being scheduled from the host. This reduces the scheduling overhead and improves the VM system performance. - ->![](./public_sys-resources/icon-note.gif) **NOTE:** ->The halt-polling mechanism ensures that the vCPU thread of the VM responds in a timely manner. However, when the VM has no load, the host also performs polling. As a result, the host detects that the CPU usage of the vCPU is high, but the actual CPU usage of the VM is not high. - -##### Instructions - -The halt-polling feature is enabled by default. You can dynamically change the halt-polling time of vCPU by modifying the **halt\_poll\_ns** file. The default value is **500000**, in ns. - -For example, to set the polling duration to 400,000 ns, run the following command: - -``` -# echo 400000 > /sys/module/kvm/parameters/halt_poll_ns -``` - -#### I/O Thread Configuration - -##### Overview - -By default, QEMU main threads handle backend VM read and write operations on the KVM. This causes the following issues: - -- VM I/O requests are processed by a QEMU main thread. Therefore, the single-thread CPU usage becomes the bottleneck of VM I/O performance. -- The QEMU global lock \(qemu\_global\_mutex\) is used when VM I/O requests are processed by the QEMU main thread. If the I/O processing takes a long time, the QEMU main thread will occupy the global lock for a long time. As a result, the VM vCPU cannot be scheduled properly, affecting the overall VM performance and user experience. - -You can configure the I/O thread attribute for the virtio-blk disk or virtio-scsi controller. At the QEMU backend, an I/O thread is used to process read and write requests of a virtual disk. The mapping relationship between the I/O thread and the virtio-blk disk or virtio-scsi controller can be a one-to-one relationship to minimize the impact on the QEMU main thread, enhance the overall I/O performance of the VM, and improve user experience. - -##### Configuration Description - -To use I/O threads to process VM disk read and write requests, you need to modify VM configurations as follows: - -- Configure the total number of high-performance virtual disks on the VM. For example, set **** to **4** to control the total number of I/O threads. - - ``` - - VMName - 4194304 - 4194304 - 4 - 4 - ``` - -- Configure the I/O thread attribute for the virtio-blk disk. **** indicates I/O thread IDs. The IDs start from 1 and each ID must be unique. The maximum ID is the value of ****. For example, to allocate I/O thread 2 to the virtio-blk disk, set parameters as follows: - - ``` - - - - -
- - ``` - -- Configure the I/O thread attribute for the virtio-scsi controller. For example, to allocate I/O thread 2 to the virtio-scsi controller, set parameters as follows: - - ``` - - - -
- - ``` - -- Bind I/O threads to a physical CPU. - - Binding I/O threads to specified physical CPUs does not affect the resource usage of vCPU threads. **** indicates I/O thread IDs, and **** indicates IDs of the bound physical CPUs. - - ``` - - - - - ``` - - -#### Raw Device Mapping - -##### Overview - -When configuring VM storage devices, you can use configuration files to configure virtual disks for VMs, or connect block devices \(such as physical LUNs and LVs\) to VMs for use to improve storage performance. The latter configuration method is called raw device mapping \(RDM\). Through RDM, a virtual disk is presented as a small computer system interface \(SCSI\) device to the VM and supports most SCSI commands. - -RDM can be classified into virtual RDM and physical RDM based on backend implementation features. Compared with virtual RDM, physical RDM provides better performance and more SCSI commands. However, for physical RDM, the entire SCSI disk needs to be mounted to a VM for use. If partitions or logical volumes are used for configuration, the VM cannot identify the disk. - -##### Configuration Example - -VM configuration files need to be modified for RDM. The following is a configuration example. - -- Virtual RDM - - The following is an example of mounting the SCSI disk **/dev/sdc** on the host to the VM as a virtual raw device: - - ``` - - - ... - - - - - -
- - ... - - - ``` - - -- Physical RDM - - The following is an example of mounting the SCSI disk **/dev/sdc** on the host to the VM as a physical raw device: - - ``` - - - ... - - - - - -
- - ... - - - ``` - - -#### kworker Isolation and Binding - -##### Overview - -kworker is a per-CPU thread implemented by the Linux kernel. It is used to execute workqueue requests in the system. kworker threads will compete for physical core resources with vCPU threads, resulting in virtualization service performance jitter. To ensure that the VM can run stably and reduce the interference of kworker threads on the VM, you can bind kworker threads on the host to a specific CPU. - -##### Instructions - -You can modify the **/sys/devices/virtual/workqueue/cpumask** file to bind tasks in the workqueue to the CPU specified by **cpumasks**. Masks in **cpumask** are in hexadecimal format. For example, if you need to bind kworker to CPU0 to CPU7, run the following command to change the mask to **ff**: - -``` -# echo ff > /sys/devices/virtual/workqueue/cpumask -``` - -#### HugePage Memory - -##### Overview - -Compared with traditional 4 KB memory paging, openEuler also supports 2 MB/1 GB memory paging. HugePage memory can effectively reduce TLB misses and significantly improve the performance of memory-intensive services. openEuler uses two technologies to implement HugePage memory. - -- Static HugePages - - The static HugePage requires that a static HugePage pool be reserved before the host OS is loaded. When creating a VM, you can modify the XML configuration file to specify that the VM memory is allocated from the static HugePage pool. The static HugePage ensures that all memory of a VM exists on the host as the HugePage to ensure physical continuity. However, the deployment difficulty is increased. After the page size of the static HugePage pool is changed, the host needs to be restarted for the change to take effect. The size of a static HugePage can be 2 MB or 1 GB. - - -- THP - - If the transparent HugePage \(THP\) mode is enabled, the VM automatically selects available 2 MB consecutive pages and automatically splits and combines HugePages when allocating memory. When no 2 MB consecutive pages are available, the VM selects available 64 KB \(AArch64 architecture\) or 4 KB \(x86\_64 architecture\) pages for allocation. By using THP, users do not need to be aware of it and 2 MB HugePages can be used to improve memory access performance. - - -If VMs use static HugePages, you can disable THP to reduce the overhead of the host OS and ensure stable VM performance. - -##### Instructions - -- Configure static HugePages. - - Before creating a VM, modify the XML file to configure a static HugePage for the VM. - - ``` - - - - - - ``` - - The preceding XML segment indicates that a 1 GB static HugePage is configured for the VM. - - ``` - - - - - - ``` - - The preceding XML segment indicates that a 2 MB static HugePage is configured for the VM. - -- Configure transparent HugePage. - - Dynamically enable the THP through sysfs. - - ``` - # echo always > /sys/kernel/mm/transparent_hugepage/enabled - ``` - - Dynamically disable the THP. - - ``` - # echo never > /sys/kernel/mm/transparent_hugepage/enabled - ``` - - -### security Best Practices - -#### Libvirt Authentication - -##### Overview - -When a user uses libvirt remote invocation but no authentication is performed, any third-party program that connects to the host's network can operate VMs through the libvirt remote invocation mechanism. This poses security risks. To improve system security, openEuler provides the libvirt authentication function. That is, users can remotely invoke a VM through libvirt only after identity authentication. Only specified users can access the VM, thereby protecting VMs on the network. - -##### Enabling Libvirt Authentication - -By default, the libvirt remote invocation function is disabled on openEuler. This following describes how to enable the libvirt remote invocation and libvirt authentication functions. - -1. Log in to the host. -2. Modify the libvirt service configuration file **/etc/libvirt/libvirtd.conf** to enable the libvirt remote invocation and libvirt authentication functions. For example, to enable the TCP remote invocation that is based on the Simple Authentication and Security Layer \(SASL\) framework, configure parameters by referring to the following: - - ``` - #Transport layer security protocol. The value 0 indicates that the protocol is disabled, and the value 1 indicates that the protocol is enabled. You can set the value as needed. - listen_tls = 0 - #Enable the TCP remote invocation. To enable the libvirt remote invocation and libvirt authentication functions, set the value to 1. - listen_tcp = 1 - #User-defined protocol configuration for TCP remote invocation. The following uses sasl as an example. - auth_tcp = "sasl" - ``` - -3. Modify the **/etc/sasl2/libvirt.conf** configuration file to set the SASL mechanism and SASLDB. - - ``` - #Authentication mechanism of the SASL framework. - mech_list: digest-md5 - #Database for storing usernames and passwords - sasldb_path: /etc/libvirt/passwd.db - ``` - -4. Add the user for SASL authentication and set the password. Take the user **userName** as an example. The command is as follows: - - ``` - # saslpasswd2 -a libvirt userName - Password: - Again (for verification): - ``` - -5. Modify the **/etc/sysconfig/libvirtd** configuration file to enable the libvirt listening option. - - ``` - LIBVIRTD_ARGS="--listen" - ``` - -6. Restart the libvirtd service to make the modification to take effect. - - ``` - # systemctl restart libvirtd - ``` - -7. Check whether the authentication function for libvirt remote invocation takes effect. Enter the username and password as prompted. If the libvirt service is successfully connected, the function is successfully enabled. - - ``` - # virsh -c qemu+tcp://192.168.0.1/system - Please enter your authentication name: openeuler - Please enter your password: - Welcome to virsh, the virtualization interactive terminal. - - Type: 'help' for help with commands - 'quit' to quit - - virsh # - ``` - - -##### Managing SASL - -The following describes how to manage SASL users. - -- Query an existing user in the database. - - ``` - # sasldblistusers2 -f /etc/libvirt/passwd.db - user@localhost.localdomain: userPassword - ``` - -- Delete a user from the database. - - ``` - # saslpasswd2 -a libvirt -d user - ``` - - -#### qemu-ga - -##### Overview - -QEMU guest agent \(qemu-ga\) is a daemon running within VMs. It allows users on a host OS to perform various management operations on the guest OS through outband channels provided by QEMU. The operations include file operations \(open, read, write, close, seek, and flush\), internal shutdown, VM suspend \(suspend-disk, suspend-ram, and suspend-hybrid\), and obtaining of VM internal information \(including the memory, CPU, NIC, and OS information\). - -In some scenarios with high security requirements, qemu-ga provides the blacklist function to prevent internal information leakage of VMs. You can use a blacklist to selectively shield some functions provided by qemu-ga. - ->![](./public_sys-resources/icon-note.gif) **NOTE:** ->The qemu-ga installation package is **qemu-guest-agent-**_xx_**.rpm**. It is not installed on openEuler by default. _xx_ indicates the actual version number. - -##### Procedure - -To add a qemu-ga blacklist, perform the following steps: - -1. Log in to the VM and ensure that the qemu-guest-agent service exists and is running. - - ``` - # systemctl status qemu-guest-agent |grep Active - Active: active (running) since Wed 2018-03-28 08:17:33 CST; 9h ago - ``` - -2. Query which **qemu-ga** commands can be added to the blacklist: - - ``` - # qemu-ga --blacklist ? - guest-sync-delimited - guest-sync - guest-ping - guest-get-time - guest-set-time - guest-info - ... - ``` - - -1. Set the blacklist. Add the commands to be shielded to **--blacklist** in the **/usr/lib/systemd/system/qemu-guest-agent.service** file. Use spaces to separate different commands. For example, to add the **guest-file-open** and **guest-file-close** commands to the blacklist, configure the file by referring to the following: - - ``` - [Service] - ExecStart=-/usr/bin/qemu-ga \ - --blacklist=guest-file-open guest-file-close - ``` - - -1. Restart the qemu-guest-agent service. - - ``` - # systemctl daemon-reload - # systemctl restart qemu-guest-agent - ``` - -2. Check whether the qemu-ga blacklist function takes effect on the VM, that is, whether the **--blacklist** parameter configured for the qemu-ga process is correct. - - ``` - # ps -ef|grep qemu-ga|grep -E "blacklist=|b=" - root 727 1 0 08:17 ? 00:00:00 /usr/bin/qemu-ga --method=virtio-serial --path=/dev/virtio-ports/org.qemu.guest_agent.0 --blacklist=guest-file-open guest-file-close guest-file-read guest-file-write guest-file-seek guest-file-flush -F/etc/qemu-ga/fsfreeze-hook - ``` - - >![](./public_sys-resources/icon-note.gif) **NOTE:** - >For more information about qemu-ga, visit [https://wiki.qemu.org/Features/GuestAgent](https://wiki.qemu.org/Features/GuestAgent). - - -#### sVirt Protection - -##### Overview - -In a virtualization environment that uses the discretionary access control \(DAC\) policy only, malicious VMs running on hosts may attack the hypervisor or other VMs. To improve security in virtualization scenarios, openEuler uses sVirt for protection. sVirt is a security protection technology based on SELinux. It is applicable to KVM virtualization scenarios. A VM is a common process on the host OS. In the hypervisor, the sVirt mechanism labels QEMU processes corresponding to VMs with SELinux labels. In addition to types which are used to label virtualization processes and files, different categories are used to label different VMs. Each VM can access only file devices of the same category. This prevents VMs from accessing files and devices on unauthorized hosts or other VMs, thereby preventing VM escape and improving host and VM security. - -##### Enabling sVirt Protection - -1. Enable SELinux on the host. - 1. Log in to the host. - 2. Enable the SELinux function on the host. - 1. Modify the system startup parameter file **grub.cfg** to set **selinux** to **1**. - - ``` - selinux=1 - ``` - - 2. Modify **/etc/selinux/config** to set the **SELINUX** to **enforcing**. - - ``` - SELINUX=enforcing - ``` - - 3. Restart the host. - - ``` - # reboot - ``` - - - -1. Create a VM where the sVirt function is enabled. - 1. Add the following information to the VM configuration file: - - ``` - - ``` - - Or check whether the following configuration exists in the file: - - ``` - - ``` - - 2. Create a VM. - - ``` - # virsh define openEulerVM.xml - ``` - -2. Check whether sVirt is enabled. - - Run the following command to check whether sVirt protection has been enabled for the QEMU process of the running VM. If **svirt\_t:s0:c** exists, sVirt protection has been enabled. - - ``` - # ps -eZ|grep qemu |grep "svirt_t:s0:c" - system_u:system_r:svirt_t:s0:c200,c947 11359 ? 00:03:59 qemu-kvm - system_u:system_r:svirt_t:s0:c427,c670 13790 ? 19:02:07 qemu-kvm - ``` - - diff --git a/docs/en/docs/desktop/Gnome_userguide.md b/docs/en/docs/desktop/Gnome_userguide.md index a18db4f7bd5820047eb680dd25da7f9e9e412f0f..ac254c0a8dcdf5e7835ff71fd7fc6bfa31cac351 100644 --- a/docs/en/docs/desktop/Gnome_userguide.md +++ b/docs/en/docs/desktop/Gnome_userguide.md @@ -90,7 +90,7 @@ If there are so many apps and you know their names, you can enter an app name in #### 3.1.3 List of Active Apps -Active apps, that is, running apps are displayed one by one after the last app in **Favorites**. There is a white dot under each active app. +Active apps, that is, running apps are displayed one by one after the last app in **Favorites**. There is a white dot under the icon of each active app. ![](./figures/gnome-8.PNG) diff --git a/docs/en/docs/desktop/HA_use_cases.md b/docs/en/docs/desktop/HA_use_cases.md deleted file mode 100644 index 9358d970242314620efc6c60e2802d8842310c58..0000000000000000000000000000000000000000 --- a/docs/en/docs/desktop/HA_use_cases.md +++ /dev/null @@ -1,712 +0,0 @@ -# HA Usage Example - -This section describes how to get started with the HA cluster and add an instance. If you are not familiar with HA installation, see [Installing and Deploying HA](./installing-and-deploying-HA.md). - -## Quick Start Guide - -- The following operations use the management platform newly developed by the community as an example. - -### Login Page - -The user name is `hacluster`, and the password is the one set on the host by the user. - -![](./figures/HA-api.png) - -### Home Page - -After logging in to the system, the main page is displayed. The main page consists of the side navigation bar, the top operation area, the resource node list area, and the node operation floating area. - -The following describes the features and usage of the four areas in detail. - -![](./figures/HA-home-page.png) - -#### Navigation Bar - -The side navigation bar consists of two parts: the name and logo of the HA cluster software, and the system navigation. The system navigation consists of three parts: System, Cluster Configurations, and Tools. System is the default option and the corresponding item to the home page. It displays the information and operation entries of all resources in the system. Preference Settings and Heartbeat Configurations are under Cluster Configurations. Log Download and Quick Cluster Operation are under Tools. These two items are displayed in a pop-up box after you click them. - -#### Top Operation Area - -The current login user is displayed statically. When you hover the mouse cursor on the user icon, the operation menu items are displayed, including Refresh Settings and Log Out. After you click Refresh Settings, the Refresh Settings dialog box is displayed with the option. You can set the automatic refresh modes for the system. The options are Do not refresh automatically, Refresh every 5 seconds, and Refresh every 10 seconds. By default, Do not refresh automatically is selected. Clicking Log Out to log out and jump to the login page. After that, a re-login is required if you want to continue to access the system. - -![](./figures/HA-refresh.png) - -#### Resource Node List Area - -The resource node list displays the resource information such as Resource Name, Status, Resource Type, Service, and Running Node of all resources in the system, and the node information such as all nodes in the system and the running status of the nodes. In addition, you can Add, Edit, Start, Stop, Clear, Migrate, Migrate Back, and Delete the resources, and set Relationships for the resources. - -#### Node Operation Floating Area - -By default, the node operation floating area is collapsed. When you click a node in the heading of the resource node list, the node operation area is displayed on the right, as shown in the preceding figure. This area consists of the collapse button, the node name, the stop button, and the standby button, and provides the stop and standby operations. Click the arrow in the upper left corner of the area to collapse the area. - -### Preference Settings - -The following operations can be performed using command lines. The following is an example. For more command details, run the `pcs --help` command. - -```shell -pcs property set stonith-enabled=false -pcs property set no-quorum-policy=ignore -``` - -Run `pcs property` to view all settings. - -![](./figures/HA-firstchoice-cmd.png) - -- Click Preference Settings in the navigation bar, the Preference Settings dialog box is displayed. Change the values of No Quorum Policy and Stonith Enabled from the default values to the values shown in the figure below. Then, click OK. - -![](./figures/HA-firstchoice.png) - -#### Adding Resources - -##### Adding Common Resources - -Click Add Common Resource. The Create Resource dialog box is displayed. All mandatory configuration items of the resource are on the Basic page. After you select a Resource Type on the Basic page, other mandatory and optional configuration items of the resource are displayed. When you type in the resource configuration information, a gray text area is displayed on the right of the dialog box to describe the current configuration item. After all mandatory parameters are set, click OK to create a common resource or click Cancel to cancel the add operation. The optional configuration items on the Instance Attribute, Meta Attribute, or Operation Attribute page are optional. The resource creation process is not affected if they are not configured. You can modify them as required. Otherwise, the default values are used. - -The following uses the Apache as an example to describe how to add an Apache resource. - -```shell -pcs resource create httpd ocf:heartbeat:apache -``` - -Check the resource running status: - -```shell -pcs status -``` - -![](./figures/HA-pcs-status.png) - -- Add the Apache resource: - -![](./figures/HA-add-resource.png) - -- If the following information is displayed, the resource is successfully added: - -![](./figures/HA-apache-suc.png) - -- The resource is successfully created and started, and runs on a node, for example, ha1. The Apache page is displayed. - -![](./figures/HA-apache-show.png) - -##### Adding Group Resources - -Adding group resources requires at least one common resource in the cluster. Click Add Group Resource. The Create Resource dialog box is displayed. All the parameters on the Basic tab page are mandatory. After setting the parameters, click OK to add the resource or click Cancel to cancel the add operation. - -- **Note: Group resources are started in the sequence of child resources. Therefore, you need to select child resources in sequence.** - -![](./figures/HA-group.png) - -If the following information is displayed, the resource is successfully added: - -![](./figures/HA-group-suc.png) - -##### Adding Clone Resources - -Click Add Clone Resource. The Create Resource dialog box is displayed. On the Basic page, enter the object to be cloned. The resource name is automatically generated. After entering the object name, click OK to add the resource, or click Cancel to cancel the add operation. - -![](./figures/HA-clone.png) - -If the following information is displayed, the resource is successfully added: - -![](./figures/HA-clone-suc.png) - -#### Editing Resources - -- Starting a resource: Select a target resource from the resource node list. The target resource must not be running. Start the resource. -- Stopping a resource: Select a target resource from the resource node list. The target resource must be running. Stop the resource. -- Clearing a resource: Select a target resource from the resource node list. Clear the resource. -- Migrating a resource: Select a target resource from the resource node list. The resource must be a common resource or a group resource in the running status. Migrate the resource to migrate it to a specified node. -- Migrating back a resource: Select a target resource from the resource node list. The resource must be a migrated resource. Migrate back the resource to clear the migration settings of the resource and migrate the resource back to the original node. - After you click Migrate Back, the status change of the resource item in the list is the same as that when the resource is started. -- Deleting a resource: Select a target resource from the resource node list. Delete the resource. - -#### Setting Resource Relationships - -Resource relationships are used to set restrictions for the target resources. There are three types of resource restrictions: resource location, resource collaboration, and resource order. - -- Resource location: sets the running level of the nodes in the cluster for the resource to determine the node where the resource runs during startup or switchover. The running levels are Master Node and Slave 1 in descending order. -- Resource collaboration: indicates whether the target resource and other resources in the cluster run on the same node. Same Node indicates that this resource must run on the same node as the target resource. Mutually Exclusive indicates that this resource cannot run on the same node as the target resource. -- Resource order: Set the order in which the target resource and other resources in the cluster are started. Front Resource indicates that this resource must be started before the target resource. Follow-up Resource indicates that this resource can be started only after the target resource is started. - -## HA MySQL Configuration Example - -- Configure three common resources separately, then add them as a group resource. - -### Configuring the Virtual IP Address - -On the home page, choose Add > Add Common Resource and set the parameters as follows: - -![](./figures/HA-vip.png) - -- The resource is successfully created and started and runs on a node, for example, ha1. The resource can be pinged and connected, and allows various operations after login. The resource is switched to ha2 and can be accessed normally. -- If the following information is displayed, the resource is successfully added: - -![](./figures/HA-vip-suc.png) - -### Configuring NFS Storage - -- Configure another host as the NFS server. - -Install the software packages: - -```shell -yum install -y nfs-utils rpcbind -``` - -Run the following command to disable the firewall: - -```shell -systemctl stop firewalld && systemctl disable firewalld -``` - -Modify the **/etc/selinux/config** file to change the status of SELINUX to disabled. - -```shell -SELINUX=disabled -``` - -Start the services: - -```shell -systemctl start rpcbind && systemctl enable rpcbind -systemctl start nfs-server && systemctl enable nfs-server -``` - -Create a shared directory on the server: - -```shell -mkdir -p /test -``` - -Modify the NFS configuration file: - -```shell -vim /etc/exports -/test *(rw,no_root_squash) -``` - -Reload the service: - -```shell -systemctl reload nfs -``` - -Install the software packages on the client. Install MySQL first to mount the NFS to the path of the MySQL data. - -```shell -yum install -y nfs-utils mariadb-server -``` - -On the home page, choose Add > Add Common Resource and configure the NFS resource as follows: - -![](./figures/HA-nfs.png) - -- The resource is successfully created and started and runs on a node, for example, ha1. The NFS is mounted to the **/var/lib/mysql** directory. The resource is switched to ha2. The NFS is unmounted from ha1 and automatically mounted to ha2. -- If the following information is displayed, the resource is successfully added: - -![](./figures/HA-nfs-suc.png) - -### Configuring MySQL - -On the home page, choose Add > Add Common Resource and configure the MySQL resource as follows: - -![](./figures/HA-mariadb.png) - -- If the following information is displayed, the resource is successfully added: - -![](./figures/HA-mariadb-suc.png) - -### Adding the Preceding Resources as a Group Resource - -- Add the three resources in the resource startup sequence. - -On the home page, choose Add > Add Group Resource and configure the group resource as follows: - -![](./figures/HA-group-new.png) - -- The group resource is successfully created and started. If the command output is the same as that of the preceding common resources, the group resource is successfully added. - -![](./figures/HA-group-new-suc.png) - -- Use ha1 as the standby node and migrate the group resource to the ha2 node. The system is running properly. - -![](./figures/HA-group-new-suc2.png) - -## Quorum Device Configuration - -Note: The current cluster must be normal with cluster attributes set. - -```sh -[root@ha1 ~]# pcs property set no-quorum-policy=stop -[root@ha1 ~]# pcs property set stonith-enabled=false -``` - -Select a new machine as the quorum device. - -### Installing Quorum Software - -- Install corosync-qdevice on a cluster node, for example, ha1. - -```sh -[root@ha1:~]# dnf install corosync-qdevice -y -``` - -- Install pcs and corosync-qnetd on the quorum device host. - -```sh -[root@qdevice:~]# dnf install pcs corosync-qnetd -y -``` - -- Start the pcsd service on the quorum device host and enable the pcsd service to start upon system startup. - -```sh -[root@qdevice:~]# systemctl start pcsd && systemctl enable pcsd -``` - -### Modifying the Host Name and the /etc/hosts File - -**Note: Perform the following operations on all the three hosts. The following uses one host as an example.** - -Before using the quorum function, change the host name, write all host names to the **/etc/hosts** file, and set the password for the **hacluster** user. - -- Change the host name. - -```shell -hostnamectl set-hostname qdevice -``` - -- Write the IP addresses and host names to the **/etc/hosts** file. - -```text -10.1.167.105 ha1 -10.1.167.105 ha2 -10.1.167.106 qdevice -``` - -- Set the password for the **hacluster** user. - -```sh -[root@qdevice:~]# passwd hacluster -``` - -### Configuring the Quorum Device and Adding It to the Cluster - -The following describes how to configure the quorum device and add it to the cluster. - -- The qdevice node is used as the quorum device. -- The model of the quorum device is net. -- The cluster nodes are ha1 and ha2. - -#### Disabling the Firewall - -```sh -systemctl stop firewalld && systemctl disable firewalld -``` - -- Temporarily disable SELinux. - -```Conf -setenforce 0 -``` - -#### Configuring the Quorum Device - -On the node that will be used to host the quorum device, run the following command to configure the quorum device. This command sets the model of the quorum device to net and configures the device to start during boot. - -```sh -[root@qdevice ~]# pcs qdevice setup model net --enable --start -Quorum device 'net' initialized -quorum device enabled -Starting quorum device... -quorum device started -``` - -After configuring the quorum device, view its status. The current status indicates that the corosync-qnetd daemon is running and no client is connected to it. Run the **--full** command to display the detailed output. - -```sh -[root@qdevice ~]# pcs qdevice status net --full -QNetd address: *:5403 -TLS: Supported (client certificate required) -Connected clients: 0 -Connected clusters: 0 -Maximum send/receive size: 32768/32768 bytes -``` - -#### Authenticate Identities - -Authenticate users on the node hosting the quorum device from a hacluster node in the cluster. This allows the pcs cluster to be connected to the qdevice on the pcs host, but does not allow the qdevice on the pcs host to be connected to the pcs cluster. - -```sh -[root@ha1 ~]# pcs host auth qdevice -Username: hacluster -Password: -qdevice: Authorized -``` - -#### Adding the Quorum Device to the Cluster - -Before adding the quorum device, run the **pcs quorum config** command to view the current configuration of the quorum device for later comparison. - -```sh -[root@ha1 ~]# pcs quorum config -Options: -``` - -Run the **pcs quorum status** command to check the current status of the quorum device. The command output indicates that the cluster does not use the quorum device and the member status of each qdevice node is NR (unregistered). - -```sh -[root@ha1 ~]# pcs quorum status -Quorum information ------------------- -Date: Mon Sep 4 17:03:29 2023 -Quorum provider: corosync_votequorum -Nodes: 2 -Node ID: 1 -Ring ID: 1.e -Quorate: Yes - -Votequorum information ----------------------- -Expected votes: 2 -Highest expected: 2 -Total votes: 2 -Quorum: 1 -Flags: 2Node Quorate WaitForAll - -Membership information ----------------------- - Nodeid Votes Qdevice Name - 1 1 NR ha1 (local) - 2 1 NR ha2 -``` - -Add the created quorum device to the cluster. Note that multiple quorum devices cannot be used in a cluster at the same time. However, a quorum device can be used by multiple clusters at the same time. This example configures the quorum device to use the ffsplit algorithm. - -```sh -[root@ha1 ~]# pcs quorum device add model net host=qdevice algorithm=ffsplit -Setting up qdevice certificates on nodes... -ha1: Succeeded -ha2: Succeeded -Enabling corosync-qdevice... -ha2: corosync-qdevice enabled -ha1: corosync-qdevice enabled -Sending updated corosync.conf to nodes... -ha1: Succeeded -ha2: Succeeded -ha1: Corosync configuration reloaded -Starting corosync-qdevice... -ha2: corosync-qdevice started -ha1: corosync-qdevice started -``` - -View the corosync-qdevice service status. - -```sh -[root@ha1 ~]# systemctl status corosync-qdevice -● corosync-qdevice.service - Corosync Qdevice daemon - Loaded: loaded (/usr/lib/systemd/system/corosync-qdevice.service; enabled; preset: disabled> - Active: active (running) since Mon 2023-09-04 17:03:49 CST; 20s ago - Docs: man:corosync-qdevice - Main PID: 12756 (corosync-qdevic) - Tasks: 2 (limit: 11872) - Memory: 1.6M - CGroup: /system.slice/corosync-qdevice.service - ├─12756 /usr/sbin/corosync-qdevice -f - └─12757 /usr/sbin/corosync-qdevice -f - -Sep 04 17:03:49 ha1 systemd[1]: Starting Corosync Qdevice daemon... -Sep 04 17:03:49 ha1 systemd[1]: Started Corosync Qdevice daemon. -``` - -#### Checking the Configuration Status of the Quorum Device - -Check the configuration changes in the cluster. Run the **pcs quorum config** command to view information about the configured quorum device. - -```shell -[root@ha1 ~]# pcs quorum config -Options: -Device: - Model: net - algorithm: ffsplit - host: qdevice -``` - -The **pcs quorum status** command displays the quorum running status, indicating that the quorum device is in use. The meanings of the member status values of each cluster node are as follows: - -- **A**/**NA**: Whether the quorum device is alive, indicating whether there is heartbeat corosync between qdevice and the cluster. This should always indicate that the quorum device is active. -- **V**/**NV**: **V** is set when the quorum device votes for a node. In this example, both nodes are set to **V** because they can communicate with each other. If the cluster is split into two single-node clusters, one node is set to **V** and the other is set to **NV**. -- **MW**/**NMW**: The internal quorum device flag is set (**MW**) or not set (**NMW**). By default, the flag is not set and the value is **NMW**. - -```sh -[root@ha1 ~]# pcs quorum status -Quorum information ------------------- -Date: Mon Sep 4 17:04:33 2023 -Quorum provider: corosync_votequorum -Nodes: 2 -Node ID: 1 -Ring ID: 1.e -Quorate: Yes - -Votequorum information ----------------------- -Expected votes: 3 -Highest expected: 3 -Total votes: 3 -Quorum: 2 -Flags: Quorate Qdevice - -Membership information ----------------------- - Nodeid Votes Qdevice Name - 1 1 A,V,NMW ha1 (local) - 2 1 A,V,NMW ha2 - 0 1 Qdevice -``` - -Run the **pcs quorum device status** command to view the running status of the quorum device. - -```shell -[root@ha1 ~]# pcs quorum device status -Qdevice information -------------------- -Model: Net -Node ID: 1 -Configured node list: - 0 Node ID = 1 - 1 Node ID = 2 -Membership node list: 1, 2 - -Qdevice-net information ----------------------- -Cluster name: hacluster -QNetd host: qdevice:5403 -Algorithm: Fifty-Fifty split -Tie-breaker: Node with lowest node ID -State: Connected -``` - -On the quorum device, run the following command to display the status of the corosync-qnetd daemon: - -```sh -[root@qdevice ~]# pcs qdevice status net --full -QNetd address: *:5403 -TLS: Supported (client certificate required) -Connected clients: 2 -Connected clusters: 1 -Maximum send/receive size: 32768/32768 bytes -Cluster "hacluster": - Algorithm: Fifty-Fifty split (KAP Tie-breaker) - Tie-breaker: Node with lowest node ID - Node ID 1: - Client address: ::ffff:10.211.55.36:43186 - HB interval: 8000ms - Configured node list: 1, 2 - Ring ID: 1.e - Membership node list: 1, 2 - Heuristics: Undefined (membership: Undefined, regular: Undefined) - TLS active: Yes (client certificate verified) - Vote: No change (ACK) - Node ID 2: - Client address: ::ffff:10.211.55.37:55682 - HB interval: 8000ms - Configured node list: 1, 2 - Ring ID: 1.e - Membership node list: 1, 2 - Heuristics: Undefined (membership: Undefined, regular: Undefined) - TLS active: Yes (client certificate verified) - Vote: ACK (ACK) -``` - -### Managing Quorum Device Services - -You can manage the quorum device by starting and stopping the corosync-qnetd service. - -```sh -[root@ha1 ~]# pcs quorum device status -Qdevice information -------------------- -Model: Net -Node ID: 1 -Configured node list: - 0 Node ID = 1 - 1 Node ID = 2 -Membership node list: 1, 2 - -Qdevice-net information ----------------------- -Cluster name: hacluster -QNetd host: qdevice:5403 -Algorithm: Fifty-Fifty split -Tie-breaker: Node with lowest node ID -State: Connected -``` - -```sh -[root@qdevice ~]# systemctl stop corosync-qnetd -[root@qdevice ~]# -[root@qdevice ~]# systemctl status corosync-qnetd -○ corosync-qnetd.service - Corosync Qdevice Network daemon - Loaded: loaded (/usr/lib/systemd/system/corosync-qnetd.service; enabled; preset: disabled> - Active: inactive (dead) since Mon 2023-09-04 17:07:57 CST; 1s ago - Duration: 5min 17.639s - Docs: man:corosync-qnetd - Process: 9297 ExecStart=/usr/bin/corosync-qnetd -f $COROSYNC_QNETD_OPTIONS (code=exited> - Main PID: 9297 (code=exited, status=0/SUCCESS) - -9月 04 17:02:39 qdevice systemd[1]: Starting Corosync Qdevice Network daemon... -9月 04 17:02:39 qdevice systemd[1]: Started Corosync Qdevice Network daemon. -9月 04 17:07:57 qdevice systemd[1]: Stopping Corosync Qdevice Network daemon... -9月 04 17:07:57 qdevice systemd[1]: corosync-qnetd.service: Deactivated successfully. -9月 04 17:07:57 qdevice systemd[1]: Stopped Corosync Qdevice Network daemon. -``` - -```sh -[root@ha1 ~]# pcs quorum device status -Qdevice information -------------------- -Model: Net -Node ID: 1 -Configured node list: - 0 Node ID = 1 - 1 Node ID = 2 -Membership node list: 1, 2 - -Qdevice-net information ----------------------- -Cluster name: hacluster -QNetd host: qdevice:5403 -Algorithm: Fifty-Fifty split -Tie-breaker: Node with lowest node ID -State: Connect failed -``` - -```sh -[root@qdevice ~]# systemctl start corosync-qnetd -[root@qdevice ~]# -[root@qdevice ~]# systemctl status corosync-qnetd -● corosync-qnetd.service - Corosync Qdevice Network daemon - Loaded: loaded (/usr/lib/systemd/system/corosync-qnetd.service; enabled; preset: disabled> - Active: active (running) since Mon 2023-09-04 17:08:09 CST; 3s ago - Docs: man:corosync-qnetd - Main PID: 9323 (corosync-qnetd) - Tasks: 1 (limit: 11872) - Memory: 6.2M - CGroup: /system.slice/corosync-qnetd.service - └─9323 /usr/bin/corosync-qnetd -f - -9月 04 17:08:09 qdevice systemd[1]: Starting Corosync Qdevice Network daemon... -9月 04 17:08:09 qdevice systemd[1]: Started Corosync Qdevice Network daemon. -``` - -```sh -[root@ha1 ~]# pcs quorum device status -Qdevice information -------------------- -Model: Net -Node ID: 1 -Configured node list: - 0 Node ID = 1 - 1 Node ID = 2 -Membership node list: 1, 2 - -Qdevice-net information ----------------------- -Cluster name: hacluster -QNetd host: qdevice:5403 -Algorithm: Fifty-Fifty split -Tie-breaker: Node with lowest node ID -State: Connected -``` - -### Managing the Quorum Device in the Cluster - -You can use the **pcs** commands to change quorum device settings, disable the quorum device, and delete the quorum device from the cluster. - -#### Changing Quorum Device Settings - -**Note: To change the net option of quorum device model in the host, run the pcs quorum device remove and pcs quorum device add commands to correctly configure the settings unless the old and new hosts are the same.** - -- Change the quorum device algorithm to lms. - -```sh -[root@ha1 ~]# pcs quorum device update model algorithm=lms -Sending updated corosync.conf to nodes... -ha1: Succeeded -ha2: Succeeded -ha1: Corosync configuration reloaded -Reloading qdevice configuration on nodes... -ha1: corosync-qdevice stopped -ha2: corosync-qdevice stopped -ha1: corosync-qdevice started -ha2: corosync-qdevice started -``` - -#### Deleting the Quorum Device - -- Delete the quorum device configured on the cluster node. - -```sh -[root@ha1 ~]# pcs quorum device remove -Disabling corosync-qdevice... -ha1: corosync-qdevice disabled -ha2: corosync-qdevice disabled -Stopping corosync-qdevice... -ha1: corosync-qdevice stopped -ha2: corosync-qdevice stopped -Removing qdevice certificates from nodes... -ha1: Succeeded -ha2: Succeeded -Sending updated corosync.conf to nodes... -ha1: Succeeded -ha2: Succeeded -ha1: Corosync configuration reloaded -``` - -After the quorum device is deleted, check the quorum device status. The following error message is displayed: - -```shell -[root@ha1 ~]# pcs quorum device status -Error: Unable to get quorum status: corosync-qdevice-tool: Can't connect to QDevice socket (is QDevice running?): No such file or directory -``` - -#### Destroying the Quorum Device - -- Disable and stop the quorum device on the quorum device host and delete all its configuration files. - -```shell -[root@qdevice ~]# pcs qdevice destroy net -Stopping quorum device... -quorum device stopped -quorum device disabled -Quorum device 'net' configuration files removed -``` - -## Encrypting corosync Service Configurations - -After the cluster starts normally, modify the **/etc/corosync/corosync.conf** configuration file on both nodes. - -```conf -totem { - version: 2 - cluster_name: hacluster - crypto_cipher: aes256 - crypto_hash: sha256 -} -``` - -Run `corosync-keygen` to generate a key pair for the corosync cluster and run `scp` to transfer the key to another node. - -```sh -[root@ha1 ~]# corosync-keygen -Corosync Cluster Engine Authentication key generator. -Gathering 2048 bits for key from /dev/urandom. -Writing corosync key to /etc/corosync/authkey. -[root@ha1 ~]# -[root@ha1 ~]# scp -r /etc/corosync/authkey root@10.211.55.37:/etc/corosync/ -``` - -Restart the corosync service on both nodes. - -```sh -systemctl restart corosync -``` diff --git a/docs/en/docs/desktop/HAuserguide.md b/docs/en/docs/desktop/HAuserguide.md deleted file mode 100644 index bcf248dc345848eb246a392ead53ce4abb91a381..0000000000000000000000000000000000000000 --- a/docs/en/docs/desktop/HAuserguide.md +++ /dev/null @@ -1,358 +0,0 @@ -# Installing, Deploying, and Using HA - - -- [Installing, Deploying, and Using HA](#installing-deploying-and-using-ha) - - [Installation and Configuration](#installation-and-configuration) - - [Modifying the Host Name and the /etc/hosts File](#modifying-the-host-name-and-the-etchosts-file) - - [Configuring the Yum Source](#configuring-the-yum-source) - - [Installing HA Software Package Components](#installing-ha-software-package-components) - - [Setting the hacluster User Password](#setting-the-hacluster-user-password) - - [Modifying the `/etc/corosync/corosync.conf` File](#modifying-the-etccorosynccorosyncconf-file) - - [Managing Services](#managing-services) - - [Disabling the Firewall](#disabling-the-firewall) - - [Managing the pcs Service](#managing-the-pcs-service) - - [Managing the pacemaker Service](#managing-the-pacemaker-service) - - [Managing the corosync Service](#managing-the-corosync-service) - - [Performing Node Authentication](#performing-node-authentication) - - [Accessing the Front-End Management Platform](#accessing-the-front-end-management-platform) - - [Quick User Guide](#quick-user-guide) - - [Login Page](#login-page) - - [Home Page](#home-page) - - [Managing Nodes](#managing-nodes) - - [Node](#node) - - [Preference Setting](#preference-setting) - - [Adding Resources](#adding-resources) - - [Adding Common Resources](#adding-common-resources) - - [Adding Group Resources](#adding-group-resources) - - [Adding Clone Resources](#adding-clone-resources) - - [Editing Resources](#editing-resources) - - [Setting Resource Relationships](#setting-resource-relationships) - - [ACLS](#acls) - - - - -## Installation and Configuration - -- Environment preparation: At least two physical machines or VMs with openEuler 20.03 LTS SP2 installed are required. (This section uses two physical machines or VMs as an example.) For details, see the *openEuler 20.03 LTS SP2 Installation Guide*. - -### Modifying the Host Name and the /etc/hosts File - -- **Note: You need to perform the following operations on both hosts. The following takes the operation on one host as an example.** - -Before using the HA software, ensure that the host name has been changed and all host names have been written into the `/etc/hosts` file. - -- Run the following command to change the host name: - -``` -# hostnamectl set-hostname ha1 -``` - -- Edit the `/etc/hosts` file and write the following fields: - -``` -172.30.30.65 ha1 -172.30.30.66 ha2 -``` - -### Configuring the Yum Source - -After the system is successfully installed, the Yum source is configured by default. The file location information is stored in the `/etc/yum.repos.d/openEuler.repo` file. The HA software package uses the following sources: - -``` -[OS] -name=OS -baseurl=http://repo.openeuler.org/openEuler-20.03-LTS-SP2/OS/$basearch/ -enabled=1 -gpgcheck=1 -gpgkey=http://repo.openeuler.org/openEuler-20.03-LTS-SP2/OS/$basearch/RPM-GPG-KEY-openEuler - -[everything] -name=everything -baseurl=http://repo.openeuler.org/openEuler-20.03-LTS-SP2/everything/$basearch/ -enabled=1 -gpgcheck=1 -gpgkey=http://repo.openeuler.org/openEuler-20.03-LTS-SP2/everything/$basearch/RPM-GPG-KEY-openEuler - -[EPOL] -name=EPOL -baseurl=http://repo.openeuler.org/openEuler-20.03-LTS-SP2/EPOL/$basearch/ -enabled=1 -gpgcheck=1 -gpgkey=http://repo.openeuler.org/openEuler-20.03-LTS-SP2/OS/$basearch/RPM-GPG-KEY-openEuler -``` - -### Installing HA Software Package Components - -``` -# yum install corosync pacemaker pcs fence-agents fence-virt corosync-qdevice sbd drbd drbd-utils -y -``` - -### Setting the hacluster User Password - -``` -# passwd hacluster -``` - -### Modifying the `/etc/corosync/corosync.conf` File - -``` -totem { - version: 2 - cluster_name: hacluster - crypto_cipher: none - crypto_hash: none -} -logging { - fileline: off - to_stderr: yes - to_logfile: yes - logfile: /var/log/cluster/corosync.log - to_syslog: yes - debug: on - logger_subsys { - subsys: QUORUM - debug: on - } -} -quorum { - provider: corosync_votequorum - expected_votes: 2 - two_node: 1 - } -nodelist { - node { - name: ha1 - nodeid: 1 - ring0_addr: 172.30.30.65 - } - node { - name: ha2 - nodeid: 2 - ring0_addr: 172.30.30.66 - } - } -``` - -### Managing Services - -#### Disabling the Firewall - -``` -# systemctl stop firewalld -``` - -Change the status of SELINUX in the `/etc/selinux/config` file to **disabled**. - -``` -# SELINUX=disabled -``` - -#### Managing the pcs Service - -- Run the following command to start the **pcs** service: - -``` -# systemctl start pcsd -``` - -- Run the following command to query service status: - -``` -# systemctl status pcsd -``` - -The service is started successfully if the following information is displayed: - -![](./figures/HA-pcs.png) - -#### Managing the pacemaker Service - -- Run the following command to start the **pacemaker** service: - -``` -# systemctl start pacemaker -``` - -- Run the following command to query service status: - -``` -# systemctl status pacemaker -``` - -The service is started successfully if the following information is displayed: - -![](./figures/HA-pacemaker.png) - -#### Managing the corosync Service - -- Run the following command to start the **corosync** service: - -``` -# systemctl start corosync -``` - -- Run the following command to query service status: - -``` -# systemctl status corosync -``` - -The service is started successfully if the following information is displayed: - -![](./figures/HA-corosync.png) - -### Performing Node Authentication - -- **Note: Perform this operation on only one node.** - -``` -# pcs host auth ha1 ha2 -``` - -### Accessing the Front-End Management Platform - -After the preceding services are started, open the browser (Chrome or Firefox is recommended) and enter `https://IP:2224` in the address box. - -## Quick User Guide - -### Login Page - -The username is **hacluster** and the password is the one set on the host. - -![](./figures/HA-login.png) - -### Home Page - -The home page is the **MANAGE CLUSTERS** page, which includes four functions: remove, add existing, destroy, and create new clusters. - -![](./figures/HA-home-page.png) - -### Managing Nodes - -#### Node - -You can add and remove nodes. The following describes how to add an existing node. - -![](./figures/HA-existing-nodes.png) - -Node management includes the following functions: start, stop, restart, standby, maintenance, and configure Fencing. You can view the enabled services and running resources of the node and manage the node. - -![](./figures/HA-node-setting1.png) ![](./figures/HA-node-setting2.png) - -### Preference Setting - -You can perform the following operations using command lines. The following is a simple example. Run the **pcs --help** command to query more commands available. - -``` -# pcs property set stonith-enabled=false -# pcs property set no-quorum-policy=ignore -``` - -Run the **pcs property** command to view all settings. - -![](./figures/HA-firstchoice-cmd.png) - -- Change the default status of **No Quorum Policy** to **ignore**, and the default status of **Stonith Enabled** to **false**, as shown in the following figure: - -![](./figures/HA-firstchoice.png) - -#### Adding Resources - -##### Adding Common Resources - -The multi-option drop-down list box in the system supports keyword matching. You can enter the keyword of the item to be configured and quickly select it. - -Apache and IPaddr are used as examples. - -Run the following commands to add the Apache and IPaddr resources: - -``` -# pcs resource create httpd ocf:heartbeat:apache -# pcs resource create IPaddr ocf:heartbeat:IPaddr2 ip=172.30.30.67 -``` - -Run the following command to check the cluster resource status: - -``` -# pcs status -``` - -![](./figures/HA-pcs-status.png) - -![](./figures/HA-add-resource.png) - -- Add Apache resources. - -![](./figures/HA-apache.png) - -- The resources are successfully added if the following information is displayed: - -![](./figures/HA-apache-suc.png) - -- The resources are created and started successfully, and run on a node, for example, **ha1**. The Apache page is displayed. - -![](./figures/HA-apache-show.png) - -- Add IPaddr resources. - -![](./figures/HA-ipaddr.png) - -- The resources are successfully added if the following information is displayed: - -![](./figures/HA-ipaddr-suc.png) - -- The resources are created and started successfully, and run on a node, for example, **ha1**. The HA web login page is displayed, and you can log in to the page and perform operations. When the resources are switched to **ha2**, the web page can still be accessed. - -![](./figures/HA-ipaddr-show.png) - -##### Adding Group Resources - -When you add group resources, at least one common resource is needed in the cluster. Select one or more resources and click **Create Group**. - -- **Note: Group resources are started in the sequence of subresources. Therefore, you need to select subresources in sequence.** - -![](./figures/HA-group.png) - -The resources are successfully added if the following information is displayed: - -![](./figures/HA-group-suc.png) - -##### Adding Clone Resources - -![](./figures/HA-clone.png) - -The resources are successfully added if the following information is displayed: - -![](./figures/HA-clone-suc.png) - -#### Editing Resources - -- **Enable**: Select a target resource that is not running from the resource node list. Enable the resource. -- **Disable**: Select a target resource that is running from the resource node list. Disable the resource. -- **Clearup**: Select a target resource from the resource node list and clear the resource. -- **Porting**: Select a target resource from the resource node list. The resource must be a common resource or group resource that is running. You can port the resource to a specified node. -- **Rollback**: Select a target resource from the resource node list. Before rolling back a resource, ensure that the resource has been ported. You can clear the porting settings of the resource and roll the resource back to the original node. After you click the button, the status of the resource item in the list is the same as that when the resource is enabled. -- **Remove**: Select a target resource from the resource node list and remove the resource. - -You can perform the preceding resource operations on the page shown in the following figure: - -![](./figures/HA-resoure-set.png) - -#### Setting Resource Relationships - -The resource relationship is used to set restrictions for target resources. Resource restrictions are classified as follows: **resource location**, **resource colocation**, and **resource ordering**. - -- **Resource location**: Set the runlevel of nodes in the cluster to determine the node where the resource runs during startup or switchover. The runlevels are Master and Slave in descending order. -- **Resource colocation**: Indicate whether the target resource and other resources in the cluster are running on the same node. For resources on the same node, the resource must run on the same node as the target resource. For resources on mutually exclusive nodes, the resource and the target resource must run on different nodes. -- **Resource ordering**: Set the ordering in which the target resource and other resources in the cluster are started. The preamble resource must run before the target resource runs. The postamble resource can run only after the target resource runs. - -After adding common resources or group resources, you can perform the preceding resource operations on the page shown in the following figure: - -![](./figures/HA-resource-relationship.png) - -#### ACLS - -ACLS is an access control list. You can click **Add** to add a user and manage the user access. - -![](./figures/HA-ACLS.png) \ No newline at end of file diff --git a/docs/en/docs/desktop/Install_Cinnamon.md b/docs/en/docs/desktop/Install_Cinnamon.md deleted file mode 100644 index 2b15a6760c7781cd47c91cdd9e4bbee21153d256..0000000000000000000000000000000000000000 --- a/docs/en/docs/desktop/Install_Cinnamon.md +++ /dev/null @@ -1,72 +0,0 @@ -# Installing Cinnamon on openEuler - -Cinnamon is the most commonly used desktop environment running in Unix-like operating systems. It is also a desktop environment with complete functions, simple operations, user-friendly interfaces, and integrated use and development capabilities. It is also a formal desktop planned by GNU. - -For users, Cinnamon is a suite that integrates the desktop environment and applications. For developers, Cinnamon is an application development framework, consisting of a large number of function libraries. Applications written in Cinnamon can run properly even if users do not run the Cinnamon desktop environment. - -Cinnamon contains basic software such as the file manager, application store, and text editor, and advanced applications and tools such as system sampling analysis, system logs, software engineering IDE, web browser, simple virtual machine monitor, and developer document browser. - -You are advised to create an administrator during the installation. - -1. Configure the source and update the system. - [Download](https://openeuler.org/en/) the openEuler ISO file, install the system, and update the software source. (You need to configure the Everything source and EPOL source. The following command is used to install the Cinnamon in the minimum installation system.) - - ```shell - sudo dnf update - ``` - -2. Install the font library. - - ```shell - sudo dnf install dejavu-fonts liberation-fonts gnu-*-fonts google-*-fonts - ``` - -3. Install the Xorg. - - ```shell - sudo dnf install xorg-* - ``` - - Unnecessary packages may be installed during the installation. You can run the following commands to install necessary Xorg packages: - - ```shell - sudo dnf install xorg-x11-apps xorg-x11-drivers xorg-x11-drv-ati \ - xorg-x11-drv-dummy xorg-x11-drv-evdev xorg-x11-drv-fbdev xorg-x11-drv-intel \ - xorg-x11-drv-libinput xorg-x11-drv-nouveau xorg-x11-drv-qxl \ - xorg-x11-drv-synaptics-legacy xorg-x11-drv-v4l xorg-x11-drv-vesa \ - xorg-x11-drv-vmware xorg-x11-drv-wacom xorg-x11-fonts xorg-x11-fonts-others \ - xorg-x11-font-utils xorg-x11-server xorg-x11-server-utils xorg-x11-server-Xephyr \ - xorg-x11-server-Xspice xorg-x11-util-macros xorg-x11-utils xorg-x11-xauth \ - xorg-x11-xbitmaps xorg-x11-xinit xorg-x11-xkb-utils - ``` - -4. Install Cinnamon and components. - - ```shell - sudo dnf install cinnamon cinnamon-control-center cinnamon-desktop \ - cinnamon-menus cinnamon-screensaver cinnamon-session \ - cinnamon-settings-daemon cinnamon-themes cjs \ - nemo nemo-extensions muffin cinnamon-translations inxi \ - perl-XML-Dumper xapps mint-x-icons mint-y-icons mintlocale \ - python3-plum-py caribou mozjs78 python3-pam \ - python3-tinycss2 python3-xapp tint2 gnome-terminal \ - lightdm lightdm-gtk - ``` - -5. Enable LightDM to automatically start upon system startup. - - ```shell - sudo systemctl enable lightdm - ``` - -6. Set the system to log in to the GUI by default. - - ```shell - sudo systemctl set-default graphical.target - ``` - -7. Reboot. - - ```shell - sudo reboot - ``` diff --git a/docs/en/docs/desktop/desktop.md b/docs/en/docs/desktop/desktop.md deleted file mode 100644 index c46639179dd096a706477753175c219a7ac74cd5..0000000000000000000000000000000000000000 --- a/docs/en/docs/desktop/desktop.md +++ /dev/null @@ -1,3 +0,0 @@ -# Desktop Environment User Guide - -This document describes how to install and use four common desktop environments (UKUI, DDE, Xfce, and GNOME), which provide a user-friendly, secure, and reliable GUI for better user experience. diff --git a/docs/en/docs/desktop/installing-and-deploying-HA.md b/docs/en/docs/desktop/installing-and-deploying-HA.md deleted file mode 100644 index a297aeffdeca7475a19c1a660b0c261a919375d0..0000000000000000000000000000000000000000 --- a/docs/en/docs/desktop/installing-and-deploying-HA.md +++ /dev/null @@ -1,213 +0,0 @@ -# Installing and Deploying HA - -This chapter describes how to install and deploy an HA cluster. - - -- [Installing and Deploying HA](#installing-and-deploying-ha) - - [Installation and Deployment](#installation-and-deployment) - - [Modifying the Host Name and the /etc/hosts File](#modifying-the-host-name-and-the-etchosts-file) - - [Configuring the Yum Repository](#configuring-the-yum-repository) - - [Installing the HA Software Package Components](#installing-the-ha-software-package-components) - - [Setting the hacluster User Password](#setting-the-hacluster-user-password) - - [Modifying the /etc/corosync/corosync.conf File](#modifying-the-etccorosynccorosyncconf-file) - - [Managing the Services](#managing-the-services) - - [Disabling the firewall](#disabling-the-firewall) - - [Managing the pcs service](#managing-the-pcs-service) - - [Managing the Pacemaker service](#managing-the-pacemaker-service) - - [Managing the Corosync service](#managing-the-corosync-service) - - [Performing Node Authentication](#performing-node-authentication) - - [Accessing the Front-End Management Platform](#accessing-the-front-end-management-platform) - -## Installation and Deployment - -- Prepare the environment: At least two physical machines or VMs with openEuler 20.03 LTS SP2 installed are required. (This section uses two physical machines or VMs as an example.) For details about how to install openEuler 20.03 LTS SP2, see the [_openEuler Installation Guide_](../Installation/Installation.md). - -### Modifying the Host Name and the /etc/hosts File - -- **Note: You need to perform the following operations on both hosts. The following takes one host as an example.** - -Before using the HA software, ensure that all host names have been changed and written into the /etc/hosts file. - -- Run the following command to change the host name: - -```shell -hostnamectl set-hostname ha1 -``` - -- Edit the `/etc/hosts` file and write the following fields: - -```text -172.30.30.65 ha1 -172.30.30.66 ha2 -``` - -### Configuring the Yum Repository - -After the system is successfully installed, the Yum source is configured by default. The file location is stored in the `/etc/yum.repos.d/openEuler.repo` file. The HA software package uses the following sources: - -```text -[OS] -name=OS -baseurl=http://repo.openeuler.org/openEuler-20.03-LTS-SP2/OS/$basearch/ -enabled=1 -gpgcheck=1 -gpgkey=http://repo.openeuler.org/openEuler-20.03-LTS-SP2/OS/$basearch/RPM-GPG-KEY-openEuler - -[everything] -name=everything -baseurl=http://repo.openeuler.org/openEuler-20.03-LTS-SP2/everything/$basearch/ -enabled=1 -gpgcheck=1 -gpgkey=http://repo.openeuler.org/openEuler-20.03-LTS-SP2/everything/$basearch/RPM-GPG-KEY-openEuler - -[EPOL] -name=EPOL -baseurl=http://repo.openeuler.org/openEuler-20.03-LTS-SP2/EPOL/$basearch/ -enabled=1 -gpgcheck=1 -gpgkey=http://repo.openeuler.org/openEuler-20.03-LTS-SP2/OS/$basearch/RPM-GPG-KEY-openEuler -``` - -### Installing the HA Software Package Components - -```shell -yum install -y corosync pacemaker pcs fence-agents fence-virt corosync-qdevice sbd drbd drbd-utils -``` - -### Setting the hacluster User Password - -```shell -passwd hacluster -``` - -### Modifying the /etc/corosync/corosync.conf File - -```text -totem { - version: 2 - cluster_name: hacluster - crypto_cipher: none - crypto_hash: none -} -logging { - fileline: off - to_stderr: yes - to_logfile: yes - logfile: /var/log/cluster/corosync.log - to_syslog: yes - debug: on - logger_subsys { - subsys: QUORUM - debug: on - } -} -quorum { - provider: corosync_votequorum - expected_votes: 2 - two_node: 1 - } -nodelist { - node { - name: ha1 - nodeid: 1 - ring0_addr: 172.30.30.65 - } - node { - name: ha2 - nodeid: 2 - ring0_addr: 172.30.30.66 - } - } -``` - -### Managing the Services - -#### Disabling the firewall - -```shell -systemctl stop firewalld -``` - -Change the status of SELINUX in the `/etc/selinux/config` file to disabled. - -```text -# SELINUX=disabled -``` - -#### Managing the pcs service - -- Run the following command to start the pcs service: - -```shell -systemctl start pcsd -``` - -- Run the following command to query the pcs service status: - -```shell -systemctl status pcsd -``` - -The service is started successfully if the following information is displayed: - -![](./figures/HA-pcs.png) - -#### Managing the Pacemaker service - -- Run the following command to start the Pacemaker service: - -```shell -systemctl start pacemaker -``` - -- Run the following command to query the Pacemaker service status: - -```shell -systemctl status pacemaker -``` - -The service is started successfully if the following information is displayed: - -![](./figures/HA-pacemaker.png) - -#### Managing the Corosync service - -- Run the following command to start the Corosync service: - -```shell -systemctl start corosync -``` - -- Run the following command to query the Corosync service status: - -```shell -systemctl status corosync -``` - -The service is started successfully if the following information is displayed: - -![](./figures/HA-corosync.png) - -### Performing Node Authentication - -- **Note: Run this command on only one node.** - -```shell -pcs host auth ha1 ha2 -``` - -### Accessing the Front-End Management Platform - -After the preceding services are started, open the browser (Chrome or Firefox is recommended) and enter **https://localhost:2224** in the navigation bar. - -- This page is the native management platform. - -![](./figures/HA-login.png) - -For details about how to install the management platform newly developed by the community, see . - -- The following is the management platform newly developed by the community. - -![](./figures/HA-api.png) - -- The next chapter describes how to quickly use an HA cluster and add an instance. For details, see the [HA Usage Example](./HA Usage Example.md\). diff --git a/docs/en/docs/desktop/kubesphere.md b/docs/en/docs/desktop/kubesphere.md deleted file mode 100644 index 6cc3b4ae243b58636bcf5d3cd45075d51b35e323..0000000000000000000000000000000000000000 --- a/docs/en/docs/desktop/kubesphere.md +++ /dev/null @@ -1,60 +0,0 @@ -# KubeSphere Deployment Guide - -This document describes how to install and deploy Kubernetes and KubeSphere clusters on openEuler 21.09. - -## What Is KubeSphere - -[KubeSphere](https://kubesphere.io/) is an open source **distributed OS** built on [Kubernetes](https://kubernetes.io/) for cloud-native applications. It supports multi-cloud and multi-cluster management and provides full-stack automated IT O&M capabilities, simplifying DevOps-based workflows for enterprises. Its architecture enables plug-and-play integration between third-party applications and cloud-native ecosystem components. For more information, see the [KubeSphere official website](https://kubesphere.com.cn/). - -## Prerequisites - -Prepare a physical machine or VM with openEuler 21.09 installed. For details about the installation method, see the [*openEuler Installation Guide*](../Installation/Installation.md). - -## Software Installation - -1. Install KubeKey. - - ```bash - yum install kubekey - ``` - - > ![](../Virtualization/public_sys-resources/icon-note.gif)**Note** - > Before the installation, manually deploy Docker on each node in the cluster in advance or use KubeKey to automatically deploy Docker. The Docker version automatically deployed by KubeKey is 20.10.8. - -2. Deploy the KubeSphere cluster. - - ```bash - kk create cluster --with-kubesphere v3.1.1 - ``` - - > ![](../Virtualization/public_sys-resources/icon-note.gif)**Note** - > After this command is executed, Kubernetes v1.19.8 is installed by default. To specify the Kubernetes version, add `--with-kubernetes < version_number >` to the end of the command line. The supported Kubernetes versions include `v1.17.9`, `v1.18.8`, `v.1.19.8`, `v1.19.9`, and `v1.20.6`. - -3. Check whether the KubeSphere cluster is successfully installed. - - ```bash - kubectl logs -n kubesphere-system $(kubectl get pod -n kubesphere-system -l app=ks-install -o jsonpath='{.items[0].metadata.name}') -f - ``` - - If the following information is displayed, the KubeSphere cluster is successfully installed: - - ![](./figures/kubesphere.png) - - >![](../Virtualization/public_sys-resources/icon-note.gif)**Note** - >This document describes how to install KubeSphere in the x86 environment. In the ARM64 environment, you need to install Kubernetes before deploying KubeSphere. - -## Accessing the KubeSphere Web Console - -**Depending on your network environment, you may need to configure port forwarding rules and firewall policies. Ensure that port 30880 is allowed in the firewall rules.** - -After the KubeSphere cluster is successfully deployed, enter `:30880` in the address box of a browser to access the KubeSphere web console. - -![kubesphere-console](./figures/1202_1.jpg) - -## See Also - -[What is KubeSphere](https://v3-1.docs.kubesphere.io/docs/introduction/what-is-kubesphere/) - -[Install a Multi-node Kubernetes and KubeSphere Cluster](https://v3-1.docs.kubesphere.io/docs/installing-on-linux/introduction/multioverview/) - -[Enable Pluggable Components](https://v3-1.docs.kubesphere.io/docs/quick-start/enable-pluggable-components/) diff --git a/docs/en/docs/memsafety/utshell/utshell_guide.md b/docs/en/docs/memsafety/utshell/utshell_guide.md index 77f6d2ed3b85610ea8acc2fd3fbb33e64c210fd0..b722699436ab3eda88c54ef1460d2cefbf8addd8 100644 --- a/docs/en/docs/memsafety/utshell/utshell_guide.md +++ b/docs/en/docs/memsafety/utshell/utshell_guide.md @@ -181,7 +181,7 @@ done **{1..500..2}** indicates that the start number is 1, the end number is 500 (included), and the step is 2. -#### util +#### until ```shell until [condition]; do diff --git a/docs/en/docs/oeAware/figures/dep-failed.png b/docs/en/docs/oeAware/figures/dep-failed.png new file mode 100644 index 0000000000000000000000000000000000000000..afb4750135657876b455978bf9d8f5eff36be91e Binary files /dev/null and b/docs/en/docs/oeAware/figures/dep-failed.png differ diff --git a/docs/en/docs/oeAware/figures/dep.png b/docs/en/docs/oeAware/figures/dep.png new file mode 100644 index 0000000000000000000000000000000000000000..91388d6a860f032c86c0559b232f2d5ef55a40f8 Binary files /dev/null and b/docs/en/docs/oeAware/figures/dep.png differ diff --git a/docs/en/docs/oeAware/figures/dependency.png b/docs/en/docs/oeAware/figures/dependency.png new file mode 100644 index 0000000000000000000000000000000000000000..0cd087fb0c9095e63aa76e0d2464a92225af2399 Binary files /dev/null and b/docs/en/docs/oeAware/figures/dependency.png differ diff --git a/docs/en/docs/oeAware/oeAware_user_guide.md b/docs/en/docs/oeAware/oeAware_user_guide.md new file mode 100644 index 0000000000000000000000000000000000000000..e86cb0022509bc47186e69917b56a367f7fa580e --- /dev/null +++ b/docs/en/docs/oeAware/oeAware_user_guide.md @@ -0,0 +1,654 @@ +# oeAware User Guide + +## Introduction + +oeAware is a framework for implementing low-load collection, sensing, and tuning on openEuler. It aims to intelligently enable optimization features after dynamically detecting system behaviors. Traditional optimization features run independently and are statically enabled or disabled. oeAware divides optimization into three layers: collection, sensing, and tuning. Each layer is associated through subscription and is developed as plugins. + +## Installation + +Configure the openEuler Yum repository and run the `yum` commands to install oeAware. + +```shell +yum install oeAware-manager +``` + +## Usage + +Start the oeaware service. Use the `oeawarectl` command to control the service. + +### Service Startup + +Run the `systemd` command to start the service. + +```shell +systemctl start oeaware +``` + +Configuration file + +Configuration file path: **/etc/oeAware/config.yaml** + +```yaml +log_path: /var/log/oeAware # Log storage path +log_level: 1 # Log level. 1: DUBUG; 2: INFO; 3: WARN; 4: ERROR. +enable_list: # Plugins are enabled by default. + - name: libtest.so # Configure the plugin and enable all instances of the plugin. + - name: libtest1.so # Configure the plugin and enable the specified plugin instances. + instances: + - instance1 + - instance2 + ... + ... +plugin_list: # Downloaded packages are supported. + - name: test #The name must be unique. If the name is repeated, the first occurrence is used. + description: hello world + url: https://gitee.com/openeuler/oeAware-manager/raw/master/README.md #url must not be empty. + ... +``` + +After modifying the configuration file, run the following commands to restart the service: + +```shell +systemctl daemon-reload +systemctl restart oeaware +``` + +### Plugin Description + +**Plugin definition**: Each plugin corresponds to an .so file. Plugins are classified into collection plugins, sensing plugins, and tuning plugins. + +**Instance definition**: The scheduling unit in the service is instance. A plugin contains multiple instances. For example, a collection plugin includes multiple collection items, and each collection item is an instance. + +**Dependencies Between Instances** + +Before running an instance, ensure that the dependency between the instances is met. + +![img](./figures/dependency.png) + +- A collection instance does not depend on any other instance. + +- A sensing instance depends on a collection instance and other sensing instances. + +- A tuning instance depends on a collection instance, sensing instance, and other tuning instances. + +### Plugin Loading + +By default, the service loads the plugins in the plugin storage paths. + +Collection plugin path: /usr/lib64/oeAware-plugin/collector + +Sensing plugin path: /usr/lib64/oeAware-plugin/scenario + +Tuning plugin path: /usr/lib64/oeAware-plugin/tune + +You can also manually load the plugins. + +```shell +oeawarectl -l | --load -t | --type # plugin type can be collector, scenario, or tune +``` + +Example + +```shell +[root@localhost ~]# oeawarectl -l libthread_collect.so -t collector +Plugin loaded successfully. +``` + +If the operation fails, an error description is returned. + +### Plugin Unloading + +```shell +oeawarectl -r | --remove +``` + +Example + +```shell +[root@localhost ~]# oeawarectl -r libthread_collect.so +Plugin remove successfully. +``` + +If the operation fails, an error description is returned. + +### Plugin Query + +#### Query the cell status + +```shell +oeawarectl -q # Query all loaded plugins. +oeawarectl --query # Query a specified plugin. +``` + +Example + +```shell +[root@localhost ~]# oeawarectl -q +Show plugins and instances status. +------------------------------------------------------------ +libthread_scenario.so + thread_scenario(available, close) +libpmu.so + collector_pmu_sampling(available, close) + collector_pmu_counting(available, close) + collector_pmu_uncore(available, close) + collector_spe(available, close) +libthread_collector.so + thread_collector(available, close) +------------------------------------------------------------ +format: +[plugin] + [instance]([dependency status], [running status]) +dependency status: available means satisfying dependency, otherwise unavailable. +running status: running means that instance is running, otherwise close. +``` + +If the operation fails, an error description is returned. + +#### Querying Plugin Dependencies + +```shell +oeawarectl -Q # Query the dependency graph of loaded instances. +oeawarectl --query-dep= # Query the dependency graph of a specified instance. +``` + +A **dep.png** file will be generated in the current directory to display the dependencies. + +Example + +Relationship diagram when dependencies are met +![img](./figures/dep.png) + +Relationship diagram when dependencies are not met + +![img](./figures/dep-failed.png) + +If the operation fails, an error description is returned. + +### Enabling Plugins + +#### Enabling a Plugin Instance + +```shell +oeawarectl -e | --enable +``` + +If the operation fails, an error description is returned. + +#### Disabling a Plugin Instance + +```shell +oeawarectl -d | --disable +``` + +If the operation fails, an error description is returned. + +### Downloading and Installing Plugins + +Use the `--list` command to query the RPM packages that can be downloaded and installed plugins. + +```shell +oeawarectl --list +``` + +The query result is as follows: + +```shell +Supported Packages: # Downloadable packages +[name1] # plugin_list configured in config +[name2] +... +Installed Plugins: # Installed plugins +[name1] +[name2] +... +``` + +Use the `--install` command to download and install the RPM package. + +```shell +oeawarectl -i | --install # Name of a package queried using --list (package in Supported Packages) +``` + +If the operation fails, an error description is returned. +### Help +Use the `--help` command for help information. +```shell +usage: oeawarectl [options]... + options + -l|--load [plugin] load plugin and need plugin type. + -t|--type [plugin_type] assign plugin type. there are three types: + collector: collection plugin. + scenario: awareness plugin. + tune: tune plugin. + -r|--remove [plugin] remove plugin from system. + -e|--enable [instance] enable the plugin instance. + -d|--disable [instance] disable the plugin instance. + -q query all plugins information. + --query [plugin] query the plugin information. + -Q query all instances dependencies. + --query-dep [instance] query the instance dependency. + --list the list of supported plugins. + -i|--install [plugin] install plugin from the list. + --help show this help message. +``` + +## Plugin Development + +### Common Data Structures of Plugins + +```c +struct DataBuf { + int len; + void *data; +}; +``` + +**struct DataBuf** is the data buffer. + +- **data**: specific data. **data** is an array. The data type can be defined as required. +- len: size of **data**. + +```c +struct DataHeader { + char type[DATA_HEADER_TYPE_SIZE]; + int index; + uint64_t count; + struct DataBuf *buf; + int buf_len; +}; +``` + +**struct DataHeader** is the structure for transferring data between plugins. It contains a cyclic buffer. + +- **type**: type of the input data. For example, when data is transferred to a sensing plugin, this parameter is used to identify the collection item of the collection plugin. + +- **index**: location of the data that is being written. For example, after a data collection, **index** increases by one. + +- **count**: number of times that an instance is executed. The value is accumulated. + +- **buf**: data buffer. For example, some collection items are used by a sensing plugin only after being sampled for multiple times. Therefore, the collection items are saved in a buffer array. + +- **buf_len**: size of the data buffer. **buf_len** is a fixed value after the data buffer is initialized. + +### Collection Plugin + +A collection plugin must have the **int32_t get_instance(CollectorInterface \*ins)** interface to return all collection items contained in the plugin. **CollectorInterface** contains the following content: + +```c +struct CollectorInterface { + char* (*get_version)(); + char* (*get_name)(); + char* (*get_description)(); + char* (*get_type)(); + int (*get_cycle)(); + char* (*get_dep)(); + void (*enable)(); + void (*disable)(); + void* (*get_ring_buf)(); + void (*reflash_ring_buf)(); +}; +``` + +Obtaining the version number + +1. Interface definition + + ```c + char* (*get_version)(); + ``` + +2. Interface description + +3. Parameter description + +4. Return value description + + The specific version number is returned. This interface is reserved. + +Obtaining the instance name + +1. Interface definition + + ```c + char* (*get_name)(); + ``` + +2. Interface description + + Obtains the name of a collection instance. When you run the `-q` command on the client, the instance name is displayed. In addition, you can run the `--enable` command to enable the instance. + +3. Parameter description + +4. Return value description + + The name of the collection instance is returned. Ensure that the instance name is unique. + +Obtaining description information + +1. Interface definition + + ```c + char* (*get_description)(); + ``` + +2. Interface description + +3. Parameter description + +4. Return value description + + The detailed description is returned. This interface is reserved. + +Obtaining the type + +1. Interface definition + + ```c + char* (*get_type)(); + ``` + +2. Interface description + +3. Parameter description + +4. Return value description + + The specific type information is returned. This interface is reserved. + +Obtaining the sampling period + +1. Interface definition + + ```c + int (*get_cycle)(); + ``` + +2. Interface description + + Obtains the sampling period. Different collection items can use different collection periods. + +3. Parameter description + +4. Return value description + + The specific sampling period is returned. The unit is ms. + +Obtaining dependencies + +1. Interface definition + + ```c + char* (*get_dep)(); + ``` + +2. Interface description + +3. Parameter description + +4. Return value description + + Information about the dependent instances is returned. This interface is reserved. + +Enabling a collection instance + +1. Interface definition + + ```c + void (*enable)(); + ``` + +2. Interface description + + Enables a collection instance. + +3. Parameter description + +4. Return value description + +Disabling a collection instance + +1. Interface definition + + ```c + void (*disable)(); + ``` + +2. Interface description + + Disables a collection instance. + +3. Parameter description + +4. Return value description + +Obtaining the collection data buffer + +1. Interface definition + + ```c + void* (*get_ring_buf)(); + ``` + +2. Interface description + + Obtains the buffer management pointer of the collection data (the memory is applied for by the plugin). The pointer is used by sensing plugins. + +3. Parameter description + +4. Return value description + + The **struct DataHeader** management pointer is returned, which stores the data of multiple samples. + +Refreshing collection data + +1. Interface definition + + ```c + void (*reflash_ring_buf)(); + ``` + +2. Interface description + + Periodically obtains sampled data based on the sampling period and saves the data to **struct DataBuf**. + +3. Parameter description + +4. Return value description + +### Sensing Plugin + +A sensing plugin must have the **int32_t get_instance(ScenarioInterface \*ins)** interface to return all collection items contained in the plugin. **ScenarioInterface** contains the following content: + +```c +struct ScenarioInterface { + char* (*get_version)(); + char* (*get_name)(); + char* (*get_description)(); + char* (*get_dep)(); + int (*get_cycle)(); + void (*enable)(); + void (*disable)(); + void (*aware)(void*[], int); + void* (*get_ring_buf)(); +}; +``` + +Obtaining the version number + +1. Interface definition + + ```c + char* (*get_version)(); + ``` + +2. Interface description + +3. Parameter description + +4. Return value description + + The specific version number is returned. This interface is reserved. + +The specific version number is returned. This interface is reserved. + +1. Interface definition + + ```c + char* (*get_name)(); + ``` + +2. Interface description + + Obtains the name of a sensing instance. When you run the `-q` command on the client, the instance name is displayed. In addition, you can run the `--enable` command to enable the instance. + +3. Parameter description + +4. Return value description + + The name of the sensing instance is returned. Ensure that the instance name is unique. + +Obtaining description information + +1. Interface definition + + ```c + char* (*get_description)(); + ``` + +2. Interface description + +3. Parameter description + +4. Return value description + + The detailed description is returned. This interface is reserved. + +Obtaining dependencies + +1. Interface definition + + ```c + char* (*get_dep)(); + ``` + +2. Interface description + +3. Parameter description + +4. Return value description + + Names of the dependent instances are returned. This interface is reserved. Multiple dependent instances are connected by hyphens (-). For example, if the sensing instance depends on collection instance **A** and collection instance **B**, **A-B** is returned. If instance **A** contains a hyphen in its name, an error is reported. + +Obtaining the sensing period + +1. Interface definition + + ```c + int (*get_cycle)(); + ``` + +2. Interface description + + Obtains the sensing period. + +3. Parameter description + +4. Return value description + + The specific sensing period is returned. The unit is ms. + +Enabling a sensing instance + +1. Interface definition + + ```c + void (*enable)(); + ``` + +2. Interface description + + Enables a sensing instance. + +3. Parameter description + +4. Return value description + +Disabling a sensing instance + +1. Interface definition + + ```c + void (*disable)(); + ``` + +2. Interface description + + Disables a sensing instance. + +3. Parameter description + +4. Return value description + +Performing Sensing + +1. Interface definition + + ```c + void (*aware)(void*[], int); + ``` + +2. Interface description + + Processes and analyzes collected data. For example, the collected uncore PMU data is processed and analyzed for NUMA issues. + +3. Parameter description + + - **void*[]**: array of **struct DataHeader**. + - **int**: array length of **struct DataHeader**. + +4. Return value description + +Obtaining the sensing data buffer + +1. Interface definition + + ```c + void* (*get_ring_buf)(); + ``` + +2. Interface description + + Obtains the buffer management pointer of the sensing data (the memory is applied for by the plugin). The pointer is used by tuning plugins. + +3. Parameter description + +4. Return value description + + The **struct DataHeader** management pointer is returned. + +### Tuning Plugin + +## Constraints + +### Function Constraints + +By default, oeAware integrates the libkperf module for collecting Arm microarchitecture information. This module can be called by only one process at a time. If this module is called by other processes or the perf command is used, conflicts may occur. + +### Operation Constraints + +Currently, only the **root** user can operate oeAware. + +## Notes + +The user group and permission of the oeAware configuration file and plugins are strictly verified. Do not modify the permissions and user group of oeAware-related files. + +Permissions: + +- Plugin files: 440 + +- Client executable file: 750 + +- Server executable file: 750 + +- Service configuration file: 640 diff --git a/docs/en/docs/oncn-bwm/overview.md b/docs/en/docs/oncn-bwm/overview.md index 5068a6ed0285ae1cc217b022337a02a4eeb7a691..32733b231485a32cb19a9aa67a7b5a9acbdf35f3 100644 --- a/docs/en/docs/oncn-bwm/overview.md +++ b/docs/en/docs/oncn-bwm/overview.md @@ -13,31 +13,19 @@ The oncn-bwm tool supports the following functions: - Setting the offline service bandwidth range and online service waterline - Querying internal statistics - - ## Installation -To install the oncn-bwm tool, the operating system must be openEuler 22.09. Run the **yum** command on the host where the openEuler Yum source is configured to install the oncn-bwm tool. - -```shell -# yum install oncn-bwm -``` - -This section describes how to install the oncn-bwm tool. - ### Environmental Requirements -* Operating system: openEuler 22.09 - -### Installation Procedure +- Operating system: openEuler-24.03-LTS with the Yum repository of openEuler-24.03-LTS -To install the oncn-bwm tool, do as follows: +### Installation Procedure -1. Configure the Yum source of openEuler and run the `yum` command to install oncn-bwm. +Run the following command: - ``` - yum install oncn-bwm - ``` +```shell +yum install oncn-bwm +``` ## How to Use @@ -55,16 +43,15 @@ The oncn-bwm tool provides the `bwmcli` command line tool to enable pod bandwidt > > Upgrading the oncn-bwm package does not affect the enabling status before the upgrade. Uninstalling the oncn-bwm package disables pod bandwidth management for all NICs. - ### Command Interfaces #### Pod Bandwidth Management -**Commands and Functions** +##### Commands and Functions | Command Format | Function | | --------------------------- | ------------------------------------------------------------ | -| **bwmcli –e** | Enables pod bandwidth management for a specified NIC.| +| **bwmcli -e** | Enables pod bandwidth management for a specified NIC.| | **bwmcli -d** | Disables pod bandwidth management for a specified NIC.| | **bwmcli -p devs** | Queries pod bandwidth management of all NICs on a node.| @@ -74,14 +61,12 @@ The oncn-bwm tool provides the `bwmcli` command line tool to enable pod bandwidt > > - Enable pod bandwidth management before running other `bwmcli` commands. - - -**Examples** +##### Examples - Enable pod bandwidth management for NICs eth0 and eth1. ```shell - # bwmcli –e eth0 –e eth1 + # bwmcli -e eth0 -e eth1 enable eth0 success enable eth1 success ``` @@ -89,7 +74,7 @@ The oncn-bwm tool provides the `bwmcli` command line tool to enable pod bandwidt - Disable pod bandwidth management for NICs eth0 and eth1. ```shell - # bwmcli –d eth0 –d eth1 + # bwmcli -d eth0 -d eth1 disable eth0 success disable eth1 success ``` @@ -107,18 +92,18 @@ The oncn-bwm tool provides the `bwmcli` command line tool to enable pod bandwidt #### Pod Network Priority -**Commands and Functions** +##### Commands and Functions | Command Format | Function | | ------------------------------------------------------------ | ------------------------------------------------------------ | -| **bwmcli –s** *path* | Sets the network priority of a pod. *path* indicates the cgroup path corresponding to the pod, and *prio* indicates the priority. The value of *path* can be a relative path or an absolute path. The default value of *prio* is **0**. The optional values are **0** and **-1**. The value **0** indicates online services, and the value **-1** indicates offline services.| -| **bwmcli –p** *path* | Queries the network priority of a pod. | +| **bwmcli -s** *path* | Sets the network priority of a pod. *path* indicates the cgroup path corresponding to the pod, and *prio* indicates the priority. The value of *path* can be a relative path or an absolute path. The default value of *prio* is **0**. The optional values are **0** and **-1**. The value **0** indicates online services, and the value **-1** indicates offline services.| +| **bwmcli -p** *path* | Queries the network priority of a pod. | > Note: > > Online and offline network priorities are supported. The oncn-bwm tool controls the bandwidth of pods in real time based on the network priority. The specific policy is as follows: For online pods, the bandwidth is not limited. For offline pods, the bandwidth is limited within the offline bandwidth range. -**Examples** +##### Examples - Set the priority of the pod whose cgroup path is **/sys/fs/cgroup/net_cls/test_online** to **0**. @@ -134,16 +119,14 @@ The oncn-bwm tool provides the `bwmcli` command line tool to enable pod bandwidt 0 ``` - - #### Offline Service Bandwidth Range | Command Format | Function | | ---------------------------------- | ------------------------------------------------------------ | -| **bwmcli –s bandwidth** | Sets the offline bandwidth for a host or VM. **low** indicates the minimum bandwidth, and **high** indicates the maximum bandwidth. The unit is KB, MB, or GB, and the value range is [1 MB, 9999 GB].| -| **bwmcli –p bandwidth** | Queries the offline bandwidth of a host or VM. | +| **bwmcli -s bandwidth** | Sets the offline bandwidth for a host or VM. **low** indicates the minimum bandwidth, and **high** indicates the maximum bandwidth. The unit is KB, MB, or GB, and the value range is [1 MB, 9999 GB].| +| **bwmcli -p bandwidth** | Queries the offline bandwidth of a host or VM. | -> Note: +> Note: > > - All NICs with pod bandwidth management enabled on a host are considered as a whole, that is, the configured online service waterline and offline service bandwidth range are shared. > @@ -151,9 +134,7 @@ The oncn-bwm tool provides the `bwmcli` command line tool to enable pod bandwidt > > - The offline service bandwidth range and online service waterline are used together to limit the offline service bandwidth. When the online service bandwidth is lower than the configured waterline, the offline services can use the configured maximum bandwidth. When the online service bandwidth is higher than the configured waterline, the offline services can use the configured minimum bandwidth. - - -**Examples** +##### Examples - Set the offline bandwidth to 30 Mbit/s to 100 Mbit/s. @@ -169,24 +150,21 @@ The oncn-bwm tool provides the `bwmcli` command line tool to enable pod bandwidt bandwidth is 31457280(B),104857600(B) ``` - - - #### Online Service Waterline -**Commands and Functions** +##### Commands and Functions | Command Format | Function | | ---------------------------------------------- | ------------------------------------------------------------ | -| **bwmcli –s waterline** | Sets the online service waterline for a host or VM. *val* indicates the waterline value. The unit is KB, MB, or GB, and the value range is [20 MB, 9999 GB].| -| **bwmcli –p waterline** | Queries the online service waterline of a host or VM. | +| **bwmcli -s waterline** | Sets the online service waterline for a host or VM. *val* indicates the waterline value. The unit is KB, MB, or GB, and the value range is [20 MB, 9999 GB].| +| **bwmcli -p waterline** | Queries the online service waterline of a host or VM. | > Note: > > - When the total bandwidth of all online services on a host is higher than the waterline, the bandwidth that can be used by offline services is limited. When the total bandwidth of all online services on a host is lower than the waterline, the bandwidth that can be used by offline services is increased. > - The system determines whether the total bandwidth of online services exceeds or is lower than the configured waterline every 10 ms. Then the system determines the bandwidth limit for offline services based on whether the online bandwidth collected within each 10 ms is higher than the waterline. -**Examples** +##### Examples - Set the online service waterline to 20 MB. @@ -202,16 +180,13 @@ The oncn-bwm tool provides the `bwmcli` command line tool to enable pod bandwidt waterline is 20971520(B) ``` - - #### Statistics -**Commands and Functions** +##### Commands and Functions | Command Format | Function | | ------------------- | ------------------ | -| **bwmcli –p stats** | Queries internal statistics.| - +| **bwmcli -p stats** | Queries internal statistics.| > Note: > @@ -225,8 +200,7 @@ The oncn-bwm tool provides the `bwmcli` command line tool to enable pod bandwidt > > - **offline_rate**: current offline service rate. - -**Examples** +##### Examples Query internal statistics. @@ -239,15 +213,11 @@ online_rate: 602 offline_rate: 0 ``` - - - - ### Typical Use Case To configure pod bandwidth management on a node, perform the following steps: -``` +```shell bwmcli -p devs #Query the pod bandwidth management status of the NICs in the system. bwmcli -e eth0 # Enable pod bandwidth management for the eth0 NIC. bwmcli -s /sys/fs/cgroup/net_cls/online 0 # Set the network priority of the online service pod to 0 @@ -255,3 +225,16 @@ bwmcli -s /sys/fs/cgroup/net_cls/offline -1 # Set the network priority of the of bwmcli -s bandwidth 20mb,1gb # Set the bandwidth range for offline services. bwmcli -s waterline 30mb # Set the waterline for online services. ``` + +### Constraints + +1. Only the **root** user is allowed to run the bwmcli command. +2. Currently, this feature supports only two network QoS priorities: offline and online. +3. If the tc qdisc rules have been configured for a NIC, the network QoS function will fail to be enabled for the NIC. +4. After a NIC is removed and then inserted, the original QoS rules will be lost. In this case, you need to manually reconfigure the network QoS function. +5. When you run one command to enable or disable multiple NICs at the same time, if any NIC fails to be operated, operations on subsequent NICs will be stopped. +6. When SELinux is enabled in the environment, if the SELinux policy is not configured for the bwmcli program, some commands (such as setting or querying the waterline, bandwidth, and priority) may fail. You can confirm the failure in SELinux logs. To solve this problem, disable SELinux or configure the SELinux policy for the bwmcli program. +7. Upgrading the software package does not change the enabling status before the upgrade. Uninstalling the software package disables the function for all devices. +8. The NIC name can contain only digits, letters, hyphens (-), and underscores (_). NICs whose names contain other characters cannot be identified. +9. In actual scenarios, bandwidth limiting may cause protocol stack memory overstock. In this case, backpressure depends on transport-layer protocols. For protocols that do not have backpressure mechanisms, such as UDP, packet loss, ENOBUFS, and rate limiting deviation may occur. +10. After using bwmcli to enable the network QoS function of a certain network card, the tc command cannot be used to modify the tc rules of the network card. Otherwise, it may affect the network QoS function of the network card, leading to abnormal functionality. diff --git a/docs/en/docs/secGear/application-scenarios.md b/docs/en/docs/secGear/application-scenarios.md deleted file mode 100644 index 13a9b7588f9b77c4a2c36b9a366ad54016bbf2dc..0000000000000000000000000000000000000000 --- a/docs/en/docs/secGear/application-scenarios.md +++ /dev/null @@ -1,96 +0,0 @@ -# Application Scenarios - -This chapter describes confidential computing solutions in typical scenarios with examples, helping you understand the application scenarios of secGear and build confidential computing solutions based on your services. - -## TEE-based BJCA Cryptographic Module - -Driven by policies and services, the cryptographic application assurance infrastructure has been evolving towards virtualization. As services are migrated to the cloud, a brand-new cryptographic delivery mode needs to be built to integrate cryptographic services, cloud services, and service applications. Under such circumstance, Beijing Certificate Authority (BJCA) launches a TEE-based cryptographic module. BJCA can not only use the Kunpeng-based TEEs to build compliant cryptographic computing modules to support cryptographic cloud service platforms, but also build a confidential computing platform based on Kunpeng hosts to provide high-speed ubiquitous, elastically deployed, and flexibly scheduled cryptographic services for various scenarios such as cloud computing, privacy computing, and edge computing. The endogenous cryptographic module based on Kunpeng processors has become a revolutionary innovative solution in the cryptographic industry, and becomes a new starting point for endogenous trusted cryptographic computing. - -### Status Quo - -In conventional cryptographic modules, algorithm protocols and processed data are privacy data. Migrating cryptographic modules to the cloud has security risks. - -### Solution - -![](./figures/BJCA_Crypto_Module.png) - -The figure shows a TEE-based cryptographic module solution. secGear can divide the cryptographic module into two parts: management service and algorithm protocol. - -- Management service: runs on the REE to provide cryptographic services for the external world and forward requests to the TEE for processing. -- Algorithm protocol: runs on the TEE to encrypt and decrypt user data. - -Cryptographic services may have highly concurrent requests with large data volumes. The switchless feature of secGear reduces the context switches and data copies typically required for processing a large number of requests between the REE and TEE. - -## TEE-based Fully-Encrypted GaussDB - -Cloud databases have become an important growth point for database services in the future. Most traditional database service vendors are accelerating the provision of better cloud database services. However, cloud databases face more complex and diversified risks than traditional databases. Application vulnerabilities, system configuration errors, and malicious administrators may pose great risks to data security and privacy. - -### Status Quo - -The deployment network of cloud databases changes from a private environment to an open environment. The system O&M role is divided into service administrators and O&M administrators. Service administrators have service management permissions and belong to the enterprise service provider. O&M administrators belong to the cloud service provider. Although being defined to be responsible only for system O&M management, the database O&M administrator still has full permissions to use data. The database O&M administrator can access or even tamper with data with O&M management permissions or privilege escalation. In addition, due to the open environment and blurring of network boundaries, user data is more fully exposed to attackers in the entire service process, no matter in transfer, storage, O&M, or running. Therefore, in cloud database scenarios, how to solve the third-party trust problem and how to protect data security more reliably are facing greater challenges than traditional databases. Data security and privacy leakage are top concerns of cloud databases. - -### Solution - -To address the preceding challenges, the TEE-based fully-encrypted GaussDB (openGauss) is designed as follows: Users hold data encryption and decryption keys, data is stored in ciphertext in the entire life cycle of the database service, and query operations are completed in the TEE of the database service. - -![](./figures/secret_gaussdb.png) - -The figure shows the TEE-based fully-encrypted database solution. The fully-encrypted database has the following features: - -1. Data files are stored in ciphertext and plaintext key information is not stored. -2. The database data key is stored on the client. -3. When the client initiates a query request, the REE executes the encrypted SQL syntax on the server to obtain related ciphertext records and sends them to the TEE. -4. The client encrypts and transfers the database data key to the server TEE through the secure channel of secGear. The database data key is decrypted in the TEE and used to decrypt the ciphertext records into plaintext records. The SQL statement is executed to obtain the query result. Then the query result is encrypted using the database data key and sent back to the client. - -In step 3, when a large number of concurrent database requests are sent, frequent calls between the REE and TEE will be triggered and a large amount of data needs to be transferred. As a result, the performance deteriorates sharply. The switchless feature of secGear helps reduce context switches in calls and data copies, improving the performance. - -## TEE-based openLooKeng Federated SQL - -openLooKeng federated SQL is a type of cross-DC query. The typical scenario is as follows. There are three DCs: central DC A, edge DC B, and edge DC C. The openLooKeng cluster is deployed in the three DCs. When receiving a cross-domain query request, DC A delivers an execution plan to each DC. After the openLookeng clusters in edge DCs B and C complete computing, the result is transferred to the openLookeng cluster in DC A over the network to complete aggregation computing. - -### Status Quo - -In the preceding solution, the computing result is transferred between openLookeng clusters in different DCs, avoiding insufficient network bandwidth and solving the cross-domain query problem to some extent. However, the computing result is obtained from the original data and may contain sensitive information. As a result, security and compliance risks exist when data is transferred out of the domain. How do we protect the computing results of the edge DCs during aggregation computing and ensure that the computing results are available but invisible in the central DC? - -### Solution - -In DC A, the openLookeng cluster splits the aggregation computing logic and operators into independent modules and deploys them in the Kunpeng-based TEE. The computing results of the edge DCs are transferred to the TEE of DC A through the secure channel. All data is finally aggregated and computed in the TEE. In this way, the computing results of the edge DCs are protected from being obtained or tampered with by privileged or malicious programs in the REE of DC A during aggregation computing. - -![](./figures/openLooKeng.png) - -The figure shows the TEE-based federated SQL solution. The query process is as follows: - -1. A user delivers a cross-domain query request in DC A. The coordinator of openLooKeng splits and delivers the execution plan to its worker nodes and the coordinators of edge DCs based on the query SQL statement and data distribution. Then the coordinators of edge DCs deliver the execution plan to their worker nodes. -2. Each worker node executes the plan to obtain the local computing result. -3. Edge DCs encrypt their computing results through the secure channel of secGear, transfer the results to the REE of DC A over the Internet, forward the results to the TEE, and decrypt the results in the TEE. -4. DC A performs aggregation computing on the computing results of DCs A, B, and C in the TEE, obtains a final execution result, and returns the result to the user. - -In step 4, when there are a large number of query requests, the REE and TEE will be frequently invoked and a large amount of data is copied. As a result, the performance deteriorates. The switchless feature of secGear is optimized to reduce context switches and data copies to improve the performance. - -## TEE-based MindSpore Feature Protection - -Vertical federated learning (VFL) is an important branch of federated learning. When multiple parties have features about the same set of users, VFL can be used for collaborative training. - -![](./figures/Mindspore_original.png) - -### Status Quo - -The figure shows the data processing flow of the traditional solution. - -1. A party that has features is also called a follower, while a party that has labels is also called a leader. Each follower inputs its features to its bottom model to obtain the intermediate result, and then sends the intermediate result to the leader. -2. The leader uses its labels and the intermediate results of followers to train the top model, and then sends the computed gradient back to the followers to train their bottom models. - -This solution prevents followers from directly uploading their raw data out of the domain, thereby protecting data privacy. However, attackers may derive user information from the uploaded intermediate results, causing privacy leakage risks. Therefore, a stronger privacy protection solution is required for intermediate results and gradients to meet security compliance requirements. - -### Solution - -Based on the security risks and solutions in the previous three scenarios, confidential computing is a good choice to make intermediate results "available but invisible" out of the domain. - -![](./figures/Mindspore.png) - -The figure shows the TEE-based VFL feature protection solution. The data processing process is as follows: - -1. Followers encrypt their intermediate results through the secure channel of secGear and transfer the results to the leader. After receiving the results, the leader transfers them to the TEE and decrypts them through the secure channel in the TEE. -2. In the TEE, the intermediate results are input to the computing module at the federated split layer to compute the result. - -In this process, the plaintext intermediate results of followers exist only in the TEE memory, which is inaccessible to the leader, like a black box. diff --git a/docs/en/docs/secGear/using-the-secGear-tool.md b/docs/en/docs/secGear/using-the-secGear-tool.md deleted file mode 100644 index 6ebef34b1b4a1fc52b401df9696034ba00192593..0000000000000000000000000000000000000000 --- a/docs/en/docs/secGear/using-the-secGear-tool.md +++ /dev/null @@ -1,149 +0,0 @@ -# secGear Tools - -secGear provides a tool set to facilitate application development. This document describes the tools and how to use them. - -## Codegener: Code Generation Tool - -### Overview - -secGear codegener is a tool developed based on Intel SGX SDK edger8r. It is used to parse the EDL file to generate intermediate C code, that is, to assist in generating code that is called between the TEE and REE. - -The EDL file format defined by secGear codegener is the same as that defined by Intel SGX SDK edger8r, but the complete syntax definition of Intel is not supported: - -- The public can be used only in methods. Functions without public are declared as private by default. -- Switchless calls from the REE to the TEE and from the TEE to the REE are not supported. -- The Outside Call (OCALL) does not support some calling modes (such as cdecl, stdcall, and fastcall). - -The EDL file syntax is similar to the C language syntax. The following describes parts different from the C language syntax: - -| Member | Description | -| ----------------------- | ------------------------------------------------------------ | -| include "my_type.h” | Uses the type defined in the external inclusion file. | -| trusted | Declares that secure functions are available on the trusted application (TA) side. | -| untrusted | Declares that insecure functions are available on the TA side. | -| return_type | Defines the return value type. | -| parameter_type | Defines the parameter type. | -| \[in, size = len] | For the ECALL, this parameter indicates that data needs to be transferred from the REE to the TEE. For the OCALL, this parameter is required for the pointer type, and size indicates the buffer that is actually used. | -| \[out, size = len] | For the ECALL, this parameter indicates that data needs to be transferred from the TEE to the REE. For the OCALL, this parameter needs to be used for the pointer type, and size indicates the buffer that is actually used.| - -### Usage Instructions - -#### **Command Format** - -The format of the codegen command is as follows: - -- x86_64 architecture: - -**codegen_x86_64** < --trustzone | --sgx > \[--trusted-dir \ | **--untrusted-dir** \| --trusted | --untrusted ] edlfile - -ARM architecture - -**codegen_arm64** < --trustzone | --sgx > \[--trusted-dir \ | **--untrusted-dir** \| --trusted | --untrusted ] edlfile - -#### **Parameter Description** - -The parameters are described as follows: - -| **Parameter** | Mandatory/Optional | Description | -| ---------------------- | -------- | ------------------------------------------------------------ | -| --trustzone \| --sgx | Mandatory | Generates the API function corresponding to the confidential computing architecture only in the current command directory. If no parameter is specified, the SGX API function is generated by default. | -| --search-path \ | Optional | Specifies the search path of the file that the EDL file to be converted depends on. | -| --use-prefix | Optional | Adds a prefix to the proxy function name. The prefix is the name of the EDL file. | -| --header-only | Optional | Specifies that the code generation tool generates only header files. | -| --trusted-dir \ | Optional | Specifies the directory where the generated TEE auxiliary code is stored. If this parameter is not specified, the current path is used by default. | -| --untrusted-dir \ | Optional | Specifies the directory where the auxiliary code for generating insecure functions is located. | -| --trusted | Optional | Generates TEE auxiliary code. | -| --untrusted | Optional | Generates REE auxiliary code. | -| edlfile | Mandatory | EDL file to be converted, for example, hello.edl. | - -#### Examples - -- Convert *helloworld.edl* to generate TEE auxiliary code in *enclave-directory* and generate REE auxiliary code in *host-directory*. An example command is as follows: - -```shell -codegen_x86_64 --sgx --trusted-dir enclave-directory --untrusted-dir host-directory helloworld.edl -``` - -- Convert *helloworld.edl* to generate TEE auxiliary code in the current directory. The following is a command example for not generating REE auxiliary code: - -```shell -codegen_x86_64 --sgx --trusted helloworld.edl -``` - -- Convert *helloworld.edl* to generate REE auxiliary code in the current directory. The following is a command example that does not generate TEE auxiliary code: - -```shell -codegen_x86_64 --sgx --untrusted helloworld.edl -``` - -- Convert *helloworld.edl*. An example of the command for generating TEE and REE auxiliary code in the current directory is as follows: - -```shell -codegen_x86_64 --sgx helloworld.edl -``` - -## Signature Tool: sign_tool - -### Overview - -secGear sign_tool is a command line tool, including the compilation tool chain and signature tool, which are used for enclave signing. The sign_tool has two signature modes: - -- Single-step signature: applies only to the debugging mode. -- Two-step signature: applies to the commercial scenario. Obtain the signature private key from a third-party platform or an independent security device to sign the enclave. - -### Operation Instructions - -#### **Format** - -The sign_tool contains the sign command (for signing the enclave) and the digest command (for generating the digest value). Command format: - -**sign_tool.sh -d** \[sign | digest] **-x** \ **-i** \ **-p** \ **-s** \ \[OPTIONS] **–o** \ - -#### **Parameter Description** - -| sign Command Parameter | Description | Mandatory/Optional | -| -------------- | -------------------------------------------------------------| -------------------------------------------- | -| -a \ | api_level, which identifies the GP API version of the iTrustee TA. The default value is 1. | Optional | -| -c \ | Configuration file | Optional | -| -d \ | Specifies the operation (sign or digest) to be performed by the signature tool. | Only the sign operation is performed in single-step mode. In two-step mode, the digest operation must be performed before the sign operation. | -| -e \ | Public key certificate of the device, which is used to protect the AES key for encrypting rawdata (mandatory for iTrustee). | This parameter is mandatory only for the iTrustee type. | -| -f \ | OTRP_FLAG, which determines whether to support the OTRP standard protocol. The default value is 0. | Optional | -| -i \ | Library file to be signed. | Mandatory | -| -k \ | Private key (PEM file) required for one-step signature. | This parameter is mandatory only for the SGX type. | -| -m \ | Security configuration file mainfest.txt, which is configured by users. | Only the iTrustee type is mandatory. | -| -o \ | Output file. | Mandatory | -| -p \ | Public key certificate (PEM file) of the signature server required for two-step signing. | Mandatory | -| -s \ | Signed digest value required for two-step signing. | Mandatory | -| -t \ | TA_TYPA, which identifies TA binary format of the iTrustee. The default value is 1. | Optional | -| -x \ | enclave type (sgx or trustzone) | Mandatory | -| -h | Prints the help information. | Optional | - -#### **Single-Step Signature** - -Set the enclave type is SGX, sign the test.enclave, and generate the signature file signed.enclave. The following is an example: - -```shell -sign_tool.sh –d sign –x sgx –i test.enclave -k private_test.pem –o signed.enclave -``` - -#### **Two-Step Signature** - -The following uses SGX as an example to describe the two-step signature procedure: - -1. Generate digest value. - - Use the sign_tool to generate the digest value digest.data and the temporary intermediate file signdata. The file is used when the signature file is generated and is automatically deleted after being signed. Example: - - ```shell - sign_tool.sh –d digest –x sgx –i input –o digest.data - ``` - -2. Send digest.data to the signature authority or platform and obtain the corresponding signature. - -3. Use the obtained signature to generate the signed dynamic library signed.enclave. - - ```shell - sign_tool.sh –d sign –x sgx–i input –p pub.pem –s signature –o signed.enclave - ``` - -Note: To release an official version of applications supported by Intel SGX, you need to apply for an Intel whitelist. For details about the process, see the Intel document at . diff --git a/docs/en/docs/sysBoost/Appendixes.md b/docs/en/docs/sysBoost/Appendixes.md deleted file mode 100644 index 9ffa3b4defb16e8acfd24f15b3d5323f9ca6698a..0000000000000000000000000000000000000000 --- a/docs/en/docs/sysBoost/Appendixes.md +++ /dev/null @@ -1,26 +0,0 @@ -# Appendixes - - -- [Appendixes](#appendixes) - - [Acronyms and Abbreviations](#acronyms-and-abbreviations) - - -## Acronyms and Abbreviations - -**Table 1** Terminology - - - - - - - - - - - -

Term

-

Description

-

-

-

-

-
diff --git a/docs/en/docs/sysBoost/faqs.md b/docs/en/docs/sysBoost/faqs.md deleted file mode 100644 index 95241a8b3bdb785effe3e4f9330f7cc6537e330e..0000000000000000000000000000000000000000 --- a/docs/en/docs/sysBoost/faqs.md +++ /dev/null @@ -1 +0,0 @@ -# FAQs diff --git a/docs/en/docs/sysmonitor/figures/sysmonitor_functions.png b/docs/en/docs/sysmonitor/figures/sysmonitor_functions.png new file mode 100644 index 0000000000000000000000000000000000000000..e9655456ebce192d196e5f55c5fc09c03fa440d8 Binary files /dev/null and b/docs/en/docs/sysmonitor/figures/sysmonitor_functions.png differ diff --git a/docs/en/docs/sysmonitor/sysmonitor-usage.md b/docs/en/docs/sysmonitor/sysmonitor-usage.md new file mode 100644 index 0000000000000000000000000000000000000000..a26a8c1979be20d2bf27017d7e80341ead157cec --- /dev/null +++ b/docs/en/docs/sysmonitor/sysmonitor-usage.md @@ -0,0 +1,797 @@ +# sysmonitor + +## Introduction + +The system monitor (sysmonitor) daemon monitors exceptions that occur during OS running and records the exceptions in the system log file **/var/log/sysmonitor.log**. sysmonitor runs as a service. You can run the `systemctl start|stop|restart|reload sysmonitor` command to start, stop, restart, and reload the service. You are advised to deploy sysmonitor to locate system exceptions. + +![](./figures/sysmonitor_functions.png) + +### Precautions + +- sysmonitor cannot run concurrently. +- Ensure that all configuration files are valid. Otherwise, the monitoring service may be abnormal. +- The root privilege is required for sysmonitor service operations, configuration file modification, and log query. The **root** user has the highest permission in the system. When performing operations as the **root** user, follow the operation guide to avoid system management and security risks caused by improper operations. + +### Configuration Overview + +Configuration file **/etc/sysconfig/sysmonitor** of sysmonitor defines the monitoring period of each monitoring item and specifies whether to enable monitoring. Spaces are not allowed between the configuration item, equal sign (=), and configuration value, for example, **PROCESS_MONITOR="on"**. + +Configuration description + +| Item | Description | Mandatory| Default Value | +| ------------------------- | ------------------------------------------------------------ | -------- | -------------------------------------- | +| PROCESS_MONITOR | Whether to enable key process monitoring. The value can be **on** or **off**. | No | on | +| PROCESS_MONITOR_PERIOD | Monitoring period on key processes, in seconds. | No | 3 | +| PROCESS_RECALL_PERIOD | Interval for attempting to restart a key process after the process fails to be recovered, in minutes. The value can be an integer ranging from 1 to 1440.| No | 1 | +| PROCESS_RESTART_TIMEOUT | Timeout interval for recovering a key process service from an exception, in seconds. The value can be an integer ranging from 30 to 300.| No | 90 | +| PROCESS_ALARM_SUPRESS_NUM | Number of alarm suppression times when the key process monitoring configuration uses the alarm command to report alarms. The value is a positive integer.| No | 5 | +| FILESYSTEM_MONITOR | Whether to enable ext3 and ext4 file system monitoring. The value can be **on** or **off**. | No | on | +| DISK_MONITOR | Whether to enable drive partition monitoring. The value can be **on** or **off**. | No | on | +| DISK_MONITOR_PERIOD | Drive monitoring period, in seconds. | No | 60 | +| INODE_MONITOR | Whether to enable drive inode monitoring. The value can be **on** or **off**. | No | on | +| INODE_MONITOR_PERIOD | Drive inode monitoring period, in seconds. | No | 60 | +| NETCARD_MONITOR | Whether to enable NIC monitoring. The value can be **on** or **off**. | No | on | +| FILE_MONITOR | Whether to enable file monitoring. The value can be **on** or **off**. | No | on | +| CPU_MONITOR | Whether to enable CPU monitoring. The value can be **on** or **off**. | No | on | +| MEM_MONITOR | Whether to enable memory monitoring. The value can be **on** or **off**. | No | on | +| PSCNT_MONITOR | Whether to enable process count monitoring. The value can be **on** or **off**. | No | on | +| FDCNT_MONITOR | Whether to enable file descriptor (FD) count monitoring. The value can be **on** or **off**. | No | on | +| CUSTOM_DAEMON_MONITOR | Whether to enable custom daemon item monitoring. The value can be **on** or **off**. | No | on | +| CUSTOM_PERIODIC_MONITOR | Whether to enable custom periodic item monitoring. The value can be **on** or **off**. | No | on | +| IO_DELAY_MONITOR | Whether to enable local drive I/O latency monitoring. The value can be **on** or **off**. | No | off | +| PROCESS_FD_NUM_MONITOR | Whether to enable process FD count monitoring. The value can be **on** or **off**. | No | on | +| PROCESS_MONITOR_DELAY | Whether to wait until all monitoring items are normal when sysmonitor is started. The value can be **on** (wait) or **off** (do not wait).| No | on | +| NET_RATE_LIMIT_BURST | NIC route information printing rate, that is, the number of logs printed per second. | No | 5
Valid range: 0 to 100 | +| FD_MONITOR_LOG_PATH | FD monitoring log file | No | /var/log/sysmonitor.log| +| ZOMBIE_MONITOR | Whether to monitor zombie processes | No | off | +| CHECK_THREAD_MONITOR | Whether to enable internal thread self-healing. The value can be **on** or **off**. | No | on
| +| CHECK_THREAD_FAILURE_NUM | Number of internal thread self-healing checks in a period. | No | 3
Valid range: 2 to 10 | + +- After modifying the **/etc/sysconfig/sysmonitor** configuration file, restart the sysmonitor service for the configurations to take effect. +- If an item is not configured in the configuration file, it is enabled by default. +- After the internal thread self-healing function is enabled, if a sub-thread of the monitoring item is suspended and the number of checks in a period exceeds the configured value, the sysmonitor service is restarted for restoration. The configuration is reloaded. The configured key process monitoring and customized monitoring are restarted. If this function affects user experience, you can disable it. + +### Command Reference + +- Start sysmonitor. + +```shell +systemctl start sysmonitor +``` + +- Stop sysmonitor. + +```shell +systemctl stop sysmonitor +``` + +- Restart sysmonitor. + +```shell +systemctl restart sysmonitor +``` + +- Reload sysmonitor for the modified configurations to take effect. + +```shell +systemctl reload sysmonitor +``` + +### Monitoring Logs + +By default, logs is split and dumped to prevent the **sysmonitor.log** file from getting to large. Logs are dumped to a drive directory. In this way, a certain number of logs can be retained. + +The configuration file is **/etc/rsyslog.d/sysmonitor.conf**. Because this rsyslog configuration file is added, after sysmonitor is installed for the first time, you need to restart the rsyslog service to make the sysmonitor log configuration take effect. + +```text +$template sysmonitorformat,"%TIMESTAMP:::date-rfc3339%|%syslogseverity-text%|%msg%\n" + +$outchannel sysmonitor, /var/log/sysmonitor.log, 2097152, /usr/libexec/sysmonitor/sysmonitor_log_dump.sh +if ($programname == 'sysmonitor' and $syslogseverity <= 6) then { +:omfile:$sysmonitor;sysmonitorformat +stop +} + +if ($msg contains 'Time has been changed') then { +:omfile:$sysmonitor;sysmonitorformat +stop +} + +if ($programname == 'sysmonitor' and $syslogseverity > 6) then { +/dev/null +stop +} +``` + +## ext3/ext4 Filesystem Monitoring + +### Introduction + +A fault in the filesystem may trigger I/O operation errors, which further cause OS faults. File system fault detection can detect the faults in real time so that system administrators or users can rectify them in a timely manner. + +### Configuration File Description + +None + +### Exception Logs + +For a file system to which the errors=remount-ro mounting option is added, if the ext3 or ext4 file system is faulty, the following exception information is recorded in the **sysmonitor.log** file: + +```text +info|sysmonitor[127]: loop0 filesystem error. Remount filesystem read-only. +``` + +In other exception scenarios, if the ext3 or ext4 file system is faulty, the following exception information is recorded in the **sysmonitor.log** file: + +```text +info|sysmonitor[127]: fs_monitor_ext3_4: loop0 filesystem error. flag is 1879113728. +``` + +## Key Processing Monitoring + +### Introduction + +Key processes in the system are periodically monitored. When a key process exits abnormally, sysmonitor automatically attempts to recover the key process. If the recovery fails, alarms can be reported. The system administrator can be promptly notified of the abnormal process exit event and whether the process is restarted. Fault locating personnel can locate the time when the process exits abnormally from logs. + +### Configuration File Description + +The configuration file directory is **/etc/sysmonitor/process**. Each process or module corresponds to a configuration file. + +```text +USER=root +NAME=irqbalance +RECOVER_COMMAND=systemctl restart irqbalance +MONITOR_COMMAND=systemctl status irqbalance +STOP_COMMAND=systemctl stop irqbalance +``` + +The configuration items are as follows: + +| Item | Description | Mandatory| Default Value | +| ---------------------- | ------------------------------------------------------------ | -------- | --------------------------------------------------- | +| NAME | Process or module name | Yes | None | +| RECOVER_COMMAND | Recovery command | No | None | +| MONITOR_COMMAND | Monitoring command
If the command output is 0, the process is normal. If the command output is greater than 0, the process is abnormal.| No | pgrep -f $(which xxx)
*xxx* is the process name configured in the **NAME** field.| +| STOP_COMMAND | Stopping command | No | None | +| USER | User name
User for executing the monitoring, recovery, and stopping commands or scripts | No | If this item is left blank, the **root** user is used by default. | +| CHECK_AS_PARAM | Parameter passing switch
If this item is on, the return value of **MONITOR_COMMAND** is transferred to the **RECOVER_COMMAND** command or script as an input parameter. If this item is set to off or other values, the function is disabled.| No | None | +| MONITOR_MODE | Monitoring mode
- **parallel** or **serial**
| No | serial | +| MONITOR_PERIOD | Monitoring period
- Parallel monitoring period
- This item does not take effect when the monitoring mode is **serial**.| No | 3 | +| USE_CMD_ALARM | Alarm mode
If this parameter is set to **on** or **ON**, alarms are reported using the alarm reporting command. | No | None | +| ALARM_COMMAND | Alarm reporting command | No | None | +| ALARM_RECOVER_COMMAND | Alarm recovery command | No | No | + +- After modifying the configuration file for monitoring key processes, run `systemctl reload sysmonitor`. The new configuration takes effect after a monitoring period. +- The recovery command and monitoring command must not block. Otherwise, the monitoring thread of the key process becomes abnormal. +- When the recovery command is executed for more than 90 seconds, the stopping command is executed to stop the process. +- If the recovery command is empty or not configured, the monitoring command does not attempt to recover the key process when detecting that the key process is abnormal. +- If a key process is abnormal and fails to be started for three consecutive times, the process is started based on the period specified by **PROCESS_RECALL_PERIOD** in the global configuration file. +- If the monitored process is not a daemon process, **MONITOR_COMMAND** is mandatory. +- If the configured key service does not exist in the current system, the monitoring does not take effect and the corresponding information is printed in the log. If a fatal error occurs in other configuration items, the default configuration is used and no error is reported. +- The permission on the configuration file is 600. You are advised to set the monitoring item to the **service** type of systemd (for example, **MONITOR_COMMAND=systemctl status irqbalance**). If a process is monitored, ensure that the **NAME** field is an absolute path. +- The restart, reload, and stop of sysmonitor do not affect the monitored processes or services. +- If **USE_CMD_ALARM** is set to **on**, you must ensure the validiy of **ALARM_COMMAND** and **ALARM_RECOVER_COMMAND**. If **ALARM_COMMAND** or **ALARM_RECOVER_COMMAND** is empty or not configured, no alarm is reported. +- The security of user-defined commands, such as the monitoring, recovery, stopping, alarm reporting, and alarm recovery commands, is ensured by users. Commands are executed by the user **root**. You are advised to set the script command permission to be used only by the user **root** to prevent privilege escalation for common users. +- If the length of the monitoring command cannot be greater than 200 characters. Otherwise, the process monitoring fails to be added. +- When the recovery command is set to a systemd service restart command (for example, **RECOVER_COMMAND=systemctl restart irqbalance**), check whether the recovery command conflicts with the open source systemd service recovery mechanism. Otherwise, the behavior of key processes may be affected after exceptions occur. +- The processes started by the sysmonitor service are in the same cgroup as the sysmonitor service, and resources cannot be restricted separately. Therefore, you are advised to use the open source systemd mechanism to recover the processes. + +### Exception Logs + +- **RECOVER_COMMAND** configured + + If a process or module exception is detected, the following exception information is recorded in the **/var/log/sysmonitor.log** file: + + ```text + info|sysmonitor[127]: irqbalance is abnormal, check cmd return 1, use "systemctl restart irqbalance" to recover + ``` + + If the process or module recovers, the following information is recorded in the **/var/log/sysmonitor.log** file: + + ```text + info|sysmonitor[127]: irqbalance is recovered + ``` + +- **RECOVER_COMMAND** not configured + + If a process or module exception is detected, the following exception information is recorded in the **/var/log/sysmonitor.log** file: + + ```text + info|sysmonitor[127]: irqbalance is abnormal, check cmd return 1, recover cmd is null, will not recover + ``` + + If the process or module recovers, the following information is recorded in the **/var/log/sysmonitor.log** file: + + ```text + info|sysmonitor[127]: irqbalance is recovered + ``` + +## File Monitoring + +### Introduction + +If key system files are deleted accidentally, the system may run abnormally or even break down. Through file monitoring, you can learn about the deletion of key files or the addition of malicious files in the system in a timely manner, so that administrators and users can learn and rectify faults in a timely manner. + +### Configuration File Description + +The configuration file is **/etc/sysmonitor/file**. Each monitoring configuration item occupies a line. A monitoring configuration item contains the file (directory) and event to be monitored. The file (directory) to be monitored is an absolute path. The file (directory) to be monitored and the event to be monitored are separated by one or more spaces. + +The file monitoring configuration items can be added to the **/etc/sysmonitor/file.d** directory. The configuration method is the same as that of the **/etc/sysmonitor/file** directory. + +- Due to the log length limit, it is recommended that the absolute path of a file or directory be less than 223 characters. Otherwise, the printed logs may be incomplete. + +- Ensure that the path of the monitored file is correct. If the configured file does not exist or the path is incorrect, the file cannot be monitored. + +- Due to the path length limit of the system, the absolute path of the monitored file or directory must be less than 4096 characters. + +- Directories and regular files can be monitored. **/proc**, **/proc/\***, **/dev**, **/dev/\***, **/sys**, **/sys/\***, pipe files, or socket files cannot be monitored. + +- Only deletion events can be monitored in **/var/log** and **/var/log/\***. + +- If multiple identical paths exist in the configuration file, the first valid configuration takes effect. In the log file, you can see messages indicating that the identical paths are ignored. + +- Soft links cannot be monitored. When a hard link file deletion event is configured, the event is printed only after the file and all its hard links are deleted. + +- When a monitored event occurs after the file monitoring is successfully added, the monitoring log records the absolute path of the configured file. + +- Currently, directories cannot be monitored recursively. The configured directory is monitored but not its subdirectories. + +- The events to be monitored are configured using bitmaps as follows. + +```text + ------------------------------- + | 11~32 | 10 | 9 | 1~8 | + ------------------------------- +``` + +Each bit in the event bitmap represents an event. If bit _n_ is set to 1, the event corresponding to bit _n_ is monitored. The hexadecimal number corresponding to the monitoring bitmap is the event monitoring item written to the configuration file. + +| Item| Description | Mandatory| +| ------ | ------------------ | -------- | +| 1~8 | Reserved | No | +| 9 | File or directory addition event| Yes | +| 10 | File or directory deletion event| Yes | +| 11~32 | Reserved | No | + +- After modifying the file monitoring configuration file, run `systemctl reload sysmonitor`. The new configuration takes effect within 60 seconds. +- Strictly follow the preceding rules to configure events to be monitored. If the configuration is incorrect, the events cannot be monitored. If an event to be monitored in the configuration item is empty, only the deletion event is monitored by default, that is, **0x200**. +- After a file or directory is deleted, the deletion event is reported only when all processes that open the file stop. +- If a monitored a is modified by `vi` or `sed`, "File XXX may have been changed" is recorded in the monitoring log. +- Currently, file addition and deletion events can be monitored, that is, the ninth and tenth bits take effect. Other bits are reserved and do not take effect. If a reserved bit is configured, the monitoring log displays a message indicating that the event monitoring is incorrectly configured. + +**Example** + +Monitor the subdirectory addition and deletion events in **/home**. The lower 12-bit bitmap is 001100000000. The configuration is as follows: + +```text +/home 0x300 +``` + +Monitor the file deletion events of **/etc/ssh/sshd_config**. The lower 12-bit bitmap is 001000000000. The configuration is as follows: + +```text +/etc/sshd/sshd_config 0x200 +``` + +### Exception Logs + +If a configured event occurs to the monitored file, the following information is displayed in the **/var/log/sysmonitor.log** file: + +```text +info|sysmonitor[127]: 1 events queued +info|sysmonitor[127]: 1th events handled +info|sysmonitor[127]: Subfile "111" under "/home" was added. +``` + +## Drive Partition Monitoring + +### Introduction + +The system periodically monitors the drive partitions mounted to the system. When the drive partition usage is greater than or equal to the configured alarm threshold, the system records a drive space alarm. When the drive partition usage falls below the configured alarm recovery threshold, a drive space recovery alarm is recorded. + +### Configuration File Description + +The configuration file is **/etc/sysmonitor/disk**. + +```text +DISK="/var/log" ALARM="90" RESUME="80" +DISK="/" ALARM="95" RESUME="85" +``` + +| Item| Description | Mandatory| Default Value| +| ------ | ---------------------- | -------- | ------ | +| DISK | Mount directory | Yes | None | +| ALARM | Integer indicating the drive space alarm threshold| No | 90 | +| RESUME | Integer indicating the drive space alarm recovery threshold| No | 80 | + +- After modifying the configuration file for drive space monitoring, run `systemctl reload sysmonitor`. The new configuration takes effect after a monitoring period. +- If a mount directory is configured repeatedly, the last configuration item takes effect. +- The value of **ALARM** must be greater than that of **RESUME**. +- Only the mount point or the drive partition of the mount point can be monitored. +- When the CPU usage and I/O usage are high, the `df` command execution may time out. As a result, the drive usage cannot be obtained. +- If a drive partition is mounted to multiple mount points, an alarm is reported for each mount point. + +### Exception Logs + +If a drive space alarm is detected, the following information is displayed in the **/var/log/sysmonitor.log** file: + +```text +warning|sysmonitor[127]: report disk alarm, /var/log used:90% alarm:90% +info|sysmonitor[127]: report disk recovered, /var/log used:4% resume:10% +``` + +## NIC Status Monitoring + +### Introduction + +During system running, the NIC status or IP address may change due to human factors or exceptions. You can monitor the NIC status and IP address changes to detect exceptions in a timely manner and locate exception causes. + +### Configuration File Description + +The configuration file is **/etc/sysmonitor/network**. + +```text +#dev event +eth1 UP +``` + +The following table describes the configuration items. +| Item| Description | Mandatory| Default Value | +| ------ | ------------------------------------------------------------ | -------- | ------------------------------------------------- | +| dev | NIC name | Yes | None | +| event | Event to be monitored. The value can be **UP**, **DOWN**, **NEWADDR**, or **DELADDR**.
- UP: The NIC is up.
- DOWN: The NIC is down.
- NEWADDR: An IP address is added.
- DELADDR: An IP address is deleted.| No | If this item is empty, **UP**, **DOWN**, **NEWADDR**, and **DELADDR** are monitored.| + +- After modifying the configuration file for NIC monitoring, run `systemctl reload sysmonitor` for the new configuration to take effect. +- The **UP** and **DOWN** status of virtual NICs cannot be monitored. +- Ensure that each line in the NIC monitoring configuration file contains less than 4096 characters. Otherwise, a configuration error message will be recorded in the monitoring log. +- By default, all events of all NICs are monitored. That is, if no NIC monitoring is configured, the **UP**, **DOWN**, **NEWADDR**, and **DELADDR** events of all NICs are monitored. +- If a NIC is configured but no event is configured, all events of the NIC are monitored by default. +- The events of route addition can be recorded five times per second. You can change the number of times by setting **NET_RATE_LIMIT_BURST** in **/etc/sysconfig/sysmonitor**. + +### Exception Logs + +If a NIC event is detected, the following information is displayed in the **/var/log/sysmonitor.log** file: + +```text +info|sysmonitor[127]: lo: ip[::1] prefixlen[128] is added, comm: (ostnamed)[1046], parent comm: syst emd[1] +info|sysmonitor[127]: lo: device is up, comm: (ostnamed)[1046], parent comm: systemd[1] +``` + +If a route event is detected, the following information is displayed in the **/var/log/sysmonitor.log** file: + +```text +info|sysmonitor[881]: Fib4 replace table=255 192.168.122.255/32, comm: daemon-init[1724], parent com m: systemd[1] +info|sysmonitor[881]: Fib4 replace table=254 192.168.122.0/24, comm: daemon-init[1724], parent comm: systemd[1] +info|sysmonitor[881]: Fib4 replace table=255 192.168.122.0/32, comm: daemon-init[1724], parent comm: systemd[1] +info|sysmonitor[881]: Fib6 replace fe80::5054:ff:fef6:b73e/128, comm: kworker/1:3[209], parent comm: kthreadd[2] +``` + +## CPU Monitoring + +### Introduction + +The system monitors the global CPU usage or the CPU usage in a specified domain. When the CPU usage exceeds the configured alarm threshold, the system runs the configured log collection command. + +### Configuration File Description + +The configuration file is **/etc/sysmonitor/cpu**. + +When the global CPU usage of the system is monitored, an example of the configuration file is as follows: + +```text +# cpu usage alarm percent +ALARM="90" + +# cpu usage alarm resume percent +RESUME="80" + +# monitor period (second) +MONITOR_PERIOD="60" + +# stat period (second) +STAT_PERIOD="300" + +# command executed when cpu usage exceeds alarm percent +REPORT_COMMAND="" +``` + +When the CPU usage of a specific domain is monitored, an example of the configuration file is as follows: + +```text +# monitor period (second) +MONITOR_PERIOD="60" + +# stat period (second) +STAT_PERIOD="300" + +DOMAIN="0,1" ALARM="90" RESUME="80" +DOMAIN="2,3" ALARM="50" RESUME="40" + +# command executed when cpu usage exceeds alarm percent +REPORT_COMMAND="" +``` + +| Item | Description | Mandatory| Default Value| +| -------------- | ------------------------------------------------------------ | -------- | ------ | +| ALARM | Number greater than 0, indicating the CPU usage alarm threshold | No | 90 | +| RESUME | Number greater than or equal to 0, indicating the CPU usage alarm recovery threshold | No | 80 | +| MONITOR_PERIOD | Monitoring period, in seconds. The value is greater than 0. | No | 60 | +| STAT_PERIOD | Statistical period, in seconds. The value is greater than 0. | No | 300 | +| DOMAIN | CPU IDs in the domain, represented by decimal numbers
- CPU IDs can be enumerated and separated by commas, for exmaple, **1,2,3**. CPU IDs can be specified as a range in the formate of _X_-_Y_, for example, **0-2**. The two representations can be used together, for example, **0, 1, 2-3** or **0-1, 2-3**. Spaces or other characters are not allowed.
- Each monitoring domain has an independent configuration item. Each configuration item supports a maximum of 256 CPUs. A CPU ID must be unique in a domain and across domains.| No | None | +| REPORT_COMMAND | Command for collecting logs after the CPU usage exceeds the alarm threshold | No | None | + +- After modifying the configuration file for CPU monitoring, run `systemctl reload sysmonitor`. The new configuration takes effect after a monitoring period. +- The value of **ALARM** must be greater than that of **RESUME**. +- After the CPU domain monitoring is configured, the global average CPU usage of the system is not monitored, and the separately configured **ALARM** and **RESUME** values do not take effect. +- If the configuration of a monitoring domain is invalid, CPU monitoring is not performed at all. +- All CPUs configured in **DOMAIN** must be online. Otherwise, the domain cannot be monitored. +- The command of **REPORT_COMMAND** cannot contain insecure characters such as **&**, **;**, and **>**, and the total length cannot exceed 159 characters. Otherwise, the command cannot be executed. +- Ensure the security and validity of **REPORT_COMMAND**. sysmonitor is responsible only for running the command as the **root** user. +- **REPORT_COMMAND** must not block. When the execution time of the command exceeds 60s, the sysmonitor forcibly stops the execution. +- Even if the CPU usage of multiple domains exceeds the threshold in a monitoring period, **REPORT_COMMAND** is executed only once. + +### Exception Logs + +If a global CPU usage alarm is detected or cleared and the log collection command is configured, the following information is displayed in the **/var/log/sysmonitor.log** file: + +```text +info|sysmonitor[127]: CPU usage alarm: 91.3% +info|sysmonitor[127]: cpu monitor: execute REPORT_COMMAND[sysmoniotrcpu] sucessfully +info|sysmonitor[127]: CPU usage resume 70.1% +``` + +If a domain average CPU usage alarm is detected or cleared and the log collection command is configured, the following information is displayed in the **/var/log/sysmonitor.log** file: + +```text +info|sysmonitor[127]: CPU 1,2,3 usage alarm: 91.3% +info|sysmonitor[127]: cpu monitor: execute REPORT_COMMAND[sysmoniotrcpu] sucessfully +info|sysmonitor[127]: CPU 1,2,3 usage resume 70.1% +``` + +## Memory Monitoring + +### Introduction + +Monitors the system memory usage and records logs when the memory usage exceeds or falls below the threshold. + +### Configuration File Description + +The configuration file is **/etc/sysmonitor/memory**. + +```text +# memory usage alarm percent +ALARM="90" + +# memory usage alarm resume percent +RESUME="80" + +# monitor period(second) +PERIOD="60" +``` + +### Configuration Item Description + +| Item| Description | Mandatory| Default Value| +| ------ | ----------------------------- | -------- | ------ | +| ALARM | Number greater than 0, indicating the memory usage alarm threshold | No | 90 | +| RESUME | Number greater than or equal to 0, indicating the memory usage alarm recovery threshold| No | 80 | +| PERIOD | Monitoring period, in seconds. The value is greater than 0. | No | 60 | + +- After modifying the configuration file for memory monitoring, run `systemctl reload sysmonitor`. The new configuration takes effect after a monitoring period. +- The value of **ALARM** must be greater than that of **RESUME**. +- The average memory usage in three monitoring periods is used to determine whether an alarm is reported or cleared. + +### Exception Logs + +If a memory alarm is detected, sysmonitor obtains the **/proc/meminfo** information and prints the information in the **/var/log/sysmonitor.log** file. The information is as follows: + +```text +info|sysmonitor[127]: memory usage alarm: 90% +info|sysmonitor[127]:---------------show /proc/meminfo: --------------- +info|sysmonitor[127]:MemTotal: 3496388 kB +info|sysmonitor[127]:MemFree: 2738100 kB +info|sysmonitor[127]:MemAvailable: 2901888 kB +info|sysmonitor[127]:Buffers: 165064 kB +info|sysmonitor[127]:Cached: 282360 kB +info|sysmonitor[127]:SwapCached: 4492 kB +...... +info|sysmonitor[127]:---------------show_memory_info end. --------------- +``` + +If the following information is printed, sysmonitor runs `echo m > /proc/sysrq-trigger` to export memory allocation information. You can view the information in **/var/log/messages**. + +```text +info|sysmonitor[127]: sysrq show memory ifno in message. +``` + +When the alarm is recovered, the following information is displayed: + +```text +info|sysmonitor[127]: memory usage resume: 4.6% +``` + +## Process and Thread Monitoring + +### Introduction + +Monitors the number of processes and threads. When the total number of processes or threads exceeds or falls below the threshold, a log is recorded or an alarm is reported. + +### Configuration File Description + +The configuration file is **/etc/sysmonitor/pscnt**. + +```text +# number of processes(include threads) when alarm occur +ALARM="1600" + +# number of processes(include threads) when alarm resume +RESUME="1500" + +# monitor period(second) +PERIOD="60" + +# process count usage alarm percent +ALARM_RATIO="90" + +# process count usage resume percent +RESUME_RATIO="80" + +# print top process info with largest num of threads when threads alarm +# (range: 0-1024, default: 10, monitor for thread off:0) +SHOW_TOP_PROC_NUM="10" +``` + +| Item | Description | Mandatory| Default Value| +| ----------------- | ------------------------------------------------------------ | -------- | ------ | +| ALARM | Integer greater than 0, indicating the process count alarm threshold | No | 1600 | +| RESUME | Integer greater than or equal to 0, indicating the process count alarm recovery threshold | No | 1500 | +| PERIOD | Monitoring period, in seconds. The value is greater than 0. | No | 60 | +| ALARM_RATIO | Number greater than 0 and less than or equal to 100. Process count alarm threshold. | No | 90 | +| RESUME_RATIO | Number greater than 0 and less than or equal to 100. Process count alarm recovery threshold, which must be less than **ALARM_RATIO**.| No | 80 | +| SHOW_TOP_PROC_NUM | Whether to use the latest `top` information about threads | No | 10 | + +- After modifying the configuration file for process count monitoring, run `systemctl reload sysmonitor`. The new configuration takes effect after a monitoring period. +- The value of **ALARM** must be greater than that of **RESUME**. +- The process count alarm threshold is the larger between **ALARM** and **ALARM_RATIO** in **/proc/sys/kernel/pid_max**. The alarm recovery threshold is the larger of **RESUME** and **RESUME_RATIO** in **/proc/sys/kernel/pid_max**. +- The thread count alarm threshold is the larger between **ALARM** and **ALARM_RATIO** in **/proc/sys/kernel/threads-max**. The alarm recovery threshold is the larger of **RESUME** and **RESUME_RATIO** in **/proc/sys/kernel/threads-max**. +- The value of **SHOW_TOP_PROC_NUM** ranges from 0 to 1024. 0 indicates that thread monitoring is disabled. A larger value, for example, 1024, indicates that thread alarms will be generated in the environment. If the alarm threshold is high, the performance is affected. You are advised to set this parameter to the default value 10 or a smaller value. If the impact is huge, you are advised to set this parameter to 0 to disable thread monitoring. +- The value of **PSCNT_MONITOR** in **/etc/sysconfig/sysmonitor** and the value of **SHOW_TOP_PROC_NUM** in **/etc/sysmonitor/pscnt** determine whether thread monitoring is enabled. + - If **PSCNT_MONITOR** is on and **SHOW_TOP_PROC_NUM** is set to a valid value, thread monitoring is enabled. + - If **PSCNT_MONITOR** is on and **SHOW_TOP_PROC_NUM** is 0, thread monitoring is disabled. + - If **PSCNT_MONITOR** is off, thread monitoring is disabled. +- When a process count alarm is generated, the system FD usage information and memory information (**/proc/meminfo**) are printed. +- When a thread count alarm is generated, the total number of threads, `top` process information, number of processes in the current environment, number of system FDs, and memory information (**/proc/meminfo**) are printed. +- If system resources are insufficient before a monitoring period ends, for example, the thread count exceeds the maximum number allowed, the monitoring cannot run properly due to resource limitation. As a result, the alarm cannot be generated. + +### Exception Logs + +If a process count alarm is detected, the following information is displayed in the **/var/log/sysmonitor.log** file: + +```text +info|sysmonitor[127]:---------------process count alarm start: --------------- +info|sysmonitor[127]: process count alarm:1657 +info|sysmonitor[127]: process count alarm, show sys fd count: 2592 +info|sysmonitor[127]: process count alarm, show mem info +info|sysmonitor[127]:---------------show /proc/meminfo: --------------- +info|sysmonitor[127]:MemTotal: 3496388 kB +info|sysmonitor[127]:MemFree: 2738100 kB +info|sysmonitor[127]:MemAvailable: 2901888 kB +info|sysmonitor[127]:Buffers: 165064 kB +info|sysmonitor[127]:Cached: 282360 kB +info|sysmonitor[127]:SwapCached: 4492 kB +...... +info|sysmonitor[127]:---------------show_memory_info end. --------------- +info|sysmonitor[127]:---------------process count alarm end: --------------- +``` + +If a process count recovery alarm is detected, the following information is displayed in the **/var/log/sysmonitor.log** file: + +```text +info|sysmonitor[127]: process count resume: 1200 +``` + +If a thread count alarm is detected, the following information is displayed in the **/var/log/sysmonitor.log** file: + +```text +info|sysmonitor[127]:---------------threads count alarm start: --------------- +info|sysmonitor[127]:threads count alarm: 273 +info|sysmonitor[127]:open threads most 10 processes is [top1:pid=1756900,openthreadsnum=13,cmd=/usr/bin/sysmonitor --daemon] +info|sysmonitor[127]:open threads most 10 processes is [top2:pid=3130,openthreadsnum=13,cmd=/usr/lib/gassproxy -D] +..... +info|sysmonitor[127]:---------------threads count alarm end. --------------- +``` + +## System FD Count Monitoring + +### Introduction + +Monitors the number of system FDs. When the total number of system FDs exceeds or is less than the threshold, a log is recorded. + +### Configuration File Description + +The configuration file is **/etc/sysmonitor/sys_fd_conf**. + +```text +# system fd usage alarm percent +SYS_FD_ALARM="80" +# system fd usage alarm resume percent +SYS_FD_RESUME="70" +# monitor period (second) +SYS_FD_PERIOD="600" +``` + +Configuration items: + +| Item | Description | Mandatory| Default Value| +| ------------- | --------------------------------------------------------- | -------- | ------ | +| SYS_FD_ALARM | Integer greater than 0 and less than 100, indicating the alarm threshold of the percentage of the total number of FDs and the maximum number of FDs allowed.| No | 80 | +| SYS_FD_RESUME | Integer greater than 0 and less than 100, indicating the alarm recovery threshold of the percentage of the total number of FDs and the maximum number of FDs allowed.| No | 70 | +| SYS_FD_PERIOD | Integer between 100 and 86400, indicating the monitor period in seconds | No | 600 | + +- After modifying the configuration file for FD count monitoring, run `systemctl reload sysmonitor`. The new configuration takes effect after a monitoring period. +- The value of **SYS_FD_ALARM** must be greater than that of **SYS_FD_RESUME**. If the value is invalid, the default value is used and a log is recorded. + +### Exception Logs + +An FD count alarm is recorded in the monitoring logs when detected. The following information is displayed in the **/var/log/sysmonitor.log** file: + +```text +info|sysmonitor[127]: sys fd count alarm: 259296 +``` + +When a system FD usage alarm is generated, the top three processes that use the most FDs are printed. + +```text +info|sysmonitor[127]:open fd most three processes is:[top1:pid=23233,openfdnum=5000,cmd=/home/openfile] +info|sysmonitor[127]:open fd most three processes is:[top2:pid=23267,openfdnum=5000,cmd=/home/openfile] +info|sysmonitor[127]:open fd most three processes is:[top3:pid=30144,openfdnum=5000,cmd=/home/openfile] +``` + +## Drive Inode Monitoring + +### Introduction + +Periodically monitors the inodes of mounted drive partitions. When the drive partition inode usage is greater than or equal to the configured alarm threshold, the system records a drive inode alarm. When the drive inode usage falls below the configured alarm recovery threshold, a drive inode recovery alarm is recorded. + +### Configuration File Description + +The configuration file is **/etc/sysmonitor/inode**. + +```text +DISK="/" +DISK="/var/log" +``` + +| Item| Description | Mandatory| Default Value| +| ------ | ------------------------- | -------- | ------ | +| DISK | Mount directory | Yes | None | +| ALARM | Integer indicating the drive inode alarm threshold| No | 90 | +| RESUME | Integer indicating the drive inode alarm recovery threshold| No | 80 | + +- After modifying the configuration file for drive inode monitoring, run `systemctl reload sysmonitor`. The new configuration takes effect after a monitoring period. +- If a mount directory is configured repeatedly, the last configuration item takes effect. +- The value of **ALARM** must be greater than that of **RESUME**. +- Only the mount point or the drive partition of the mount point can be monitored. +- When the CPU usage and I/O usage are high, the `df` command execution may time out. As a result, the drive inode usage cannot be obtained. +- If a drive partition is mounted to multiple mount points, an alarm is reported for each mount point. + +### Exception Logs + +If a drive inode alarm is detected, the following information is displayed in the **/var/log/sysmonitor.log** file: + +```text +info|sysmonitor[4570]:report disk inode alarm, /var/log used:90% alarm:90% +info|sysmonitor[4570]:report disk inode recovered, /var/log used:79% alarm:80% +``` + +## Local Drive I/O Latency Monitoring + +### Introduction + +Reads the local drive I/O latency data every 5 seconds and collects statistics on 60 groups of data every 5 minutes. If more than 30 groups of data are greater than the configured maximum I/O latency, the system records a log indicating excessive drive I/O latency. + +### Configuration File Description + +The configuration file is **/etc/sysmonitor/iodelay**. + +```text +DELAY_VALUE="500" +``` + +| Item | Description | Mandatory| Default Value| +| ----------- | -------------------- | -------- | ------ | +| DELAY_VALUE | Maximum drive I/O latency| Yes | 500 | + +### Exception Logs + +If a drive I/O latency alarm is detected, the following information is displayed in the **/var/log/sysmonitor.log** file: + +```text +info|sysmonitor[127]:local disk sda IO delay is too large, I/O delay threshold is 70. +info|sysmonitor[127]:disk is sda, io delay data: 71 72 75 87 99 29 78 ...... +``` + +If a drive I/O latency recovery alarm is detected, the following information is displayed in the **/var/log/sysmonitor.log** file: + +```text +info|sysmonitor[127]:local disk sda IO delay is normal, I/O delay threshold is 70. +info|sysmonitor[127]:disk is sda, io delay data: 11 22 35 8 9 29 38 ...... +``` + +## Zombie Process Monitoring + +### Introduction + +Monitors the number of zombie processes in the system. If the number is greater than the alarm threshold, an alarm log is recorded. When the number drops lower than the recovery threshold, a recovery alarm is reported. + +### Configuration File Description + +The configuration file is **/etc/sysmonitor/zombie**. + +```text +# Ceiling zombie process counts of alarm +ALARM="500" + +# Floor zombie process counts of resume +RESUME="400" + +# Periodic (second) +PERIOD="600" +``` + +| Item| Description | Mandatory| Default Value| +| ------ | ------------------------------- | -------- | ------ | +| ALARM | Number greater than 0, indicating the zombie process count alarm threshold | No | 500 | +| RESUME | Number greater than or equal to 0, indicating the zombie process count recovery threshold| No | 400 | +| PERIOD | Monitoring period, in seconds. The value is greater than 0. | No | 60 | + +### Exception Logs + +If a zombie process count alarm is detected, the following information is displayed in the **/var/log/sysmonitor.log** file: + +```text +info|sysmonitor[127]: zombie process count alarm: 600 +info|sysmonitor[127]: zombie process count resume: 100 +``` + +## Custom Monitoring + +### Introduction + +You can customize monitoring items. The monitoring framework reads the content of the configuration file, parses the monitoring attributes, and calls the monitoring actions to be performed. The monitoring module provides only the monitoring framework. It is not aware of what users are monitoring or how to monitor, and does not report alarms. + +### Configuration File Description + +The configuration files are stored in **/etc/sysmonitor.d/**. Each process or module corresponds to a configuration file. + +```text +MONITOR_SWITCH="on" +TYPE="periodic" +EXECSTART="/usr/sbin/iomonitor_daemon" +PERIOD="1800" +``` + +| Item | Description | Mandatory | Default Value| +| -------------- | ------------------------------------------------------------ | --------------------- | ------ | +| MONITOR_SWITCH | Monitoring switch | No | off | +| TYPE | Custom monitoring item type
**daemon**: background execution
**periodic**: periodic execution| Yes | None | +| EXECSTART | Monitoring command | Yes | None | +| ENVIROMENTFILE | Environment variable file | No | None | +| PERIOD | If the type is **periodic**, this parameter is mandatory and sets the monitoring period. The value is an integer greater than 0.| Yes when the type is **periodic**| None | + +- The absolute path of the configuration file or environment variable file cannot contain more than 127 characters. The environment variable file path cannot be a soft link path. +- The length of the **EXECSTART** command cannot exceed 159 characters. No space is allowed in the key field. +- The execution of the periodic monitoring command cannot time out. Otherwise, the custom monitoring framework will be affected. +- Currently, a maximum of 256 environment variables can be configured. +- The custom monitoring of the daemon type checks whether the `reload` command is delivered or whether the daemon process exits abnormally every 10 seconds. If the `reload` command is delivered, the new configuration is loaded 10 seconds later. If a daemon process exits abnormally, the daemon process is restarted 10 seconds later. +- If the content of the **ENVIROMENTFILE** file changes, for example, an environment variable is added or the environment variable value changes, you need to restart the sysmonitor service for the new environment variable to take effect. +- You are advised to set the permission on the configuration files in the **/etc/sysmonitor.d/** directory to 600. If **EXECSTART** is only an executable file, you are advised to set the permission on the executable file to 550. +- After a daemon process exits abnormally, sysmonitor reloads the configuration file of the daemon process. + +### Exception Logs + +If a monitoring item of the daemon type exits abnormally, the **/var/log/sysmonitor.log** file records the following information: + +```text +info|sysmonitor[127]: custom daemon monitor: child process[11609] name unetwork_alarm exit code[127],[1] times. +``` diff --git a/docs/en/docs/thirdparty_migration/OpenStack-train.md b/docs/en/docs/thirdparty_migration/OpenStack-train.md deleted file mode 100644 index 7ad42a2867c6e95a01ee866f7d271e663b43335c..0000000000000000000000000000000000000000 --- a/docs/en/docs/thirdparty_migration/OpenStack-train.md +++ /dev/null @@ -1,2961 +0,0 @@ -# OpenStack-Wallaby Deployment Guide - - - -- [OpenStack-Wallaby Deployment Guide](#openstack-wallaby-deployment-guide) - - [OpenStack](#openstack) - - [Conventions](#conventions) - - [Preparing the Environment](#preparing-the-environment) - - [Environment Configuration](#environment-configuration) - - [Installing the SQL Database](#installing-the-sql-database) - - [Installing RabbitMQ](#installing-rabbitmq) - - [Installing Memcached](#installing-memcached) - - [OpenStack Installation](#openstack-installation) - - [Installing Keystone](#installing-keystone) - - [Installing Glance](#installing-glance) - - [Installing Placement](#installing-placement) - - [Installing Nova](#installing-nova) - - [Installing Neutron](#installing-neutron) - - [Installing Cinder](#installing-cinder) - - [Installing Horizon](#installing-horizon) - - [Installing Tempest](#installing-tempest) - - [Installing Ironic](#installing-ironic) - - [Installing Kolla](#installing-kolla) - - [Installing Trove](#installing-trove) - - [Installing Swift](#installing-swift) - - [Installing Cyborg](#installing-cyborg) - - [Installing Aodh](#installing-aodh) - - [Installing Gnocchi](#installing-gnocchi) - - [Installing Ceilometer](#installing-ceilometer) - - [Installing Heat](#installing-heat) - - [OpenStack Quick Installation](#openstack-quick-installation) - - -## OpenStack - -OpenStack is an open source cloud computing infrastructure software project developed by the community. It provides an operating platform or tool set for deploying the cloud, offering scalable and flexible cloud computing for organizations. - -As an open source cloud computing management platform, OpenStack consists of several major components, such as Nova, Cinder, Neutron, Glance, Keystone, and Horizon. OpenStack supports almost all cloud environments. The project aims to provide a cloud computing management platform that is easy-to-use, scalable, unified, and standardized. OpenStack provides an infrastructure as a service (IaaS) solution that combines complementary services, each of which provides an API for integration. - -The official source of openEuler 22.03-LTS now supports OpenStack Train. You can configure the Yum source then deploy OpenStack by following the instructions of this document. - -## Conventions - -OpenStack supports multiple deployment modes. This document includes two deployment modes: **All in One** and **Distributed**. The conventions are as follows: - -**All in One** mode: - -```text -Ignores all possible suffixes. -``` - -**Distributed** mode: - -```text -A suffix of (CTL) indicates that the configuration or command applies only to the control node. -A suffix of (CPT) indicates that the configuration or command applies only to the compute node. -A suffix of (STG) indicates that the configuration or command applies only to the storage node. -In other cases, the configuration or command applies to both the control node and compute node. -``` - -***Note*** - -The services involved in the preceding conventions are as follows: - -- Cinder -- Nova -- Neutron - -## Preparing the Environment - -### Environment Configuration - -1. Start the OpenStack Train Yum source. - - ```shell - yum update - yum install openstack-release-train - yum clean all && yum makecache - ``` - - **Note**: Enable the EPOL repository for the Yum source if it is not enabled already. - - ```shell - vi /etc/yum.repos.d/openEuler.repo - - [EPOL] - name=EPOL - baseurl=http://repo.openeuler.org/openEuler-22.03-LTS/EPOL/main/$basearch/ - enabled=1 - gpgcheck=1 - gpgkey=http://repo.openeuler.org/openEuler-22.03-LTS/OS/$basearch/RPM-GPG-KEY-openEuler - EOF - ``` - -2. Change the host name and mapping. - - Set the host name of each node: - - ```shell - hostnamectl set-hostname controller (CTL) - hostnamectl set-hostname compute (CPT) - ``` - - Assuming the IP address of the controller node is **10.0.0.11** and the IP address of the compute node (if any) is **10.0.0.12**, add the following information to the **/etc/hosts** file: - - ```shell - 10.0.0.11 controller - 10.0.0.12 compute - ``` - -### Installing the SQL Database - -1. Run the following command to install the software package: - - ```shell - yum install mariadb mariadb-server python3-PyMySQL - ``` - -2. Run the following command to create and edit the **/etc/my.cnf.d/openstack.cnf** file: - - ```shell - vim /etc/my.cnf.d/openstack.cnf - - [mysqld] - bind-address = 10.0.0.11 - default-storage-engine = innodb - innodb_file_per_table = on - max_connections = 4096 - collation-server = utf8_general_ci - character-set-server = utf8 - ``` - - ***Note*** - - **`bind-address` is set to the management IP address of the controller node.** - -3. Run the following commands to start the database service and configure it to automatically start upon system boot: - - ```shell - systemctl enable mariadb.service - systemctl start mariadb.service - ``` - -4. (Optional) Configure the default database password: - - ```shell - mysql_secure_installation - ``` - - ***Note*** - - **Perform operations as prompted.** - -### Installing RabbitMQ - -1. Run the following command to install the software package: - - ```shell - yum install rabbitmq-server - ``` - -2. Start the RabbitMQ service and configure it to automatically start upon system boot: - - ```shell - systemctl enable rabbitmq-server.service - systemctl start rabbitmq-server.service - ``` - -3. Add the OpenStack user: - - ```shell - rabbitmqctl add_user openstack RABBIT_PASS - ``` - - ***Note*** - - **Replace *RABBIT_PASS* to set the password for the openstack user.** - -4. Run the following command to set the permission of the **openstack** user to allow the user to perform configuration, write, and read operations: - - ```shell - rabbitmqctl set_permissions openstack ".*" ".*" ".*" - ``` - -### Installing Memcached - -1. Run the following command to install the dependency package: - - ```shell - yum install memcached python3-memcached - ``` - -2. Open the **/etc/sysconfig/memcached** file in insert mode. - - ```shell - vim /etc/sysconfig/memcached - - OPTIONS="-l 127.0.0.1,::1,controller" - ``` - -3. Run the following command to start the Memcached service and configure it to automatically start upon system boot: - - ```shell - systemctl enable memcached.service - systemctl start memcached.service - ``` - - ***Note*** - - **After the service is started, you can run `memcached-tool controller stats` to ensure that the service is started properly and available. You can replace `controller` with the management IP address of the controller node.** - -## OpenStack Installation - -### Installing Keystone - -1. Create the **keystone** database and grant permissions: - - ``` sql - mysql -u root -p - - MariaDB [(none)]> CREATE DATABASE keystone; - MariaDB [(none)]> GRANT ALL PRIVILEGES ON keystone.* TO 'keystone'@'localhost' \ - IDENTIFIED BY 'KEYSTONE_DBPASS'; - MariaDB [(none)]> GRANT ALL PRIVILEGES ON keystone.* TO 'keystone'@'%' \ - IDENTIFIED BY 'KEYSTONE_DBPASS'; - MariaDB [(none)]> exit - ``` - - ***Note*** - - **Replace *KEYSTONE_DBPASS* to set the password for the keystone database.** - -2. Install the software package: - - ```shell - yum install openstack-keystone httpd mod_wsgi - ``` - -3. Configure Keystone: - - ```shell - vim /etc/keystone/keystone.conf - - [database] - connection = mysql+pymysql://keystone:KEYSTONE_DBPASS@controller/keystone - - [token] - provider = fernet - ``` - - ***Description*** - - In the **[database]** section, configure the database entry . - - In the **[token]** section, configure the token provider . - - ***Note:*** - - **Replace *KEYSTONE_DBPASS* with the password of the keystone database.** - -4. Synchronize the database: - - ```shell - su -s /bin/sh -c "keystone-manage db_sync" keystone - ``` - -5. Initialize the Fernet keystore: - - ```shell - keystone-manage fernet_setup --keystone-user keystone --keystone-group keystone - keystone-manage credential_setup --keystone-user keystone --keystone-group keystone - ``` - -6. Start the service: - - ```shell - keystone-manage bootstrap --bootstrap-password ADMIN_PASS \ - --bootstrap-admin-url http://controller:5000/v3/ \ - --bootstrap-internal-url http://controller:5000/v3/ \ - --bootstrap-public-url http://controller:5000/v3/ \ - --bootstrap-region-id RegionOne - ``` - - ***Note*** - - **Replace *ADMIN_PASS* to set the password for the admin user.** - -7. Configure the Apache HTTP server: - - ```shell - vim /etc/httpd/conf/httpd.conf - - ServerName controller - ``` - - ```shell - ln -s /usr/share/keystone/wsgi-keystone.conf /etc/httpd/conf.d/ - ``` - - ***Description*** - - Configure **ServerName** to use the control node. - - ***Note*** - **If the ServerName item does not exist, create it. - -8. Start the Apache HTTP service: - - ```shell - systemctl enable httpd.service - systemctl start httpd.service - ``` - -9. Create environment variables: - - ```shell - cat << EOF >> ~/.admin-openrc - export OS_PROJECT_DOMAIN_NAME=Default - export OS_USER_DOMAIN_NAME=Default - export OS_PROJECT_NAME=admin - export OS_USERNAME=admin - export OS_PASSWORD=ADMIN_PASS - export OS_AUTH_URL=http://controller:5000/v3 - export OS_IDENTITY_API_VERSION=3 - export OS_IMAGE_API_VERSION=2 - EOF - ``` - - ***Note*** - - **Replace *ADMIN_PASS* with the password of the admin user.** - -10. Create domains, projects, users, and roles in sequence. python3-openstackclient must be installed first: - - ```shell - yum install python3-openstackclient - ``` - - Import the environment variables: - - ```shell - source ~/.admin-openrc - ``` - - Create the project **service**. The domain **default** has been created during keystone-manage bootstrap. - - ```shell - openstack domain create --description "An Example Domain" example - ``` - - ```shell - openstack project create --domain default --description "Service Project" service - ``` - - Create the (non-admin) project **myproject**, user **myuser**, and role **myrole**, and add the role **myrole** to **myproject** and **myuser**. - - ```shell - openstack project create --domain default --description "Demo Project" myproject - openstack user create --domain default --password-prompt myuser - openstack role create myrole - openstack role add --project myproject --user myuser myrole - ``` - -11. Perform the verification. - - Cancel the temporary environment variables **OS_AUTH_URL** and **OS_PASSWORD**. - - ```shell - source ~/.admin-openrc - unset OS_AUTH_URL OS_PASSWORD - ``` - - Request a token for the **admin** user: - - ```shell - openstack --os-auth-url http://controller:5000/v3 \ - --os-project-domain-name Default --os-user-domain-name Default \ - --os-project-name admin --os-username admin token issue - ``` - - Request a token for user **myuser**: - - ```shell - openstack --os-auth-url http://controller:5000/v3 \ - --os-project-domain-name Default --os-user-domain-name Default \ - --os-project-name myproject --os-username myuser token issue - ``` - -### Installing Glance - -1. Create the database, service credentials, and the API endpoints. - - Create the database: - - ```sql - mysql -u root -p - - MariaDB [(none)]> CREATE DATABASE glance; - MariaDB [(none)]> GRANT ALL PRIVILEGES ON glance.* TO 'glance'@'localhost' \ - IDENTIFIED BY 'GLANCE_DBPASS'; - MariaDB [(none)]> GRANT ALL PRIVILEGES ON glance.* TO 'glance'@'%' \ - IDENTIFIED BY 'GLANCE_DBPASS'; - MariaDB [(none)]> exit - ``` - - ***Note:*** - - **Replace *GLANCE_DBPASS* to set the password for the glance database.** - - Create the service credential: - - ```shell - source ~/.admin-openrc - - openstack user create --domain default --password-prompt glance - openstack role add --project service --user glance admin - openstack service create --name glance --description "OpenStack Image" image - ``` - - Create the API endpoints for the image service: - - ```shell - openstack endpoint create --region RegionOne image public http://controller:9292 - openstack endpoint create --region RegionOne image internal http://controller:9292 - openstack endpoint create --region RegionOne image admin http://controller:9292 - ``` - -2. Install the software package: - - ```shell - yum install openstack-glance - ``` - -3. Configure Glance: - - ```shell - vim /etc/glance/glance-api.conf - - [database] - connection = mysql+pymysql://glance:GLANCE_DBPASS@controller/glance - - [keystone_authtoken] - www_authenticate_uri = http://controller:5000 - auth_url = http://controller:5000 - memcached_servers = controller:11211 - auth_type = password - project_domain_name = Default - user_domain_name = Default - project_name = service - username = glance - password = GLANCE_PASS - - [paste_deploy] - flavor = keystone - - [glance_store] - stores = file,http - default_store = file - filesystem_store_datadir = /var/lib/glance/images/ - ``` - - ***Description:*** - - In the **[database]** section, configure the database entry. - - In the **[keystone_authtoken]** and **[paste_deploy]** sections, configure the identity authentication service entry. - - In the **[glance_store]** section, configure the local file system storage and the location of image files. - - ***Note*** - - **Replace *GLANCE_DBPASS* with the password of the glance database.** - - **Replace *GLANCE_PASS* with the password of user glance.** - -4. Synchronize the database: - - ```shell - su -s /bin/sh -c "glance-manage db_sync" glance - ``` - -5. Start the service: - - ```shell - systemctl enable openstack-glance-api.service - systemctl start openstack-glance-api.service - ``` - -6. Perform the verification. - - Download the image: - - ```shell - source ~/.admin-openrc - - wget http://download.cirros-cloud.net/0.4.0/cirros-0.4.0-x86_64-disk.img - ``` - - ***Note*** - - **If the Kunpeng architecture is used in your environment, download the image of the AArch64 version. the cirros-0.5.2-aarch64-disk.img image file has been tested.** - - Upload the image to the image service: - - ```shell - openstack image create --disk-format qcow2 --container-format bare \ - --file cirros-0.4.0-x86_64-disk.img --public cirros - ``` - - Confirm the image upload and verify the attributes: - - ```shell - openstack image list - ``` - -### Installing Placement - -1. Create a database, service credentials, and API endpoints. - - Create a database. - - Access the database as the **root** user. Create the **placement** database, and grant permissions. - - ```shell - mysql -u root -p - MariaDB [(none)]> CREATE DATABASE placement; - MariaDB [(none)]> GRANT ALL PRIVILEGES ON placement.* TO 'placement'@'localhost' \ - IDENTIFIED BY 'PLACEMENT_DBPASS'; - MariaDB [(none)]> GRANT ALL PRIVILEGES ON placement.* TO 'placement'@'%' \ - IDENTIFIED BY 'PLACEMENT_DBPASS'; - MariaDB [(none)]> exit - ``` - - **Note**: - - **Replace *PLACEMENT_DBPASS* to set the password for the placement database.** - - ```shell - source admin-openrc - ``` - - Run the following commands to create the Placement service credentials, create the **placement** user, and add the **admin** role to the **placement** user: - - Create the Placement API Service. - - ```shell - openstack user create --domain default --password-prompt placement - openstack role add --project service --user placement admin - openstack service create --name placement --description "Placement API" placement - ``` - - Create API endpoints of the **placement** service. - - ```shell - openstack endpoint create --region RegionOne placement public http://controller:8778 - openstack endpoint create --region RegionOne placement internal http://controller:8778 - openstack endpoint create --region RegionOne placement admin http://controller:8778 - ``` - -2. Perform the installation and configuration. - - Install the software package: - - ```shell - yum install openstack-placement-api - ``` - - Configure Placement: - - Edit the **/etc/placement/placement.conf** file: - - In the **[placement_database]** section, configure the database entry. - - In **[api]** and **[keystone_authtoken]** sections, configure the identity authentication service entry. - - ```shell - # vim /etc/placement/placement.conf - [placement_database] - # ... - connection = mysql+pymysql://placement:PLACEMENT_DBPASS@controller/placement - [api] - # ... - auth_strategy = keystone - [keystone_authtoken] - # ... - auth_url = http://controller:5000/v3 - memcached_servers = controller:11211 - auth_type = password - project_domain_name = Default - user_domain_name = Default - project_name = service - username = placement - password = PLACEMENT_PASS - ``` - - Replace **PLACEMENT_DBPASS** with the password of the **placement** database, and replace **PLACEMENT_PASS** with the password of the **placement** user. - - Synchronize the database: - - ```shell - su -s /bin/sh -c "placement-manage db sync" placement - ``` - - Start the httpd service. - - ```shell - systemctl restart httpd - ``` - -3. Perform the verification. - - Run the following command to check the status: - - ```shell - . admin-openrc - placement-status upgrade check - ``` - - Run the following command to install osc-placement and list the available resource types and features: - - ```shell - yum install python3-osc-placement - openstack --os-placement-api-version 1.2 resource class list --sort-column name - openstack --os-placement-api-version 1.6 trait list --sort-column name - ``` - -### Installing Nova - -1. Create a database, service credentials, and API endpoints. - - Create a database. - - ```sql - mysql -u root -p (CTL) - - MariaDB [(none)]> CREATE DATABASE nova_api; - MariaDB [(none)]> CREATE DATABASE nova; - MariaDB [(none)]> CREATE DATABASE nova_cell0; - MariaDB [(none)]> GRANT ALL PRIVILEGES ON nova_api.* TO 'nova'@'localhost' \ - IDENTIFIED BY 'NOVA_DBPASS'; - MariaDB [(none)]> GRANT ALL PRIVILEGES ON nova_api.* TO 'nova'@'%' \ - IDENTIFIED BY 'NOVA_DBPASS'; - MariaDB [(none)]> GRANT ALL PRIVILEGES ON nova.* TO 'nova'@'localhost' \ - IDENTIFIED BY 'NOVA_DBPASS'; - MariaDB [(none)]> GRANT ALL PRIVILEGES ON nova.* TO 'nova'@'%' \ - IDENTIFIED BY 'NOVA_DBPASS'; - MariaDB [(none)]> GRANT ALL PRIVILEGES ON nova_cell0.* TO 'nova'@'localhost' \ - IDENTIFIED BY 'NOVA_DBPASS'; - MariaDB [(none)]> GRANT ALL PRIVILEGES ON nova_cell0.* TO 'nova'@'%' \ - IDENTIFIED BY 'NOVA_DBPASS'; - MariaDB [(none)]> exit - ``` - - **Note**: - - **Replace *NOVA_DBPASS* to set the password for the nova database.** - - ```shell - source ~/.admin-openrc (CTL) - ``` - - Run the following command to create the Nova service certificate: - - ```shell - openstack user create --domain default --password-prompt nova (CTL) - openstack role add --project service --user nova admin (CTL) - openstack service create --name nova --description "OpenStack Compute" compute (CTL) - ``` - - Create a Nova API endpoint. - - ```shell - openstack endpoint create --region RegionOne compute public http://controller:8774/v2.1 (CTL) - openstack endpoint create --region RegionOne compute internal http://controller:8774/v2.1 (CTL) - openstack endpoint create --region RegionOne compute admin http://controller:8774/v2.1 (CTL) - ``` - -2. Install the software packages: - - ```shell - yum install openstack-nova-api openstack-nova-conductor \ (CTL) - openstack-nova-novncproxy openstack-nova-scheduler - - yum install openstack-nova-compute (CPT) - ``` - - **Note**: - - **If the ARM64 architecture is used, you also need to run the following command:** - - ```shell - yum install edk2-aarch64 (CPT) - ``` - -3. Configure Nova: - - ```shell - vim /etc/nova/nova.conf - - [DEFAULT] - enabled_apis = osapi_compute,metadata - transport_url = rabbit://openstack:RABBIT_PASS@controller:5672/ - my_ip = 10.0.0.1 - use_neutron = true - firewall_driver = nova.virt.firewall.NoopFirewallDriver - compute_driver=libvirt.LibvirtDriver (CPT) - instances_path = /var/lib/nova/instances/ (CPT) - lock_path = /var/lib/nova/tmp (CPT) - - [api_database] - connection = mysql+pymysql://nova:NOVA_DBPASS@controller/nova_api (CTL) - - [database] - connection = mysql+pymysql://nova:NOVA_DBPASS@controller/nova (CTL) - - [api] - auth_strategy = keystone - - [keystone_authtoken] - www_authenticate_uri = http://controller:5000/ - auth_url = http://controller:5000/ - memcached_servers = controller:11211 - auth_type = password - project_domain_name = Default - user_domain_name = Default - project_name = service - username = nova - password = NOVA_PASS - - [vnc] - enabled = true - server_listen = $my_ip - server_proxyclient_address = $my_ip - novncproxy_base_url = http://controller:6080/vnc_auto.html (CPT) - - [glance] - api_servers = http://controller:9292 - - [oslo_concurrency] - lock_path = /var/lib/nova/tmp (CTL) - - [placement] - region_name = RegionOne - project_domain_name = Default - project_name = service - auth_type = password - user_domain_name = Default - auth_url = http://controller:5000/v3 - username = placement - password = PLACEMENT_PASS - - [neutron] - auth_url = http://controller:5000 - auth_type = password - project_domain_name = default - user_domain_name = default - region_name = RegionOne - project_name = service - username = neutron - password = NEUTRON_PASS - service_metadata_proxy = true (CTL) - metadata_proxy_shared_secret = METADATA_SECRET (CTL) - ``` - - Description - - In the **[default]** section, enable the compute and metadata APIs, configure the RabbitMQ message queue entry, configure **my_ip**, and enable the network service **neutron**. - - In the **[api_database]** and **[database]** sections, configure the database entry. - - In the **[api]** and **[keystone_authtoken]** sections, configure the identity service entry. - - In the **[vnc]** section, enable and configure the entry for the remote console. - - In the **[glance]** section, configure the API address for the image service. - - In the **[oslo_concurrency]** section, configure the lock path. - - In the **[placement]** section, configure the entry of the Placement service. - - **Note**: - - **Replace *RABBIT_PASS* with the password of the openstack user in RabbitMQ.** - - **Set *my_ip* to the management IP address of the controller node.** - - **Replace *NOVA_DBPASS* with the password of the nova database.** - - **Replace *NOVA_PASS* with the password of the nova user.** - - **Replace *PLACEMENT_PASS* with the password of the placement user.** - - **Replace *NEUTRON_PASS* with the password of the neutron user.** - - **Replace *METADATA_SECRET* with a proper metadata agent secret.** - - Others - - Check whether VM hardware acceleration (x86 architecture) is supported: - - ```shell - egrep -c '(vmx|svm)' /proc/cpuinfo (CPT) - ``` - - If the returned value is **0**, hardware acceleration is not supported. You need to configure libvirt to use QEMU instead of KVM. - - ```shell - vim /etc/nova/nova.conf (CPT) - - [libvirt] - virt_type = qemu - ``` - - If the returned value is **1** or a larger value, hardware acceleration is supported. You can set the value of **virt_type** to **kvm**. - - **Note**: - - **If the ARM64 architecture is used, you also need to run the following command on the compute node:** - - ```shell - - mkdir -p /usr/share/AAVMF - chown nova:nova /usr/share/AAVMF - - ln -s /usr/share/edk2/aarch64/QEMU_EFI-pflash.raw \ - /usr/share/AAVMF/AAVMF_CODE.fd - ln -s /usr/share/edk2/aarch64/vars-template-pflash.raw \ - /usr/share/AAVMF/AAVMF_VARS.fd - - vim /etc/libvirt/qemu.conf - - nvram = ["/usr/share/AAVMF/AAVMF_CODE.fd: \ - /usr/share/AAVMF/AAVMF_VARS.fd", \ - "/usr/share/edk2/aarch64/QEMU_EFI-pflash.raw: \ - /usr/share/edk2/aarch64/vars-template-pflash.raw"] - ``` - In addition, when the deployment environment in the ARM architecture is nested virtualization, configure the **[libvirt]** section as follows: - - ```shell - [libvirt] - virt_type = qemu - cpu_mode = custom - cpu_model = cortex-a72 - ``` - -4. Synchronize the database. - - Run the following command to synchronize the **nova-api** database: - - ```shell - su -s /bin/sh -c "nova-manage api_db sync" nova (CTL) - ``` - - Run the following command to register the **cell0** database: - - ```shell - su -s /bin/sh -c "nova-manage cell_v2 map_cell0" nova (CTL) - ``` - - Create the **cell1** cell: - - ```shell - su -s /bin/sh -c "nova-manage cell_v2 create_cell --name=cell1 --verbose" nova (CTL) - ``` - - Synchronize the **nova** database: - - ```shell - su -s /bin/sh -c "nova-manage db sync" nova (CTL) - ``` - - Verify whether **cell0** and **cell1** are correctly registered: - - ```shell - su -s /bin/sh -c "nova-manage cell_v2 list_cells" nova (CTL) - ``` - - Add compute node to the OpenStack cluster: - - ```shell - su -s /bin/sh -c "nova-manage cell_v2 discover_hosts --verbose" nova (CPT) - ``` - -5. Start the services: - - ```shell - systemctl enable \ (CTL) - openstack-nova-api.service \ - openstack-nova-scheduler.service \ - openstack-nova-conductor.service \ - openstack-nova-novncproxy.service - - systemctl start \ (CTL) - openstack-nova-api.service \ - openstack-nova-scheduler.service \ - openstack-nova-conductor.service \ - openstack-nova-novncproxy.service - ``` - - ```shell - systemctl enable libvirtd.service openstack-nova-compute.service (CPT) - systemctl start libvirtd.service openstack-nova-compute.service (CPT) - ``` - -6. Perform the verification. - - ```shell - source ~/.admin-openrc (CTL) - ``` - - List the service components to verify that each process is successfully started and registered: - - ```shell - openstack compute service list (CTL) - ``` - - List the API endpoints in the identity service to verify the connection to the identity service: - - ```shell - openstack catalog list (CTL) - ``` - - List the images in the image service to verify the connections: - - ```shell - openstack image list (CTL) - ``` - - Check whether the cells are running properly and whether other prerequisites are met. - - ```shell - nova-status upgrade check (CTL) - ``` - -### Installing Neutron - -1. Create the database, service credentials, and API endpoints. - - Create the database: - - ```sql - mysql -u root -p (CTL) - - MariaDB [(none)]> CREATE DATABASE neutron; - MariaDB [(none)]> GRANT ALL PRIVILEGES ON neutron.* TO 'neutron'@'localhost' \ - IDENTIFIED BY 'NEUTRON_DBPASS'; - MariaDB [(none)]> GRANT ALL PRIVILEGES ON neutron.* TO 'neutron'@'%' \ - IDENTIFIED BY 'NEUTRON_DBPASS'; - MariaDB [(none)]> exit - ``` - - ***Note*** - - **Replace *NEUTRON_DBPASS* to set the password for the neutron database.** - - ```shell - source ~/.admin-openrc (CTL) - ``` - - Create the **neutron** service credential: - - ```shell - openstack user create --domain default --password-prompt neutron (CTL) - openstack role add --project service --user neutron admin (CTL) - openstack service create --name neutron --description "OpenStack Networking" network (CTL) - ``` - - Create the API endpoints of the Neutron service: - - ```shell - openstack endpoint create --region RegionOne network public http://controller:9696 (CTL) - openstack endpoint create --region RegionOne network internal http://controller:9696 (CTL) - openstack endpoint create --region RegionOne network admin http://controller:9696 (CTL) - ``` - -2. Install the software packages: - - ```shell - yum install openstack-neutron openstack-neutron-linuxbridge ebtables ipset \ (CTL) - openstack-neutron-ml2 - ``` - - ```shell - yum install openstack-neutron-linuxbridge ebtables ipset (CPT) - ``` - -3. Configure Neutron. - - Set the main configuration items: - - ```shell - vim /etc/neutron/neutron.conf - - [database] - connection = mysql+pymysql://neutron:NEUTRON_DBPASS@controller/neutron (CTL) - - [DEFAULT] - core_plugin = ml2 (CTL) - service_plugins = router (CTL) - allow_overlapping_ips = true (CTL) - transport_url = rabbit://openstack:RABBIT_PASS@controller - auth_strategy = keystone - notify_nova_on_port_status_changes = true (CTL) - notify_nova_on_port_data_changes = true (CTL) - api_workers = 3 (CTL) - - [keystone_authtoken] - www_authenticate_uri = http://controller:5000 - auth_url = http://controller:5000 - memcached_servers = controller:11211 - auth_type = password - project_domain_name = Default - user_domain_name = Default - project_name = service - username = neutron - password = NEUTRON_PASS - - [nova] - auth_url = http://controller:5000 (CTL) - auth_type = password (CTL) - project_domain_name = Default (CTL) - user_domain_name = Default (CTL) - region_name = RegionOne (CTL) - project_name = service (CTL) - username = nova (CTL) - password = NOVA_PASS (CTL) - - [oslo_concurrency] - lock_path = /var/lib/neutron/tmp - ``` - - ***Description*** - - Configure the database entry in the **[database]** section. - - Enable the ML2 and router plugins, allow IP address overlapping, and configure the RabbitMQ message queue entry in the **[default]** section. - - Configure the identity authentication service entry in the **[default]** and **[keystone]** sections. - - Enable the network to notify the change of the compute network topology in the **[default]** and **[nova]** sections. - - Configure the lock path in the **[oslo_concurrency]** section. - - ***Note*** - - **Replace *NEUTRON_DBPASS* with the password of the neutron database.** - - **Replace *RABBIT_PASS* with the password of the openstack user in RabbitMQ.** - - **Replace *NEUTRON_PASS* with the password of the neutron user.** - - **Replace *NOVA_PASS* with the password of the nova user.** - - Configure the ML2 plugin: - - ```shell - vim /etc/neutron/plugins/ml2/ml2_conf.ini - - [ml2] - type_drivers = flat,vlan,vxlan - tenant_network_types = vxlan - mechanism_drivers = linuxbridge,l2population - extension_drivers = port_security - - [ml2_type_flat] - flat_networks = provider - - [ml2_type_vxlan] - vni_ranges = 1:1000 - - [securitygroup] - enable_ipset = true - ``` - - Create the symbolic link for /etc/neutron/plugin.ini. - - ```shell - ln -s /etc/neutron/plugins/ml2/ml2_conf.ini /etc/neutron/plugin.ini - ``` - - **Note** - - **Enable flat, vlan, and vxlan networks, enable the linuxbridge and l2population mechanisms, and enable the port security extension driver in the [ml2] section.** - - **Configure the flat network as the provider virtual network in the [ml2_type_flat] section.** - - **Configure the range of the VXLAN network identifier in the [ml2_type_vxlan] section.** - - **Set ipset enabled in the [securitygroup] section.** - - **Remarks** - - **The actual configurations of l2 can be modified based as required. In this example, the provider network + linuxbridge is used.** - - Configure the Linux bridge agent: - - ```shell - vim /etc/neutron/plugins/ml2/linuxbridge_agent.ini - - [linux_bridge] - physical_interface_mappings = provider:PROVIDER_INTERFACE_NAME - - [vxlan] - enable_vxlan = true - local_ip = OVERLAY_INTERFACE_IP_ADDRESS - l2_population = true - - [securitygroup] - enable_security_group = true - firewall_driver = neutron.agent.linux.iptables_firewall.IptablesFirewallDriver - ``` - - ***Description*** - - Map the provider virtual network to the physical network interface in the **[linux_bridge]** section. - - Enable the VXLAN overlay network, configure the IP address of the physical network interface that processes the overlay network, and enable layer-2 population in the **[vxlan]** section. - - Enable the security group and configure the linux bridge iptables firewall driver in the **[securitygroup]** section. - - ***Note*** - - **Replace *PROVIDER_INTERFACE_NAME* with the physical network interface.** - - **Replace *OVERLAY_INTERFACE_IP_ADDRESS* with the management IP address of the controller node.** - - Configure the Layer-3 agent: - - ```shell - vim /etc/neutron/l3_agent.ini (CTL) - - [DEFAULT] - interface_driver = linuxbridge - ``` - - ***Description*** - - Set the interface driver to linuxbridge in the **[default]** section. - - Configure the DHCP agent: - - ```shell - vim /etc/neutron/dhcp_agent.ini (CTL) - - [DEFAULT] - interface_driver = linuxbridge - dhcp_driver = neutron.agent.linux.dhcp.Dnsmasq - enable_isolated_metadata = true - ``` - - ***Description*** - - In the **[default]** section, configure the linuxbridge interface driver and Dnsmasq DHCP driver, and enable the isolated metadata. - - Configure the metadata agent: - - ```shell - vim /etc/neutron/metadata_agent.ini (CTL) - - [DEFAULT] - nova_metadata_host = controller - metadata_proxy_shared_secret = METADATA_SECRET - ``` - - ***Description*** - - In the **[default]**, configure the metadata host and the shared secret. - - ***Note*** - - **Replace *METADATA_SECRET* with a proper metadata agent secret.** - -4. Configure Nova: - - ```shell - vim /etc/nova/nova.conf - - [neutron] - auth_url = http://controller:5000 - auth_type = password - project_domain_name = Default - user_domain_name = Default - region_name = RegionOne - project_name = service - username = neutron - password = NEUTRON_PASS - service_metadata_proxy = true (CTL) - metadata_proxy_shared_secret = METADATA_SECRET (CTL) - ``` - - ***Description*** - - In the **[neutron]** section, configure the access parameters, enable the metadata agent, and configure the secret. - - ***Note*** - - **Replace *NEUTRON_PASS* with the password of the neutron user.** - - **Replace *METADATA_SECRET* with a proper metadata agent secret.** - -5. Synchronize the database: - - ```shell - su -s /bin/sh -c "neutron-db-manage --config-file /etc/neutron/neutron.conf \ - --config-file /etc/neutron/plugins/ml2/ml2_conf.ini upgrade head" neutron - ``` - -6. Run the following command to restart the compute API service: - - ```shell - systemctl restart openstack-nova-api.service - ``` - -7. Start the network service: - - ```shell - systemctl enable neutron-server.service neutron-linuxbridge-agent.service \ (CTL) - neutron-dhcp-agent.service neutron-metadata-agent.service \ - neutron-l3-agent.service - - systemctl restart neutron-server.service neutron-linuxbridge-agent.service \ (CTL) - neutron-dhcp-agent.service neutron-metadata-agent.service \ - neutron-l3-agent.service - - systemctl enable neutron-linuxbridge-agent.service (CPT) - systemctl restart neutron-linuxbridge-agent.service openstack-nova-compute.service (CPT) - ``` - -8. Perform the verification. - - Run the following command to verify whether the Neutron agent is started successfully: - - ```shell - openstack network agent list - ``` - -### Installing Cinder - -1. Create the database, service credentials, and API endpoints. - - Create the database: - - ```sql - mysql -u root -p - - MariaDB [(none)]> CREATE DATABASE cinder; - MariaDB [(none)]> GRANT ALL PRIVILEGES ON cinder.* TO 'cinder'@'localhost' \ - IDENTIFIED BY 'CINDER_DBPASS'; - MariaDB [(none)]> GRANT ALL PRIVILEGES ON cinder.* TO 'cinder'@'%' \ - IDENTIFIED BY 'CINDER_DBPASS'; - MariaDB [(none)]> exit - ``` - - ***Note*** - - **Replace *CINDER_DBPASS* to set the password for the cinder database.** - - ```shell - source ~/.admin-openrc - ``` - - Create the Cinder service credentials: - - ```shell - openstack user create --domain default --password-prompt cinder - openstack role add --project service --user cinder admin - openstack service create --name cinderv2 --description "OpenStack Block Storage" volumev2 - openstack service create --name cinderv3 --description "OpenStack Block Storage" volumev3 - ``` - - Create the API endpoints for the block storage service: - - ```shell - openstack endpoint create --region RegionOne volumev2 public http://controller:8776/v2/%\(project_id\)s - openstack endpoint create --region RegionOne volumev2 internal http://controller:8776/v2/%\(project_id\)s - openstack endpoint create --region RegionOne volumev2 admin http://controller:8776/v2/%\(project_id\)s - openstack endpoint create --region RegionOne volumev3 public http://controller:8776/v3/%\(project_id\)s - openstack endpoint create --region RegionOne volumev3 internal http://controller:8776/v3/%\(project_id\)s - openstack endpoint create --region RegionOne volumev3 admin http://controller:8776/v3/%\(project_id\)s - ``` - -2. Install the software packages: - - ```shell - yum install openstack-cinder-api openstack-cinder-scheduler (CTL) - ``` - - ```shell - yum install lvm2 device-mapper-persistent-data scsi-target-utils rpcbind nfs-utils \ (STG) - openstack-cinder-volume openstack-cinder-backup - ``` - -3. Prepare the storage devices. The following is an example: - - ```shell - pvcreate /dev/vdb - vgcreate cinder-volumes /dev/vdb - - vim /etc/lvm/lvm.conf - - - devices { - ... - filter = [ "a/vdb/", "r/.*/"] - ``` - - ***Description*** - - In the **devices** section, add filters to allow the **/dev/vdb** devices and reject other devices. - -4. Prepare the NFS: - - ```shell - mkdir -p /root/cinder/backup - - cat << EOF >> /etc/export - /root/cinder/backup 192.168.1.0/24(rw,sync,no_root_squash,no_all_squash) - EOF - - ``` - -5. Configure Cinder: - - ```shell - vim /etc/cinder/cinder.conf - - [DEFAULT] - transport_url = rabbit://openstack:RABBIT_PASS@controller - auth_strategy = keystone - my_ip = 10.0.0.11 - enabled_backends = lvm (STG) - backup_driver=cinder.backup.drivers.nfs.NFSBackupDriver (STG) - backup_share=HOST:PATH (STG) - - [database] - connection = mysql+pymysql://cinder:CINDER_DBPASS@controller/cinder - - [keystone_authtoken] - www_authenticate_uri = http://controller:5000 - auth_url = http://controller:5000 - memcached_servers = controller:11211 - auth_type = password - project_domain_name = Default - user_domain_name = Default - project_name = service - username = cinder - password = CINDER_PASS - - [oslo_concurrency] - lock_path = /var/lib/cinder/tmp - - [lvm] - volume_driver = cinder.volume.drivers.lvm.LVMVolumeDriver (STG) - volume_group = cinder-volumes (STG) - iscsi_protocol = iscsi (STG) - iscsi_helper = tgtadm (STG) - ``` - - ***Description*** - - In the **[database]** section, configure the database entry. - - In the **[DEFAULT]** section, configure the RabbitMQ message queue entry and **my_ip**. - - In the **[DEFAULT]** and **[keystone_authtoken]** sections, configure the identity authentication service entry. - - In the **[oslo_concurrency]** section, configure the lock path. - - ***Note*** - - **Replace *CINDER_DBPASS* with the password of the cinder database.** - - **Replace *RABBIT_PASS* with the password of the openstack user in RabbitMQ.** - - **Set *my_ip* to the management IP address of the controller node.** - - **Replace *CINDER_PASS* with the password of the cinder user.** - - **Replace *HOST:PATH* with the host IP address and the shared path of the NFS.** - -6. Synchronize the database: - - ```shell - su -s /bin/sh -c "cinder-manage db sync" cinder (CTL) - ``` - -7. Configure Nova: - - ```shell - vim /etc/nova/nova.conf (CTL) - - [cinder] - os_region_name = RegionOne - ``` - -8. Restart the compute API service: - - ```shell - systemctl restart openstack-nova-api.service - ``` - -9. Start the Cinder service: - - ```shell - systemctl enable openstack-cinder-api.service openstack-cinder-scheduler.service (CTL) - systemctl start openstack-cinder-api.service openstack-cinder-scheduler.service (CTL) - ``` - - ```shell - systemctl enable rpcbind.service nfs-server.service tgtd.service iscsid.service \ (STG) - openstack-cinder-volume.service \ - openstack-cinder-backup.service - systemctl start rpcbind.service nfs-server.service tgtd.service iscsid.service \ (STG) - openstack-cinder-volume.service \ - openstack-cinder-backup.service - ``` - - ***Note*** - - If the Cinder volumes are mounted using tgtadm, modify the **/etc/tgt/tgtd.conf** file as follows to ensure that tgtd can discover the iscsi target of cinder-volume. - - ```shell - include /var/lib/cinder/volumes/* - ``` - -10. Perform the verification: - - ```shell - source ~/.admin-openrc - openstack volume service list - ``` - -### Installing Horizon - -1. Install the software package: - - ```shell - yum install openstack-dashboard - ``` - -2. Modify the file. - - Modify the variables: - - ```text - vim /etc/openstack-dashboard/local_settings - - OPENSTACK_HOST = "controller" - ALLOWED_HOSTS = ['*', ] - - SESSION_ENGINE = 'django.contrib.sessions.backends.cache' - - CACHES = { - 'default': { - 'BACKEND': 'django.core.cache.backends.memcached.MemcachedCache', - 'LOCATION': 'controller:11211', - } - } - - OPENSTACK_KEYSTONE_URL = "http://%s:5000/v3" % OPENSTACK_HOST - OPENSTACK_KEYSTONE_MULTIDOMAIN_SUPPORT = True - OPENSTACK_KEYSTONE_DEFAULT_DOMAIN = "Default" - OPENSTACK_KEYSTONE_DEFAULT_ROLE = "user" - - OPENSTACK_API_VERSIONS = { - "identity": 3, - "image": 2, - "volume": 3, - } - ``` - -3. Restart the httpd service: - - ```shell - systemctl restart httpd.service memcached.service - ``` - -4. Perform the verification. - Open the browser, enter in the address bar, and log in to Horizon. - - ***Note*** - - **Replace *HOSTIP* with the management plane IP address of the controller node.** - -### Installing Tempest - -Tempest is the integrated test service of OpenStack. If you need to run a fully automatic test of the functions of the installed OpenStack environment, you are advised to use Tempest. Otherwise, you can choose not to install it. - -1. Install Tempest: - - ```shell - yum install openstack-tempest - ``` - -2. Initialize the directory: - - ```shell - tempest init mytest - ``` - -3. Modify the configuration file: - - ```shell - cd mytest - vi etc/tempest.conf - ``` - - Configure the current OpenStack environment information in **tempest.conf**. For details, see the [official example](https://docs.openstack.org/tempest/latest/sampleconf.html). - -4. Perform the test: - - ```shell - tempest run - ``` - -5. (Optional) Install the tempest extensions. - The OpenStack services have provided some tempest test packages. You can install these packages to enrich the tempest test content. In Train, extension tests for Cinder, Glance, Keystone, Ironic and Trove are provided. You can run the following command to install and use them: - ``` - yum install python3-cinder-tempest-plugin python3-glance-tempest-plugin python3-ironic-tempest-plugin python3-keystone-tempest-plugin python3-trove-tempest-plugin - ``` - -### Installing Ironic - -Ironic is the bare metal service of OpenStack. If you need to deploy bare metal machines, Ironic is recommended. Otherwise, you can choose not to install it. - -1. Set the database. - - The bare metal service stores information in the database. Create a **ironic** database that can be accessed by the **ironic** user and replace **IRONIC_DBPASSWORD** with a proper password. - - ```sql - mysql -u root -p - - MariaDB [(none)]> CREATE DATABASE ironic CHARACTER SET utf8; - MariaDB [(none)]> GRANT ALL PRIVILEGES ON ironic.* TO 'ironic'@'localhost' \ - IDENTIFIED BY 'IRONIC_DBPASSWORD'; - MariaDB [(none)]> GRANT ALL PRIVILEGES ON ironic.* TO 'ironic'@'%' \ - IDENTIFIED BY 'IRONIC_DBPASSWORD'; - ``` - -2. Install the software packages. - - ```shell - yum install openstack-ironic-api openstack-ironic-conductor python3-ironicclient - ``` - - Start the services. - - ```shell - systemctl enable openstack-ironic-api openstack-ironic-conductor - systemctl start openstack-ironic-api openstack-ironic-conductor - ``` - -3. Create service user authentication. - - 1. Create the bare metal service user: - - ```shell - openstack user create --password IRONIC_PASSWORD \ - --email ironic@example.com ironic - openstack role add --project service --user ironic admin - openstack service create --name ironic \ - --description "Ironic baremetal provisioning service" baremetal - ``` - - 1. Create the bare metal service access entries: - - ```shell - openstack endpoint create --region RegionOne baremetal admin http://$IRONIC_NODE:6385 - openstack endpoint create --region RegionOne baremetal public http://$IRONIC_NODE:6385 - openstack endpoint create --region RegionOne baremetal internal http://$IRONIC_NODE:6385 - ``` - -4. Configure the ironic-api service. - - Configuration file path: **/etc/ironic/ironic.conf** - - 1. Use **connection** to configure the location of the database as follows. Replace **IRONIC_DBPASSWORD** with the password of user **ironic** and replace **DB_IP** with the IP address of the database server. - - ```shell - [database] - - # The SQLAlchemy connection string used to connect to the - # database (string value) - - connection = mysql+pymysql://ironic:IRONIC_DBPASSWORD@DB_IP/ironic - ``` - - 1. Configure the ironic-api service to use the RabbitMQ message broker. Replace **RPC_\*** with the detailed address and the credential of RabbitMQ. - - ```shell - [DEFAULT] - - # A URL representing the messaging driver to use and its full - # configuration. (string value) - - transport_url = rabbit://RPC_USER:RPC_PASSWORD@RPC_HOST:RPC_PORT/ - ``` - - You can also use json-rpc instead of RabbitMQ. - - 1. Configure the ironic-api service to use the credential of the identity authentication service. Replace **PUBLIC_IDENTITY_IP** with the public IP address of the identity authentication server and **PRIVATE_IDENTITY_IP** with the private IP address of the identity authentication server, replace **IRONIC_PASSWORD** with the password of the **ironic** user in the identity authentication service. - - ```shell - [DEFAULT] - - # Authentication strategy used by ironic-api: one of - # "keystone" or "noauth". "noauth" should not be used in a - # production environment because all authentication will be - # disabled. (string value) - - auth_strategy=keystone - - [keystone_authtoken] - # Authentication type to load (string value) - auth_type=password - # Complete public Identity API endpoint (string value) - www_authenticate_uri=http://PUBLIC_IDENTITY_IP:5000 - # Complete admin Identity API endpoint. (string value) - auth_url=http://PRIVATE_IDENTITY_IP:5000 - # Service username. (string value) - username=ironic - # Service account password. (string value) - password=IRONIC_PASSWORD - # Service tenant name. (string value) - project_name=service - # Domain name containing project (string value) - project_domain_name=Default - # User's domain name (string value) - user_domain_name=Default - - ``` - - 1. Create the bare metal service database table: - - ```shell - ironic-dbsync --config-file /etc/ironic/ironic.conf create_schema - ``` - - 1. Restart the ironic-api service: - - ```shell - sudo systemctl restart openstack-ironic-api - ``` - -5. Configure the ironic-conductor service. - - 1. Replace **HOST_IP** with the IP address of the conductor host. - - ```shell - [DEFAULT] - - # IP address of this host. If unset, will determine the IP - # programmatically. If unable to do so, will use "127.0.0.1". - # (string value) - - my_ip=HOST_IP - ``` - - 1. Specifies the location of the database. ironic-conductor must use the same configuration as ironic-api. Replace **IRONIC_DBPASSWORD** with the password of user **ironic** and replace **DB_IP** with the IP address of the database server. - - ```shell - [database] - - # The SQLAlchemy connection string to use to connect to the - # database. (string value) - - connection = mysql+pymysql://ironic:IRONIC_DBPASSWORD@DB_IP/ironic - ``` - - 1. Configure the ironic-api service to use the RabbitMQ message broker. ironic-conductor must use the same configuration as ironic-api. Replace **RPC_\*** with the detailed address and the credential of RabbitMQ. - - ```shell - [DEFAULT] - - # A URL representing the messaging driver to use and its full - # configuration. (string value) - - transport_url = rabbit://RPC_USER:RPC_PASSWORD@RPC_HOST:RPC_PORT/ - ``` - - You can also use json-rpc instead of RabbitMQ. - - 1. Configure the credentials to access other OpenStack services. - - To communicate with other OpenStack services, the bare metal service needs to use the service users to get authenticated by the OpenStack Identity service when requesting other services. The credentials of these users must be configured in each configuration file associated to the corresponding service. - - ```shell - [neutron] - Accessing the OpenStack network services. - [glance] - Accessing the OpenStack image service. - [swift] - Accessing the OpenStack object storage service. - [cinder] - Accessing the OpenStack block storage service. - [inspector] Accessing the OpenStack bare metal introspection service. - [service_catalog] - A special item to store the credential used by the bare metal service. The credential is used to discover the API URL endpoint registered in the OpenStack identity authentication service catalog by the bare metal service. - ``` - - For simplicity, you can use one service user for all services. For backward compatibility, the user name must be the same as that configured in [keystone_authtoken] of the ironic-api service. However, this is not mandatory. You can also create and configure a different service user for each service. - - In the following example, the authentication information for the user to access the OpenStack network service is configured as follows: - - ```shell - The network service is deployed in the identity authentication service domain named RegionOne. Only the public endpoint interface is registered in the service catalog. - - A specific CA SSL certificate is used for HTTPS connection when sending a request. - - The same service user as that configured for ironic-api. - - The dynamic password authentication plugin discovers a proper identity authentication service API version based on other options. - ``` - - ```shell - [neutron] - - # Authentication type to load (string value) - auth_type = password - # Authentication URL (string value) - auth_url=https://IDENTITY_IP:5000/ - # Username (string value) - username=ironic - # User's password (string value) - password=IRONIC_PASSWORD - # Project name to scope to (string value) - project_name=service - # Domain ID containing project (string value) - project_domain_id=default - # User's domain id (string value) - user_domain_id=default - # PEM encoded Certificate Authority to use when verifying - # HTTPs connections. (string value) - cafile=/opt/stack/data/ca-bundle.pem - # The default region_name for endpoint URL discovery. (string - # value) - region_name = RegionOne - # List of interfaces, in order of preference, for endpoint - # URL. (list value) - valid_interfaces=public - ``` - - By default, to communicate with other services, the bare metal service attempts to discover a proper endpoint of the service through the service catalog of the identity authentication service. If you want to use a different endpoint for a specific service, specify the endpoint_override option in the bare metal service configuration file. - - ```shell - [neutron] ... endpoint_override = - ``` - - 1. Configure the allowed drivers and hardware types. - - Set enabled_hardware_types to specify the hardware types that can be used by ironic-conductor: - - ```shell - [DEFAULT] enabled_hardware_types = ipmi - ``` - - Configure hardware interfaces: - - ```shell - enabled_boot_interfaces = pxe enabled_deploy_interfaces = direct,iscsi enabled_inspect_interfaces = inspector enabled_management_interfaces = ipmitool enabled_power_interfaces = ipmitool - ``` - - Configure the default value of the interface: - - ```shell - [DEFAULT] default_deploy_interface = direct default_network_interface = neutron - ``` - - If any driver that uses Direct Deploy is enabled, you must install and configure the Swift backend of the image service. The Ceph object gateway (RADOS gateway) can also be used as the backend of the image service. - - 1. Restart the ironic-conductor service: - - ```shell - sudo systemctl restart openstack-ironic-conductor - ``` - -6. Configure the httpd service. - - 1. Create the root directory of the httpd used by Ironic, and set the owner and owner group. The directory path must be the same as the path specified by the **http_root** configuration item in the **[deploy]** group in **/etc/ironic/ironic.conf**. - - ``` - mkdir -p /var/lib/ironic/httproot ``chown ironic.ironic /var/lib/ironic/httproot - ``` - - 2. Install and configure the httpd Service. - - 1. Install the httpd service. If the httpd service is already installed, skip this step. - - ``` - yum install httpd -y - ``` - 2. Create the **/etc/httpd/conf.d/openstack-ironic-httpd.conf** file. The file content is as follows: - - ``` - Listen 8080 - - - ServerName ironic.openeuler.com - - ErrorLog "/var/log/httpd/openstack-ironic-httpd-error_log" - CustomLog "/var/log/httpd/openstack-ironic-httpd-access_log" "%h %l %u %t \"%r\" %>s %b" - - DocumentRoot "/var/lib/ironic/httproot" - - Options Indexes FollowSymLinks - Require all granted - - LogLevel warn - AddDefaultCharset UTF-8 - EnableSendfile on - - - ``` - - The listening port must be the same as the port specified by **http_url** in the **[deploy]** section of **/etc/ironic/ironic.conf**. - - 3. Restart the httpd service: - - ``` - systemctl restart httpd - ``` - - - -8. Create the deploy ramdisk image. - - The ramdisk image of Train can be created using the ironic-python-agent service or disk-image-builder tool. You can also use the latest ironic-python-agent-builder provided by the community. You can also use other tools. - To use the Train native tool, you need to install the corresponding software package. - - ```shell - yum install openstack-ironic-python-agent - or - yum install diskimage-builder - ``` - - For details, see the [official document](https://docs.openstack.org/ironic/queens/install/deploy-ramdisk.html). - - The following describes how to use the ironic-python-agent-builder to build the deploy image used by ironic. - - 1. Install ironic-python-agent-builder. - - 1. Install the tool: - - ```shell - pip install ironic-python-agent-builder - ``` - - 2. Modify the python interpreter in the following files: - - ```shell - /usr/bin/yum /usr/libexec/urlgrabber-ext-down - ``` - - 3. Install the other necessary tools: - - ```shell - yum install git - ``` - - **DIB** depends on the `semanage` command. Therefore, check whether the `semanage --help` command is available before creating an image. If the system displays a message indicating that the command is unavailable, install the command: - - ```shell - # Check which package needs to be installed. - [root@localhost ~]# yum provides /usr/sbin/semanage - Loaded plug-in: fastestmirror - Loading mirror speeds from cached hostfile - * base: mirror.vcu.edu - * extras: mirror.vcu.edu - * updates: mirror.math.princeton.edu - policycoreutils-python-2.5-34.el7.aarch64 : SELinux policy core python utilities - Source: base - Matching source: - File name: /usr/sbin/semanage - # Install. - [root@localhost ~]# yum install policycoreutils-python - ``` - - 2. Create the image. - - For Arm architecture, add the following information: - ```shell - export ARCH=aarch64 - ``` - - Basic usage: - - ```shell - usage: ironic-python-agent-builder [-h] [-r RELEASE] [-o OUTPUT] [-e ELEMENT] - [-b BRANCH] [-v] [--extra-args EXTRA_ARGS] - distribution - - positional arguments: - distribution Distribution to use - - optional arguments: - -h, --help show this help message and exit - -r RELEASE, --release RELEASE - Distribution release to use - -o OUTPUT, --output OUTPUT - Output base file name - -e ELEMENT, --element ELEMENT - Additional DIB element to use - -b BRANCH, --branch BRANCH - If set, override the branch that is used for ironic- - python-agent and requirements - -v, --verbose Enable verbose logging in diskimage-builder - --extra-args EXTRA_ARGS - Extra arguments to pass to diskimage-builder - ``` - - Example: - - ```shell - ironic-python-agent-builder centos -o /mnt/ironic-agent-ssh -b origin/stable/rocky - ``` - - 3. Allow SSH login. - - Initialize the environment variables and create the image: - - ```shell - export DIB_DEV_USER_USERNAME=ipa \ - export DIB_DEV_USER_PWDLESS_SUDO=yes \ - export DIB_DEV_USER_PASSWORD='123' - ironic-python-agent-builder centos -o /mnt/ironic-agent-ssh -b origin/stable/rocky -e selinux-permissive -e devuser - ``` - - 4. Specify the code repository. - - Initialize the corresponding environment variables and create the image: - - ```shell - # Specify the address and version of the repository. - DIB_REPOLOCATION_ironic_python_agent=git@172.20.2.149:liuzz/ironic-python-agent.git - DIB_REPOREF_ironic_python_agent=origin/develop - - # Clone code from Gerrit. - DIB_REPOLOCATION_ironic_python_agent=https://review.opendev.org/openstack/ironic-python-agent - DIB_REPOREF_ironic_python_agent=refs/changes/43/701043/1 - ``` - - Reference: [source-repositories](https://docs.openstack.org/diskimage-builder/latest/elements/source-repositories/README.html). - - The specified repository address and version are verified successfully. - - 5. Note - -The template of the PXE configuration file of the native OpenStack does not support the ARM64 architecture. You need to modify the native OpenStack code. - -In Train, Ironic provided by the community does not support the boot from ARM 64-bit UEFI PXE. As a result, the format of the generated grub.cfg file (generally in /tftpboot/) is incorrect, causing the PXE boot failure. - -You need to modify the code logic for generating the grub.cfg file. - -The following TLS error is reported when Ironic sends a request to IPA to query the command execution status: - -By default, both IPA and Ironic of Train have TLS authentication enabled to send requests to each other. Disable TLS authentication according to the description on the official website. - -1. Add **ipa-insecure=1** to the following configuration in the Ironic configuration file (**/etc/ironic/ironic.conf**): - -``` -[agent] -verify_ca = False - -[pxe] -pxe_append_params = nofb nomodeset vga=normal coreos.autologin ipa-insecure=1 -``` - -2. Add the IPA configuration file **/etc/ironic_python_agent/ironic_python_agent.conf** to the ramdisk image and configure the TLS as follows: - -**/etc/ironic_python_agent/ironic_python_agent.conf** (The **/etc/ironic_python_agent** directory must be created in advance.) - -``` -[DEFAULT] -enable_auto_tls = False -``` - -Set the permission: - -``` -chown -R ipa.ipa /etc/ironic_python_agent/ -``` - -3. Modify the startup file of the IPA service and add the configuration file option. - - vim usr/lib/systemd/system/ironic-python-agent.service - - ``` - [Unit] - Description=Ironic Python Agent - After=network-online.target - - [Service] - ExecStartPre=/sbin/modprobe vfat - ExecStart=/usr/local/bin/ironic-python-agent --config-file /etc/ironic_python_agent/ironic_python_agent.conf - Restart=always - RestartSec=30s - - [Install] - WantedBy=multi-user.target - ``` - - -Other services such as ironic-inspector are also provided for OpenStack Train. Install the services based on site requirements. - -### Installing Kolla - -Kolla provides the OpenStack service with the container-based deployment function that is ready for the production environment. - -The installation of Kolla is simple. You only need to install the corresponding RPM packages: - -``` -yum install openstack-kolla openstack-kolla-ansible -``` - -After the installation is complete, you can run commands such as `kolla-ansible`, `kolla-build`, `kolla-genpwd`, `kolla-mergepwd` to create an image or deploy a container environment. - -### Installing Trove -Trove is the database service of OpenStack. If you need to use the database service provided by OpenStack, Trove is recommended. Otherwise, you can choose not to install it. - -1. Set the database. - - The database service stores information in the database. Create a **trove** database that can be accessed by the **trove** user and replace **TROVE_DBPASSWORD** with a proper password. - - ```sql - mysql -u root -p - - MariaDB [(none)]> CREATE DATABASE trove CHARACTER SET utf8; - MariaDB [(none)]> GRANT ALL PRIVILEGES ON trove.* TO 'trove'@'localhost' \ - IDENTIFIED BY 'TROVE_DBPASSWORD'; - MariaDB [(none)]> GRANT ALL PRIVILEGES ON trove.* TO 'trove'@'%' \ - IDENTIFIED BY 'TROVE_DBPASSWORD'; - ``` - -2. Create service user authentication. - - 1. Create the **Trove** service user. - - ```shell - openstack user create --domain default --password-prompt trove - openstack role add --project service --user trove admin - openstack service create --name trove --description "Database" database - ``` - **Description:** Replace *TROVE_PASSWORD* with the password of the **trove** user. - - 1. Create the **Database** service access entry - - ```shell - openstack endpoint create --region RegionOne database public http://controller:8779/v1.0/%\(tenant_id\)s - openstack endpoint create --region RegionOne database internal http://controller:8779/v1.0/%\(tenant_id\)s - openstack endpoint create --region RegionOne database admin http://controller:8779/v1.0/%\(tenant_id\)s - ``` - -3. Install and configure the **Trove** components. - - 1. Install the **Trove** package: - ```shell script - yum install openstack-trove python3-troveclient - ``` - - 2. Configure **trove.conf**: - ```shell script - vim /etc/trove/trove.conf - - [DEFAULT] - log_dir = /var/log/trove - trove_auth_url = http://controller:5000/ - nova_compute_url = http://controller:8774/v2 - cinder_url = http://controller:8776/v1 - swift_url = http://controller:8080/v1/AUTH_ - rpc_backend = rabbit - transport_url = rabbit://openstack:RABBIT_PASS@controller:5672 - auth_strategy = keystone - add_addresses = True - api_paste_config = /etc/trove/api-paste.ini - nova_proxy_admin_user = admin - nova_proxy_admin_pass = ADMIN_PASSWORD - nova_proxy_admin_tenant_name = service - taskmanager_manager = trove.taskmanager.manager.Manager - use_nova_server_config_drive = True - # Set these if using Neutron Networking - network_driver = trove.network.neutron.NeutronDriver - network_label_regex = .* - - [database] - connection = mysql+pymysql://trove:TROVE_DBPASSWORD@controller/trove - - [keystone_authtoken] - www_authenticate_uri = http://controller:5000/ - auth_url = http://controller:5000/ - auth_type = password - project_domain_name = default - user_domain_name = default - project_name = service - username = trove - password = TROVE_PASSWORD - ``` - **Description:** - - In the **[Default]** section, **nova_compute_url** and **cinder_url** are endpoints created by Nova and Cinder in Keystone. - - **nova_proxy_XXX** is a user who can access the Nova service. In the preceding example, the **admin** user is used. - - **transport_url** is the **RabbitMQ** connection information, and **RABBIT_PASS** is the RabbitMQ password. - - In the **[database]** section, **connection** is the information of the database created for Trove in MySQL. - - Replace **TROVE_PASSWORD** in the Trove user information with the password of the **trove** user. - - 3. Configure **trove-guestagent.conf**: - ```shell script - vim /etc/trove/trove-guestagent.conf - - rabbit_host = controller - rabbit_password = RABBIT_PASS - trove_auth_url = http://controller:5000/ - ``` - **Description:** **guestagent** is an independent component in Trove and needs to be pre-built into the virtual machine image created by Trove using Nova. - After the database instance is created, the guestagent process is started to report heartbeat messages to the Trove through the message queue (RabbitMQ). - Therefore, you need to configure the user name and password of the RabbitMQ. - **Since Victoria, Trove uses a unified image to run different types of databases. The database service runs in the Docker container of the Guest VM.** - - Replace **RABBIT_PASS** with the RabbitMQ password. - - 4. Generate the **Trove** database table. - ```shell script - su -s /bin/sh -c "trove-manage db_sync" trove - ``` - -4. Complete the installation and configuration. - 1. Configure the **Trove** service to automatically start: - ```shell script - systemctl enable openstack-trove-api.service \ - openstack-trove-taskmanager.service \ - openstack-trove-conductor.service - ``` - 2. Start the services: - ```shell script - systemctl start openstack-trove-api.service \ - openstack-trove-taskmanager.service \ - openstack-trove-conductor.service - ``` -### Installing Swift - -Swift provides a scalable and highly available distributed object storage service, which is suitable for storing unstructured data in large scale. - -1. Create the service credentials and API endpoints. - - Create the service credential: - - ``` shell - # Create the swift user. - openstack user create --domain default --password-prompt swift - # Add the admin role for the swift user. - openstack role add --project service --user swift admin - # Create the swift service entity. - openstack service create --name swift --description "OpenStack Object Storage" object-store - ``` - - Create the Swift API endpoints. - - ```shell - openstack endpoint create --region RegionOne object-store public http://controller:8080/v1/AUTH_%\(project_id\)s - openstack endpoint create --region RegionOne object-store internal http://controller:8080/v1/AUTH_%\(project_id\)s - openstack endpoint create --region RegionOne object-store admin http://controller:8080/v1 - ``` - - -2. Install the software packages: - - ```shell - yum install openstack-swift-proxy python3-swiftclient python3-keystoneclient python3-keystonemiddleware memcached (CTL) - ``` - -3. Configure the proxy-server. - - The Swift RPM package contains a **proxy-server.conf** file which is basically ready to use. You only need to change the values of **ip** and swift **password** in the file. - - ***Note*** - - **Replace password with the password you set for the swift user in the identity service.** - -4. Install and configure the storage node. (STG) - - Install the supported program packages: - ```shell - yum install xfsprogs rsync - ``` - - Format the /dev/vdb and /dev/vdc devices into XFS: - - ```shell - mkfs.xfs /dev/vdb - mkfs.xfs /dev/vdc - ``` - - Create the mount point directory structure: - - ```shell - mkdir -p /srv/node/vdb - mkdir -p /srv/node/vdc - ``` - - Find the UUID of the new partition: - - ```shell - blkid - ``` - - Add the following to the **/etc/fstab** file: - - ```shell - UUID="" /srv/node/vdb xfs noatime 0 2 - UUID="" /srv/node/vdc xfs noatime 0 2 - ``` - - Mount the devices: - - ```shell - mount /srv/node/vdb - mount /srv/node/vdc - ``` - ***Note*** - - **If the disaster recovery function is not required, you only need to create one device and skip the following rsync configuration.** - - (Optional) Create or edit the **/etc/rsyncd.conf** file to include the following content: - - ```shell - [DEFAULT] - uid = swift - gid = swift - log file = /var/log/rsyncd.log - pid file = /var/run/rsyncd.pid - address = MANAGEMENT_INTERFACE_IP_ADDRESS - - [account] - max connections = 2 - path = /srv/node/ - read only = False - lock file = /var/lock/account.lock - - [container] - max connections = 2 - path = /srv/node/ - read only = False - lock file = /var/lock/container.lock - - [object] - max connections = 2 - path = /srv/node/ - read only = False - lock file = /var/lock/object.lock - ``` - **Replace *MANAGEMENT_INTERFACE_IP_ADDRESS* with the management network IP address of the storage node.** - - Start the rsyncd service and configure it to start upon system startup. - - ```shell - systemctl enable rsyncd.service - systemctl start rsyncd.service - ``` - -5. Install and configure the components on storage nodes. (STG) - - Install the software packages: - - ```shell - yum install openstack-swift-account openstack-swift-container openstack-swift-object - ``` - - Edit **account-server.conf**, **container-server.conf**, and **object-server.conf** in the **/etc/swift directory** and replace **bind_ip** with the management network IP address of the storage node. - - Ensure the proper ownership of the mount point directory structure. - - ```shell - chown -R swift:swift /srv/node - ``` - - Create the recon directory and ensure that it has the correct ownership. - - ```shell - mkdir -p /var/cache/swift - chown -R root:swift /var/cache/swift - chmod -R 775 /var/cache/swift - ``` - -6. Create the account ring. (CTL) - - Switch to the **/etc/swift** directory: - - ```shell - cd /etc/swift - ``` - - Create the basic **account.builder** file: - - ```shell - swift-ring-builder account.builder create 10 1 1 - ``` - - Add each storage node to the ring: - - ```shell - swift-ring-builder account.builder add --region 1 --zone 1 --ip STORAGE_NODE_MANAGEMENT_INTERFACE_IP_ADDRESS --port 6202 --device DEVICE_NAME --weight DEVICE_WEIGHT - ``` - - **Replace *STORAGE_NODE_MANAGEMENT_INTERFACE_IP_ADDRESS* with the management network IP address of the storage node. Replace *DEVICE_NAME* with the name of the storage device on the same storage node.** - - ***Note*** - **Repeat this command to each storage device on each storage node.** - - Verify the ring contents: - - ```shell - swift-ring-builder account.builder - ``` - - Rebalance the ring: - - ```shell - swift-ring-builder account.builder rebalance - ``` - -7. Create the container ring. (CTL) - - Switch to the **/etc/swift** directory: - - Create the basic **container.builder** file: - - ```shell - swift-ring-builder container.builder create 10 1 1 - ``` - - Add each storage node to the ring: - - ```shell - swift-ring-builder container.builder \ - add --region 1 --zone 1 --ip STORAGE_NODE_MANAGEMENT_INTERFACE_IP_ADDRESS --port 6201 \ - --device DEVICE_NAME --weight 100 - - ``` - - **Replace *STORAGE_NODE_MANAGEMENT_INTERFACE_IP_ADDRESS* with the management network IP address of the storage node. Replace *DEVICE_NAME* with the name of the storage device on the same storage node.** - - ***Note*** - **Repeat this command to every storage devices on every storage nodes.** - - Verify the ring contents: - - ```shell - swift-ring-builder container.builder - ``` - - Rebalance the ring: - - ```shell - swift-ring-builder container.builder rebalance - ``` - -8. Create the object ring. (CTL) - - Switch to the **/etc/swift** directory: - - Create the basic **object.builder** file: - - ```shell - swift-ring-builder object.builder create 10 1 1 - ``` - - Add each storage node to the ring: - - ```shell - swift-ring-builder object.builder \ - add --region 1 --zone 1 --ip STORAGE_NODE_MANAGEMENT_INTERFACE_IP_ADDRESS --port 6200 \ - --device DEVICE_NAME --weight 100 - ``` - - **Replace *STORAGE_NODE_MANAGEMENT_INTERFACE_IP_ADDRESS* with the management network IP address of the storage node. Replace *DEVICE_NAME* with the name of the storage device on the same storage node.** - - ***Note*** - **Repeat this command to every storage devices on every storage nodes.** - - Verify the ring contents: - - ```shell - swift-ring-builder object.builder - ``` - - Rebalance the ring: - - ```shell - swift-ring-builder object.builder rebalance - ``` - - Distribute ring configuration files: - - Copy **account.ring.gz**, **container.ring.gz**, and **object.ring.gz** to the **/etc/swift** directory on each storage node and any additional nodes running the proxy service. - - - -9. Complete the installation. - - Edit the **/etc/swift/swift.conf** file: - - ``` shell - [swift-hash] - swift_hash_path_suffix = test-hash - swift_hash_path_prefix = test-hash - - [storage-policy:0] - name = Policy-0 - default = yes - ``` - - **Replace test-hash with a unique value.** - - Copy the **swift.conf** file to the **/etc/swift** directory on each storage node and any additional nodes running the proxy service. - - Ensure correct ownership of the configuration directory on all nodes: - - ```shell - chown -R root:swift /etc/swift - ``` - - On the controller node and any additional nodes running the proxy service, start the object storage proxy service and its dependencies, and configure them to start upon system startup. - - ```shell - systemctl enable openstack-swift-proxy.service memcached.service - systemctl start openstack-swift-proxy.service memcached.service - ``` - - On the storage node, start the object storage services and configure them to start upon system startup. - - ```shell - systemctl enable openstack-swift-account.service openstack-swift-account-auditor.service openstack-swift-account-reaper.service openstack-swift-account-replicator.service - - systemctl start openstack-swift-account.service openstack-swift-account-auditor.service openstack-swift-account-reaper.service openstack-swift-account-replicator.service - - systemctl enable openstack-swift-container.service openstack-swift-container-auditor.service openstack-swift-container-replicator.service openstack-swift-container-updater.service - - systemctl start openstack-swift-container.service openstack-swift-container-auditor.service openstack-swift-container-replicator.service openstack-swift-container-updater.service - - systemctl enable openstack-swift-object.service openstack-swift-object-auditor.service openstack-swift-object-replicator.service openstack-swift-object-updater.service - - systemctl start openstack-swift-object.service openstack-swift-object-auditor.service openstack-swift-object-replicator.service openstack-swift-object-updater.service - ``` -### Installing Cyborg - -Cyborg provides acceleration device support for OpenStack, for example, GPUs, FPGAs, ASICs, NPs, SoCs, NVMe/NOF SSDs, ODPs, DPDKs, and SPDKs. - -1. Initialize the databases. - -``` -CREATE DATABASE cyborg; -GRANT ALL PRIVILEGES ON cyborg.* TO 'cyborg'@'localhost' IDENTIFIED BY 'CYBORG_DBPASS'; -GRANT ALL PRIVILEGES ON cyborg.* TO 'cyborg'@'%' IDENTIFIED BY 'CYBORG_DBPASS'; -``` - -2. Create Keystone resource objects. - -``` -$ openstack user create --domain default --password-prompt cyborg -$ openstack role add --project service --user cyborg admin -$ openstack service create --name cyborg --description "Acceleration Service" accelerator - -$ openstack endpoint create --region RegionOne \ - accelerator public http://:6666/v1 -$ openstack endpoint create --region RegionOne \ - accelerator internal http://:6666/v1 -$ openstack endpoint create --region RegionOne \ - accelerator admin http://:6666/v1 -``` - -3. Install Cyborg - -``` -yum install openstack-cyborg -``` - -4. Configure Cyborg - -Modify **/etc/cyborg/cyborg.conf**. - -``` -[DEFAULT] -transport_url = rabbit://%RABBITMQ_USER%:%RABBITMQ_PASSWORD%@%OPENSTACK_HOST_IP%:5672/ -use_syslog = False -state_path = /var/lib/cyborg -debug = True - -[database] -connection = mysql+pymysql://%DATABASE_USER%:%DATABASE_PASSWORD%@%OPENSTACK_HOST_IP%/cyborg - -[service_catalog] -project_domain_id = default -user_domain_id = default -project_name = service -password = PASSWORD -username = cyborg -auth_url = http://%OPENSTACK_HOST_IP%/identity -auth_type = password - -[placement] -project_domain_name = Default -project_name = service -user_domain_name = Default -password = PASSWORD -username = placement -auth_url = http://%OPENSTACK_HOST_IP%/identity -auth_type = password - -[keystone_authtoken] -memcached_servers = localhost:11211 -project_domain_name = Default -project_name = service -user_domain_name = Default -password = PASSWORD -username = cyborg -auth_url = http://%OPENSTACK_HOST_IP%/identity -auth_type = password -``` - -Set the user names, passwords, and IP addresses as required. - -1. Synchronize the database table. - -``` -cyborg-dbsync --config-file /etc/cyborg/cyborg.conf upgrade -``` - -6. Start the Cyborg services. - -``` -systemctl enable openstack-cyborg-api openstack-cyborg-conductor openstack-cyborg-agent -systemctl start openstack-cyborg-api openstack-cyborg-conductor openstack-cyborg-agent -``` - -### Installing Aodh - -1. Create the database. - -``` -CREATE DATABASE aodh; - -GRANT ALL PRIVILEGES ON aodh.* TO 'aodh'@'localhost' IDENTIFIED BY 'AODH_DBPASS'; - -GRANT ALL PRIVILEGES ON aodh.* TO 'aodh'@'%' IDENTIFIED BY 'AODH_DBPASS'; -``` - -2. Create Keystone resource objects. - -``` -openstack user create --domain default --password-prompt aodh - -openstack role add --project service --user aodh admin - -openstack service create --name aodh --description "Telemetry" alarming - -openstack endpoint create --region RegionOne alarming public http://controller:8042 - -openstack endpoint create --region RegionOne alarming internal http://controller:8042 - -openstack endpoint create --region RegionOne alarming admin http://controller:8042 -``` - -3. Install Aodh. - -``` -yum install openstack-aodh-api openstack-aodh-evaluator openstack-aodh-notifier openstack-aodh-listener openstack-aodh-expirer python3-aodhclient -``` - -4. Modify the configuration file. - -``` -[database] -connection = mysql+pymysql://aodh:AODH_DBPASS@controller/aodh - -[DEFAULT] -transport_url = rabbit://openstack:RABBIT_PASS@controller -auth_strategy = keystone - -[keystone_authtoken] -www_authenticate_uri = http://controller:5000 -auth_url = http://controller:5000 -memcached_servers = controller:11211 -auth_type = password -project_domain_id = default -user_domain_id = default -project_name = service -username = aodh -password = AODH_PASS - -[service_credentials] -auth_type = password -auth_url = http://controller:5000/v3 -project_domain_id = default -user_domain_id = default -project_name = service -username = aodh -password = AODH_PASS -interface = internalURL -region_name = RegionOne -``` - -5. Initialize the database. - -``` -aodh-dbsync -``` - -6. Start the Aodh services. - -``` -systemctl enable openstack-aodh-api.service openstack-aodh-evaluator.service openstack-aodh-notifier.service openstack-aodh-listener.service - -systemctl start openstack-aodh-api.service openstack-aodh-evaluator.service openstack-aodh-notifier.service openstack-aodh-listener.service -``` - -### Installing Gnocchi - -1. Create the database. - -``` -CREATE DATABASE gnocchi; - -GRANT ALL PRIVILEGES ON gnocchi.* TO 'gnocchi'@'localhost' IDENTIFIED BY 'GNOCCHI_DBPASS'; - -GRANT ALL PRIVILEGES ON gnocchi.* TO 'gnocchi'@'%' IDENTIFIED BY 'GNOCCHI_DBPASS'; -``` - -2. Create Keystone resource objects. - -``` -openstack user create --domain default --password-prompt gnocchi - -openstack role add --project service --user gnocchi admin - -openstack service create --name gnocchi --description "Metric Service" metric - -openstack endpoint create --region RegionOne metric public http://controller:8041 - -openstack endpoint create --region RegionOne metric internal http://controller:8041 - -openstack endpoint create --region RegionOne metric admin http://controller:8041 -``` - -3. Install Gnocchi. - -``` -yum install openstack-gnocchi-api openstack-gnocchi-metricd python3-gnocchiclient -``` - -1. Modify the **/etc/gnocchi/gnocchi.conf** configuration file. - -``` -[api] -auth_mode = keystone -port = 8041 -uwsgi_mode = http-socket - -[keystone_authtoken] -auth_type = password -auth_url = http://controller:5000/v3 -project_domain_name = Default -user_domain_name = Default -project_name = service -username = gnocchi -password = GNOCCHI_PASS -interface = internalURL -region_name = RegionOne - -[indexer] -url = mysql+pymysql://gnocchi:GNOCCHI_DBPASS@controller/gnocchi - -[storage] -# coordination_url is not required but specifying one will improve -# performance with better workload division across workers. -coordination_url = redis://controller:6379 -file_basepath = /var/lib/gnocchi -driver = file -``` - -5. Initialize the database. - -``` -gnocchi-upgrade -``` - -6. Start the Gnocchi services. - -``` -systemctl enable openstack-gnocchi-api.service openstack-gnocchi-metricd.service - -systemctl start openstack-gnocchi-api.service openstack-gnocchi-metricd.service -``` - -### Installing Ceilometer - -1. Create Keystone resource objects. - -``` -openstack user create --domain default --password-prompt ceilometer - -openstack role add --project service --user ceilometer admin - -openstack service create --name ceilometer --description "Telemetry" metering -``` - -2. Install Ceilometer. - -``` -yum install openstack-ceilometer-notification openstack-ceilometer-central -``` - -1. Modify the **/etc/ceilometer/pipeline.yaml** configuration file. - -``` -publishers: - # set address of Gnocchi - # + filter out Gnocchi-related activity meters (Swift driver) - # + set default archive policy - - gnocchi://?filter_project=service&archive_policy=low -``` - -4. Modify the **/etc/ceilometer/ceilometer.conf** configuration file. - -``` -[DEFAULT] -transport_url = rabbit://openstack:RABBIT_PASS@controller - -[service_credentials] -auth_type = password -auth_url = http://controller:5000/v3 -project_domain_id = default -user_domain_id = default -project_name = service -username = ceilometer -password = CEILOMETER_PASS -interface = internalURL -region_name = RegionOne -``` - -5. Initialize the database. - -``` -ceilometer-upgrade -``` - -6. Start the Ceilometer services. - -``` -systemctl enable openstack-ceilometer-notification.service openstack-ceilometer-central.service - -systemctl start openstack-ceilometer-notification.service openstack-ceilometer-central.service -``` - -### Installing Heat - -1. Creat the **heat** database and grant proper privileges to it. Replace **HEAT_DBPASS** with a proper password. - -``` -CREATE DATABASE heat; -GRANT ALL PRIVILEGES ON heat.* TO 'heat'@'localhost' IDENTIFIED BY 'HEAT_DBPASS'; -GRANT ALL PRIVILEGES ON heat.* TO 'heat'@'%' IDENTIFIED BY 'HEAT_DBPASS'; -``` - -2. Create a service credential. Create the **heat** user and add the **admin** role to it. - -``` -openstack user create --domain default --password-prompt heat -openstack role add --project service --user heat admin -``` - -3. Create the **heat** and **heat-cfn** services and their API enpoints. - -``` -openstack service create --name heat --description "Orchestration" orchestration -openstack service create --name heat-cfn --description "Orchestration" cloudformation -openstack endpoint create --region RegionOne orchestration public http://controller:8004/v1/%\(tenant_id\)s -openstack endpoint create --region RegionOne orchestration internal http://controller:8004/v1/%\(tenant_id\)s -openstack endpoint create --region RegionOne orchestration admin http://controller:8004/v1/%\(tenant_id\)s -openstack endpoint create --region RegionOne cloudformation public http://controller:8000/v1 -openstack endpoint create --region RegionOne cloudformation internal http://controller:8000/v1 -openstack endpoint create --region RegionOne cloudformation admin http://controller:8000/v1 -``` - -4. Create additional OpenStack management information, including the **heat** domain and its administrator **heat_domain_admin**, the **heat_stack_owner** role, and the **heat_stack_user** role. - -``` -openstack user create --domain heat --password-prompt heat_domain_admin -openstack role add --domain heat --user-domain heat --user heat_domain_admin admin -openstack role create heat_stack_owner -openstack role create heat_stack_user -``` - -5. Install the software packages. - -``` -yum install openstack-heat-api openstack-heat-api-cfn openstack-heat-engine -``` - -6. Modify the configuration file **/etc/heat/heat.conf**. - -``` -[DEFAULT] -transport_url = rabbit://openstack:RABBIT_PASS@controller -heat_metadata_server_url = http://controller:8000 -heat_waitcondition_server_url = http://controller:8000/v1/waitcondition -stack_domain_admin = heat_domain_admin -stack_domain_admin_password = HEAT_DOMAIN_PASS -stack_user_domain_name = heat - -[database] -connection = mysql+pymysql://heat:HEAT_DBPASS@controller/heat - -[keystone_authtoken] -www_authenticate_uri = http://controller:5000 -auth_url = http://controller:5000 -memcached_servers = controller:11211 -auth_type = password -project_domain_name = default -user_domain_name = default -project_name = service -username = heat -password = HEAT_PASS - -[trustee] -auth_type = password -auth_url = http://controller:5000 -username = heat -password = HEAT_PASS -user_domain_name = default - -[clients_keystone] -auth_uri = http://controller:5000 -``` - -7. Initialize the **heat** database table. - -``` -su -s /bin/sh -c "heat-manage db_sync" heat -``` - -8. Start the services. - -``` -systemctl enable openstack-heat-api.service openstack-heat-api-cfn.service openstack-heat-engine.service -systemctl start openstack-heat-api.service openstack-heat-api-cfn.service openstack-heat-engine.service -``` - -## OpenStack Quick Installation - -The OpenStack SIG provides the Ansible script for one-click deployment of OpenStack in All in One or Distributed modes. Users can use the script to quickly deploy an OpenStack environment based on openEuler RPM packages. The following uses the All in One mode installation as an example. - -1. Install the OpenStack SIG Tool. - - ```shell - pip install openstack-sig-tool - ``` - -2. Configure the OpenStack Yum source. - - ```shell - yum install openstack-release-train - ``` - - **Note**: Enable the EPOL repository for the Yum source if it is not enabled already. - - ```shell - vi /etc/yum.repos.d/openEuler.repo - - [EPOL] - name=EPOL - baseurl=http://repo.openeuler.org/openEuler-22.03-LTS/EPOL/main/$basearch/ - enabled=1 - gpgcheck=1 - gpgkey=http://repo.openeuler.org/openEuler-22.03-LTS/OS/$basearch/RPM-GPG-KEY-openEuler - EOF - -3. Update the Ansible configurations. - - Open the **/usr/local/etc/inventory/all_in_one.yaml** file and modify the configuration based on the environment and requirements. Modify the file as follows: - - ```shell - all: - hosts: - controller: - ansible_host: - ansible_ssh_private_key_file: - ansible_ssh_user: root - vars: - mysql_root_password: root - mysql_project_password: root - rabbitmq_password: root - project_identity_password: root - enabled_service: - - keystone - - neutron - - cinder - - placement - - nova - - glance - - horizon - - aodh - - ceilometer - - cyborg - - gnocchi - - kolla - - heat - - swift - - trove - - tempest - neutron_provider_interface_name: br-ex - default_ext_subnet_range: 10.100.100.0/24 - default_ext_subnet_gateway: 10.100.100.1 - neutron_dataplane_interface_name: eth1 - cinder_block_device: vdb - swift_storage_devices: - - vdc - swift_hash_path_suffix: ash - swift_hash_path_prefix: has - children: - compute: - hosts: controller - storage: - hosts: controller - network: - hosts: controller - vars: - test-key: test-value - dashboard: - hosts: controller - vars: - allowed_host: '*' - kolla: - hosts: controller - vars: - # We add openEuler OS support for kolla in OpenStack Queens/Rocky release - # Set this var to true if you want to use it in Q/R - openeuler_plugin: false - ``` - - Key Configurations - - | Item | Description| - |---|---| - | ansible_host | IP address of the all-in-one node.| - | ansible_ssh_private_key_file | Key used by the Ansible script for logging in to the all-in-one node.| - | ansible_ssh_user | User used by the Ansible script for logging in to the all-in-one node.| - | enabled_service | List of services to be installed. You can delete services as required.| - | neutron_provider_interface_name | Neutron L3 bridge name. | - | default_ext_subnet_range | Neutron private network IP address range. | - | default_ext_subnet_gateway | Neutron private network gateway. | - | neutron_dataplane_interface_name | NIC used by Neutron. You are advised to use a new NIC to avoid conflicts with existing NICs causing disconnection of the all-in-one node. | - | cinder_block_device | Name of the block device used by Cinder.| - | swift_storage_devices | Name of the block device used by Swift. | - -4. Run the installation command. - - ```shell - oos env setup all_in_one - ``` - - After the command is executed, the OpenStack environment of the All in One mode is successfully deployed. - - The environment variable file **.admin-openrc** is stored in the home directory of the current user. - -5. Initialize the Tempest environment. - - If you want to perform the Tempest test in the environment, run the `oos env init all_in_one` command to create the OpenStack resources required by Tempest. - - After the command is executed successfully, a **mytest** directory is generated in the home directory of the user. You can run the `tempest run` command in the directory. diff --git a/docs/en/docs/thirdparty_migration/OpenStack-wallaby.md b/docs/en/docs/thirdparty_migration/OpenStack-wallaby.md deleted file mode 100644 index 486d1856d5d70faa55066435483d203939059cf4..0000000000000000000000000000000000000000 --- a/docs/en/docs/thirdparty_migration/OpenStack-wallaby.md +++ /dev/null @@ -1,3208 +0,0 @@ -# OpenStack-Wallaby Deployment Guide - - - -- [OpenStack-Wallaby Deployment Guide](#openstack-wallaby-deployment-guide) - - [OpenStack](#openstack) - - [Conventions](#conventions) - - [Preparing the Environment](#preparing-the-environment) - - [Environment Configuration](#environment-configuration) - - [Installing the SQL Database](#installing-the-sql-database) - - [Installing RabbitMQ](#installing-rabbitmq) - - [Installing Memcached](#installing-memcached) - - [Installing OpenStack](#installing-openstack) - - [Installing Keystone](#installing-keystone) - - [Installing Glance](#installing-glance) - - [Installing Placement](#installing-placement) - - [Installing Nova](#installing-nova) - - [Installing Neutron](#installing-neutron) - - [Installing Cinder](#installing-cinder) - - [Installing Horizon](#installing-horizon) - - [Installing Tempest](#installing-tempest) - - [Installing Ironic](#installing-ironic) - - [Installing Kolla](#installing-kolla) - - [Installing Trove](#installing-trove) - - [Installing Swift](#installing-swift) - - -## OpenStack - -OpenStack is an open source cloud computing infrastructure software project developed by the community. It provides an operating platform or tool set for deploying the cloud, offering scalable and flexible cloud computing for organizations. - -As an open source cloud computing management platform, OpenStack consists of several major components, such as Nova, Cinder, Neutron, Glance, Keystone, and Horizon. OpenStack supports almost all cloud environments. The project aims to provide a cloud computing management platform that is easy-to-use, scalable, unified, and standardized. OpenStack provides an infrastructure as a service (IaaS) solution that combines complementary services, each of which provides an API for integration. - -The official source of openEuler 22.03 LTS now supports OpenStack Wallaby. You can configure the Yum source then deploy OpenStack by following the instructions of this document. - -## Conventions - -OpenStack supports multiple deployment modes. This document includes two deployment modes: `All in One` and `Distributed`. The conventions are as follows: - -`ALL in One` mode: - -```text -Ignores all possible suffixes. -``` - -`Distributed` mode: - -```text -A suffix of `(CTL)` indicates that the configuration or command applies only to the `control node`. -A suffix of `(CPT)` indicates that the configuration or command applies only to the `compute node`. -A suffix of `(STG)` indicates that the configuration or command applies only to the `storage node`. -In other cases, the configuration or command applies to both the `control node` and `compute node`. -``` - -***Note*** - -The services involved in the preceding conventions are as follows: - -- Cinder -- Nova -- Neutron - -## Preparing the Environment - -### Environment Configuration - -1. Configure the openEuler 22.03 LTS official Yum source. Enable the EPOL software repository to support OpenStack. - - ```shell - yum update - yum install openstack-release-wallaby - yum clean all && yum makecache - ``` - - **Note**: Enable the EPOL repository for the Yum source if it is not enabled already. - - ```shell - vi /etc/yum.repos.d/openEuler.repo - - [EPOL] - name=EPOL - baseurl=http://repo.openeuler.org/openEuler-22.03-LTS/EPOL/main/$basearch/ - enabled=1 - gpgcheck=1 - gpgkey=http://repo.openeuler.org/openEuler-22.03-LTS/OS/$basearch/RPM-GPG-KEY-openEuler - EOF - ``` - -2. Change the host name and mapping. - - Set the host name of each node: - - ```shell - hostnamectl set-hostname controller (CTL) - hostnamectl set-hostname compute (CPT) - ``` - - Assuming the IP address of the controller node is **10.0.0.11** and the IP address of the compute node (if any) is **10.0.0.12**, add the following information to the **/etc/hosts** file: - - ```shell - 10.0.0.11 controller - 10.0.0.12 compute - ``` - -### Installing the SQL Database - -1. Run the following command to install the software package: - - ```shell - yum install mariadb mariadb-server python3-PyMySQL - ``` - -2. Run the following command to create and edit the `/etc/my.cnf.d/openstack.cnf` file: - - ```shell - vim /etc/my.cnf.d/openstack.cnf - - [mysqld] - bind-address = 10.0.0.11 - default-storage-engine = innodb - innodb_file_per_table = on - max_connections = 4096 - collation-server = utf8_general_ci - character-set-server = utf8 - ``` - - ***Note*** - - **`bind-address` is set to the management IP address of the controller node.** - -3. Run the following commands to start the database service and configure it to automatically start upon system boot: - - ```shell - systemctl enable mariadb.service - systemctl start mariadb.service - ``` - -4. (Optional) Configure the default database password: - - ```shell - mysql_secure_installation - ``` - - ***Note*** - - **Perform operations as prompted.** - -### Installing RabbitMQ - -1. Run the following command to install the software package: - - ```shell - yum install rabbitmq-server - ``` - -2. Start the RabbitMQ service and configure it to automatically start upon system boot: - - ```shell - systemctl enable rabbitmq-server.service - systemctl start rabbitmq-server.service - ``` - -3. Add the OpenStack user: - - ```shell - rabbitmqctl add_user openstack RABBIT_PASS - ``` - - ***Note*** - - **Replace `RABBIT_PASS` to set the password for the openstack user.** - -4. Run the following command to set the permission of the openstack user to allow the user to perform configuration, write, and read operations: - - ```shell - rabbitmqctl set_permissions openstack ".*" ".*" ".*" - ``` - -### Installing Memcached - -1. Run the following command to install the dependency package: - - ```shell - yum install memcached python3-memcached - ``` - -2. Open the `/etc/sysconfig/memcached` file in insert mode. - - ```shell - vim /etc/sysconfig/memcached - - OPTIONS="-l 127.0.0.1,::1,controller" - ``` - -3. Run the following command to start the Memcached service and configure it to automatically start upon system boot: - - ```shell - systemctl enable memcached.service - systemctl start memcached.service - ``` - - ***Note*** - - **After the service is started, you can run `memcached-tool controller stats` to ensure that the service is started properly and available. You can replace `controller` with the management IP address of the controller node.** - -## Installing OpenStack - -### Installing Keystone - -1. Create the **keyston** database and grant permissions: - - ``` sql - mysql -u root -p - - MariaDB [(none)]> CREATE DATABASE keystone; - MariaDB [(none)]> GRANT ALL PRIVILEGES ON keystone.* TO 'keystone'@'localhost' \ - IDENTIFIED BY 'KEYSTONE_DBPASS'; - MariaDB [(none)]> GRANT ALL PRIVILEGES ON keystone.* TO 'keystone'@'%' \ - IDENTIFIED BY 'KEYSTONE_DBPASS'; - MariaDB [(none)]> exit - ``` - - ***Note*** - - **Replace `KEYSTONE_DBPASS` to set the password for the keystone database.** - -2. Install the software package: - - ```shell - yum install openstack-keystone httpd mod_wsgi - ``` - -3. Configure Keystone: - - ```shell - vim /etc/keystone/keystone.conf - - [database] - connection = mysql+pymysql://keystone:KEYSTONE_DBPASS@controller/keystone - - [token] - provider = fernet - ``` - - ***Description*** - - In the **[database]** section, configure the database entry . - - In the **[token]** section, configure the token provider . - - ***Note:*** - - **Replace `KEYSTONE_DBPASS` with the password of the keystone database.** - -4. Synchronize the database: - - ```shell - su -s /bin/sh -c "keystone-manage db_sync" keystone - ``` - -5. Initialize the Fernet keystore: - - ```shell - keystone-manage fernet_setup --keystone-user keystone --keystone-group keystone - keystone-manage credential_setup --keystone-user keystone --keystone-group keystone - ``` - -6. Start the service: - - ```shell - keystone-manage bootstrap --bootstrap-password ADMIN_PASS \ - --bootstrap-admin-url http://controller:5000/v3/ \ - --bootstrap-internal-url http://controller:5000/v3/ \ - --bootstrap-public-url http://controller:5000/v3/ \ - --bootstrap-region-id RegionOne - ``` - - ***Note*** - - **Replace `ADMIN_PASS` to set the password for the admin user.** - -7. Configure the Apache HTTP server: - - ```shell - vim /etc/httpd/conf/httpd.conf - - ServerName controller - ``` - - ```shell - ln -s /usr/share/keystone/wsgi-keystone.conf /etc/httpd/conf.d/ - ``` - - ***Description*** - - Configure `ServerName` to use the control node. - - ***Note*** - **If the `ServerName` item does not exist, create it. - -8. Start the Apache HTTP service: - - ```shell - systemctl enable httpd.service - systemctl start httpd.service - ``` - -9. Create environment variables: - - ```shell - cat << EOF >> ~/.admin-openrc - export OS_PROJECT_DOMAIN_NAME=Default - export OS_USER_DOMAIN_NAME=Default - export OS_PROJECT_NAME=admin - export OS_USERNAME=admin - export OS_PASSWORD=ADMIN_PASS - export OS_AUTH_URL=http://controller:5000/v3 - export OS_IDENTITY_API_VERSION=3 - export OS_IMAGE_API_VERSION=2 - EOF - ``` - - ***Note*** - - **Replace `ADMIN_PASS` with the password of the admin user.** - -10. Create domains, projects, users, and roles in sequence.The python3-openstackclient must be installed first: - - ```shell - yum install python3-openstackclient - ``` - - Import the environment variables: - - ```shell - source ~/.admin-openrc - ``` - - Create the project `service`. The domain `default` has been created during keystone-manage bootstrap. - - ```shell - openstack domain create --description "An Example Domain" example - ``` - - ```shell - openstack project create --domain default --description "Service Project" service - ``` - - Create the (non-admin) project `myproject`, user `myuser`, and role `myrole`, and add the role `myrole` to `myproject` and `myuser`. - - ```shell - openstack project create --domain default --description "Demo Project" myproject - openstack user create --domain default --password-prompt myuser - openstack role create myrole - openstack role add --project myproject --user myuser myrole - ``` - -11. Perform the verification. - - Cancel the temporary environment variables `OS_AUTH_URL` and `OS_PASSWORD`. - - ```shell - source ~/.admin-openrc - unset OS_AUTH_URL OS_PASSWORD - ``` - - Request a token for the **admin** user: - - ```shell - openstack --os-auth-url http://controller:5000/v3 \ - --os-project-domain-name Default --os-user-domain-name Default \ - --os-project-name admin --os-username admin token issue - ``` - - Request a token for user **myuser**: - - ```shell - openstack --os-auth-url http://controller:5000/v3 \ - --os-project-domain-name Default --os-user-domain-name Default \ - --os-project-name myproject --os-username myuser token issue - ``` - -### Installing Glance - -1. Create the database, service credentials, and the API endpoints. - - Create the database: - - ```sql - mysql -u root -p - - MariaDB [(none)]> CREATE DATABASE glance; - MariaDB [(none)]> GRANT ALL PRIVILEGES ON glance.* TO 'glance'@'localhost' \ - IDENTIFIED BY 'GLANCE_DBPASS'; - MariaDB [(none)]> GRANT ALL PRIVILEGES ON glance.* TO 'glance'@'%' \ - IDENTIFIED BY 'GLANCE_DBPASS'; - MariaDB [(none)]> exit - ``` - - ***Note:*** - - **Replace `GLANCE_DBPASS` to set the password for the glance database.** - - Create the service credential: - - ```shell - source ~/.admin-openrc - - openstack user create --domain default --password-prompt glance - openstack role add --project service --user glance admin - openstack service create --name glance --description "OpenStack Image" image - ``` - - Create the API endpoints for the image service: - - ```shell - openstack endpoint create --region RegionOne image public http://controller:9292 - openstack endpoint create --region RegionOne image internal http://controller:9292 - openstack endpoint create --region RegionOne image admin http://controller:9292 - ``` - -2. Install the software package: - - ```shell - yum install openstack-glance - ``` - -3. Configure Glance: - - ```shell - vim /etc/glance/glance-api.conf - - [database] - connection = mysql+pymysql://glance:GLANCE_DBPASS@controller/glance - - [keystone_authtoken] - www_authenticate_uri = http://controller:5000 - auth_url = http://controller:5000 - memcached_servers = controller:11211 - auth_type = password - project_domain_name = Default - user_domain_name = Default - project_name = service - username = glance - password = GLANCE_PASS - - [paste_deploy] - flavor = keystone - - [glance_store] - stores = file,http - default_store = file - filesystem_store_datadir = /var/lib/glance/images/ - ``` - - ***Description:*** - - In the **[database]** section, configure the database entry. - - In the **[keystone_authtoken]** and **[paste_deploy]** sections, configure the identity authentication service entry. - - In the **[glance_store]** section, configure the local file system storage and the location of image files. - - ***Note*** - - **Replace `GLANCE_DBPASS` with the password of the glance database.** - - **Replace `GLANCE_PASS` with the password of user glance.** - -4. Synchronize the database: - - ```shell - su -s /bin/sh -c "glance-manage db_sync" glance - ``` - -5. Start the service: - - ```shell - systemctl enable openstack-glance-api.service - systemctl start openstack-glance-api.service - ``` - -6. Perform the verification. - - Download the image: - - ```shell - source ~/.admin-openrc - - wget http://download.cirros-cloud.net/0.4.0/cirros-0.4.0-x86_64-disk.img - ``` - - ***Note*** - - **If the Kunpeng architecture is used in your environment, download the image of the AArch64 version. the Image cirros-0.5.2-aarch64-disk.img has been tested.** - - Upload the image to the image service: - - ```shell - openstack image create --disk-format qcow2 --container-format bare \ - --file cirros-0.4.0-x86_64-disk.img --public cirros - ``` - - Confirm the image upload and verify the attributes: - - ```shell - openstack image list - ``` - -### Installing Placement - -1. Create a database, service credentials, and API endpoints. - - Create a database. - - Access the database as the **root** user. Create the **placement** database, and grant permissions. - - ```shell - mysql -u root -p - MariaDB [(none)]> CREATE DATABASE placement; - MariaDB [(none)]> GRANT ALL PRIVILEGES ON placement.* TO 'placement'@'localhost' \ - IDENTIFIED BY 'PLACEMENT_DBPASS'; - MariaDB [(none)]> GRANT ALL PRIVILEGES ON placement.* TO 'placement'@'%' \ - IDENTIFIED BY 'PLACEMENT_DBPASS'; - MariaDB [(none)]> exit - ``` - - **Note**: - - **Replace `PLACEMENT_DBPASS` to set the password for the placement database.** - - ```shell - source admin-openrc - ``` - - Run the following commands to create the Placement service credentials, create the **placement** user, and add the **admin** role to the **placement** user: - - Create the Placement API Service. - - ```shell - openstack user create --domain default --password-prompt placement - openstack role add --project service --user placement admin - openstack service create --name placement --description "Placement API" placement - ``` - - Create API endpoints of the Placement service. - - ```shell - openstack endpoint create --region RegionOne placement public http://controller:8778 - openstack endpoint create --region RegionOne placement internal http://controller:8778 - openstack endpoint create --region RegionOne placement admin http://controller:8778 - ``` - -2. Perform the installation and configuration. - - Install the software package: - - ```shell - yum install openstack-placement-api - ``` - - Configure Placement: - - Edit the **/etc/placement/placement.conf** file: - - In the **[placement_database]** section, configure the database entry. - - In **[api]** and **[keystone_authtoken]** sections, configure the identity authentication service entry. - - ```shell - # vim /etc/placement/placement.conf - [placement_database] - # ... - connection = mysql+pymysql://placement:PLACEMENT_DBPASS@controller/placement - [api] - # ... - auth_strategy = keystone - [keystone_authtoken] - # ... - auth_url = http://controller:5000/v3 - memcached_servers = controller:11211 - auth_type = password - project_domain_name = Default - user_domain_name = Default - project_name = service - username = placement - password = PLACEMENT_PASS - ``` - - Replace **PLACEMENT_DBPASS** with the password of the **placement** database, and replace **PLACEMENT_PASS** with the password of the **placement** user. - - Synchronize the database: - - ```shell - su -s /bin/sh -c "placement-manage db sync" placement - ``` - - Start the httpd service. - - ```shell - systemctl restart httpd - ``` - -3. Perform the verification. - - Run the following command to check the status: - - ```shell - . admin-openrc - placement-status upgrade check - ``` - - Run the following command to install osc-placement and list the available resource types and features: - - ```shell - yum install python3-osc-placement - openstack --os-placement-api-version 1.2 resource class list --sort-column name - openstack --os-placement-api-version 1.6 trait list --sort-column name - ``` - -### Installing Nova - -1. Create a database, service credentials, and API endpoints. - - Create a database. - - ```sql - mysql -u root -p (CTL) - - MariaDB [(none)]> CREATE DATABASE nova_api; - MariaDB [(none)]> CREATE DATABASE nova; - MariaDB [(none)]> CREATE DATABASE nova_cell0; - MariaDB [(none)]> GRANT ALL PRIVILEGES ON nova_api.* TO 'nova'@'localhost' \ - IDENTIFIED BY 'NOVA_DBPASS'; - MariaDB [(none)]> GRANT ALL PRIVILEGES ON nova_api.* TO 'nova'@'%' \ - IDENTIFIED BY 'NOVA_DBPASS'; - MariaDB [(none)]> GRANT ALL PRIVILEGES ON nova.* TO 'nova'@'localhost' \ - IDENTIFIED BY 'NOVA_DBPASS'; - MariaDB [(none)]> GRANT ALL PRIVILEGES ON nova.* TO 'nova'@'%' \ - IDENTIFIED BY 'NOVA_DBPASS'; - MariaDB [(none)]> GRANT ALL PRIVILEGES ON nova_cell0.* TO 'nova'@'localhost' \ - IDENTIFIED BY 'NOVA_DBPASS'; - MariaDB [(none)]> GRANT ALL PRIVILEGES ON nova_cell0.* TO 'nova'@'%' \ - IDENTIFIED BY 'NOVA_DBPASS'; - MariaDB [(none)]> exit - ``` - - **Note**: - - **Replace `NOVA_DBPASS` to set the password for the nova database.** - - ```shell - source ~/.admin-openrc (CTL) - ``` - - Run the following command to create the Nova service certificate: - - ```shell - openstack user create --domain default --password-prompt nova (CTL) - openstack role add --project service --user nova admin (CTL) - openstack service create --name nova --description "OpenStack Compute" compute (CTL) - ``` - - Create a Nova API endpoint. - - ```shell - openstack endpoint create --region RegionOne compute public http://controller:8774/v2.1 (CTL) - openstack endpoint create --region RegionOne compute internal http://controller:8774/v2.1 (CTL) - openstack endpoint create --region RegionOne compute admin http://controller:8774/v2.1 (CTL) - ``` - -2. Install the software packages: - - ```shell - yum install openstack-nova-api openstack-nova-conductor \ (CTL) - openstack-nova-novncproxy openstack-nova-scheduler - - yum install openstack-nova-compute (CPT) - ``` - - **Note**: - - **If the ARM64 architecture is used, you also need to run the following command:** - - ```shell - yum install edk2-aarch64 (CPT) - ``` - -3. Configure Nova: - - ```shell - vim /etc/nova/nova.conf - - [DEFAULT] - enabled_apis = osapi_compute,metadata - transport_url = rabbit://openstack:RABBIT_PASS@controller:5672/ - my_ip = 10.0.0.1 - use_neutron = true - firewall_driver = nova.virt.firewall.NoopFirewallDriver - compute_driver=libvirt.LibvirtDriver (CPT) - instances_path = /var/lib/nova/instances/ (CPT) - lock_path = /var/lib/nova/tmp (CPT) - - [api_database] - connection = mysql+pymysql://nova:NOVA_DBPASS@controller/nova_api (CTL) - - [database] - connection = mysql+pymysql://nova:NOVA_DBPASS@controller/nova (CTL) - - [api] - auth_strategy = keystone - - [keystone_authtoken] - www_authenticate_uri = http://controller:5000/ - auth_url = http://controller:5000/ - memcached_servers = controller:11211 - auth_type = password - project_domain_name = Default - user_domain_name = Default - project_name = service - username = nova - password = NOVA_PASS - - [vnc] - enabled = true - server_listen = $my_ip - server_proxyclient_address = $my_ip - novncproxy_base_url = http://controller:6080/vnc_auto.html (CPT) - - [libvirt] - virt_type = qemu (CPT) - cpu_mode = custom (CPT) - cpu_model = cortex-a72 (CPT) - - [glance] - api_servers = http://controller:9292 - - [oslo_concurrency] - lock_path = /var/lib/nova/tmp (CTL) - - [placement] - region_name = RegionOne - project_domain_name = Default - project_name = service - auth_type = password - user_domain_name = Default - auth_url = http://controller:5000/v3 - username = placement - password = PLACEMENT_PASS - - [neutron] - auth_url = http://controller:5000 - auth_type = password - project_domain_name = default - user_domain_name = default - region_name = RegionOne - project_name = service - username = neutron - password = NEUTRON_PASS - service_metadata_proxy = true (CTL) - metadata_proxy_shared_secret = METADATA_SECRET (CTL) - ``` - - Description - - In the **[default]** section, enable the compute and metadata APIs, configure the RabbitMQ message queue entry, configure **my_ip**, and enable the network service **neutron**. - - In the **[api_database]** and **[database]** sections, configure the database entry. - - In the **[api]** and **[keystone_authtoken]** sections, configure the identity service entry. - - In the **[vnc]** section, enable and configure the entry for the remote console. - - In the **[glance]** section, configure the API address for the image service. - - In the **[oslo_concurrency]** section, configure the lock path. - - In the **[placement]** section, configure the entry of the Placement service. - - **Note**: - - **Replace `RABBIT_PASS` with the password of the openstack user in RabbitMQ.** - - **Set `my_ip` to the management IP address of the controller node.** - - **Replace `NOVA_DBPASS` with the password of the nova database.** - - **Replace `NOVA_PASS` with the password of the nova user.** - - **Replace `PLACEMENT_PASS` with the password of the placement user.** - - **Replace `NEUTRON_PASS` with the password of the neutron user.** - - **Replace `METADATA_SECRET` with a proper metadata agent secret.** - - Others - - Check whether VM hardware acceleration (x86 architecture) is supported: - - ```shell - egrep -c '(vmx|svm)' /proc/cpuinfo (CPT) - ``` - - If the returned value is **0**, hardware acceleration is not supported. You need to configure libvirt to use QEMU instead of KVM. - - ```shell - vim /etc/nova/nova.conf (CPT) - - [libvirt] - virt_type = qemu - ``` - - If the returned value is **1** or a larger value, hardware acceleration is supported, and no extra configuration is required. - - **Note**: - - **If the ARM64 architecture is used, you also need to run the following command:** - - ```shell - vim /etc/libvirt/qemu.conf - - nvram = ["/usr/share/AAVMF/AAVMF_CODE.fd: \ - /usr/share/AAVMF/AAVMF_VARS.fd", \ - "/usr/share/edk2/aarch64/QEMU_EFI-pflash.raw: \ - /usr/share/edk2/aarch64/vars-template-pflash.raw"] - - vim /etc/qemu/firmware/edk2-aarch64.json - - { - "description": "UEFI firmware for ARM64 virtual machines", - "interface-types": [ - "uefi" - ], - "mapping": { - "device": "flash", - "executable": { - "filename": "/usr/share/edk2/aarch64/QEMU_EFI-pflash.raw", - "format": "raw" - }, - "nvram-template": { - "filename": "/usr/share/edk2/aarch64/vars-template-pflash.raw", - "format": "raw" - } - }, - "targets": [ - { - "architecture": "aarch64", - "machines": [ - "virt-*" - ] - } - ], - "features": [ - - ], - "tags": [ - - ] - } - - (CPT) - ``` - -4. Synchronize the database. - - Run the following command to synchronize the **nova-api** database: - - ```shell - su -s /bin/sh -c "nova-manage api_db sync" nova (CTL) - ``` - - Run the following command to register the **cell0** database: - - ```shell - su -s /bin/sh -c "nova-manage cell_v2 map_cell0" nova (CTL) - ``` - - Create the **cell1** cell: - - ```shell - su -s /bin/sh -c "nova-manage cell_v2 create_cell --name=cell1 --verbose" nova (CTL) - ``` - - Synchronize the **nova** database: - - ```shell - su -s /bin/sh -c "nova-manage db sync" nova (CTL) - ``` - - Verify whether **cell0** and **cell1** are correctly registered: - - ```shell - su -s /bin/sh -c "nova-manage cell_v2 list_cells" nova (CTL) - ``` - - Add compute node to the OpenStack cluster: - - ```shell - su -s /bin/sh -c "nova-manage cell_v2 discover_hosts --verbose" nova (CPT) - ``` - -5. Start the services: - - ```shell - systemctl enable \ (CTL) - openstack-nova-api.service \ - openstack-nova-scheduler.service \ - openstack-nova-conductor.service \ - openstack-nova-novncproxy.service - - systemctl start \ (CTL) - openstack-nova-api.service \ - openstack-nova-scheduler.service \ - openstack-nova-conductor.service \ - openstack-nova-novncproxy.service - ``` - - ```shell - systemctl enable libvirtd.service openstack-nova-compute.service (CPT) - systemctl start libvirtd.service openstack-nova-compute.service (CPT) - ``` - -6. Perform the verification. - - ```shell - source ~/.admin-openrc (CTL) - ``` - - List the service components to verify that each process is successfully started and registered: - - ```shell - openstack compute service list (CTL) - ``` - - List the API endpoints in the identity service to verify the connection to the identity service: - - ```shell - openstack catalog list (CTL) - ``` - - List the images in the image service to verify the connections: - - ```shell - openstack image list (CTL) - ``` - - Check whether the cells are running properly and whether other prerequisites are met. - - ```shell - nova-status upgrade check (CTL) - ``` - -### Installing Neutron - -1. Create the database, service credentials, and API endpoints. - - Create the database: - - ```sql - mysql -u root -p (CTL) - - MariaDB [(none)]> CREATE DATABASE neutron; - MariaDB [(none)]> GRANT ALL PRIVILEGES ON neutron.* TO 'neutron'@'localhost' \ - IDENTIFIED BY 'NEUTRON_DBPASS'; - MariaDB [(none)]> GRANT ALL PRIVILEGES ON neutron.* TO 'neutron'@'%' \ - IDENTIFIED BY 'NEUTRON_DBPASS'; - MariaDB [(none)]> exit - ``` - - ***Note*** - - **Replace `NEUTRON_DBPASS` to set the password for the neutron database.** - - ```shell - source ~/.admin-openrc (CTL) - ``` - - Create the **neutron** service credential: - - ```shell - openstack user create --domain default --password-prompt neutron (CTL) - openstack role add --project service --user neutron admin (CTL) - openstack service create --name neutron --description "OpenStack Networking" network (CTL) - ``` - - Create the API endpoints of the Neutron service: - - ```shell - openstack endpoint create --region RegionOne network public http://controller:9696 (CTL) - openstack endpoint create --region RegionOne network internal http://controller:9696 (CTL) - openstack endpoint create --region RegionOne network admin http://controller:9696 (CTL) - ``` - -2. Install the software packages: - - ```shell - yum install openstack-neutron openstack-neutron-linuxbridge ebtables ipset \ (CTL) - openstack-neutron-ml2 - ``` - - ```shell - yum install openstack-neutron-linuxbridge ebtables ipset (CPT) - ``` - -3. Configure Neutron. - - Set the main configuration items: - - ```shell - vim /etc/neutron/neutron.conf - - [database] - connection = mysql+pymysql://neutron:NEUTRON_DBPASS@controller/neutron (CTL) - - [DEFAULT] - core_plugin = ml2 (CTL) - service_plugins = router (CTL) - allow_overlapping_ips = true (CTL) - transport_url = rabbit://openstack:RABBIT_PASS@controller - auth_strategy = keystone - notify_nova_on_port_status_changes = true (CTL) - notify_nova_on_port_data_changes = true (CTL) - api_workers = 3 (CTL) - - [keystone_authtoken] - www_authenticate_uri = http://controller:5000 - auth_url = http://controller:5000 - memcached_servers = controller:11211 - auth_type = password - project_domain_name = Default - user_domain_name = Default - project_name = service - username = neutron - password = NEUTRON_PASS - - [nova] - auth_url = http://controller:5000 (CTL) - auth_type = password (CTL) - project_domain_name = Default (CTL) - user_domain_name = Default (CTL) - region_name = RegionOne (CTL) - project_name = service (CTL) - username = nova (CTL) - password = NOVA_PASS (CTL) - - [oslo_concurrency] - lock_path = /var/lib/neutron/tmp - ``` - - ***Description*** - - Configure the database entry in the **[database]** section. - - Enable the ML2 and router plugins, allow IP address overlapping, and configure the RabbitMQ message queue entry in the **[default]** section. - - Configure the identity authentication service entry in the **[default]** and **[keystone]** sections. - - Enable the network to notify the change of the compute network topology in the **[default]** and **[nova]** sections. - - Configure the lock path in the **[oslo_concurrency]** section. - - ***Note*** - - **Replace `NEUTRON_DBPASS` with the password of the neutron database.** - - **Replace `RABBIT_PASS` with the password of the openstack user in RabbitMQ.** - - **Replace `NEUTRON_PASS` with the password of the neutron user.** - - **Replace `NOVA_PASS` with the password of the nova user.** - - Configure the ML2 plugin: - - ```shell - vim /etc/neutron/plugins/ml2/ml2_conf.ini - - [ml2] - type_drivers = flat,vlan,vxlan - tenant_network_types = vxlan - mechanism_drivers = linuxbridge,l2population - extension_drivers = port_security - - [ml2_type_flat] - flat_networks = provider - - [ml2_type_vxlan] - vni_ranges = 1:1000 - - [securitygroup] - enable_ipset = true - ``` - - Create the symbolic link for /etc/neutron/plugin.ini. - - ```shell - ln -s /etc/neutron/plugins/ml2/ml2_conf.ini /etc/neutron/plugin.ini - ``` - - **Note** - - **Enable flat, vlan, and vxlan networks, enable the linuxbridge and l2population mechanisms, and enable the port security extension driver in the [ml2] section.** - - **Configure the flat network as the provider virtual network in the [ml2_type_flat] section.** - - **Configure the range of the VXLAN network identifier in the [ml2_type_vxlan] section.** - - **Set ipset enabled in the [securitygroup] section.** - - **Remarks** - - **The actual configurations of l2 can be modified based as required. In this example, the provider network + linuxbridge is used.** - - Configure the Linux bridge agent: - - ```shell - vim /etc/neutron/plugins/ml2/linuxbridge_agent.ini - - [linux_bridge] - physical_interface_mappings = provider:PROVIDER_INTERFACE_NAME - - [vxlan] - enable_vxlan = true - local_ip = OVERLAY_INTERFACE_IP_ADDRESS - l2_population = true - - [securitygroup] - enable_security_group = true - firewall_driver = neutron.agent.linux.iptables_firewall.IptablesFirewallDriver - ``` - - ***Description*** - - Map the provider virtual network to the physical network interface in the **[linux_bridge]** section. - - Enable the VXLAN overlay network, configure the IP address of the physical network interface that processes the overlay network, and enable layer-2 population in the **[vxlan]** section. - - Enable the security group and configure the linux bridge iptables firewall driver in the **[securitygroup]** section. - - ***Note*** - - **Replace `PROVIDER_INTERFACE_NAME` with the physical network interface.** - - **Replace `OVERLAY_INTERFACE_IP_ADDRESS` with the management IP address of the controller node.** - - Configure the Layer-3 agent: - - ```shell - vim /etc/neutron/l3_agent.ini (CTL) - - [DEFAULT] - interface_driver = linuxbridge - ``` - - ***Description*** - - Set the interface driver to linuxbridge in the **[default]** section. - - Configure the DHCP agent: - - ```shell - vim /etc/neutron/dhcp_agent.ini (CTL) - - [DEFAULT] - interface_driver = linuxbridge - dhcp_driver = neutron.agent.linux.dhcp.Dnsmasq - enable_isolated_metadata = true - ``` - - ***Description*** - - In the **[default]** section, configure the linuxbridge interface driver and Dnsmasq DHCP driver, and enable the isolated metadata. - - Configure the metadata agent: - - ```shell - vim /etc/neutron/metadata_agent.ini (CTL) - - [DEFAULT] - nova_metadata_host = controller - metadata_proxy_shared_secret = METADATA_SECRET - ``` - - ***Description*** - - In the **[default]**, configure the metadata host and the shared secret. - - ***Note*** - - **Replace `METADATA_SECRET` with a proper metadata agent secret.** - -4. Configure Nova: - - ```shell - vim /etc/nova/nova.conf - - [neutron] - auth_url = http://controller:5000 - auth_type = password - project_domain_name = Default - user_domain_name = Default - region_name = RegionOne - project_name = service - username = neutron - password = NEUTRON_PASS - service_metadata_proxy = true (CTL) - metadata_proxy_shared_secret = METADATA_SECRET (CTL) - ``` - - ***Description*** - - In the **[neutron]** section, configure the access parameters, enable the metadata agent, and configure the secret. - - ***Note*** - - **Replace `NEUTRON_PASS` with the password of the neutron user.** - - **Replace `METADATA_SECRET` with a proper metadata agent secret.** - -5. Synchronize the database: - - ```shell - su -s /bin/sh -c "neutron-db-manage --config-file /etc/neutron/neutron.conf \ - --config-file /etc/neutron/plugins/ml2/ml2_conf.ini upgrade head" neutron - ``` - -6. Run the following command to restart the compute API service: - - ```shell - systemctl restart openstack-nova-api.service - ``` - -7. Start the network service: - - ```shell - systemctl enable neutron-server.service neutron-linuxbridge-agent.service \ (CTL) - neutron-dhcp-agent.service neutron-metadata-agent.service \ - systemctl enable neutron-l3-agent.service - systemctl restart openstack-nova-api.service neutron-server.service (CTL) - neutron-linuxbridge-agent.service neutron-dhcp-agent.service \ - neutron-metadata-agent.service neutron-l3-agent.service - - systemctl enable neutron-linuxbridge-agent.service (CPT) - systemctl restart neutron-linuxbridge-agent.service openstack-nova-compute.service (CPT) - ``` - -8. Perform the verification. - - Run the following command to verify whether the Neutron agent is started successfully: - - ```shell - openstack network agent list - ``` - -### Installing Cinder - -1. Create the database, service credentials, and API endpoints. - - Create the database: - - ```sql - mysql -u root -p - - MariaDB [(none)]> CREATE DATABASE cinder; - MariaDB [(none)]> GRANT ALL PRIVILEGES ON cinder.* TO 'cinder'@'localhost' \ - IDENTIFIED BY 'CINDER_DBPASS'; - MariaDB [(none)]> GRANT ALL PRIVILEGES ON cinder.* TO 'cinder'@'%' \ - IDENTIFIED BY 'CINDER_DBPASS'; - MariaDB [(none)]> exit - ``` - - ***Note*** - - **Replace `CINDER_DBPASS` to set the password for the cinder database.** - - ```shell - source ~/.admin-openrc - ``` - - Create the Cinder service credentials: - - ```shell - openstack user create --domain default --password-prompt cinder - openstack role add --project service --user cinder admin - openstack service create --name cinderv2 --description "OpenStack Block Storage" volumev2 - openstack service create --name cinderv3 --description "OpenStack Block Storage" volumev3 - ``` - - Create the API endpoints for the block storage service: - - ```shell - openstack endpoint create --region RegionOne volumev2 public http://controller:8776/v2/%\(project_id\)s - openstack endpoint create --region RegionOne volumev2 internal http://controller:8776/v2/%\(project_id\)s - openstack endpoint create --region RegionOne volumev2 admin http://controller:8776/v2/%\(project_id\)s - openstack endpoint create --region RegionOne volumev3 public http://controller:8776/v3/%\(project_id\)s - openstack endpoint create --region RegionOne volumev3 internal http://controller:8776/v3/%\(project_id\)s - openstack endpoint create --region RegionOne volumev3 admin http://controller:8776/v3/%\(project_id\)s - ``` - -2. Install the software packages: - - ```shell - yum install openstack-cinder-api openstack-cinder-scheduler (CTL) - ``` - - ```shell - yum install lvm2 device-mapper-persistent-data scsi-target-utils rpcbind nfs-utils \ (STG) - openstack-cinder-volume openstack-cinder-backup - ``` - -3. Prepare the storage devices. The following is an example: - - ```shell - pvcreate /dev/vdb - vgcreate cinder-volumes /dev/vdb - - vim /etc/lvm/lvm.conf - - - devices { - ... - filter = [ "a/vdb/", "r/.*/"] - ``` - - ***Description*** - - In the **devices** section, add filters to allow the **/dev/vdb** devices and reject other devices. - -4. Prepare the NFS: - - ```shell - mkdir -p /root/cinder/backup - - cat << EOF >> /etc/export - /root/cinder/backup 192.168.1.0/24(rw,sync,no_root_squash,no_all_squash) - EOF - - ``` - -5. Configure Cinder: - - ```shell - vim /etc/cinder/cinder.conf - - [DEFAULT] - transport_url = rabbit://openstack:RABBIT_PASS@controller - auth_strategy = keystone - my_ip = 10.0.0.11 - enabled_backends = lvm (STG) - backup_driver=cinder.backup.drivers.nfs.NFSBackupDriver (STG) - backup_share=HOST:PATH (STG) - - [database] - connection = mysql+pymysql://cinder:CINDER_DBPASS@controller/cinder - - [keystone_authtoken] - www_authenticate_uri = http://controller:5000 - auth_url = http://controller:5000 - memcached_servers = controller:11211 - auth_type = password - project_domain_name = Default - user_domain_name = Default - project_name = service - username = cinder - password = CINDER_PASS - - [oslo_concurrency] - lock_path = /var/lib/cinder/tmp - - [lvm] - volume_driver = cinder.volume.drivers.lvm.LVMVolumeDriver (STG) - volume_group = cinder-volumes (STG) - iscsi_protocol = iscsi (STG) - iscsi_helper = tgtadm (STG) - ``` - - ***Description*** - - In the **[database]** section, configure the database entry. - - In the **[DEFAULT]** section, configure the RabbitMQ message queue entry and **my_ip**. - - In the **[DEFAULT]** and **[keystone_authtoken]** sections, configure the identity authentication service entry. - - In the **[oslo_concurrency]** section, configure the lock path. - - ***Note*** - - **Replace `CINDER_DBPASS` with the password of the cinder database.** - - **Replace `RABBIT_PASS` with the password of the openstack user in RabbitMQ.** - - **Set `my_ip` to the management IP address of the controller node.** - - **Replace `CINDER_PASS` with the password of the cinder user.** - - **Replace `HOST:PATH` with the host IP address and the shared path of the NFS.** - -6. Synchronize the database: - - ```shell - su -s /bin/sh -c "cinder-manage db sync" cinder (CTL) - ``` - -7. Configure Nova: - - ```shell - vim /etc/nova/nova.conf (CTL) - - [cinder] - os_region_name = RegionOne - ``` - -8. Restart the compute API service: - - ```shell - systemctl restart openstack-nova-api.service - ``` - -9. Start the Cinder service: - - ```shell - systemctl enable openstack-cinder-api.service openstack-cinder-scheduler.service (CTL) - systemctl start openstack-cinder-api.service openstack-cinder-scheduler.service (CTL) - ``` - - ```shell - systemctl enable rpcbind.service nfs-server.service tgtd.service iscsid.service \ (STG) - openstack-cinder-volume.service \ - openstack-cinder-backup.service - systemctl start rpcbind.service nfs-server.service tgtd.service iscsid.service \ (STG) - openstack-cinder-volume.service \ - openstack-cinder-backup.service - ``` - - ***Note*** - - If the Cinder volumes are mounted using tgtadm, modify the /etc/tgt/tgtd.conf file as follows to ensure that tgtd can discover the iscsi target of cinder-volume. - - ```shell - include /var/lib/cinder/volumes/* - ``` - -10. Perform the verification: - - ```shell - source ~/.admin-openrc - openstack volume service list - ``` - -### Installing Horizon - -1. Install the software package: - - ```shell - yum install openstack-dashboard - ``` - -2. Modify the file. - - Modify the variables: - - ```text - vim /etc/openstack-dashboard/local_settings - - OPENSTACK_HOST = "controller" - ALLOWED_HOSTS = ['*', ] - - SESSION_ENGINE = 'django.contrib.sessions.backends.cache' - - CACHES = { - 'default': { - 'BACKEND': 'django.core.cache.backends.memcached.MemcachedCache', - 'LOCATION': 'controller:11211', - } - } - - OPENSTACK_KEYSTONE_URL = "http://%s:5000/v3" % OPENSTACK_HOST - OPENSTACK_KEYSTONE_MULTIDOMAIN_SUPPORT = True - OPENSTACK_KEYSTONE_DEFAULT_DOMAIN = "Default" - OPENSTACK_KEYSTONE_DEFAULT_ROLE = "user" - - OPENSTACK_API_VERSIONS = { - "identity": 3, - "image": 2, - "volume": 3, - } - ``` - -3. Restart the httpd service: - - ```shell - systemctl restart httpd.service memcached.service - ``` - -4. Perform the verification. - Open the browser, enter in the address bar, and log in to Horizon. - - ***Note*** - - **Replace `HOSTIP` with the management plane IP address of the controller node.** - -### Installing Tempest - -Tempest is the integrated test service of OpenStack. If you need to run a fully automatic test of the functions of the installed OpenStack environment, you are advised to use Tempest. Otherwise, you can choose not to install it. - -1. Install Tempest: - - ```shell - yum install openstack-tempest - ``` - -2. Initialize the directory: - - ```shell - tempest init mytest - ``` - -3. Modify the configuration file: - - ```shell - cd mytest - vi etc/tempest.conf - ``` - - Configure the current OpenStack environment information in **tempest.conf**. For details, see the [official example](https://docs.openstack.org/tempest/latest/sampleconf.html). - -4. Perform the test: - - ```shell - tempest run - ``` - -5. (Optional) Install the tempest extensions. - The OpenStack services have provided some tempest test packages. You can install these packages to enrich the tempest test content. In Wallaby, extension tests for Cinder, Glance, Keystone, Ironic and Trove are provided. You can run the following command to install and use them: - ``` - yum install python3-cinder-tempest-plugin python3-glance-tempest-plugin python3-ironic-tempest-plugin python3-keystone-tempest-plugin python3-trove-tempest-plugin - ``` - -### Installing Ironic - -Ironic is the bare metal service of OpenStack. If you need to deploy bare metal machines, Ironic is recommended. Otherwise, you can choose not to install it. - -1. Set the database. - - The bare metal service stores information in the database. Create a **ironic** database that can be accessed by the **ironic** user and replace **IRONIC_DBPASSWORD** with a proper password. - - ```sql - mysql -u root -p - - MariaDB [(none)]> CREATE DATABASE ironic CHARACTER SET utf8; - MariaDB [(none)]> GRANT ALL PRIVILEGES ON ironic.* TO 'ironic'@'localhost' \ - IDENTIFIED BY 'IRONIC_DBPASSWORD'; - MariaDB [(none)]> GRANT ALL PRIVILEGES ON ironic.* TO 'ironic'@'%' \ - IDENTIFIED BY 'IRONIC_DBPASSWORD'; - ``` - -2. Create service user authentication. - - 1. Create the bare metal service user: - - ```shell - openstack user create --password IRONIC_PASSWORD \ - --email ironic@example.com ironic - openstack role add --project service --user ironic admin - openstack service create --name ironic - --description "Ironic baremetal provisioning service" baremetal - - openstack service create --name ironic-inspector --description "Ironic inspector baremetal provisioning service" baremetal-introspection - openstack user create --password IRONIC_INSPECTOR_PASSWORD --email ironic_inspector@example.com ironic_inspector - openstack role add --project service --user ironic-inspector admin - ``` - - 2. Create the bare metal service access entries: - - ```shell - openstack endpoint create --region RegionOne baremetal admin http://$IRONIC_NODE:6385 - openstack endpoint create --region RegionOne baremetal public http://$IRONIC_NODE:6385 - openstack endpoint create --region RegionOne baremetal internal http://$IRONIC_NODE:6385 - openstack endpoint create --region RegionOne baremetal-introspection internal http://172.20.19.13:5050/v1 - openstack endpoint create --region RegionOne baremetal-introspection public http://172.20.19.13:5050/v1 - openstack endpoint create --region RegionOne baremetal-introspection admin http://172.20.19.13:5050/v1 - ``` - -3. Configure the ironic-api service. - - Configuration file path: **/etc/ironic/ironic.conf** - - 1. Use **connection** to configure the location of the database as follows. Replace **IRONIC_DBPASSWORD** with the password of user **ironic** and replace **DB_IP** with the IP address of the database server. - - ```shell - [database] - - # The SQLAlchemy connection string used to connect to the - # database (string value) - - connection = mysql+pymysql://ironic:IRONIC_DBPASSWORD@DB_IP/ironic - ``` - - 2. Configure the ironic-api service to use the RabbitMQ message broker. Replace **RPC_\*** with the detailed address and the credential of RabbitMQ. - - ```shell - [DEFAULT] - - # A URL representing the messaging driver to use and its full - # configuration. (string value) - - transport_url = rabbit://RPC_USER:RPC_PASSWORD@RPC_HOST:RPC_PORT/ - ``` - - You can also use json-rpc instead of RabbitMQ. - - 3. Configure the ironic-api service to use the credential of the identity authentication service. Replace **PUBLIC_IDENTITY_IP** with the public IP address of the identity authentication server and **PRIVATE_IDENTITY_IP** with the private IP address of the identity authentication server, replace **IRONIC_PASSWORD** with the password of the **ironic** user in the identity authentication service. - - ```shell - [DEFAULT] - - # Authentication strategy used by ironic-api: one of - # "keystone" or "noauth". "noauth" should not be used in a - # production environment because all authentication will be - # disabled. (string value) - - auth_strategy=keystone - host = controller - memcache_servers = controller:11211 - enabled_network_interfaces = flat,noop,neutron - default_network_interface = noop - transport_url = rabbit://openstack:RABBITPASSWD@controller:5672/ - enabled_hardware_types = ipmi - enabled_boot_interfaces = pxe - enabled_deploy_interfaces = direct - default_deploy_interface = direct - enabled_inspect_interfaces = inspector - enabled_management_interfaces = ipmitool - enabled_power_interfaces = ipmitool - enabled_rescue_interfaces = no-rescue,agent - isolinux_bin = /usr/share/syslinux/isolinux.bin - logging_context_format_string = %(asctime)s.%(msecs)03d %(process)d %(levelname)s %(name)s [%(global_request_id)s %(request_id)s %(user_identity)s] %(instance)s%(message)s - - [keystone_authtoken] - # Authentication type to load (string value) - auth_type=password - # Complete public Identity API endpoint (string value) - www_authenticate_uri=http://PUBLIC_IDENTITY_IP:5000 - # Complete admin Identity API endpoint. (string value) - auth_url=http://PRIVATE_IDENTITY_IP:5000 - # Service username. (string value) - username=ironic - # Service account password. (string value) - password=IRONIC_PASSWORD - # Service tenant name. (string value) - project_name=service - # Domain name containing project (string value) - project_domain_name=Default - # User's domain name (string value) - user_domain_name=Default - - [agent] - deploy_logs_collect = always - deploy_logs_local_path = /var/log/ironic/deploy - deploy_logs_storage_backend = local - image_download_source = http - stream_raw_images = false - force_raw_images = false - verify_ca = False - - [oslo_concurrency] - - [oslo_messaging_notifications] - transport_url = rabbit://openstack:123456@172.20.19.25:5672/ - topics = notifications - driver = messagingv2 - - [oslo_messaging_rabbit] - amqp_durable_queues = True - rabbit_ha_queues = True - - [pxe] - ipxe_enabled = false - pxe_append_params = nofb nomodeset vga=normal coreos.autologin ipa-insecure=1 - image_cache_size = 204800 - tftp_root=/var/lib/tftpboot/cephfs/ - tftp_master_path=/var/lib/tftpboot/cephfs/master_images - - [dhcp] - dhcp_provider = none - ``` - - 4. Create the bare metal service database table: - - ```shell - ironic-dbsync --config-file /etc/ironic/ironic.conf create_schema - ``` - - 5. Restart the ironic-api service: - - ```shell - sudo systemctl restart openstack-ironic-api - ``` - -4. Configure the ironic-conductor service. - - 1. Replace **HOST_IP** with the IP address of the conductor host. - - ```shell - [DEFAULT] - - # IP address of this host. If unset, will determine the IP - # programmatically. If unable to do so, will use "127.0.0.1". - # (string value) - - my_ip=HOST_IP - ``` - - 2. Specifies the location of the database. ironic-conductor must use the same configuration as ironic-api. Replace **IRONIC_DBPASSWORD** with the password of user **ironic** and replace **DB_IP** with the IP address of the database server. - - ```shell - [database] - - # The SQLAlchemy connection string to use to connect to the - # database. (string value) - - connection = mysql+pymysql://ironic:IRONIC_DBPASSWORD@DB_IP/ironic - ``` - - 3. Configure the ironic-api service to use the RabbitMQ message broker. ironic-conductor must use the same configuration as ironic-api. Replace **RPC_\*** with the detailed address and the credential of RabbitMQ. - - ```shell - [DEFAULT] - - # A URL representing the messaging driver to use and its full - # configuration. (string value) - - transport_url = rabbit://RPC_USER:RPC_PASSWORD@RPC_HOST:RPC_PORT/ - ``` - - You can also use json-rpc instead of RabbitMQ. - - 4. Configure the credentials to access other OpenStack services. - - To communicate with other OpenStack services, the bare metal service needs to use the service users to get authenticated by the OpenStack Identity service when requesting other services. The credentials of these users must be configured in each configuration file associated to the corresponding service. - - ```shell - [neutron] - Accessing the OpenStack network services. - [glance] - Accessing the OpenStack image service. - [swift] - Accessing the OpenStack object storage service. - [cinder] - Accessing the OpenStack block storage service. - [inspector] Accessing the OpenStack bare metal introspection service. - [service_catalog] - A special item to store the credential used by the bare metal service. The credential is used to discover the API URL endpoint registered in the OpenStack identity authentication service catalog by the bare metal service. - ``` - - For simplicity, you can use one service user for all services. For backward compatibility, the user name must be the same as that configured in [keystone_authtoken] of the ironic-api service. However, this is not mandatory. You can also create and configure a different service user for each service. - - In the following example, the authentication information for the user to access the OpenStack network service is configured as follows: - - ```shell - The network service is deployed in the identity authentication service domain named RegionOne. Only the public endpoint interface is registered in the service catalog. - - A specific CA SSL certificate is used for HTTPS connection when sending a request. - - The same service user as that configured for ironic-api. - - The dynamic password authentication plugin discovers a proper identity authentication service API version based on other options. - ``` - - ```shell - [neutron] - - # Authentication type to load (string value) - auth_type = password - # Authentication URL (string value) - auth_url=https://IDENTITY_IP:5000/ - # Username (string value) - username=ironic - # User's password (string value) - password=IRONIC_PASSWORD - # Project name to scope to (string value) - project_name=service - # Domain ID containing project (string value) - project_domain_id=default - # User's domain id (string value) - user_domain_id=default - # PEM encoded Certificate Authority to use when verifying - # HTTPs connections. (string value) - cafile=/opt/stack/data/ca-bundle.pem - # The default region_name for endpoint URL discovery. (string - # value) - region_name = RegionOne - # List of interfaces, in order of preference, for endpoint - # URL. (list value) - valid_interfaces=public - ``` - - By default, to communicate with other services, the bare metal service attempts to discover a proper endpoint of the service through the service catalog of the identity authentication service. If you want to use a different endpoint for a specific service, specify the endpoint_override option in the bare metal service configuration file. - - ```shell - [neutron] ... endpoint_override = - ``` - - 5. Configure the allowed drivers and hardware types. - - Set enabled_hardware_types to specify the hardware types that can be used by ironic-conductor: - - ```shell - [DEFAULT] enabled_hardware_types = ipmi - ``` - - Configure hardware interfaces: - - ```shell - enabled_boot_interfaces = pxe enabled_deploy_interfaces = direct,iscsi enabled_inspect_interfaces = inspector enabled_management_interfaces = ipmitool enabled_power_interfaces = ipmitool - ``` - - Configure the default value of the interface: - - ```shell - [DEFAULT] default_deploy_interface = direct default_network_interface = neutron - ``` - - If any driver that uses Direct Deploy is enabled, you must install and configure the Swift backend of the image service. The Ceph object gateway (RADOS gateway) can also be used as the backend of the image service. - - 6. Restart the ironic-conductor service: - - ```shell - sudo systemctl restart openstack-ironic-conductor - ``` - -5. Configure the ironic-inspector service. - - Configuration file path: **/etc/ironic-inspector/inspector.conf**. - - 1. Create the database: - - ```shell - # mysql -u root -p - - MariaDB [(none)]> CREATE DATABASE ironic_inspector CHARACTER SET utf8; - - MariaDB [(none)]> GRANT ALL PRIVILEGES ON ironic_inspector.* TO 'ironic_inspector'@'localhost' \ IDENTIFIED BY 'IRONIC_INSPECTOR_DBPASSWORD'; - MariaDB [(none)]> GRANT ALL PRIVILEGES ON ironic_inspector.* TO 'ironic_inspector'@'%' \ - IDENTIFIED BY 'IRONIC_INSPECTOR_DBPASSWORD'; - ``` - - 2. Use **connection** to configure the location of the database as follows. Replace **IRONIC_INSPECTOR_DBPASSWORD** with the password of user **ironic_inspector** and replace **DB_IP** with the IP address of the database server: - - ```shell - [database] - backend = sqlalchemy - connection = mysql+pymysql://ironic_inspector:IRONIC_INSPECTOR_DBPASSWORD@DB_IP/ironic_inspector - min_pool_size = 100 - max_pool_size = 500 - pool_timeout = 30 - max_retries = 5 - max_overflow = 200 - db_retry_interval = 2 - db_inc_retry_interval = True - db_max_retry_interval = 2 - db_max_retries = 5 - ``` - - 3. Configure the communication address of the message queue: - - ```shell - [DEFAULT] - transport_url = rabbit://RPC_USER:RPC_PASSWORD@RPC_HOST:RPC_PORT/ - - ``` - - 4. Configure the Keystone authentication: - - ```shell - [DEFAULT] - - auth_strategy = keystone - timeout = 900 - rootwrap_config = /etc/ironic-inspector/rootwrap.conf - logging_context_format_string = %(asctime)s.%(msecs)03d %(process)d %(levelname)s %(name)s [%(global_request_id)s %(request_id)s %(user_identity)s] %(instance)s%(message)s - log_dir = /var/log/ironic-inspector - state_path = /var/lib/ironic-inspector - use_stderr = False - - [ironic] - api_endpoint = http://IRONIC_API_HOST_ADDRRESS:6385 - auth_type = password - auth_url = http://PUBLIC_IDENTITY_IP:5000 - auth_strategy = keystone - ironic_url = http://IRONIC_API_HOST_ADDRRESS:6385 - os_region = RegionOne - project_name = service - project_domain_name = Default - user_domain_name = Default - username = IRONIC_SERVICE_USER_NAME - password = IRONIC_SERVICE_USER_PASSWORD - - [keystone_authtoken] - auth_type = password - auth_url = http://control:5000 - www_authenticate_uri = http://control:5000 - project_domain_name = default - user_domain_name = default - project_name = service - username = ironic_inspector - password = IRONICPASSWD - region_name = RegionOne - memcache_servers = control:11211 - token_cache_time = 300 - - [processing] - add_ports = active - processing_hooks = $default_processing_hooks,local_link_connection,lldp_basic - ramdisk_logs_dir = /var/log/ironic-inspector/ramdisk - always_store_ramdisk_logs = true - store_data =none - power_off = false - - [pxe_filter] - driver = iptables - - [capabilities] - boot_mode=True - ``` - - 5. Configure the ironic inspector dnsmasq service: - - ```shell - #Configuration file path: /etc/ironic-inspector/dnsmasq.conf - port=0 - interface=enp3s0 #Replace with the actual listening network interface. - dhcp-range=172.20.19.100,172.20.19.110 #Replace with the actual DHCP IP address range. - bind-interfaces - enable-tftp - - dhcp-match=set:efi,option:client-arch,7 - dhcp-match=set:efi,option:client-arch,9 - dhcp-match=aarch64, option:client-arch,11 - dhcp-boot=tag:aarch64,grubaa64.efi - dhcp-boot=tag:!aarch64,tag:efi,grubx64.efi - dhcp-boot=tag:!aarch64,tag:!efi,pxelinux.0 - - tftp-root=/tftpboot #Replace with the actual tftpboot directory. - log-facility=/var/log/dnsmasq.log - ``` - - 6. Disable DHCP for the subnet of the ironic provision network. - - ``` - openstack subnet set --no-dhcp 72426e89-f552-4dc4-9ac7-c4e131ce7f3c - ``` - - 7. Initializs the database of the ironic-inspector service. - - Run the following command on the controller node: - - ``` - ironic-inspector-dbsync --config-file /etc/ironic-inspector/inspector.conf upgrade - ``` - - 8. Start the service: - - ```shell - systemctl enable --now openstack-ironic-inspector.service - systemctl enable --now openstack-ironic-inspector-dnsmasq.service - ``` - -6. Configure the httpd service. - - 1. Create the root directory of the httpd used by Ironic, and set the owner and owner group. The directory path must be the same as the path specified by the **http_root** configuration item in the **[deploy]** group in **/etc/ironic/ironic.conf**. - - ``` - mkdir -p /var/lib/ironic/httproot ``chown ironic.ironic /var/lib/ironic/httproot - ``` - - - - 2. Install and configure the httpd Service. - - - - 1. Install the httpd service. If the httpd service is already installed, skip this step. - - ``` - yum install httpd -y - ``` - - - - 2. Create the **/etc/httpd/conf.d/openstack-ironic-httpd.conf** file. The file content is as follows: - - ``` - Listen 8080 - - - ServerName ironic.openeuler.com - - ErrorLog "/var/log/httpd/openstack-ironic-httpd-error_log" - CustomLog "/var/log/httpd/openstack-ironic-httpd-access_log" "%h %l %u %t \"%r\" %>s %b" - - DocumentRoot "/var/lib/ironic/httproot" - - Options Indexes FollowSymLinks - Require all granted - - LogLevel warn - AddDefaultCharset UTF-8 - EnableSendfile on - - - ``` - - The listening port must be the same as the port specified by **http_url** in the **[deploy]** section of **/etc/ironic/ironic.conf**. - - 3. Restart the httpd service: - - ``` - systemctl restart httpd - ``` - - - -7. Create the deploy ramdisk image. - - The ramdisk image of Wallaby can be created using the ironic-python-agent service or disk-image-builder tool. You can also use the latest ironic-python-agent-builder provided by the community. You can also use other tools. - To use the Wallaby native tool, you need to install the corresponding software package. - - ```shell - yum install openstack-ironic-python-agent - or - yum install diskimage-builder - ``` - - For details, see the [official document](https://docs.openstack.org/ironic/queens/install/deploy-ramdisk.html). - - The following describes how to use the ironic-python-agent-builder to build the deploy image used by ironic. - - 1. Install ironic-python-agent-builder. - - - 1. Install the tool: - - ```shell - pip install ironic-python-agent-builder - ``` - - 2. Modify the python interpreter in the following files: - - ```shell - /usr/bin/yum /usr/libexec/urlgrabber-ext-down - ``` - - 3. Install the other necessary tools: - - ```shell - yum install git - ``` - - `DIB` depends on the `semanage` command. Therefore, check whether the `semanage --help` command is available before creating an image. If the system displays a message indicating that the command is unavailable, install the command: - - ```shell - # Check which package needs to be installed. - [root@localhost ~]# yum provides /usr/sbin/semanage - Loaded plug-in: fastestmirror - Loading mirror speeds from cached hostfile - * base: mirror.vcu.edu - * extras: mirror.vcu.edu - * updates: mirror.math.princeton.edu - policycoreutils-python-2.5-34.el7.aarch64 : SELinux policy core python utilities - Source: base - Matching source: - File name: /usr/sbin/semanage - # Install. - [root@localhost ~]# yum install policycoreutils-python - ``` - - 2. Create the image. - - For `arm` architecture, add the following information: - ```shell - export ARCH=aarch64 - ``` - - Basic usage: - - ```shell - usage: ironic-python-agent-builder [-h] [-r RELEASE] [-o OUTPUT] [-e ELEMENT] - [-b BRANCH] [-v] [--extra-args EXTRA_ARGS] - distribution - - positional arguments: - distribution Distribution to use - - optional arguments: - -h, --help show this help message and exit - -r RELEASE, --release RELEASE - Distribution release to use - -o OUTPUT, --output OUTPUT - Output base file name - -e ELEMENT, --element ELEMENT - Additional DIB element to use - -b BRANCH, --branch BRANCH - If set, override the branch that is used for ironic- - python-agent and requirements - -v, --verbose Enable verbose logging in diskimage-builder - --extra-args EXTRA_ARGS - Extra arguments to pass to diskimage-builder - ``` - - Example: - - ```shell - ironic-python-agent-builder centos -o /mnt/ironic-agent-ssh -b origin/stable/rocky - ``` - - 3. Allow SSH login. - - Initialize the environment variables and create the image: - - ```shell - export DIB_DEV_USER_USERNAME=ipa \ - export DIB_DEV_USER_PWDLESS_SUDO=yes \ - export DIB_DEV_USER_PASSWORD='123' - ironic-python-agent-builder centos -o /mnt/ironic-agent-ssh -b origin/stable/rocky -e selinux-permissive -e devuser - ``` - - 4. Specify the code repository. - - Initialize the corresponding environment variables and create the image: - - ```shell - # Specify the address and version of the repository. - DIB_REPOLOCATION_ironic_python_agent=git@172.20.2.149:liuzz/ironic-python-agent.git - DIB_REPOREF_ironic_python_agent=origin/develop - - # Clone code from Gerrit. - DIB_REPOLOCATION_ironic_python_agent=https://review.opendev.org/openstack/ironic-python-agent - DIB_REPOREF_ironic_python_agent=refs/changes/43/701043/1 - ``` - - Reference: [source-repositories](https://docs.openstack.org/diskimage-builder/latest/elements/source-repositories/README.html). - - The specified repository address and version are verified successfully. - - 5. Note - -The template of the PXE configuration file of the native OpenStack does not support the ARM64 architecture. You need to modify the native OpenStack code. - -In Wallaby, Ironic provided by the community does not support the boot from ARM 64-bit UEFI PXE. As a result, the format of the generated grub.cfg file (generally in /tftpboot/) is incorrect, causing the PXE boot failure. - -The generated incorrect configuration file is as follows: - -![erro](/Users/andy_lee/Downloads/erro.png) - -As shown in the preceding figure, in the ARM architecture, the commands for searching for the vmlinux and ramdisk images are **linux** and **initrd**, respectively. The command in red in the preceding figure is the UEFI PXE startup command in the x86 architecture. - -You need to modify the code logic for generating the grub.cfg file. - -The following TLS error is reported when Ironic sends a request to IPA to query the command execution status: - -By default, both IPA and Ironic of Wallaby have TLS authentication enabled to send requests to each other. Disable TLS authentication according to the description on the official website. - -1. Add **ipa-insecure=1** to the following configuration in the Ironic configuration file (**/etc/ironic/ironic.conf**): - -``` -[agent] -verify_ca = False - -[pxe] -pxe_append_params = nofb nomodeset vga=normal coreos.autologin ipa-insecure=1 -``` - -2. Add the IPA configuration file **/etc/ironic_python_agent/ironic_python_agent.conf** to the ramdisk image and configure the TLS as follows: - -**/etc/ironic_python_agent/ironic_python_agent.conf** (The **/etc/ironic_python_agent** directory must be created in advance.) - -``` -[DEFAULT] -enable_auto_tls = False -``` - -Set the permission: - -``` -chown -R ipa.ipa /etc/ironic_python_agent/ -``` - -3. Modify the startup file of the IPA service and add the configuration file option. - - vim usr/lib/systemd/system/ironic-python-agent.service - - ``` - [Unit] - Description=Ironic Python Agent - After=network-online.target - - [Service] - ExecStartPre=/sbin/modprobe vfat - ExecStart=/usr/local/bin/ironic-python-agent --config-file /etc/ironic_python_agent/ironic_python_agent.conf - Restart=always - RestartSec=30s - - [Install] - WantedBy=multi-user.target - ``` - - - -### Installing Kolla - -Kolla provides the OpenStack service with the container-based deployment function that is ready for the production environment. The Kolla and Kolla-ansible services are introduced in openEuler in version 22.03 LTS. - -The installation of Kolla is simple. You only need to install the corresponding RPM packages: - -``` -yum install openstack-kolla openstack-kolla-ansible -``` - -After the installation is complete, you can run commands such as `kolla-ansible`, `kolla-build`, `kolla-genpwd`, `kolla-mergepwd`. - -### Installing Trove -Trove is the database service of OpenStack. If you need to use the database service provided by OpenStack, Trove is recommended. Otherwise, you can choose not to install it. - -1. Set the database. - - The database service stores information in the database. Create a **trove** database that can be accessed by the **trove** user and replace **TROVE_DBPASSWORD** with a proper password. - - ```sql - mysql -u root -p - - MariaDB [(none)]> CREATE DATABASE trove CHARACTER SET utf8; - MariaDB [(none)]> GRANT ALL PRIVILEGES ON trove.* TO 'trove'@'localhost' \ - IDENTIFIED BY 'TROVE_DBPASSWORD'; - MariaDB [(none)]> GRANT ALL PRIVILEGES ON trove.* TO 'trove'@'%' \ - IDENTIFIED BY 'TROVE_DBPASSWORD'; - ``` - -2. Create service user authentication. - - 1. Create the **Trove** service user. - - ```shell - openstack user create --password TROVE_PASSWORD \ - --email trove@example.com trove - openstack role add --project service --user trove admin - openstack service create --name trove - --description "Database service" database - ``` - **Description:** Replace `TROVE_PASSWORD` with the password of the `trove` user. - - 2. Create the **Database** service access entry - - ```shell - openstack endpoint create --region RegionOne database public http://controller:8779/v1.0/%\(tenant_id\)s - openstack endpoint create --region RegionOne database internal http://controller:8779/v1.0/%\(tenant_id\)s - openstack endpoint create --region RegionOne database admin http://controller:8779/v1.0/%\(tenant_id\)s - ``` - -3. Install and configure the **Trove** components. - 1. Install the **Trove** package: - ```shell script - yum install openstack-trove python-troveclient - ``` - 2. Configure `trove.conf`: - ```shell script - vim /etc/trove/trove.conf - - [DEFAULT] - bind_host=TROVE_NODE_IP - log_dir = /var/log/trove - network_driver = trove.network.neutron.NeutronDriver - management_security_groups = - nova_keypair = trove-mgmt - default_datastore = mysql - taskmanager_manager = trove.taskmanager.manager.Manager - trove_api_workers = 5 - transport_url = rabbit://openstack:RABBIT_PASS@controller:5672/ - reboot_time_out = 300 - usage_timeout = 900 - agent_call_high_timeout = 1200 - use_syslog = False - debug = True - - # Set these if using Neutron Networking - network_driver=trove.network.neutron.NeutronDriver - network_label_regex=.* - - - transport_url = rabbit://openstack:RABBIT_PASS@controller:5672/ - - [database] - connection = mysql+pymysql://trove:TROVE_DBPASS@controller/trove - - [keystone_authtoken] - project_domain_name = Default - project_name = service - user_domain_name = Default - password = trove - username = trove - auth_url = http://controller:5000/v3/ - auth_type = password - - [service_credentials] - auth_url = http://controller:5000/v3/ - region_name = RegionOne - project_name = service - password = trove - project_domain_name = Default - user_domain_name = Default - username = trove - - [mariadb] - tcp_ports = 3306,4444,4567,4568 - - [mysql] - tcp_ports = 3306 - - [postgresql] - tcp_ports = 5432 - ``` - **Description:** - - In the `[Default]` section, set `bind_host` to the IP address of the node where Trove is deployed. - - `nova_compute_url` and `cinder_url` are endpoints created by Nova and Cinder in Keystone. - - `nova_proxy_XXX` is a user who can access the Nova service. In the preceding example, the `admin` user is used. - - `transport_url` is the `RabbitMQ` connection information, and `RABBIT_PASS` is the RabbitMQ password. - - In the `[database]` section, `connection` is the information of the database created for Trove in MySQL. - - Replace `TROVE_PASS` in the Trove user information with the password of the **trove** user. - - 5. Configure `trove-guestagent.conf`: - ```shell script - vim /etc/trove/trove-guestagent.conf - - [DEFAULT] - log_file = trove-guestagent.log - log_dir = /var/log/trove/ - ignore_users = os_admin - control_exchange = trove - transport_url = rabbit://openstack:RABBIT_PASS@controller:5672/ - rpc_backend = rabbit - command_process_timeout = 60 - use_syslog = False - debug = True - - [service_credentials] - auth_url = http://controller:5000/v3/ - region_name = RegionOne - project_name = service - password = TROVE_PASS - project_domain_name = Default - user_domain_name = Default - username = trove - - [mysql] - docker_image = your-registry/your-repo/mysql - backup_docker_image = your-registry/your-repo/db-backup-mysql:1.1.0 - ``` - ** Description:** `guestagent` is an independent component in Trove and needs to be pre-built into the virtual machine image created by Trove using Nova. - After the database instance is created, the guestagent process is started to report heartbeat messages to the Trove through the message queue (RabbitMQ). - Therefore, you need to configure the user name and password of the RabbitMQ. - ** Since Victoria, Trove uses a unified image to run different types of databases. The database service runs in the Docker container of the Guest VM.** - - `transport_url` is the `RabbitMQ` connection information, and `RABBIT_PASS` is the RabbitMQ password. - - Replace `TROVE_PASS` in the Trove user information with the password of the **trove** user. - - 6. Generate the `Trove` database table. - ```shell script - su -s /bin/sh -c "trove-manage db_sync" trove - ``` -4. Complete the installation and configuration. - 1. Configure the **Trove** service to automatically start: - ```shell script - systemctl enable openstack-trove-api.service \ - openstack-trove-taskmanager.service \ - openstack-trove-conductor.service - ``` - 2. Start the service: - ```shell script - systemctl start openstack-trove-api.service \ - openstack-trove-taskmanager.service \ - openstack-trove-conductor.service - ``` -### Installing Swift - -Swift provides a scalable and highly available distributed object storage service, which is suitable for storing unstructured data in large scale. - -1. Create the service credentials and API endpoints. - - Create the service credential: - - ``` shell - #Create the swift user. - openstack user create --domain default --password-prompt swift - #Add the admin role for the swift user. - openstack role add --project service --user swift admin - #Create the swift service entity. - openstack service create --name swift --description "OpenStack Object Storage" object-store - ``` - - Create the Swift API endpoints. - - ```shell - openstack endpoint create --region RegionOne object-store public http://controller:8080/v1/AUTH_%\(project_id\)s - openstack endpoint create --region RegionOne object-store internal http://controller:8080/v1/AUTH_%\(project_id\)s - openstack endpoint create --region RegionOne object-store admin http://controller:8080/v1 - ``` - - -2. Install the software packages: - - ```shell - yum install openstack-swift-proxy python3-swiftclient python3-keystoneclient python3-keystonemiddleware memcached (CTL) - ``` - -3. Configure the proxy-server. - - The Swift RPM package contains a **proxy-server.conf** file which is basically ready to use. You only need to change the values of **ip** and swift **password** in the file. - - ***Note*** - - **Replace password with the password you set for the swift user in the identity service.** - -4. Install and configure the storage node. (STG) - - Install the supported program packages: - ```shell - yum install xfsprogs rsync - ``` - - Format the /dev/vdb and /dev/vdc devices into XFS: - - ```shell - mkfs.xfs /dev/vdb - mkfs.xfs /dev/vdc - ``` - - Create the mount point directory structure: - - ```shell - mkdir -p /srv/node/vdb - mkdir -p /srv/node/vdc - ``` - - Find the UUID of the new partition: - - ```shell - blkid - ``` - - Add the following to the **/etc/fstab** file: - - ```shell - UUID="" /srv/node/vdb xfs noatime 0 2 - UUID="" /srv/node/vdc xfs noatime 0 2 - ``` - - Mount the devices: - - ```shell - mount /srv/node/vdb - mount /srv/node/vdc - ``` - ***Note*** - - **If the disaster recovery function is not required, you only need to create one device and skip the following rsync configuration.** - - (Optional) Create or edit the **/etc/rsyncd.conf** file to include the following content: - - ```shell - [DEFAULT] - uid = swift - gid = swift - log file = /var/log/rsyncd.log - pid file = /var/run/rsyncd.pid - address = MANAGEMENT_INTERFACE_IP_ADDRESS - - [account] - max connections = 2 - path = /srv/node/ - read only = False - lock file = /var/lock/account.lock - - [container] - max connections = 2 - path = /srv/node/ - read only = False - lock file = /var/lock/container.lock - - [object] - max connections = 2 - path = /srv/node/ - read only = False - lock file = /var/lock/object.lock - ``` - **Replace `MANAGEMENT_INTERFACE_IP_ADDRESS` with the management network IP address of the storage node.** - - Start the rsyncd service and configure it to start upon system startup. - - ```shell - systemctl enable rsyncd.service - systemctl start rsyncd.service - ``` - -5. Install and configure the components on storage nodes. (STG) - - Install the software packages: - - ```shell - yum install openstack-swift-account openstack-swift-container openstack-swift-object - ``` - - Edit **account-server.conf**, **container-server.conf**, and **object-server.conf** in the **/etc/swift directory** and replace **bind_ip** with the management network IP address of the storage node. - - Ensure the proper ownership of the mount point directory structure. - - ```shell - chown -R swift:swift /srv/node - ``` - - Create the recon directory and ensure that it has the correct ownership. - - ```shell - mkdir -p /var/cache/swift - chown -R root:swift /var/cache/swift - chmod -R 775 /var/cache/swift - ``` - -6. Create the account ring. (CTL) - - Switch to the `/etc/swift` directory: - - ```shell - cd /etc/swift - ``` - - Create the basic `account.builder` file: - - ```shell - swift-ring-builder account.builder create 10 1 1 - ``` - - Add each storage node to the ring: - - ```shell - swift-ring-builder account.builder add --region 1 --zone 1 --ip STORAGE_NODE_MANAGEMENT_INTERFACE_IP_ADDRESS --port 6202 --device DEVICE_NAME --weight DEVICE_WEIGHT - ``` - - **Replace `STORAGE_NODE_MANAGEMENT_INTERFACE_IP_ADDRESS` with the management network IP address of the storage node. Replace `DEVICE_NAME` with the name of the storage device on the same storage node.** - - ***Note*** - **Repeat this command to each storage device on each storage node.** - - Verify the ring contents: - - ```shell - swift-ring-builder account.builder - ``` - - Rebalance the ring: - - ```shell - swift-ring-builder account.builder rebalance - ``` - -7. Create the container ring. (CTL) - - Switch to the `/etc/swift` directory: - - Create the basic `container.builder` file: - - ```shell - swift-ring-builder container.builder create 10 1 1 - ``` - - Add each storage node to the ring: - - ```shell - swift-ring-builder container.builder \ - add --region 1 --zone 1 --ip STORAGE_NODE_MANAGEMENT_INTERFACE_IP_ADDRESS --port 6201 \ - --device DEVICE_NAME --weight 100 - - ``` - - **Replace `STORAGE_NODE_MANAGEMENT_INTERFACE_IP_ADDRESS` with the management network IP address of the storage node. Replace `DEVICE_NAME` with the name of the storage device on the same storage node.** - - ***Note*** - **Repeat this command to every storage devices on every storage nodes.** - - Verify the ring contents: - - ```shell - swift-ring-builder container.builder - ``` - - Rebalance the ring: - - ```shell - swift-ring-builder container.builder rebalance - ``` - -8. Create the object ring. (CTL) - - Switch to the `/etc/swift` directory: - - Create the basic `object.builder` file: - - ```shell - swift-ring-builder object.builder create 10 1 1 - ``` - - Add each storage node to the ring: - - ```shell - swift-ring-builder object.builder \ - add --region 1 --zone 1 --ip STORAGE_NODE_MANAGEMENT_INTERFACE_IP_ADDRESS --port 6200 \ - --device DEVICE_NAME --weight 100 - ``` - - **Replace `STORAGE_NODE_MANAGEMENT_INTERFACE_IP_ADDRESS` with the management network IP address of the storage node. Replace `DEVICE_NAME` with the name of the storage device on the same storage node.** - - ***Note*** - **Repeat this command to every storage devices on every storage nodes.** - - Verify the ring contents: - - ```shell - swift-ring-builder object.builder - ``` - - Rebalance the ring: - - ```shell - swift-ring-builder object.builder rebalance - ``` - - Distribute ring configuration files: - - Copy `account.ring.gz`, `container.ring.gz`, and `object.ring.gz` to the `/etc/swift` directory on each storage node and any additional nodes running the proxy service. - - - -9. Complete the installation. - - Edit the `/etc/swift/swift.conf` file: - - ``` shell - [swift-hash] - swift_hash_path_suffix = test-hash - swift_hash_path_prefix = test-hash - - [storage-policy:0] - name = Policy-0 - default = yes - ``` - - **Replace test-hash with a unique value.** - - Copy the `swift.conf` file to the `/etc/swift` directory on each storage node and any additional nodes running the proxy service. - - Ensure correct ownership of the configuration directory on all nodes: - - ```shell - chown -R root:swift /etc/swift - ``` - - On the controller node and any additional nodes running the proxy service, start the object storage proxy service and its dependencies, and configure them to start upon system startup. - - ```shell - systemctl enable openstack-swift-proxy.service memcached.service - systemctl start openstack-swift-proxy.service memcached.service - ``` - - On the storage node, start the object storage services and configure them to start upon system startup. - - ```shell - systemctl enable openstack-swift-account.service openstack-swift-account-auditor.service openstack-swift-account-reaper.service openstack-swift-account-replicator.service - - systemctl start openstack-swift-account.service openstack-swift-account-auditor.service openstack-swift-account-reaper.service openstack-swift-account-replicator.service - - systemctl enable openstack-swift-container.service openstack-swift-container-auditor.service openstack-swift-container-replicator.service openstack-swift-container-updater.service - - systemctl start openstack-swift-container.service openstack-swift-container-auditor.service openstack-swift-container-replicator.service openstack-swift-container-updater.service - - systemctl enable openstack-swift-object.service openstack-swift-object-auditor.service openstack-swift-object-replicator.service openstack-swift-object-updater.service - - systemctl start openstack-swift-object.service openstack-swift-object-auditor.service openstack-swift-object-replicator.service openstack-swift-object-updater.service - ``` - -### Installing Cyborg - -Cyborg provides acceleration device support for OpenStack, for example, GPUs, FPGAs, ASICs, NPs, SoCs, NVMe/NOF SSDs, ODPs, DPDKs, and SPDKs. - -1. Initialize the databases. - -``` -CREATE DATABASE cyborg; -GRANT ALL PRIVILEGES ON cyborg.* TO 'cyborg'@'localhost' IDENTIFIED BY 'CYBORG_DBPASS'; -GRANT ALL PRIVILEGES ON cyborg.* TO 'cyborg'@'%' IDENTIFIED BY 'CYBORG_DBPASS'; -``` - -2. Create Keystone resource objects. - -``` -$ openstack user create --domain default --password-prompt cyborg -$ openstack role add --project service --user cyborg admin -$ openstack service create --name cyborg --description "Acceleration Service" accelerator - -$ openstack endpoint create --region RegionOne \ - accelerator public http://:6666/v1 -$ openstack endpoint create --region RegionOne \ - accelerator internal http://:6666/v1 -$ openstack endpoint create --region RegionOne \ - accelerator admin http://:6666/v1 -``` - -3. Install Cyborg - -``` -yum install openstack-cyborg -``` - -4. Configure Cyborg - -Modify **/etc/cyborg/cyborg.conf**. - -``` -[DEFAULT] -transport_url = rabbit://%RABBITMQ_USER%:%RABBITMQ_PASSWORD%@%OPENSTACK_HOST_IP%:5672/ -use_syslog = False -state_path = /var/lib/cyborg -debug = True - -[database] -connection = mysql+pymysql://%DATABASE_USER%:%DATABASE_PASSWORD%@%OPENSTACK_HOST_IP%/cyborg - -[service_catalog] -project_domain_id = default -user_domain_id = default -project_name = service -password = PASSWORD -username = cyborg -auth_url = http://%OPENSTACK_HOST_IP%/identity -auth_type = password - -[placement] -project_domain_name = Default -project_name = service -user_domain_name = Default -password = PASSWORD -username = placement -auth_url = http://%OPENSTACK_HOST_IP%/identity -auth_type = password - -[keystone_authtoken] -memcached_servers = localhost:11211 -project_domain_name = Default -project_name = service -user_domain_name = Default -password = PASSWORD -username = cyborg -auth_url = http://%OPENSTACK_HOST_IP%/identity -auth_type = password -``` - -Set the user names, passwords, and IP addresses as required. - -1. Synchronize the database table. - -``` -cyborg-dbsync --config-file /etc/cyborg/cyborg.conf upgrade -``` - -6. Start the Cyborg services. - -``` -systemctl enable openstack-cyborg-api openstack-cyborg-conductor openstack-cyborg-agent -systemctl start openstack-cyborg-api openstack-cyborg-conductor openstack-cyborg-agent -``` - -### Installing Aodh - -1. Create the database. - -``` -CREATE DATABASE aodh; - -GRANT ALL PRIVILEGES ON aodh.* TO 'aodh'@'localhost' IDENTIFIED BY 'AODH_DBPASS'; - -GRANT ALL PRIVILEGES ON aodh.* TO 'aodh'@'%' IDENTIFIED BY 'AODH_DBPASS'; -``` - -2. Create Keystone resource objects. - -``` -openstack user create --domain default --password-prompt aodh - -openstack role add --project service --user aodh admin - -openstack service create --name aodh --description "Telemetry" alarming - -openstack endpoint create --region RegionOne alarming public http://controller:8042 - -openstack endpoint create --region RegionOne alarming internal http://controller:8042 - -openstack endpoint create --region RegionOne alarming admin http://controller:8042 -``` - -3. Install Aodh. - -``` -yum install openstack-aodh-api openstack-aodh-evaluator openstack-aodh-notifier openstack-aodh-listener openstack-aodh-expirer python3-aodhclient -``` - -4. Modify the configuration file. - -``` -[database] -connection = mysql+pymysql://aodh:AODH_DBPASS@controller/aodh - -[DEFAULT] -transport_url = rabbit://openstack:RABBIT_PASS@controller -auth_strategy = keystone - -[keystone_authtoken] -www_authenticate_uri = http://controller:5000 -auth_url = http://controller:5000 -memcached_servers = controller:11211 -auth_type = password -project_domain_id = default -user_domain_id = default -project_name = service -username = aodh -password = AODH_PASS - -[service_credentials] -auth_type = password -auth_url = http://controller:5000/v3 -project_domain_id = default -user_domain_id = default -project_name = service -username = aodh -password = AODH_PASS -interface = internalURL -region_name = RegionOne -``` - -5. Initialize the database. - -``` -aodh-dbsync -``` - -6. Start the Aodh services. - -``` -systemctl enable openstack-aodh-api.service openstack-aodh-evaluator.service openstack-aodh-notifier.service openstack-aodh-listener.service - -systemctl start openstack-aodh-api.service openstack-aodh-evaluator.service openstack-aodh-notifier.service openstack-aodh-listener.service -``` - -### Installing Gnocchi - -1. Create the database. - -``` -CREATE DATABASE gnocchi; - -GRANT ALL PRIVILEGES ON gnocchi.* TO 'gnocchi'@'localhost' IDENTIFIED BY 'GNOCCHI_DBPASS'; - -GRANT ALL PRIVILEGES ON gnocchi.* TO 'gnocchi'@'%' IDENTIFIED BY 'GNOCCHI_DBPASS'; -``` - -2. Create Keystone resource objects. - -``` -openstack user create --domain default --password-prompt gnocchi - -openstack role add --project service --user gnocchi admin - -openstack service create --name gnocchi --description "Metric Service" metric - -openstack endpoint create --region RegionOne metric public http://controller:8041 - -openstack endpoint create --region RegionOne metric internal http://controller:8041 - -openstack endpoint create --region RegionOne metric admin http://controller:8041 -``` - -3. Install Gnocchi. - -``` -yum install openstack-gnocchi-api openstack-gnocchi-metricd python3-gnocchiclient -``` - -1. Modify the **/etc/gnocchi/gnocchi.conf** configuration file. - -``` -[api] -auth_mode = keystone -port = 8041 -uwsgi_mode = http-socket - -[keystone_authtoken] -auth_type = password -auth_url = http://controller:5000/v3 -project_domain_name = Default -user_domain_name = Default -project_name = service -username = gnocchi -password = GNOCCHI_PASS -interface = internalURL -region_name = RegionOne - -[indexer] -url = mysql+pymysql://gnocchi:GNOCCHI_DBPASS@controller/gnocchi - -[storage] -# coordination_url is not required but specifying one will improve -# performance with better workload division across workers. -coordination_url = redis://controller:6379 -file_basepath = /var/lib/gnocchi -driver = file -``` - -5. Initialize the database. - -``` -gnocchi-upgrade -``` - -6. Start the Gnocchi services. - -``` -systemctl enable openstack-gnocchi-api.service openstack-gnocchi-metricd.service - -systemctl start openstack-gnocchi-api.service openstack-gnocchi-metricd.service -``` - -### Installing Ceilometer - -1. Create Keystone resource objects. - -``` -openstack user create --domain default --password-prompt ceilometer - -openstack role add --project service --user ceilometer admin - -openstack service create --name ceilometer --description "Telemetry" metering -``` - -2. Install Ceilometer. - -``` -yum install openstack-ceilometer-notification openstack-ceilometer-central -``` - -1. Modify the **/etc/ceilometer/pipeline.yaml** configuration file. - -``` -publishers: - # set address of Gnocchi - # + filter out Gnocchi-related activity meters (Swift driver) - # + set default archive policy - - gnocchi://?filter_project=service&archive_policy=low -``` - -4. Modify the **/etc/ceilometer/ceilometer.conf** configuration file. - -``` -[DEFAULT] -transport_url = rabbit://openstack:RABBIT_PASS@controller - -[service_credentials] -auth_type = password -auth_url = http://controller:5000/v3 -project_domain_id = default -user_domain_id = default -project_name = service -username = ceilometer -password = CEILOMETER_PASS -interface = internalURL -region_name = RegionOne -``` - -5. Initialize the database. - -``` -ceilometer-upgrade -``` - -6. Start the Ceilometer services. - -``` -systemctl enable openstack-ceilometer-notification.service openstack-ceilometer-central.service - -systemctl start openstack-ceilometer-notification.service openstack-ceilometer-central.service -``` - -### Installing Heat - -1. Creat the **heat** database and grant proper privileges to it. Replace **HEAT_DBPASS** with a proper password. - -``` -CREATE DATABASE heat; -GRANT ALL PRIVILEGES ON heat.* TO 'heat'@'localhost' IDENTIFIED BY 'HEAT_DBPASS'; -GRANT ALL PRIVILEGES ON heat.* TO 'heat'@'%' IDENTIFIED BY 'HEAT_DBPASS'; -``` - -2. Create a service credential. Create the **heat** user and add the **admin** role to it. - -``` -openstack user create --domain default --password-prompt heat -openstack role add --project service --user heat admin -``` - -3. Create the **heat** and **heat-cfn** services and their API enpoints. - -``` -openstack service create --name heat --description "Orchestration" orchestration -openstack service create --name heat-cfn --description "Orchestration" cloudformation -openstack endpoint create --region RegionOne orchestration public http://controller:8004/v1/%\(tenant_id\)s -openstack endpoint create --region RegionOne orchestration internal http://controller:8004/v1/%\(tenant_id\)s -openstack endpoint create --region RegionOne orchestration admin http://controller:8004/v1/%\(tenant_id\)s -openstack endpoint create --region RegionOne cloudformation public http://controller:8000/v1 -openstack endpoint create --region RegionOne cloudformation internal http://controller:8000/v1 -openstack endpoint create --region RegionOne cloudformation admin http://controller:8000/v1 -``` - -4. Create additional OpenStack management information, including the **heat** domain and its administrator **heat_domain_admin**, the **heat_stack_owner** role, and the **heat_stack_user** role. - -``` -openstack user create --domain heat --password-prompt heat_domain_admin -openstack role add --domain heat --user-domain heat --user heat_domain_admin admin -openstack role create heat_stack_owner -openstack role create heat_stack_user -``` - -5. Install the software packages. - -``` -yum install openstack-heat-api openstack-heat-api-cfn openstack-heat-engine -``` - -6. Modify the configuration file **/etc/heat/heat.conf**. - -``` -[DEFAULT] -transport_url = rabbit://openstack:RABBIT_PASS@controller -heat_metadata_server_url = http://controller:8000 -heat_waitcondition_server_url = http://controller:8000/v1/waitcondition -stack_domain_admin = heat_domain_admin -stack_domain_admin_password = HEAT_DOMAIN_PASS -stack_user_domain_name = heat - -[database] -connection = mysql+pymysql://heat:HEAT_DBPASS@controller/heat - -[keystone_authtoken] -www_authenticate_uri = http://controller:5000 -auth_url = http://controller:5000 -memcached_servers = controller:11211 -auth_type = password -project_domain_name = default -user_domain_name = default -project_name = service -username = heat -password = HEAT_PASS - -[trustee] -auth_type = password -auth_url = http://controller:5000 -username = heat -password = HEAT_PASS -user_domain_name = default - -[clients_keystone] -auth_uri = http://controller:5000 -``` - -7. Initialize the **heat** database table. - -``` -su -s /bin/sh -c "heat-manage db_sync" heat -``` - -8. Start the services. - -``` -systemctl enable openstack-heat-api.service openstack-heat-api-cfn.service openstack-heat-engine.service -systemctl start openstack-heat-api.service openstack-heat-api-cfn.service openstack-heat-engine.service -``` - -## OpenStack Quick Installation - -The OpenStack SIG provides the Ansible script for one-click deployment of OpenStack in All in One or Distributed modes. Users can use the script to quickly deploy an OpenStack environment based on openEuler RPM packages. The following uses the All in One mode installation as an example. - -1. Install the OpenStack SIG Tool. - - ```shell - pip install openstack-sig-tool - ``` - -2. Configure the OpenStack Yum source. - - ```shell - yum install openstack-release-wallaby - ``` - - **Note**: Enable the EPOL repository for the Yum source if it is not enabled already. - - ```shell - vi /etc/yum.repos.d/openEuler.repo - - [EPOL] - name=EPOL - baseurl=http://repo.openeuler.org/openEuler-22.03-LTS/EPOL/main/$basearch/ - enabled=1 - gpgcheck=1 - gpgkey=http://repo.openeuler.org/openEuler-22.03-LTS/OS/$basearch/RPM-GPG-KEY-openEuler - EOF - -3. Update the Ansible configurations. - - Open the **/usr/local/etc/inventory/all_in_one.yaml** file and modify the configuration based on the environment and requirements. Modify the file as follows: - - ```shell - all: - hosts: - controller: - ansible_host: - ansible_ssh_private_key_file: - ansible_ssh_user: root - vars: - mysql_root_password: root - mysql_project_password: root - rabbitmq_password: root - project_identity_password: root - enabled_service: - - keystone - - neutron - - cinder - - placement - - nova - - glance - - horizon - - aodh - - ceilometer - - cyborg - - gnocchi - - kolla - - heat - - swift - - trove - - tempest - neutron_provider_interface_name: br-ex - default_ext_subnet_range: 10.100.100.0/24 - default_ext_subnet_gateway: 10.100.100.1 - neutron_dataplane_interface_name: eth1 - cinder_block_device: vdb - swift_storage_devices: - - vdc - swift_hash_path_suffix: ash - swift_hash_path_prefix: has - children: - compute: - hosts: controller - storage: - hosts: controller - network: - hosts: controller - vars: - test-key: test-value - dashboard: - hosts: controller - vars: - allowed_host: '*' - kolla: - hosts: controller - vars: - # We add openEuler OS support for kolla in OpenStack Queens/Rocky release - # Set this var to true if you want to use it in Q/R - openeuler_plugin: false - ``` - - Key Configurations - - | Item | Description| - |---|---| - | ansible_host | IP address of the all-in-one node.| - | ansible_ssh_private_key_file | Key used by the Ansible script for logging in to the all-in-one node.| - | ansible_ssh_user | User used by the Ansible script for logging in to the all-in-one node.| - | enabled_service | List of services to be installed. You can delete services as required.| - | neutron_provider_interface_name | Neutron L3 bridge name. | - | default_ext_subnet_range | Neutron private network IP address range. | - | default_ext_subnet_gateway | Neutron private network gateway. | - | neutron_dataplane_interface_name | NIC used by Neutron. You are advised to use a new NIC to avoid conflicts with existing NICs causing disconnection of the all-in-one node. | - | cinder_block_device | Name of the block device used by Cinder.| - | swift_storage_devices | Name of the block device used by Swift. | - -4. Run the installation command. - - ```shell - oos env setup all_in_one - ``` - - After the command is executed, the OpenStack environment of the All in One mode is successfully deployed. - - The environment variable file **.admin-openrc** is stored in the home directory of the current user. - -5. Initialize the Tempest environment. - - If you want to perform the Tempest test in the environment, run the `oos env init all_in_one` command to create the OpenStack resources required by Tempest. - - After the command is executed successfully, a **mytest** directory is generated in the home directory of the user. You can run the `tempest run` command in the directory. \ No newline at end of file diff --git a/docs/en/docs/thirdparty_migration/installha.md b/docs/en/docs/thirdparty_migration/installha.md deleted file mode 100644 index bfa6283411cdde3800bafc9228da1a5618655b3c..0000000000000000000000000000000000000000 --- a/docs/en/docs/thirdparty_migration/installha.md +++ /dev/null @@ -1,201 +0,0 @@ -# Installing and Deploying an HA Cluster - -This section describes how to install and deploy an HA cluster. - -## Installation and Deployment - -### Preparing the Environment - -At least two physical machines or virtual machines (VMs) installed with openEuler 21.03 are required. This section uses two physical machines or VMs as an example. For details about how to install openEuler 21.03, see the [_openEuler Installation Guide_](../Installation/Installation.md). - -### Modifying the Host Name and the /etc/hosts File - -**Note**: You need to perform the following operations on both hosts. The following uses one host as an example. The IP address used in this section is for reference only. - -Before using the HA software, ensure that the host name has been changed and all host names have been written into the **/etc/hosts** file. - -1. Run the following command to change the host name: - - ```shell - hostnamectl set-hostname ha1 - ``` - -2. Edit the `/etc/hosts` file and write the following fields: - - ```text - 172.30.30.65 ha1 - 172.30.30.66 ha2 - ``` - -### Configuring the Yum Source - -After the system is successfully installed, the Yum source is configured by default. The file location is stored in the `/etc/yum.repos.d/openEuler.repo` file. The HA software package uses the following sources: - -```conf -[OS] -name=OS -baseurl=http://repo.openeuler.org/openEuler-23.09/OS/$basearch/ -enabled=1 -gpgcheck=1 -gpgkey=http://repo.openeuler.org/openEuler-23.09/OS/$basearch/RPM-GPG-KEY-openEuler - -[everything] -name=everything -baseurl=http://repo.openeuler.org/openEuler-23.09/everything/$basearch/ -enabled=1 -gpgcheck=1 -gpgkey=http://repo.openeuler.org/openEuler-23.09/everything/$basearch/RPM-GPG-KEY-openEuler - -[EPOL] -name=EPOL -baseurl=http://repo.openeuler.org/openEuler-23.09/EPOL/$basearch/ -enabled=1 -gpgcheck=1 -gpgkey=http://repo.openeuler.org/openEuler-23.09/OS/$basearch/RPM-GPG-KEY-openEuler -``` - -### Installing the Components of the HA Software Package - -```shell -yum install -y corosync pacemaker pcs fence-agents fence-virt corosync-qdevice sbd drbd drbd-utils -``` - -### Setting the **hacluster** User Password - -```shell -passwd hacluster -``` - -### Modifying the `/etc/corosync/corosync.conf` file - -```conf -totem { - version: 2 - cluster_name: hacluster - crypto_cipher: none - crypto_hash: none -} -logging { - fileline: off - to_stderr: yes - to_logfile: yes - logfile: /var/log/cluster/corosync.log - to_syslog: yes - debug: on - logger_subsys { - subsys: QUORUM - debug: on - } -} -quorum { - provider: corosync_votequorum - expected_votes: 2 - two_node: 1 - } -nodelist { - node { - name: ha1 - nodeid: 1 - ring0_addr: 172.30.30.65 - } - node { - name: ha2 - nodeid: 2 - ring0_addr: 172.30.30.66 - } - } -``` - -### Managing Services - -#### Disabling the Firewall - -1. Run the following command to disable the firewall: - - ```shell - systemctl stop firewalld - ``` - -2. Change **SELinux** to **disabled** in the **`/etc/selinux/config`** file. - - ```text - SELINUX=disabled - ``` - -#### Managing the pcs Service - -1. Run the following command to start the pcs service: - - ```shell - systemctl start pcsd - ``` - -2. Run the following command to query the pcs service status: - - ```shell - systemctl status pcsd - ``` - - The service is started successfully if the following information is displayed: - - ![](./figures/HA-pcs.png) - -#### Managing the Pacemaker Service - -1. Run the following command to start the Pacemaker service: - - ```shell - systemctl start pacemaker - ``` - -2. Run the following command to query the Pacemaker service status: - - ```shell - systemctl status pacemaker - ``` - - The service is started successfully if the following information is displayed: - - ![](./figures/HA-pacemaker.png) - -#### Managing the Corosync Service - -1. Run the following command to start the Corosync service: - - ```shell - systemctl start corosync - ``` - -2. Run the following command to query the Corosync service status: - - ```shell - systemctl status corosync - ``` - - The service is started successfully if the following information is displayed: - - ![](./figures/HA-corosync.png) - -### Performing Node Authentication - -**Note**: Perform this operation on either node. - -```shell -pcs host auth ha1 ha2 -``` - -### Accessing the Front-End Management Platform - -After the preceding services are started, open the browser (Chrome or Firefox is recommended) and enter `https://localhost:2224` in the address box. - -- The following figure shows the native management platform: - -![](./figures/HA-login.png) - -For details about how to install the management platform newly developed by the community, see `https://gitee.com/openeuler/ha-api/blob/master/docs/build.md`. - -- The following is the management platform newly developed by the community: - -![](./figures/HA-api.png) - -For details about how to use the HA cluster and how to add an instance, see [HA Usage Example](../desktop/HA_use_cases.md). diff --git a/docs/en/docs/thirdparty_migration/thidrparty.md b/docs/en/docs/thirdparty_migration/thidrparty.md deleted file mode 100644 index 66f59126694b37d126c81238ab201744905d6b21..0000000000000000000000000000000000000000 --- a/docs/en/docs/thirdparty_migration/thidrparty.md +++ /dev/null @@ -1,3 +0,0 @@ -# Third-Party Software Porting Guide - -This document is intended for community developers, open source enthusiasts, and partners who use the openEuler OS and intend to learn more about third-party software. Basic knowledge about the Linux OS is required for reading this document. \ No newline at end of file diff --git a/docs/en/docs/userguide/overview.md b/docs/en/docs/userguide/overview.md deleted file mode 100644 index e3b656290f017e8688b1f831d00dd9ebeb86c576..0000000000000000000000000000000000000000 --- a/docs/en/docs/userguide/overview.md +++ /dev/null @@ -1,3 +0,0 @@ -# Toolset User Guide - -This document describes the toolkit used for the openEuler release, including the overview, installation, and usage of tools. diff --git a/docs/en/menu/index.md b/docs/en/menu/index.md index 760c380d0c7d5eb71679ab360926cb859e4e68dd..b9274c4b19a2a273914421cbdb2c0e0501f5b56d 100644 --- a/docs/en/menu/index.md +++ b/docs/en/menu/index.md @@ -10,7 +10,7 @@ headless: true - [Key Features]({{< relref "./docs/Releasenotes/key-features.md" >}}) - [Known Issues]({{< relref "./docs/Releasenotes/known-issues.md" >}}) - [Resolved Issues]({{< relref "./docs/Releasenotes/resolved-issues.md" >}}) - - [Common Vulnerabilities and Exposures (CVE)]({{< relref "./docs/Releasenotes/common-vulnerabilities-and-exposures-(cve).md" >}}) + - [Common Vulnerabilities and Exposures (CVEs)]({{< relref "./docs/Releasenotes/common-vulnerabilities-and-exposures-(cve).md" >}}) - [Source Code]({{< relref "./docs/Releasenotes/source-code.md" >}}) - [Contribution]({{< relref "./docs/Releasenotes/contribution.md" >}}) - [Acknowledgment]({{< relref "./docs/Releasenotes/acknowledgment.md" >}}) @@ -19,20 +19,21 @@ headless: true - [Installation Guide]({{< relref "./docs/Installation/Installation.md" >}}) - [Installation on Servers]({{< relref "./docs/Installation/install-server.md" >}}) - [Installation Preparations]({{< relref "./docs/Installation/installation-preparations.md" >}}) - - [Installation Mode]({{< relref "./docs/Installation/installation-modes.md" >}}) + - [Installation Modes]({{< relref "./docs/Installation/installation-modes.md" >}}) - [Installation Guideline]({{< relref "./docs/Installation/installation-guideline.md" >}}) - [Using Kickstart for Automatic Installation]({{< relref "./docs/Installation/using-kickstart-for-automatic-installation.md" >}}) - [FAQs]({{< relref "./docs/Installation/faqs.md" >}}) - [Installation on Raspberry Pi]({{< relref "./docs/Installation/install-pi.md" >}}) - [Installation Preparations]({{< relref "./docs/Installation/Installation-Preparations1.md" >}}) - [Installation Mode]({{< relref "./docs/Installation/Installation-Modes1.md" >}}) - - [Installation Guideline]({{< relref "./docs/Installation/Installation-Guide1" >}}) + - [Installation Guideline]({{< relref "./docs/Installation/Installation-Guide1.md" >}}) - [FAQs]({{< relref "./docs/Installation/FAQ1.md" >}}) - [More Resources]({{< relref "./docs/Installation/More-Resources.md" >}}) - - [RISC-V Installation Guide]({{< relref "./docs/Installation/riscv.md" >}}) - - [Virtual Machine Installation]({{< relref "./docs/Installation/riscv_qemu.md" >}}) - - [More Resources]({{< relref "./docs/Installation/riscv_more.md" >}}) - - [Upgrade and Downgrade Guide]({{< relref "./docs/os_upgrade_and_downgrade/openEuler_22.03_LTS_upgrade_and_downgrade.md" >}}) + - [Installation on RISC-V]({{< relref "./docs/Installation/riscv.md" >}}) + - [Installing on QEMU]({{< relref "./docs/Installation/RISC-V-QEMU.md" >}}) + - [Installing on Pioneer Box]({{< relref "./docs/Installation/RISC-V-Pioneer1.3.md" >}}) + - [Installing on Licheepi4A]({{< relref "./docs/Installation/RISC-V-LicheePi4A.md" >}}) + - [RISCV-OLK6.6 Source-Compatible Version Guide]({{< relref "./docs/Installation/RISCV-OLK6.6.md" >}}) - [OS Management](#) - [Administrator Guide]({{< relref "./docs/Administration/administration.md" >}}) - [Viewing System Information]({{< relref "./docs/Administration/viewing-system-information.md" >}}) @@ -80,6 +81,13 @@ headless: true - [Installation and Deployment]({{< relref "./docs/KernelLiveUpgrade/installation-and-deployment.md" >}}) - [How to Run]({{< relref "./docs/KernelLiveUpgrade/how-to-run.md" >}}) - [Common Problems and Solutions]({{< relref "./docs/KernelLiveUpgrade/common-problems-and-solutions.md" >}}) + - [SysCare User Guide]({{< relref "./docs/SysCare/SysCare_user_guide.md" >}}) + - [Introduction to SysCare]({{< relref "./docs/SysCare/SysCare_introduction.md" >}}) + - [Installing SysCare]({{< relref "./docs/SysCare/installing_SysCare.md" >}}) + - [Using SysCare]({{< relref "./docs/SysCare/using_SysCare.md" >}}) + - [Constraints]({{< relref "./docs/SysCare/constraints.md" >}}) + - [FAQs]({{< relref "./docs/SysCare/faqs.md" >}}) + - [sysmonitor]({{< relref "./docs/sysmonitor/sysmonitor-usage.md" >}}) - [HA User Guide]({{< relref "./docs/thirdparty_migration/ha.md" >}}) - [Deploying an HA Cluster]({{< relref "./docs/thirdparty_migration/installing-and-deploying-HA.md" >}}) - [HA Usage Example]({{< relref "./docs/thirdparty_migration/usecase.md" >}}) @@ -101,6 +109,9 @@ headless: true - [Installing secGear]({{< relref "./docs/secGear/secGear-installation.md" >}}) - [API Reference]({{< relref "./docs/secGear/api-reference.md" >}}) - [secGear Application Development]({{< relref "./docs/secGear/developer-guide.md" >}}) + - [Certificate Signature]({{< relref "./docs/CertSignature/overview_of_certificates_and_signatures.md" >}}) + - [Introduction to Signature Certificates]({{< relref "./docs/CertSignature/introduction_to_signature_certificates.md" >}}) + - [Secure Boot]({{< relref "./docs/CertSignature/secure_boot.md" >}}) - [Performance](#) - [A-Tune User Guide]({{< relref "./docs/A-Tune/A-Tune.md" >}}) - [Getting to Know A-Tune]({{< relref "./docs/A-Tune/getting-to-know-a-tune.md" >}}) @@ -113,6 +124,7 @@ headless: true - [Getting to Know sysBoost]({{< relref "./docs/sysBoost/getting-to-know-sysBoost.md" >}}) - [Installation and Deployment]({{< relref "./docs/sysBoost/installation-and-deployment.md" >}}) - [Usage Instructions]({{< relref "./docs/sysBoost/usage-instructions.md" >}}) + - [oeAware User Guide]({{< relref "./docs/oeAware/oeAware_user_guide.md" >}}) - [Desktop](#) - [UKUI]({{< relref "./docs/desktop/ukui.md" >}}) - [UKUI Installation]({{< relref "./docs/desktop/installing-UKUI.md" >}}) @@ -122,7 +134,7 @@ headless: true - [DDE User Guide]({{< relref "./docs/desktop/DDE-user-guide.md" >}}) - [Xfce]({{< relref "./docs/desktop/xfce.md" >}}) - [Xfce Installation]({{< relref "./docs/desktop/installing-Xfce.md" >}}) - - [Xfce User Guide]({{< relref "./docs/desktop/Xfce_userguide.md" >}}) + - [Xfce User Guide]({{< relref "./docs/desktop/Xfce_userguide.md" >}}) - [GNOME]({{< relref "./docs/desktop/gnome.md" >}}) - [GNOME Installation]({{< relref "./docs/desktop/installing-GNOME.md" >}}) - [GNOME User Guide]({{< relref "./docs/desktop/GNOME_userguide.md" >}}) @@ -130,7 +142,7 @@ headless: true - [Kiran Installation]({{< relref "./docs/desktop/install-kiran.md" >}}) - [Kiran User Guide]({{< relref "./docs/desktop/Kiran_userguide.md" >}}) - [Embedded](#) - - [openEuler Embedded User Guide](https://openeuler.gitee.io/yocto-meta-openeuler/master/index.html) + - [openEuler Embedded User Guide](https://embedded.pages.openeuler.org/openEuler-24.03-LTS/index.html) - [Virtualization](#) - [Virtualization User Guide]({{< relref "./docs/Virtualization/virtualization.md" >}}) - [Introduction to Virtualization]({{< relref "./docs/Virtualization/introduction-to-virtualization.md" >}}) @@ -153,7 +165,7 @@ headless: true - [Installing StratoVirt]({{< relref "./docs/StratoVirt/Install_StratoVirt.md" >}}) - [Preparing the Environment]({{< relref "./docs/StratoVirt/Prepare_env.md" >}}) - [Configuring a VM]({{< relref "./docs/StratoVirt/VM_configuration.md" >}}) - - [Managing VMs]({{< relref "./docs/StratoVirt/VM_management.md" >}}) + - [Managing VMs]({{< relref "./docs/StratoVirt/VM_management.md" >}}) - [Connecting to the iSula Secure Container]({{< relref "./docs/StratoVirt/interconnect_isula.md" >}}) - [Interconnecting with libvirt]({{< relref "./docs/StratoVirt/Interconnect_libvirt.md" >}}) - [StratoVirt VFIO Instructions]({{< relref "./docs/StratoVirt/StratoVirt_VFIO_instructions.md" >}}) @@ -175,7 +187,8 @@ headless: true - [Interconnection with the CNI Network]({{< relref "./docs/Container/interconnection-with-the-cni-network.md" >}}) - [Container Resource Management]({{< relref "./docs/Container/container-resource-management.md" >}}) - [Privileged Container]({{< relref "./docs/Container/privileged-container.md" >}}) - - [CRI]({{< relref "./docs/Container/cri.md" >}}) + - [CRI API v1alpha2]({{< relref "./docs/Container/CRI_API_v1alpha2.md" >}}) + - [CRI API v1]({{< relref "./docs/Container/CRI_API_v1.md" >}}) - [Image Management]({{< relref "./docs/Container/image-management.md" >}}) - [Checking the Container Health Status]({{< relref "./docs/Container/checking-the-container-health-status.md" >}}) - [Querying Information]({{< relref "./docs/Container/querying-information.md" >}}) @@ -183,6 +196,8 @@ headless: true - [Supporting OCI hooks]({{< relref "./docs/Container/supporting-oci-hooks.md" >}}) - [Local Volume Management]({{< relref "./docs/Container/local-volume-management.md" >}}) - [Interconnecting iSulad shim v2 with StratoVirt]({{< relref "./docs/Container/interconnecting-isula-shim-v2-with-stratovirt.md" >}}) + - [iSulad Support for Cgroup v2]({{< relref "./docs/Container/iSulad_support_for_cgroup_v2.md" >}}) + - [iSulad Support for CDI]({{< relref "./docs/Container/iSulad_support_for_CDI.md" >}}) - [Appendix]({{< relref "./docs/Container/appendix.md" >}}) - [System Container]({{< relref "./docs/Container/system-container.md" >}}) - [Installation Guideline]({{< relref "./docs/Container/installation-guideline.md" >}}) @@ -215,18 +230,18 @@ headless: true - [Container Management]({{< relref "./docs/Container/container-management-2.md" >}}) - [Image Management]({{< relref "./docs/Container/image-management-2.md" >}}) - [Statistics]({{< relref "./docs/Container/statistics.md" >}}) - - [Image Building]({{< relref "./docs/Container/isula-build.md" >}}) + - [Container Image Building]({{< relref "./docs/Container/isula-build.md" >}}) + - [isula-build User Guide]({{< relref "./docs/Container/isula-build_user_guide.md" >}}) - [Kuasar Multi-Sandbox Container Runtime]({{< relref "./docs/Container/kuasar.md" >}}) - [Installation and Configuration]({{< relref "./docs/Container/kuasar-install-config.md" >}}) - [Usage Instructions]({{< relref "./docs/Container/kuasar-usage.md" >}}) - - [Appendix]({{< relref "./docs/Container/kuasar-install-config.md" >}}) - [KubeOS User Guide]({{< relref "./docs/KubeOS/kubeos-user-guide.md" >}}) - [About KubeOS]({{< relref "./docs/KubeOS/about-kubeos.md" >}}) - [Installation and Deployment]({{< relref "./docs/KubeOS/installation-and-deployment.md" >}}) - [Usage Instructions]({{< relref "./docs/KubeOS/usage-instructions.md" >}}) - [KubeOS Image Creation]({{< relref "./docs/KubeOS/kubeos-image-creation.md" >}}) - [Kubernetes Cluster Deployment Guide]({{< relref "./docs/Kubernetes/Kubernetes.md" >}}) - - [Preparing VMs]( {{< relref "./docs/Kubernetes/preparing-VMs.md">}}) + - [Preparing VMs]( {{< relref "./docs/Kubernetes/preparing-VMs.md" >}}) - [Manual Cluster Deployment]({{< relref "./docs/Kubernetes/deploying-a-Kubernetes-cluster-manually.md" >}}) - [Installing the Kubernetes Software Package]( {{< relref "./docs/Kubernetes/installing-the-Kubernetes-software-package.md" >}}) - [Preparing Certificates]({{< relref "./docs/Kubernetes/preparing-certificates.md" >}}) @@ -277,6 +292,7 @@ headless: true - [Application Development Guide]({{< relref "./docs/ApplicationDev/application-development.md" >}}) - [Preparing the Development Environment]({{< relref "./docs/ApplicationDev/preparations-for-development-environment.md" >}}) - [Using GCC for Compilation]({{< relref "./docs/ApplicationDev/using-gcc-for-compilation.md" >}}) + - [Using LLVM/Clang for Compilation]({{< relref "./docs/ApplicationDev/using-clang-for-compilation.md" >}}) - [Using Make for Compilation]({{< relref "./docs/ApplicationDev/using-make-for-compilation.md" >}}) - [Using JDK for Compilation]({{< relref "./docs/ApplicationDev/using-jdk-for-compilation.md" >}}) - [Building an RPM Package]({{< relref "./docs/ApplicationDev/building-an-rpm-package.md" >}}) diff --git "a/docs/zh/docs/A-Ops/AOps\344\270\200\351\224\256\345\214\226\351\203\250\347\275\262\346\214\207\345\215\227.md" "b/docs/zh/docs/A-Ops/AOps\344\270\200\351\224\256\345\214\226\351\203\250\347\275\262\346\214\207\345\215\227.md" new file mode 100644 index 0000000000000000000000000000000000000000..53f9699aea1f480d7f75463a248ae4312b9a4970 --- /dev/null +++ "b/docs/zh/docs/A-Ops/AOps\344\270\200\351\224\256\345\214\226\351\203\250\347\275\262\346\214\207\345\215\227.md" @@ -0,0 +1,102 @@ +## 一、一键化部署介绍 + +Aops服务一键化部署采用docker容器技术,搭配docker-compose容器编排,简化部署难度,实现一键启动和暂停。 + +## 二、环境要求 + +建议使用2台openEuler 24.03-LTS及以上机器完成部署(单台机器内存8G+),具体用途及部署方案如下: + +- 机器A用于部署mysql、elasticsearch、kafka、redis、prometheus等,主要提供数据服务支持; +- 机器B用于部署A-Ops服务端,提供业务功能支持。部署A-Ops前端服务,提供展示、操作; + +| 机器编号 | 配置IP | 部署服务 | +| -------- | ----------- | -------------------------------------------- | +| 机器A | 192.168.1.1 | mysql elasticsearch redis kafka prometheus | +| 机器B | 192.168.1.2 | aops-zeus aops-diana aops-apollo aops-hermes | + +## 三、配置环境部署 + +### 1. 关闭机器A防火墙 + +```shell +systemctl stop firewalld +systemctl disable firewalld +systemctl status firewalld +``` + +### 2. 安装docker docker-compose + +```shell +dnf install docker docker-compose +# 设置docker开机启动 +systemctl enable docker +``` + +### 3. 安装aops-vulcanus aops-tools + +```shell +dnf install aops-vulcanus aops-tools +``` + +### 4. 执行一键化部署 + +- 执行部署脚本 + +```shell +cd /opt/aops/scripts/deploy/container +# 执行run.sh部署脚本 +bash run.sh +``` + +> 进入交互式命令行 +> +> ```shell +> 1. Build the docker container (build). +> 2. Start the container orchestration service (start-service/start-env). +> 3. Stop all container services (stop-service/stop-env). +> run.sh: line 74: read: `Enter to exit the operation (Q/q).': not a valid identifier +> Select an operation procedure to continue: +> +> ``` +> +> **build**: 部署基础服务(mysql、kafka等)不需要执行build操作 +> +> **start-service**: 启动A-Ops服务及前端应用 +> +> **start-env**: 启动基础服务,包括mysql、redis、kafka等 +> +> **stop-service**: 停止A-Ops服务及前端应用 +> +> **stop-env**: 停止基础服务(数据会依然保留) +> +> **Q/q**: 退出命令交互模式 + +- 部署A-Ops服务端 + +```shell +# 切换在机器B上执行部署脚本 +cd /opt/aops/scripts/deploy/container +bash run.sh +# 交互式命令中执行start-service +``` + +- 更改服务配置文件 + +> **注意:当A-Ops服务和基础服务在同一台机器上部署时,则无需调整配置文件即可使用。若部署方案与本文档中类似(机器A、B),则需要将所有的配置文件中连接基础服务的配置项更改为机器A的ip** +> +> **默认的mysql连接字符串中使用无密码模式,基础服务的mysql配置了默认密码“123456”,视具体情况调整** + +```shell +# 调整 apollo.ini diana.ini zeus.ini配置文件中连接mysql、elasticsearch、kafka、redis的ip地址 +cd /etc/aops/ +``` + +- **FAQ** + +​ **1. elasticsearch基础服务无法正常启动** + + 查看/opt/es文件夹的权限,默认权限需要调整为777,可执行 "chmod -R 777 /opt/es" 。 + +​ **2. prometheus 基础服务无法正常启动** + + 查看/etc/prometheus目录下是否存在prometheus.yml配置文件,如果不存在,请添加配置文件。 diff --git "a/docs/zh/docs/A-Ops/AOps\346\231\272\350\203\275\345\256\232\344\275\215\346\241\206\346\236\266\344\275\277\347\224\250\346\211\213\345\206\214.md" "b/docs/zh/docs/A-Ops/AOps\346\231\272\350\203\275\345\256\232\344\275\215\346\241\206\346\236\266\344\275\277\347\224\250\346\211\213\345\206\214.md" index 1959066e9c965b47c56a1e1aa7dba5cc14586db5..9fc9e7abf8278bbc80db10d2b6e419c7b8ab03ce 100644 --- "a/docs/zh/docs/A-Ops/AOps\346\231\272\350\203\275\345\256\232\344\275\215\346\241\206\346\236\266\344\275\277\347\224\250\346\211\213\345\206\214.md" +++ "b/docs/zh/docs/A-Ops/AOps\346\231\272\350\203\275\345\256\232\344\275\215\346\241\206\346\236\266\344\275\277\347\224\250\346\211\213\345\206\214.md" @@ -1,115 +1,41 @@ # AOps 智能定位框架使用手册 -参照[AOps部署指南](AOps部署指南.md)部署AOps前后端服务后,即可使用AOps智能定位框架。 +参照[AOps部署指南](AOps部署指南.md)部署AOps前后端服务,并参照[AOps资产管理使用手册](AOps资产管理使用手册.md)纳管了主机后,即可使用AOps智能定位框架。 -下文会从页面的维度进行AOps智能定位框架功能的介绍。 - -## 1. 工作台 - - 该页面为数据看板页面,用户登录后,仍在该页面。 - - ![4911661916984_.pic](./figures/工作台.jpg) - -支持操作: - -- 当前纳管的主机数量 -- 当前所有未确认的告警数量 - -- 每个主机组告警情况的统计 - -- 用户帐户操作 - - - 修改密码 - - 退出登录 -- 业务域和CVE信息暂不支持 - -## 2. 资产管理 - -资产管理分为对主机组进行管理以及对主机进行管理。每个主机在agent侧注册时需指定一个已存在的主机组进行注册,注册完毕后会在前端进行显示。 - -(1)主机组页面: - -![4761661915951_.pic](./figures/主机组.jpg) - -支持如下操作: - -- 主机组添加 -- 主机组删除 -- 查看当前所有主机组 -- 查看每个主机组下的主机信息 - -添加主机组时,需指定主机组的名称和描述。注意:请勿重复名称。 - -![添加主机组](./figures/添加主机组.jpg) - -(2)主机管理页面: - -![主机管理](./figures/主机管理.jpg) - -支持如下操作: - -- 查看主机列表(可根据主机组、管理节点进行筛选,可根据主机名称进行排序) -- 删除主机 -- 点击主机可跳转到主机详情界面 - -(3)主机详细信息界面: - -![主机详情](./figures/主机详情.jpg) - -详情页的上半部分展示了该主机的操作系统及CPU等的基础信息。 - -![插件管理](./figures/插件管理.jpg) - -详情页的下半部分,用户可以看到该主机当前运行的采集插件信息(目前agent只支持gala-gopher插件)。 - -支持如下操作: - -- 查看主机基础信息及插件信息 -- 插件的管理(gala-gopher) - - 插件资源查看 - - 插件的开启和管理 - - gala-gopher的采集探针的开启和关闭 -- 主机场景的识别 - -点击场景识别后,系统会生成该主机的场景,并推荐检测该场景所需开启的插件以及采集项,用户可以根据推荐结果进行插件/探针的调整。 - -注意:修改插件信息如关闭插件或开关探针后,需要点击保存才能生效。 - -![修改插件](./figures/修改插件.png) - -## 3. 智能定位 +智能定位框架包含了**智能定位**和**配置溯源**两部分,下文会从页面的维度进行AOps智能定位框架功能的介绍。 +## 1. 智能定位 AOps项目的智能定位策略采用内置网络诊断应用作为模板,生成个性化工作流的策略进行检测和诊断。 -“应用”作为工作流的模板,描述了检测中各步骤的串联情况,内置各步骤中使用的检测模型的推荐逻辑。用户在生成工作流时,可根据各主机的采集项、场景等信息,定制出工作流的详细信息。 +“应用”作为工作流的模板,描述了检测中各步骤的串联情况,内置各步骤中使用的检测模型的推荐逻辑。用户在生成工作流时,可根据各主机的采集项、场景等信息,定制出工作流的详细信息。 -(1)工作流列表页面: +### 1.1工作流列表页面: -![工作流](./figures/工作流.jpg) +![工作流](./figures/故障诊断/工作流.jpg) 支持操作: - 查看当前工作流列表,支持按照主机组、应用和状态进行筛选,并支持分页操作 - 查看当前应用列表 -(2)工作流详情页面: +### 1.2 工作流详情页面: -![工作流详情](./figures/工作流详情.jpg) +![工作流详情](./figures/故障诊断/工作流详情.jpg) 支持操作: -- 查看工作流所属主机组,主机数量、状态等基础信息 +- 查看工作流所属主机组、主机数量、状态等基础信息 - 查看单指标检测、多指标检测、集群故障诊断各步骤的详细算法模型信息 - 修改检测各步骤应用的模型 - 执行、暂停和删除工作流 修改某检测步骤的模型时,用户可根据模型名或标签搜索系统内置的模型库,选中模型后点击应用进行更改。 -![修改模型](./figures/修改模型.png) +![修改模型](./figures/故障诊断/修改模型.png) -(3)应用详情页面 +### 1.3 应用详情页面 -![app详情](./figures/应用.png) +![app详情](./figures/故障诊断/应用.png) 支持操作: @@ -118,15 +44,15 @@ AOps项目的智能定位策略采用内置网络诊断应用作为模板,生 创建工作流时,点击右上角的创建工作流按钮,并在右侧弹出的窗口中输入工作流的名称和描述,选择要检测的主机组。选中主机组后,下方会列出该主机组的所有主机,用户可选中部分主机后移到右侧的列表,最后点击创建,即可在工作流列表中看到新创建的工作流。 -![app详情](./figures/app详情.jpg) +![app详情](./figures/故障诊断/app详情.jpg) -![创建工作流](./figures/创建工作流.jpg) +![创建工作流](./figures/故障诊断/创建工作流.jpg) -(4)告警 +### 1.4 告警 -启动工作流后,会根据工作流的执行周期定时触发诊断,每次诊断若结果为异常,则会作为一条告警存入数据库,同时也会反应在前端告警页面中。 +启动工作流后,会根据工作流的执行周期定时触发诊断,每次诊断若结果为异常,则会作为一条告警存入数据库,同时也会反映在前端告警页面中。 -![告警](./figures/告警.jpg) +![告警](./figures/故障诊断/告警.jpg) 支持操作: @@ -139,44 +65,52 @@ AOps项目的智能定位策略采用内置网络诊断应用作为模板,生 告警确认后,将不在列表中显示 -![告警确认](./figures/告警确认.jpg) +![告警确认](./figures/故障诊断/告警确认.jpg) 点击异常详情后,可以根据主机维度查看告警详情,包括异常数据项的展示以及根因节点、根因异常的判断等。 -![告警详情](./figures/告警详情.jpg) +![告警详情](./figures/故障诊断/告警详情.jpg) -## 4. 配置溯源 +## 2. 配置溯源 AOps项目的配置溯源用于对目标主机配置文件内容的变动进行检测记录,对于文件配置错误类引发的故障起到很好的支撑作用。 -### 创建配置域 +### 2.1 创建配置域 + ![](./figures/chuangjianyewuyu.png) -### 添加配置域纳管node +### 2.2 添加配置域纳管node ![](./figures/tianjianode.png) -### 添加配置域配置 +### 2.3 添加配置域配置 + ![](./figures/xinzengpeizhi.png) -### 查询预期配置 +### 2.4 查询预期配置 + ![](./figures/chakanyuqi.png) -### 删除配置 +### 2.5 删除配置 ![](./figures/shanchupeizhi.png) -### 查询实际配置 +### 2.6 查询实际配置 ![](./figures/chaxunshijipeizhi.png) -### 配置校验 + + +### 2.7 配置校验 + ![](./figures/zhuangtaichaxun.png) -### 配置同步 + + +### 2.8 配置同步 暂未提供 diff --git "a/docs/zh/docs/A-Ops/AOps\346\274\217\346\264\236\347\256\241\347\220\206\346\250\241\345\235\227\344\275\277\347\224\250\346\211\213\345\206\214.md" "b/docs/zh/docs/A-Ops/AOps\346\274\217\346\264\236\347\256\241\347\220\206\346\250\241\345\235\227\344\275\277\347\224\250\346\211\213\345\206\214.md" new file mode 100644 index 0000000000000000000000000000000000000000..a2b845ba14ff788d97fa36cbe0bd3252a3e367b7 --- /dev/null +++ "b/docs/zh/docs/A-Ops/AOps\346\274\217\346\264\236\347\256\241\347\220\206\346\250\241\345\235\227\344\275\277\347\224\250\346\211\213\345\206\214.md" @@ -0,0 +1,287 @@ +# AOps漏洞管理模块使用手册 + +参照[AOps部署指南](AOps部署指南.md)部署AOps前后端服务,并参照[AOps资产管理使用手册](AOps资产管理使用手册.md)纳管了主机后,即可使用AOps漏洞管理模块。 + +A-Ops智能运维工具的智能补丁管理模块(**apollo**)主要集成了**漏洞扫描、CVE修复、任务回退**、**热补丁移除**等核心功能: + +- 支持对openEuler已修复并发布的漏洞进行手动/定时扫描。漏洞的详细信息通过**在线/离线**同步社区发布的安全公告进行获取。当前聚焦于内核漏洞的处理,后续支持用户态软件包漏洞。 + +- 支持漏洞批量修复。修复过程中,客户端会命令行调用基于dnf原生框架的dnf hotpatch插件,实现**冷补丁(需重启)/热补丁(免重启)**的修复。此插件将底层冷、热补丁的管理封装成统一的入口,方便单机用户的使用和集群的调用。 + +- 支持通过任务粒度回退或移除热补丁的形式,将系统恢复至原状态。 + +下文将按照漏洞修复的工作流来进行A-Ops智能补丁管理功能的介绍。 + +## 1. 配置repo源 + +openEuler的漏洞信息通过安全公告对外发布,同时在update源中发布修复所用的软件包及相应元数据。配置了update源后即可在命令行通过dnf updateinfo list cves命令或dnf hot-updateinfo list cves(需安装A-Ops的dnf热补丁插件)进行漏洞的扫描。 + +默认的openEuler系统安装后自带对应OS版本的冷补丁update源。对于自定义或离线场景,用户可以通过设置repo来自行配置冷/热补丁的update源。 + +### 1.1 Repo源添加 + +漏洞管理界面用于对目标主机存在的CVE进行监控与修复。 + +当前漏洞管理模块分为以下三个界面: + ++ 主机列表界面 ++ CVEs界面 ++ 任务列表界面 + +进入漏洞管理的主机管理子页面,可以从主机粒度看到当前纳管的所有主机的**已修复和未修复漏洞**情况: + +![主机列表界面](./figures/漏洞管理/主机列表界面.png) + +点击下方CVE REPO的加号框,即可进行repo源的添加: + +![添加REPO源](./figures/漏洞管理/添加repo源.png) + +若不清楚格式,可以点击下载模板按钮查看。注意baseurl和gpgkey要配置为客户端OS版本的对应地址。用户也可以直接上传编辑好的repo文件。 + +新建repo完毕后,即可在CVE REPO列表中进行查看或删除。 + +### 1.2 Repo设置 + +新建repo源后,点击右上角“设置repo”的按钮,可以创建一个任务,为勾选的主机进行批量的repo设置。 + +![设置repo](./figures/漏洞管理/设置repo源.png) + +点击“创建”或“立即执行”后会生成一个repo设置任务,执行完毕后即可在主机列表界面看到已设置好该repo源。 + +## 2. 漏洞扫描 + +确认好主机上已配置好repo源(或使用默认安装时自带的repo源)后,我们就可以为主机进行批量扫描了。直接点击右侧的漏洞扫描,默认扫描全部主机。用户也可以勾选部分主机进行扫描。 + +除了手动扫描,用户也可以配置后台定时任务,进行每日定时扫描。 + +![漏洞扫描](./figures/漏洞管理/漏洞扫描.png) + +扫描完毕后,若用户在创建用户时配置了邮箱信息,apollo会将漏洞情况邮件发送给用户。 + +![邮件通知](./figures/漏洞管理/邮件通知.png) + +## 3. 漏洞查看 + +### 3.1 主机详情信息界面 + +扫描完毕后,除了上文的主机列表可以看到每个主机的已修复和未修复的CVE数量,还可以点击某台主机查看详细的CVE信息: + +![主机详情](./figures/漏洞管理/主机详情.png) + +支持如下操作: + +- 查看主机基本信息与CVE个数 +- 查看该主机未修复CVE和已修复CVE列表,支持导出。未修复CVE展开后可以看到受影响的RPM包及支持的修复方法,已修复CVE展开后可以看到修复使用的RPM。 +- 生成CVE修复任务(切换至“未修复”时,才可支持CVE修复任务创建),可支持CVE粒度的任务创建,同时也可具体到特定的rpm包修复 +- 生成热补丁移除任务(切换至“已修复”时,才可支持生成热补丁移除任务) +- 单机漏洞扫描 + +### 3.2 CVE列表界面 + +上文介绍了从主机维度查看漏洞情况,我们也可以从**漏洞维度**去查看我们重点关注的漏洞。 + +点击CVEs子页面,可以看到未修复和已修复两个页签,下方详细介绍了每个CVE的发布时间、影响软件包、严重性等信息,展开后则能看到描述信息及受影响的rpm包和支持的修复方式。 + +![CVEs列表](./figures/漏洞管理/cve列表.png) + +支持如下操作: + ++ 查看所有CVE信息(CVE严重性进行筛选、CVE ID、发布时间、CVSS分数、主机数量进行排序,可根据CVE ID或软件包名称进行检索) ++ 切换至“未修复”列表 + + 展开某CVE可以看到受影响的RPM包和支持的修复方法,以及该组合对应的主机 + + 右侧按钮为“生成修复任务”,支持生成**CVE修复任务** + ++ 切换至“已修复”列表 + + 展开某CVE可以看到修复使用的RPM及对应的主机 + + 右侧按钮为“热补丁移除”,针对热补丁修复的CVE,可生成**热补丁移除任务** + + +- 上传安全公告 + +这里对安全公告的上传简单说明: + +apollo支持定时从openEuler官网下载安全公告信息,针对无法连接外网的环境,提供了安全公告的手动上传功能。当前社区仅对社区软件包受影响的CVE发布了安全公告,用户可以从以下地址下载安全公告并上传压缩包:[https://repo.openeuler.org/security/data/cvrf/](https://gitee.com/link?target=https%3A%2F%2Frepo.openeuler.org%2Fsecurity%2Fdata%2Fcvrf%2F) + +![上传安全公告](./figures/漏洞管理/上传安全公告.png) + +社区也提供了安全公告订阅,订阅后会收到邮件通知:[https://mailweb.openeuler.org/postorius/lists/sa-announce.openeuler.org/](https://mailweb.openeuler.org/postorius/lists/sa-announce.openeuler.org/) + +### 3.3 CVE详情信息 + +和主机详情界面类似,在CVE列表界面点击某一个CVE即可进入CVE详情界面。可以看到此漏洞影响的所有主机和已修复这个漏洞的主机。 + +![CVE详情](./figures/漏洞管理/CVE详情界面.png) + +支持如下操作: + ++ 查看CVE基本信息 ++ 查看关联CVE数量,即影响同样源码包(如kernel)的CVE ++ 查看受此CVE影响的主机列表以及单个主机上受此CVE影响的rpm包列表 ++ 支持点击主机名称跳转至**主机详情页** ++ 选择“未修复”列表,右侧按钮为“生成修复任务,支持**生成CVE修复任务** ++ 选择“已修复”列表,右侧按钮为“热补丁移除任务”,支持**生成热补丁移除任务** + +## 4. 漏洞修复 + +### 4.1 生成修复任务 + +在CVE列表、CVE详情、主机详情界面均可进行漏洞的批量修复。这里以CVE列表界面为示例,选中CVE点击“生成修复任务”按钮,右侧会出现弹窗。不选中CVE则默认修复全部CVE。 + +其中针对热补丁,有2个按钮: + +- 是否accept:勾选后会在重启后自动激活此次修复使用的热补丁 + +- 冷补丁收编:勾选后,会同步生成热补丁对应的冷补丁的修复任务 + +![生成修复任务](./figures/漏洞管理/生成修复任务.png) + +需额外注意: + +- 为了方便执行以及后续的任务回滚,生成任务时会自动将冷、热补丁的修复动作拆分成两个任务,可以通过任务名进行分辨。 + +### 4.2 执行修复任务 + +生成任务后可以点击立即跳转到该任务详情,或点击左侧的任务子页面,进入任务列表界面: + +![任务列表](./figures/漏洞管理/任务列表.png) + +点击刚才生成的修复任务,可以看到此任务的基础信息,以及下方的主机以及该主机要修复的软件包信息。点击右侧的执行按钮即可执行。 + +![任务详情界面](./figures/漏洞管理/任务详情.png) + +**注意**:针对同一台主机,**热补丁任务应优先与冷补丁任务执行**。由于内核热补丁只能应用在指定版本内核,若先安装冷补丁再安装热补丁,aops客户端会报错,以防重启后内核切换、热补丁失效导致的漏洞重新暴露。而先安装热补丁再安装冷补丁时,客户端调用的dnf upgrade-en 命令会确保冷补丁包含了当前热补丁修复的漏洞。 + +### 4.3 任务报告查看 + +执行完毕后,可以看到任务的“上次执行时间”发生更新,并出现“查看报告”按钮。点击查看报告,即可查看各主机的执行情况,如执行结果、执行失败的原因等: + +![修复任务报告](./figures/漏洞管理/修复任务报告.png) + +## 5. 修复任务回滚 + +进入修复任务详情,点击生成回滚任务,即可对该修复任务进行回退: + +![生成回滚任务](./figures/漏洞管理/生成回滚任务.png) + +进入回滚任务详情,与修复任务相反,可以看见当前已安装的软件包(修复时安装的rpm),以及回退后的目标软件包(修复前的rpm)。执行时点击“执行”按钮即可。 + +![回滚任务详情](./figures/漏洞管理/回滚任务详情.png) + +## 6. 热补丁移除任务 + +若对已安装的热补丁不满意,可以在任意“已修复”的列表,勾选使用热补丁修复的CVE或主机,生成热补丁移除任务。 + +与回滚任务相比,热补丁移除任务只针对热补丁,且不支持对热补丁的升降级处理,只通过dnf操作将热补丁rpm进行移除。 + +![生成热补丁移除任务](./figures/漏洞管理/生成热补丁移除任务.png) + +## 7.定时任务配置 + +主体的漏洞处理过程在前台完成之后,用户还可以在apollo服务端针对后台的定时任务进行编辑,修改后`systemctl restart aops-apollo`重启服务生效。 + +定时任务主要包含3种类型任务,定时任务配置文件位于 /etc/aops/apollo_crontab.ini,内容如下: + +```ini +[cve_scan] +# timed task name +id = cve scan +# value between 0-6, for example, 0 means Monday, 0-6means everyday. +day_of_week = 0-6 +# value between 0-23, for example, 2 means 2:00 in a day. +hour = 2 +# value is true or false, for example, true means with service start. +auto_start = true + +[download_sa] +id = download sa +day_of_week = 0-6 +hour = 3 +auto_start = true +cvrf_url = https://repo.openeuler.org/security/data/cvrf + +[correct_data] +id = correct data +day_of_week = 0-6 +hour = 4 +auto_start = true +service_timeout_threshold_min = 15 +``` + +### 7.1 定时巡检,执行漏洞扫描 + +**定时扫描cve任务的参数** + ++ id + + > 定时任务的名称,不能与其他定时任务名称重复,不建议修改。 + ++ day_of_week + + > 定时任务在一周中的第几天启动,取值范围0-6,0-6表示每天,0表示周一,以此类推。 + ++ hour + + > 任务启动的时间,取值范围0-23,与24小时制时间格式一致。 + ++ auto_start + + > 任务是否跟随服务启动,true表示同时启动,false表示不同时启动。 + ++ 其他 + + > 如果要精确到分钟,秒,需要添加minute(取值范围0-59)和second(取值范围0-59)。 + > + > **示例** + > + > ```ini + > minute = 0 + > second = 0 + > ``` + +**修改配置文件示例** + +> 打开配置文件 + +```shell +vim /etc/aops/apollo_crontab.ini +``` + +> 修改定时任务执行时机 + +```ini +[cve_scan] +id = cve scan +day_of_week = 5 +hour = 2 +auto_start = true +``` + +### 7.2 定时下载安全公告 + +相同字段含义和使用与[cve_scan]一样。 + ++ cvrf_url + + > 获取安全公告详细信息的基础url,**暂不支持修改。** + +### 7.3 定时校正异常数据 + +相同字段含义和使用与[cve_scan]一样。 + ++ service_timeout_threshold_min + + > 判断异常数据的阈值,取值为正整数,建议最小值为15。 + +**修改配置文件示例** + +> 打开配置文件 + +```shell +vim /etc/aops/apollo_crontab.ini +``` + +> 设置异常数据阈值 + +```ini +service_timeout_threshold_min = 15 +``` diff --git "a/docs/zh/docs/A-Ops/AOps\350\265\204\344\272\247\347\256\241\347\220\206\344\275\277\347\224\250\346\211\213\345\206\214.md" "b/docs/zh/docs/A-Ops/AOps\350\265\204\344\272\247\347\256\241\347\220\206\344\275\277\347\224\250\346\211\213\345\206\214.md" new file mode 100644 index 0000000000000000000000000000000000000000..5d3018d25a14067b6003a307525f353bf1fb3767 --- /dev/null +++ "b/docs/zh/docs/A-Ops/AOps\350\265\204\344\272\247\347\256\241\347\220\206\344\275\277\347\224\250\346\211\213\345\206\214.md" @@ -0,0 +1,111 @@ +# AOps资产管理使用手册 + +参照[AOps部署指南](AOps部署指南.md)部署AOps前后端服务后,即可使用AOps资产管理功能,纳管集群主机。 + +主机纳管是使用AOps进行智能运维的第一步,后续用户按需部署的漏洞管理、配置溯源及故障诊断服务均面向纳管的主机进行操作。 + +下文将为大家介绍如何使用资产管理功能,逐步纳管集群主机。 + +## 1. 登录 + +首先使用默认的admin账号进行登录,密码为changeme + +![登陆界面](./figures/资产管理/登陆界面.png) + +这里也支持用户注册新账号,或是用gitee账号第三方登录。 + +登录后会直接切入数据看板界面,后续纳管主机并漏洞扫描后,可以在右侧页面主窗口右上角内看到CVE整体数量的分布统计。 + +![工作台详情](./figures/资产管理/工作台.png) + +## 2. 纳管集群主机 + +资产管理界面用于对集群中服务器的纳管(添加、编辑和删除),支持单台主机和批量主机的添加操作。 + +当前资产管理模块分为以下两个界面: + +- 主机管理 +- 主机组管理 + +纳管时需先创建主机组,再将主机添加至对应主机组中,便于后续对主机分组查看和管理。 + +### 2.1 添加主机组 + +进入资产管理的主机组管理子页面,点击右侧添加主机组按钮,即可添加主机组的名称和描述。 + +![主机组管理](./figures/资产管理/主机组管理列表.png) + +![添加主机组](./figures/资产管理/添加主机组.png) + +后续添加主机后,可以查看该主机组的主机列表: + +![主机组内主机查看](./figures/资产管理/主机组内主机查看.png) + +### 2.2 添加主机 + +进入主机管理页面,可以看到当前所有纳管的主机。其中也可以看到各主机的在线状态。 + +![主机列表查看](./figures/资产管理/主机列表.png) + +此页面支持如下操作: + +- 添加单台主机 +- 批量添加主机 +- 主机的批量删除 +- 支持使用所属主机组、管理节点的主机过滤,同时可满足对主机名称的排序 + +#### 2.2.1 添加单台主机 + +点击“添加主机”按钮,即可对单台主机进行添加。 + +![主机管理-添加](./figures/资产管理/主机管理-添加.png) + +支持如下操作: + +- 快捷添加主机组(同主机组管理中的添加功能) +- 登录认证方式支持主机密码模式(需提供账号和密码)和主机密钥模式(需提供可登录的密钥,**注意此处的密钥为私钥**) + +注:管理节点/监控节点暂无本质区别,用户可按需指定。 + +#### 2.2.2 批量添加主机 + +针对大集群的场景,一个个添加主机过于麻烦。这里我们提供了上传excel的方式,将主机批量添加。 + +![主机管理-批量添加](./figures/资产管理/批量添加主机.png) + +下载模板并按照格式填写主机注册所需信息后,选择文件进行上传。 + +![主机管理-批量添加](./figures/资产管理/批量添加-文件解析.png) + +格式解析无误后,点击提交,即可看到添加结果。若添加失败会有相应提示。 + +![主机管理-批量添加](./figures/资产管理/批量添加-添加结果.png) + +支持如下操作: + +- 在线下载批量导入主机模板,支持的类型有xls、xlsx、csv三种格式的文件 +- 通过新增上传解析文件内容,数据回显至前端展示 +- 支持单主机的数据调整或删除 +- 点击提交后,可查看主机添加的结果 + +## 3 编辑主机 + +添加主机完毕后,若密码或密钥不对导致连接失败,或其他信息需要变更,可以点击主机列表中的编辑按钮进行编辑: + +![主机管理-批量添加](./figures/资产管理/主机编辑界面.png) + +## 4 查看主机详情 + +客户端部署了aops-ceres命令行工具后,在主机列表点击主机可以查看该主机的一些基础信息。 + +![主机详情](./figures/资产管理/主机详情.png) + +若部署了prometheus服务以及客户端的采集器(node-exporter或gala-gopher等),可以在下方选择并展示主机的指标波形。 + +![指标波形](./figures/资产管理/指标波形.png) + +点击插件页签,可以看到node-exporter插件的各采集探针,按需开启或关闭。 + +点击场景识别后,系统会根据客户端的应用生成该主机的场景,并推荐检测该场景所需开启的插件以及采集项。 + +![插件开关](./figures/资产管理/插件开关.png) \ No newline at end of file diff --git "a/docs/zh/docs/A-Ops/AOps\351\203\250\347\275\262\346\214\207\345\215\227.md" "b/docs/zh/docs/A-Ops/AOps\351\203\250\347\275\262\346\214\207\345\215\227.md" index 56c9f263e3aedae8221bc46ecef9f589496061ee..af21e55d0bcc90c93748af73e60a3d5277da9e8e 100644 --- "a/docs/zh/docs/A-Ops/AOps\351\203\250\347\275\262\346\214\207\345\215\227.md" +++ "b/docs/zh/docs/A-Ops/AOps\351\203\250\347\275\262\346\214\207\345\215\227.md" @@ -1,437 +1,713 @@ -# A-Ops部署指南 +# 一、A-Ops服务介绍 -## 一、环境要求 +A-Ops是用于提升主机整体安全性的服务,通过资产管理、漏洞管理、配置溯源等功能,识别并管理主机中的信息资产,监测主机中的软件漏洞、排查主机中遇到的系统故障,使得目标主机能够更加稳定和安全的运行。 -- 2台openEuler 23.09机器 +下表是A-Ops服务涉及模块的说明: - 分别用于部署check模块的两种模式:调度器,执行器。其他服务如mysql、elasticsearch、aops-manager等可在任意一台机器独立部署,为便于操作,将这些服务部署在机器A。 +| 模块 | 说明 | +| ---------- | ---------------------------------------------------- | +| aops-ceres | A-Ops服务的客户端。
提供采集主机数据与管理其他数据采集器(如gala-gopher)的功能。
响应管理中心下发的命令,处理管理中心的需求与操作。 | +| aops-zeus | A-Ops基础应用管理中心,主要负责与其他模块的中转站,默认端口:11111
对外提供基本主机管理服务,主机与主机组的添加、删除等功能依赖此模块实现。 | +| aops-hermes | A-Ops可视化操作界面,展示数据信息,提升服务易用性。 | +| aops-apollo | A-Ops漏洞管理模块相关功能依赖此服务实现,默认端口:11116
识别客户机周期性获取openEuler社区发布的安全公告,并更新到漏洞库中。
通过与漏洞库比对,检测出系统和软件存在的漏洞。 | +| aops-vulcanus | A-Ops工具库,**除aops-ceres与aops-hermes模块外,其余模块须与此模块共同安装使用**。 | +| aops-tools | 提供基础环境一键部署脚本、数据库表初始化,安装后在/opt/aops/scripts目录下可见。
| +| gala-ragdoll | A-Ops配置溯源模块,通过git监测并记录配置文件的改动,默认端口:11114 | +| dnf-hotpatch-plugin | dnf插件,使得dnf工具可识别热补丁信息,提供热补丁扫描及热补丁修复功能。 | -- 内存尽量为8G+ +# 二、部署环境要求 -## 二、配置部署环境 +建议采用4台 openEuler 24.03-LTS 机器部署,其中3台用于配置服务端,1台用于纳管(aops服务纳管的主机),**且repo中需要配置update源**([FAQ:配置update源](#Q6、配置update源)),具体用途以及部署方案如下: -### 机器A ++ 机器A:部署mysql、redis、elasticsearch等,主要提供数据服务支持,建议内存8G+。 ++ 机器B:部署A-Ops的资产管理zeus服务+前端展示服务,提供完整的业务功能支持,建议内存6G+。 ++ 机器C:部署A-Ops的漏洞管理配置溯源(gala-ragdoll),提供漏洞管理服务,建议内存4G+。 ++ 机器D:部署A-Ops的客户端,用作一个被AOps服务纳管监控的主机(需要监管的机器中都可以安装aops-ceres)。 -机器A需部署的aops服务有:aops-tools、aops-manager、aops-check、aops-hermes、aops-agent、gala-gopher。 +| 机器编号 | 配置IP | 部署模块 | +| -------- | ----------- | ------------------------------------- | +| 机器A | 192.168.1.1 | mysql,elasticsearch, redis | +| 机器B | 192.168.1.2 | aops-zeus,aops-hermes,aops-diana | +| 机器C | 192.168.1.3 | aops-apollo,gala-ragdoll,aops-diana | +| 机器D | 192.168.1.4 | aops-ceres,dnf-hotpatch-plugin | -需部署的第三方服务有:mysql、elasticsearch、zookeeper、kafka、prometheus。 +> 每台机器在部署前,请先**关闭防火墙和SELinux**。 -具体部署步骤如下: - -#### 2.1 关闭防火墙 - -关闭本节点防火墙 +- 关闭防火墙 ```shell systemctl stop firewalld systemctl disable firewalld systemctl status firewalld -``` +setenforce 0 -#### 2.2 部署aops-tools +``` -安装aops-tools: +- 禁用SELinux ```shell -yum install aops-tools +# 修改/etc/selinux/config文件中SELINUX状态为disabled + +vi /etc/selinux/config +SELINUX=disabled + +# 更改之后,按下ESC键,键盘中输入 :wq 保存修改的内容 ``` +注:此SELINUX状态配置在系统重启后生效。 + +# 三、服务端部署 + +## 3.1、 资产管理 + +使用资产管理功能需部署aops-zeus、aops-hermes、mysql、redis服务。 + +### 3.1.1、节点信息 -#### 2.3 部署数据库[mysql、elasticsearch] +| 机器编号 | 配置IP|部署模块| +| -------- | -------- | -------- | +| 机器A | 192.168.1.1 |mysql,redis| +| 机器B | 192.168.1.2 |aops-zeus,aops-hermes| -##### 2.3.1 部署mysql +### 3.1.2、部署步骤 -使用安装aops-tools时安装的aops-basedatabase脚本进行安装 +#### 3.1.2.1、 部署mysql + +- 安装mysql ```shell -cd /opt/aops/aops_tools -./aops-basedatabase mysql +yum install mysql-server -y ``` -修改mysql配置文件 +- 修改mysql配置文件 -```shell +```bash vim /etc/my.cnf ``` -新增bind-address, 值为本机ip +- 在mysqld配置节下新增bind-address,值为本机ip -![1662346986112](./figures/修改mysql配置文件.png) +```ini +[mysqld] +bind-address=192.168.1.1 +``` -重启mysql服务 +- 重启mysql服务 -```shell +```bash systemctl restart mysqld ``` -连接数据库,设置权限: +- 设置mysql数据库的root用户访问权限 -```shell -mysql -show databases; -use mysql; -select user,host from user;//出现user为root,host为localhost时,说明mysql只允许本机连接,外网和本地软件客户端则无法连接。 -update user set host = '%' where user='root'; -flush privileges;//刷新权限 -exit -``` +```mysql +[root@localhost ~] mysql -##### 2.3.2 部署elasticsearch +mysql> show databases; +mysql> use mysql; +mysql> select user,host from user; -- 此处出现host为localhost时,说明mysql只允许本机连接,外网和本地软件客户端则无法连接。 -使用安装aops-tools时安装的aops-basedatabase脚本进行安装 ++---------------+-----------+ +| user | host | ++---------------+-----------+ +| root | localhost | +| mysql.session | localhost | +| mysql.sys | localhost | ++---------------+-----------+ +3 rows in set (0.00 sec) +``` -```shell -cd /opt/aops/aops_tools -./aops-basedatabase elasticsearch +```mysql +mysql> update user set host = '%' where user='root'; -- 设置允许root用户任意IP访问。 +mysql> flush privileges; -- 刷新权限 +mysql> exit ``` -修改配置文件: +#### 3.1.2.2、 部署redis -修改elasticsearch配置文件: +- 安装redis ```shell -vim /etc/elasticsearch/elasticsearch.yml +yum install redis -y ``` -![1662370718890](./figures/elasticsearch配置2.png) +- 修改配置文件 -![1662370575036](./figures/elasticsearch配置1.png) +```shell +vim /etc/redis.conf +``` -![1662370776219](./figures/elasticsearch3.png) +- 绑定IP -重启elasticsearch服务: +```ini +# It is possible to listen to just one or multiple selected interfaces using +# the "bind" configuration directive, followed by one or more IP addresses. +# +# Examples: +# +# bind 192.168.1.100 10.0.0.1 +# bind 127.0.0.1 ::1 +# +# ~~~ WARNING ~~~ If the computer running Redis is directly exposed to the +# internet, binding to all the interfaces is dangerous and will expose the +# instance to everybody on the internet. So by default we uncomment the +# following bind directive, that will force Redis to listen only into +# the IPv4 lookback interface address (this means Redis will be able to +# accept connections only from clients running into the same computer it +# is running). +# +# IF YOU ARE SURE YOU WANT YOUR INSTANCE TO LISTEN TO ALL THE INTERFACES +# JUST COMMENT THE FOLLOWING LINE. +# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +bind 127.0.0.1 192.168.1.1 # 此处添加机器A的真实IP +``` + +- 启动redis服务 ```shell -systemctl restart elasticsearch +systemctl start redis ``` -#### 2.4 部署aops-manager +#### 3.1.2.3、 部署prometheus -安装aops-manager +- 安装prometheus ```shell -yum install aops-manager +yum install prometheus2 -y ``` -修改配置文件: +- 修改配置文件 ```shell -vim /etc/aops/manager.ini +vim /etc/prometheus/prometheus.yml +``` + +- 被纳管的客户端**gala-gopher**地址添加至prometheus监控节点 + + > 本指南中机器D用于部署客户端,故添加机器D的gala-gopher地址 + > + > 修改**targets**配置项 + +```yml +# A scrape configuration containing exactly one endpoint to scrape: +# Here it's Prometheus itself. +scrape_configs: + # The job name is added as a label `job=` to any timeseries scraped from this config. + - job_name: 'prometheus' + + # metrics_path defaults to '/metrics' + # scheme defaults to 'http'. + + static_configs: + - targets: ['localhost:9090', '192.168.1.4:8888'] ``` -将配置文件中各服务的地址修改为真实地址,由于将所有服务都部署在机器A,故需把IP地址配为机器A的地址。 +- 启动prometheus服务 ```shell -[manager] -ip=192.168.1.1 // 此处及后续服务ip修改为机器A真实ip +systemctl start prometheus +``` + +#### 3.1.2.4、 部署aops-zeus + +- 安装aops-zeus + +``` +yum install aops-zeus -y +``` + +- 修改配置文件 + +``` +vim /etc/aops/zeus.ini +``` + +- 将配置文件中各服务的地址修改为真实地址,本指南中aops-zeus部署于机器B,故需把IP地址配为机器B的ip地址 + +```ini +[zeus] +ip=192.168.1.2 // 此处ip修改为机器B真实ip port=11111 -host_vault_dir=/opt/aops -host_vars=/opt/aops/host_vars [uwsgi] wsgi-file=manage.py -daemonize=/var/log/aops/uwsgi/manager.log +daemonize=/var/log/aops/uwsgi/zeus.log http-timeout=600 harakiri=600 - -[elasticsearch] -ip=192.168.1.1 // 此处及后续服务ip修改为机器A真实ip -port=9200 -max_es_query_num=10000000 +processes=2 // 生成指定数目的worker/进程 +gevent=100 // gevent异步核数 [mysql] -ip=192.168.1.1 // 此处及后续服务ip修改为机器A真实ip +ip=192.168.1.1 // 此处ip修改为机器A真实ip port=3306 database_name=aops engine_format=mysql+pymysql://@%s:%s/%s -pool_size=10000 +pool_size=100 pool_recycle=7200 -[aops_check] -ip=192.168.1.1 // 此处及后续服务ip修改为机器A真实ip -port=11112 +[agent] +default_instance_port=8888 + +[redis] +ip=192.168.1.1 // 此处ip修改为机器A真实ip +port=6379 + +[apollo] +ip=192.168.1.3 // 此处ip修改为部署apollo服务的真实ip(建议apollo与zeus分开部署)。若不使用apollo的漏洞管理功能则可以不配置 +port=11116 ``` -启动aops-manager服务: +> **mysql数据库设置为密码模式**,请参阅[FAQ:密码模式下mysql服务配置链接字符串](#Q5、mysql设置为密码模式) + +- 启动aops-zeus服务 ```shell -systemctl start aops-manager +systemctl start aops-zeus ``` -#### 2.5 部署aops-hermes +**注意:服务启动前请确保已 [初始化aops-zeus数据库](#3125-初始化aops-zeus数据库)** + +> zeus服务启动失败,且报错内容包含mysql数据库连接失败,请排查是否设置mysql密码,如果是请参阅[FAQ:密码模式下mysql服务启动失败](#Q5、mysql设置为密码模式) + +#### 3.1.2.5、 初始化aops-zeus数据库 + +- 执行数据库初始化 + +```shell +cd /opt/aops/scripts/deploy +bash aops-basedatabase.sh init zeus +``` + +**注意:在未安装aops-tools工具包时,也可获取sql脚本通过mysql加载的方式初始化(sql脚本路径:/opt/aops/database/zeus.sql)** + +[FAQ:密码模式下mysql数据库初始化](#Q5、mysql设置为密码模式) -安装aops-hermes +[FAQ:/opt/aops/scripts/deploy目录不存在](#Q7、/opt/aops/scripts/deploy目录不存在) + +#### 3.1.2.6、 部署aops-hermes + +- 安装aops-hermes ```shell -yum install aops-hermes +yum install aops-hermes -y ``` -修改配置文件,由于将所有服务都部署在机器A,故需将web访问的各服务地址配置成机器A的真实ip。 +- 修改配置文件 ```shell vim /etc/nginx/aops-nginx.conf ``` -部分服务配置截图: +- 服务配置展示 -![1662378186528](./figures/配置web.png) + > 服务都部署在机器B,需将ngxin代理访问的各服务地址配置为机器B的真实ip -开启aops-hermesb服务: +```ini + # 保证前端路由变动时nginx仍以index.html作为入口 + location / { + try_files $uri $uri/ /index.html; + if (!-e $request_filename){ + rewrite ^(.*)$ /index.html last; + } + } + # 此处修改为aops-zeus部署机器真实IP + location /api/ { + proxy_pass http://192.168.1.2:11111/; + } + # 此处IP对应gala-ragdoll的IP地址,涉及到端口为11114的IP地址都需要进行调整 + location /api/domain { + proxy_pass http://192.168.1.3:11114/; + rewrite ^/api/(.*) /$1 break; + } + # 此处IP对应aops-apollo的IP地址 + location /api/vulnerability { + proxy_pass http://192.168.1.3:11116/; + rewrite ^/api/(.*) /$1 break; + } +``` + +- 开启aops-hermes服务 ```shell systemctl start aops-hermes ``` -#### 2.6 部署kafka +## 3.2、 漏洞管理 -##### 2.6.1 部署zookeeper +CVE管理模块在[资产管理](#31-资产管理)模块的基础上实现,在部署CVE管理模块前须完成[资产管理](#31-资产管理)模块的部署,然后再部署aops-apollo。 -安装: +数据服务部分aops-apollo服务的运行需要**mysql、elasticsearch、redis**数据库的支持。 -```shell -yum install zookeeper -``` +### 3.2.1、 节点信息 -启动服务: +| 机器编号 | 配置IP | 部署模块 | +| -------- | ----------- | ------------- | +| 机器A | 192.168.1.1 | elasticsearch | +| 机器C | 192.168.1.3 | aops-apollo | -```shell -systemctl start zookeeper -``` +### 3.2.2、 部署步骤 -##### 2.6.2 部署kafka +[部署步骤](#312部署步骤) -安装: +#### 3.2.2.1、 部署elasticsearch + +- 生成elasticsearch的repo源 ```shell -yum install kafka +echo "[aops_elasticsearch] +name=Elasticsearch repository for 7.x packages +baseurl=https://artifacts.elastic.co/packages/7.x/yum +gpgcheck=1 +gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch +enabled=1 +autorefresh=1 +type=rpm-md" > "/etc/yum.repos.d/aops_elascticsearch.repo" ``` -修改配置文件: +- 安装elasticsearch ```shell -vim /opt/kafka/config/server.properties +yum install elasticsearch-7.14.0-1 -y ``` -将listener 改为本机ip +- 修改elasticsearch配置文件 -![1662381371927](./figures/kafka配置.png) +```shell +vim /etc/elasticsearch/elasticsearch.yml +``` -启动kafka服务: +```yml +# ------------------------------------ Node ------------------------------------ +# +# Use a descriptive name for the node: +# +node.name: node-1 +``` + +```yml +# ---------------------------------- Network ----------------------------------- +# +# By default Elasticsearch is only accessible on localhost. Set a different +# address here to expose this node on the network: +# +# 此处修改为机器A真实ip +network.host: 192.168.1.1 +# +# By default Elasticsearch listens for HTTP traffic on the first free port it +# finds starting at 9200. Set a specific HTTP port here: +# +http.port: 9200 +# +# For more information, consult the network module documentation. +# +``` + +```yml +# --------------------------------- Discovery ---------------------------------- +# +# Pass an initial list of hosts to perform discovery when this node is started: +# The default list of hosts is ["127.0.0.1", "[::1]"] +# +#discovery.seed_hosts: ["host1", "host2"] +# +# Bootstrap the cluster using an initial set of master-eligible nodes: +# +cluster.initial_master_nodes: ["node-1"] +# 跨域配置 +http.cors.enabled: true +http.cors.allow-origin: "*" +``` + +- 重启elasticsearch服务 ```shell -cd /opt/kafka/bin -nohup ./kafka-server-start.sh ../config/server.properties & -tail -f ./nohup.out # 查看nohup所有的输出出现A本机ip 以及 kafka启动成功INFO; +systemctl restart elasticsearch ``` -#### 2.7 部署aops-check +#### 3.2.2.2、 部署aops-apollo -安装aops-check: +- 安装aops-apollo ```shell -yum install aops-check +yum install aops-apollo -y ``` -修改配置文件: +- 修改配置文件 -```shell -vim /etc/aops/check.ini +``` +vim /etc/aops/apollo.ini ``` -将配置文件中各服务的地址修改为真实地址,由于将所有服务都部署在机器A,故需把IP地址配为机器A的地址。 +- 将apollo.ini配置文件中各服务的地址修改为真实地址 -```shell -[check] -ip=192.168.1.1 // 此处及后续服务ip修改为机器A真实ip -port=11112 -mode=configurable // 该模式为configurable模式,用于常规诊断模式下的调度器 -timing_check=on +```ini +[apollo] +ip=192.168.1.3//此处修改为机器C的真实IP +port=11116 +host_vault_dir=/opt/aops +host_vars=/opt/aops/host_vars -[default_mode] -period=30 -step=30 +[zeus] +ip=192.168.1.2 //此处修改为机器B的真实IP +port=11111 -[elasticsearch] -ip=192.168.1.1 // 此处及后续服务ip修改为机器A真实ip -port=9200 +# hermes info is used to send mail. +[hermes] +ip=192.168.1.2 //此处修改为部署aops-hermes的真实IP,以机器B的IP地址为例 +port=80 //此处改为hermes服务实际使用端口 + +[cve] +cve_fix_function=yum +# value between 0-23, for example, 2 means 2:00 in a day. +cve_scan_time=2 [mysql] -ip=192.168.1.1 // 此处及后续服务ip修改为机器A真实ip +ip=192.168.1.1 //此处修改为机器A的真实IP port=3306 database_name=aops engine_format=mysql+pymysql://@%s:%s/%s -pool_size=10000 +pool_size=100 pool_recycle=7200 -[prometheus] -ip=192.168.1.1 // 此处及后续服务ip修改为机器A真实ip -port=9090 -query_range_step=15s +[elasticsearch] +ip=192.168.1.1 //此处修改为机器A的真实IP +port=9200 +max_es_query_num=10000000 -[agent] -default_instance_port=8888 +[redis] +ip=192.168.1.1 //此处修改为机器A的真实IP +port=6379 -[manager] -ip=192.168.1.1 // 此处及后续服务ip修改为机器A真实ip -port=11111 +[uwsgi] +wsgi-file=manage.py +daemonize=/var/log/aops/uwsgi/apollo.log +http-timeout=600 +harakiri=600 +processes=2 +gevent=100 -[consumer] -kafka_server_list=192.168.1.1:9092 // 此处及后续服务ip修改为机器A真实ip -enable_auto_commit=False -auto_offset_reset=earliest -timeout_ms=5 -max_records=3 -task_name=CHECK_TASK -task_group_id=CHECK_TASK_GROUP_ID -result_name=CHECK_RESULT -[producer] -kafka_server_list = 192.168.1.1:9092 // 此处及后续服务ip修改为机器A真实ip -api_version = 0.11.5 -acks = 1 -retries = 3 -retry_backoff_ms = 100 -task_name=CHECK_TASK -task_group_id=CHECK_TASK_GROUP_ID ``` +> **mysql数据库设置为密码模式**,请参阅[密码模式下mysql服务配置链接字符串](#Q5、mysql设置为密码模式) -启动aops-check服务(configurable模式): +- 启动aops-apollo服务 ```shell -systemctl start aops-check +systemctl start aops-apollo ``` -#### 2.8 部署客户端服务 - -客户端机器的服务需要部署aops-agent及gala-gopher,具体可参考[aops-agent部署指南](aops-agent部署指南.md)。 +**注意:服务启动前请确保已 [初始化aops-apollo数据库](#3223初始化aops-apollo数据库)** -注意:主机注册时需要先在前端添加主机组操作,确保该主机所属的主机组存在。此处只对机器A做部署、纳管。 +> apollo服务启动失败,且报错内容包含mysql数据库连接失败,请排查是否设置mysql密码,如果是请参阅[密码模式下mysql服务启动失败](#Q5、mysql设置为密码模式) -#### 2.9 部署prometheus +#### 3.2.2.3、初始化aops-apollo数据库 -安装prometheus: +- apollo数据库初始化 ```shell -yum install prometheus2 +cd /opt/aops/scripts/deploy +bash aops-basedatabase.sh init apollo ``` -修改配置文件: +**注意:在未安装aops-tools工具包时,也可获取sql脚本通过mysql加载的方式初始化(sql脚本路径:/opt/aops/database/apollo.sql)** -```shell -vim /etc/prometheus/prometheus.yml -``` +[FAQ:密码模式下mysql数据库初始化](#Q5、mysql设置为密码模式) -将所有客户端的gala-gopher地址新增到prometheus的监控节点中。 +[FAQ:/opt/aops/scripts/deploy目录不存在](#Q7、/opt/aops/scripts/deploy目录不存在) -![1662377261742](./figures/prometheus配置.png) +## 3.3、 配置溯源 -启动服务: +A-Ops配置溯源在机器管理的基础上依赖gala-ragdoll实现,同样在部署gala-ragdoll服务之前,须完成[资产管理](#31-资产管理)部分的部署。 -```shell -systemctl start prometheus -``` +### 3.3.1、 节点信息 -#### 2.10 部署gala-ragdoll +| 机器编号 | 配置IP | 部署模块 | +| -------- | ----------- | ------------ | +| 机器C | 192.168.1.3 | gala-ragdoll | -A-Ops配置溯源功能依赖gala-ragdoll实现,通过Git实现配置文件的变动监测。 +### 3.3.2、 部署步骤 -安装gala-ragdoll: +[部署步骤](#312部署步骤) + +#### 3.3.2.1、 部署gala-ragdoll + +- 安装gala-ragdoll ```shell -yum install gala-ragdoll # A-Ops 配置溯源 +yum install gala-ragdoll python3-gala-ragdoll -y ``` -修改配置文件: +- 修改配置文件 ```shell vim /etc/ragdoll/gala-ragdoll.conf ``` -将collect节点collect_address中IP地址修改为机器A的地址,collect_api与collect_port修改为实际接口地址。 +> **将collect节点collect_address中IP地址修改为机器B的地址,collect_api与collect_port修改为实际接口地址** -```text +```ini [git] git_dir = "/home/confTraceTest" user_name = "user_name" user_email = "user_email" [collect] -collect_address = "http://192.168.1.1" //此处修改为机器A的真实IP -collect_api = "/manage/config/collect" //此处修改为配置文件采集的实际接口 -collect_port = 11111 //此处修改为服务的实际端口 +collect_address = "http://192.168.1.2" //此处修改为机器B的真实IP +collect_api = "/manage/config/collect" //此处接口原为示例值,需修改为实际接口值/manage/config/collect +collect_port = 11111 //此处修改为aops-zeus服务的实际端口 [sync] -sync_address = "http://0.0.0.0" -sync_api = "/demo/syncConf" -sync_port = 11114 +sync_address = "http://192.168.1.2" +sync_api = "/manage/config/sync" //此处接口原为示例值,需修改为实际接口值/manage/config/sync +sync_port = 11111 +[objectFile] +object_file_address = "http://192.168.1.2" +object_file_api = "/manage/config/objectfile" //此处接口原为示例值,需修改为实际接口值/manage/config/objectfile +object_file_port = 11111 [ragdoll] port = 11114 - ``` -启动gala-ragdoll服务 +- 启动gala-ragdoll服务 ```shell systemctl start gala-ragdoll ``` -### 机器B +## 3.4、 异常检测 + +异常检测模块依赖[机器管理](#31-资产管理)服务,故在部署异常检测功能前须完成[机器管理](#31-资产管理)模块部署,然后再部署aops-diana。 + +基于分布式部署考虑,aops-diana服务需在机器B与机器C同时部署,分别扮演消息队列中的生产者与消费者角色。 + +数据服务部分aops-diana服务的运行需要**mysql、elasticsearch、kafka**以及**prometheus**的支持。 + +### 3.4.1、 节点信息 + +| 机器编号 | 配置IP | 部署模块 | +| -------- | ----------- | ---------- | +| 机器A | 192.168.1.1 | kafka | +| 机器B | 192.168.1.2 | aops-diana | +| 机器C | 192.168.1.3 | aops-diana | + +### 3.4.2、 部署步骤 -机器B只需部署aops-check作为执行器。 +[部署步骤](#312部署步骤) -#### 2.11 部署aops-check +[部署elasticsearch](#3221-部署elasticsearch) -安装aops-check: +#### 3.4.2.1、 部署kafka + +kafka使用zooKeeper用于管理、协调代理,在应用**kafka**服务时需要同步部署**zookeeper**服务。 + +- 安装zookeeper ```shell -yum install aops-check +yum install zookeeper -y ``` -修改配置文件: +- 启动zookeeper服务 ```shell -vim /etc/aops/check.ini +systemctl start zookeeper ``` -将配置文件中各服务的地址修改为真实地址,除check服务为机器B的地址外,其他服务都部署在机器A,故需把IP地址配置为机器A的地址即可。 +- 安装kafka ```shell -[check] -ip=192.168.1.2 // 此处ip改为机器B真实ip +yum install kafka -y +``` + +- 修改kafka配置文件 + +```shell +vim /opt/kafka/config/server.properties +``` + +- 修改**listeners**为本机ip + +```yaml +############################# Socket Server Settings ############################# + +# The address the socket server listens on. It will get the value returned from +# java.net.InetAddress.getCanonicalHostName() if not configured. +# FORMAT: +# listeners = listener_name://host_name:port +# EXAMPLE: +# listeners = PLAINTEXT://your.host.name:9092 +listeners=PLAINTEXT://192.168.1.1:9092 +``` + +- 后台运行kafka服务 + +```shell +cd /opt/kafka/bin +nohup ./kafka-server-start.sh ../config/server.properties & + +# 查看nohup所有的输出出现A本机ip 以及kafka启动成功INFO +tail -f ./nohup.out +``` + +#### 3.4.2.2、 部署diana + +- 安装aops-diana + +```shell +yum install aops-diana -y +``` + +- 修改配置文件 + + > 机器B与机器C中aops-diana分别扮演不同的角色,通过**配置文件的差异来区分两者扮演角色的不同** + +```shell +vim /etc/aops/diana.ini +``` + +(1)机器C中aops-diana以**executor**模式启动,**扮演kafka消息队列中的消费者角色**,配置文件需修改部分如下所示 + +```ini +[diana] +ip=192.168.1.3 // 此处ip修改为机器C真实ip port=11112 -mode=executor // executor,用于常规诊断模式下的执行器 +mode=executor // 该模式为executor模式,用于常规诊断模式下的执行器,扮演kafka中消费者角色。 timing_check=on [default_mode] -period=30 -step=30 +period=60 +step=60 [elasticsearch] -ip=192.168.1.1 // 此处及后续服务ip修改为机器A真实ip +ip=192.168.1.1 // 此处ip修改为机器A真实ip port=9200 +max_es_query_num=10000000 [mysql] -ip=192.168.1.1 // 此处及后续服务ip修改为机器A真实ip +ip=192.168.1.1 // 此处ip修改为机器A真实ip port=3306 database_name=aops engine_format=mysql+pymysql://@%s:%s/%s -pool_size=10000 +pool_size=100 pool_recycle=7200 +[redis] +ip=192.168.1.1 // 此处ip修改为机器A真实ip +port=6379 + [prometheus] -ip=192.168.1.1 // 此处及后续服务ip修改为机器A真实ip +ip=192.168.1.1 // 此处ip修改为机器A真实ip port=9090 query_range_step=15s [agent] default_instance_port=8888 -[manager] -ip=192.168.1.1 // 此处及后续服务ip修改为机器A真实ip +[zeus] +ip=192.168.1.2 // 此处ip修改为机器B真实ip port=11111 [consumer] -kafka_server_list=192.168.1.1:9092 // 此处及后续服务ip修改为机器A真实ip +kafka_server_list=192.168.1.1:9092 // 此处ip修改为机器A真实ip enable_auto_commit=False auto_offset_reset=earliest timeout_ms=5 @@ -439,20 +715,224 @@ max_records=3 task_name=CHECK_TASK task_group_id=CHECK_TASK_GROUP_ID result_name=CHECK_RESULT + [producer] -kafka_server_list = 192.168.1.1:9092 // 此处及后续服务ip修改为机器A真实ip +kafka_server_list = 192.168.1.1:9092 // 此处ip修改为机器A真实ip api_version = 0.11.5 acks = 1 retries = 3 retry_backoff_ms = 100 task_name=CHECK_TASK task_group_id=CHECK_TASK_GROUP_ID + +[uwsgi] +wsgi-file=manage.py +daemonize=/var/log/aops/uwsgi/diana.log +http-timeout=600 +harakiri=600 +processes=2 +threads=2 ``` -启动aops-check服务(executor模式): +> **mysql数据库设置为密码模式**,请参阅[FAQ:密码模式下mysql服务配置链接字符串](#Q5、mysql设置为密码模式) + +(2)机器B中diana以**configurable**模式启动,**扮演kafka消息队列中的生产者角色**,aops-hermes中关于aops-diana的端口配置以该机器ip与端口为准,配置文件需修改部分如下所示 + +```ini +[diana] +ip=192.168.1.2 // 此处ip修改为机器B真实ip +port=11112 +mode=configurable // 该模式为configurable模式,用于常规诊断模式下的调度器,充当生产者角色。 +timing_check=on + +[default_mode] +period=60 +step=60 + +[elasticsearch] +ip=192.168.1.1 // 此处ip修改为机器A真实ip +port=9200 +max_es_query_num=10000000 + +[mysql] +ip=192.168.1.1 // 此处ip修改为机器A真实ip +port=3306 +database_name=aops +engine_format=mysql+pymysql://@%s:%s/%s +pool_size=100 +pool_recycle=7200 + +[redis] +ip=192.168.1.1 // 此处ip修改为机器A真实ip +port=6379 + +[prometheus] +ip=192.168.1.1 // 此处ip修改为机器A真实ip +port=9090 +query_range_step=15s + +[agent] +default_instance_port=8888 + +[zeus] +ip=192.168.1.2 // 此处ip修改为机器B真实ip +port=11111 + +[consumer] +kafka_server_list=192.168.1.1:9092 // 此处ip修改为机器A真实ip +enable_auto_commit=False +auto_offset_reset=earliest +timeout_ms=5 +max_records=3 +task_name=CHECK_TASK +task_group_id=CHECK_TASK_GROUP_ID +result_name=CHECK_RESULT + +[producer] +kafka_server_list = 192.168.1.1:9092 // 此处ip修改为机器A真实ip +api_version = 0.11.5 +acks = 1 +retries = 3 +retry_backoff_ms = 100 +task_name=CHECK_TASK +task_group_id=CHECK_TASK_GROUP_ID + +[uwsgi] +wsgi-file=manage.py +daemonize=/var/log/aops/uwsgi/diana.log +http-timeout=600 +harakiri=600 +processes=2 +threads=2 +``` + +> **mysql数据库设置为密码模式**,请参阅[FAQ:密码模式下mysql服务配置链接字符串](#Q5、mysql设置为密码模式) + +- 启动aops-diana服务 + +```shell +systemctl start aops-diana +``` + +**注意:服务启动前请确保已 [初始化aops-diana数据库](#3423初始化aops-diana数据库)** + +> diana服务启动失败,且报错内容包含mysql数据库连接失败,请排查是否设置mysql密码,如果是请参阅[FAQ:密码模式下mysql服务启动失败](#Q5、mysql设置为密码模式) + +#### 3.4.2.3、初始化aops-diana数据库 + +- diana数据库初始化 + +```shell +cd /opt/aops/scripts/deploy +bash aops-basedatabase.sh init diana +``` + +**注意:在未安装aops-tools工具包时,也可获取sql脚本通过mysql加载的方式初始化(sql脚本路径:/opt/aops/database/diana.sql)** + +[FAQ:密码模式下mysql数据库初始化](#Q5、mysql设置为密码模式) + +[FAQ:/opt/aops/scripts/deploy目录不存在](#Q7、/opt/aops/scripts/deploy目录不存在) + +## 3.5、客户端安装 + +aops-ceres作为A-Ops模块的客户端,通过ssh协议与AOps管理中心进行数据交互,提供采集主机信息、响应并处理中心命令等功能。 + +### 3.5.1、 节点信息 + +| 机器编号 | 配置IP | 部署模块 | +| -------- | ----------- | ---------- | +| 机器D | 192.168.1.4 | aops-ceres | + +### 3.5.2、 部署客户端 + +```shell +yum install aops-ceres dnf-hotpatch-plugin -y +``` + +## FAQ + +#### Q1、最大连接数(MaxStartups) + +批量添加主机接口服务执行过程中会受到aops-zeus安装所在主机sshd服务配置中最大连接数(MaxStartups)的限制,会出现部分主机不能连接的情况,如有大量添加主机的需求,可考虑临时调增该数值。关于该配置项的修改可参考[ssh文档](https://www.man7.org/linux/man-pages/man5/sshd_config.5.html)。 + +#### Q2、504网关超时 + +部分http访问接口执行时间较长,web端可能返回504错误,可向nginx配置中添加proxy_read_timeout配置项,并适当调大该数值,可降低504问题出现概率。 + +#### Q3、防火墙 + +若防火墙不方便关闭,请设置放行服务部署过程涉及的所有接口,否则会造成服务不可访问,影响A-Ops的正常使用。 + +#### Q4、elasticasearch访问拒绝 + +elasticsearch分布式部署多节点时,需调整配置跨域部分,允许各节点访问。 + +#### Q5、mysql设置为密码模式 + +- **服务配置mysql链接字符串** + +mysql数据库链接设置密码模式(例如用户名为**root**,密码为**123456**),则需要调整[mysql]配置节下engine_format配置项(apollo、zeus同步调整),数据格式如下: + +```ini +[mysql] +egine_format=mysql+pymysql://root:123456@%s:%s/%s +``` + +- **初始化脚本aops-basedatabase.sh修改** + +aops-basedatabase.sh脚本需要调整145行代码实现 + +> aops-basedatabase.sh调整前内容如下: + +```shell +database = pymysql.connect(host='$mysql_ip', port=$port, database='mysql', autocommit=True,client_flag=CLIENT.MULTI_STAT EMENTS) +``` + +> aops-basedatabase.sh调整后内容如下: + +```shell +database = pymysql.connect(host='$mysql_ip', port=$port, database='mysql', password='密码', user='用户名', autocommit=True, client_flag=CLIENT.MULTI_STATEMENTS) +``` + +- **服务启动时数据库连接错误** + +**/usr/bin/aops-vulcanus**脚本需要调整178行代码实现 + +> /usr/bin/aops-vulcanus调整前内容如下: + +```shell +connect = pymysql.connect(host='$mysql_ip', port=$port, database='$aops_database') +``` + +> /usr/bin/aops-vulcanus调整后内容如下: + +```shell +connect = pymysql.connect(host='$mysql_ip', port=$port, database='$aops_database', password='密码', user='用户名') +``` + +**注意:当服务器不是以root用户登录时,需添加user="root"或mysql允许链接的用户名** + +#### Q6、配置update源 ```shell -systemctl start aops-check +echo "[update] +name=update +baseurl=http://repo.openeuler.org/openEuler-24.03-LTS/update/$basearch/ +enabled=1 +gpgcheck=0 +[update-epol] +name=update-epol +baseurl=http://repo.openeuler.org/openEuler-24.03-LTS/EPOL/update/main/$basearch/ +enabled=1 +gpgcheck=0" > /etc/yum.repos.d/openEuler-update.repo ``` -至此,两台机器的服务部署完成。 \ No newline at end of file +> 注意: 其中**openEuler-24.03-LTS** 根据部署的系统版本具体调整,或可直接参与openeuler官网中针对repo源配置介绍 + +#### Q7、/opt/aops/scripts/deploy目录不存在 + +在执行数据库初始化时,提示不存在`/opt/aops/scripts/deploy`文件目录,执行安装aops-tools工具包 + +```shell +yum install aops-tools -y +``` diff --git "a/docs/zh/docs/A-Ops/aops-agent\351\203\250\347\275\262\346\214\207\345\215\227.md" "b/docs/zh/docs/A-Ops/aops-agent\351\203\250\347\275\262\346\214\207\345\215\227.md" deleted file mode 100644 index da0184bd611a216be79c106127936885d17e2b59..0000000000000000000000000000000000000000 --- "a/docs/zh/docs/A-Ops/aops-agent\351\203\250\347\275\262\346\214\207\345\215\227.md" +++ /dev/null @@ -1,655 +0,0 @@ -# aops-agent部署指南 - -## 一、环境要求 - -1台openEuler机器,建议openEuler-20.03及以上版本运行。 - -## 二、配置环境部署 - -#### 1. 关闭防火墙 - -```shell -systemctl stop firewalld -systemctl disable firewalld -systemctl status firewalld -``` - -#### 2. aops-agent部署 - -1. 基于yum源安装:yum install aops-agent - -2. 修改配置文件:将agent节点下IP标签值修改为本机IP, - - vim /etc/aops/agent.conf,以IP地址为192.168.1.47为例 - - ```ini - [agent] - ;启动aops-agent时,绑定的IP与端口 - ip=192.168.1.47 - port=12000 - - [gopher] - ;gala-gopher默认配置文件路径,如需修改请确保文件路径的准确性 - config_path=/opt/gala-gopher/gala-gopher.conf - - ;aops-agent采集日志配置 - [log] - ;采集日志级别,可设置为DEBUG,INFO,WARNING,ERROR,CRITICAL - log_level=INFO - ;采集日志存放位置 - log_dir=/var/log/aops - ;日志文件最大值 - max_bytes=31457280 - ;备份日志的数量 - backup_count=40 - ``` - -3. 启动服务:systemctl start aops-agent - -#### 3. 向aops-manager注册 - -为了辨别使用者的身份,避免接口被随意调用,aops-agent采用token验证身份,以减轻所部署主机的压力。 - -基于安全性考虑,项目采用主动注册的方式去获取token。注册前,须在agent侧准备好需要注册的信息,调用register命令向aops-manager注册。由于agent未配置数据库,注册成功后,自动将token保存到指定文件内,并在前台展示注册结果。同时将本机相关信息存入到aops-manager侧的数据库中,以便后续管理。 - -1. 准备register.json 文件 - - 在aops-agent侧准备好注册所需信息以json格式存入文件中,数据结构如下: - -```JSON -{ - // 前端登录用户名 - "web_username":"admin", - // 用户密码 - "web_password": "changeme", - // 主机名称 - "host_name": "host1", - // 主机所在组名称 - "host_group_name": "group1", - // aops-manager运行主机IP地址 - "manager_ip":"192.168.1.23", - // 是否注册为管理机器 - "management":false, - // aops-manager运行对外端口 - "manager_port":"11111", - // agent运行端口 - "agent_port":"12000" -} -``` - -`注意:确保aops-manager已在目标主机运行,如192.168.1.23,且注册的主机组要存在。` - -2. 执行:aops_agent register -f register.json, -3. 前台展示注册结果,注册成功时,保存token字符串至指定文件;注册失败时,根据提示以及日志内容了解具体原因。(/var/log/aops/aops.log) - -注册结果示例: - -`注册成功` - -```shell -[root@localhost ~]# aops_agent register -f register.json -Agent Register Success -``` - -`注册失败,以aops-manager未启动为示例` - -```shell -[root@localhost ~]# aops_agent register -f register.json -Agent Register Fail -[root@localhost ~]# -``` - -`对应日志内容` - -```shell -2022-09-05 16:11:52,576 ERROR command_manage/register/331: HTTPConnectionPool(host='192.168.1.23', port=11111): Max retries exceeded with url: /manage/host/add (Caused by NewConnectionError(': Failed to establish a new connection: [Errno 111] Connection refused')) -[root@localhost ~]# -``` - -## 三、插件支持 - -#### 3.1 gala-gopher - -##### 3.1.1 介绍 - -gala-gopher是基于eBPF的低负载探针框架,可用于对主机的CPU,内存,网络等状态的监控以及数据采集服务。可根据实际业务需求对已有采集探针采集状态进行配置。 - -##### 3.1.2 部署 - -1. 基于yum源安装:yum install gala-gopher -2. 基于实际的业务需求,选择需要探针进行开启,探针信息可在/opt/gala-gopher/gala-gopher.conf下查看。 -3. 启动服务:systemctl start gala-gopher - -##### 3.1.3 其他 - -gala-gopher更多信息可参考文档https://gitee.com/openeuler/gala-gopher/blob/master/README.md - -## 四、接口支持 - -### 4.1 对外接口清单 - -| 序号 | 接口名称 | 类型 | 说明 | -| ---- | ------------------------------ | ---- | ----------------------| -| 1 | /v1/agent/plugin/start | POST | 启动插件 | -| 2 | /v1/agent/plugin/stop | POST | 停止插件 | -| 3 | /v1/agent/application/info | GET | 采集目标应用集内正在运行的应用 | -| 4 | /v1/agent/host/info | GET | 获取主机信息 | -| 5 | /v1/agent/plugin/info | GET | 获取agent中插件运行信息 | -| 6 | /v1/agent/file/collect | POST | 采集配置文件内容 | -| 7 | /v1/agent/collect/items/change | POST | 改变插件采集项的运行状态 | - -#### 4.1.1、/v1/agent/plugin/start - -+ 描述:启动已安装但未运行的插件,目前仅支持gala-gopher插件。 - -+ HTTP请求方式:POST - -+ 数据提交方式:query - -+ 请求参数: - - | 参数名 | 必选 | 类型 | 说明 | - | ----------- | ---- | ---- | ------ | - | plugin_name | True | str | 插件名 | - -+ 请求参数示例 - - | 参数名 | 参数值 | - | ----------- | ----------- | - | plugin_name | gala-gopher | - -+ 返回体参数 - - | 参数名 | 类型 | 说明 | - | ------ | ---- | ---------------- | - | code | int/ | 返回码 | - | msg | str | 状态码对应的信息 | - -+ 返回示例 - - ```json - { - "code": 200, - "msg": "xxxx" - } - ``` - - -#### 4.1.2、/v1/agent/plugin/stop - -+ 描述:使正在运行的插件停止,目前仅支持gala-gopher插件。 - -+ HTTP请求方式:POST - -+ 数据提交方式:query - -+ 请求参数: - - | 参数名 | 必选 | 类型 | 说明 | - | ----------- | ---- | ---- | ------ | - | plugin_name | True | str | 插件名 | - -+ 请求参数示例: - - | 参数名 | 参数值 | - | ----------- | ----------- | - | plugin_name | gala-gopher | - -+ 返回体参数: - - | 参数名 | 类型 | 说明 | - | ------ | ---- | ---------------- | - | code | int | 返回码 | - | msg | str | 状态码对应的信息 | - -+ 返回示例: - - ```json - { - "code": 200, - "msg": "xxxx" - } - ``` - - -#### 4.1.3、/v1/agent/application/info - -+ 描述:采集目标应用集内正在运行的应用,当前目标应用集包含mysql, kubernetes, hadoop, nginx, docker, gala-gopher。 - -+ HTTP请求方式:GET - -+ 数据提交方式:query - -+ 请求参数: - - | 参数名 | 必选 | 类型 | 说明 | - | ------ | ---- | ---- | ---- | - | | | | | - -+ 请求参数示例: - - | 参数名 | 参数值 | - | ------ | ------ | - | | | - -+ 返回体参数: - - | 参数名 | 类型 | 说明 | - | ------ | ---- | ---------------- | - | code | int | 返回码 | - | msg | str | 状态码对应的信息 | - | resp | dict | 响应数据主体 | - - + resp - - | 参数名 | 类型 | 说明 | - | ------- | --------- | -------------------------- | - | running | List[str] | 包含正在运行应用名称的系列 | - -+ 返回示例: - - ```json - { - "code": 200, - "msg": "xxxx", - "resp": { - "running": [ - "mysql", - "docker" - ] - } - } - ``` - - -#### 4.1.4、/v1/agent/host/info - -+ 描述:获取安装agent主机的信息,包含系统版本,BIOS版本,内核版本,CPU信息以及内存信息。 - -+ HTTP请求方式:POST - -+ 数据提交方式:application/json - -+ 请求参数: - - | 参数名 | 必选 | 类型 | 说明 | - | --------- | ---- | --------- | ------------------------------------------------ | - | info_type | True | List[str] | 需采集信息的名称,目前仅支持cpu、disk、memory、os | - -+ 请求参数示例: - - ```json - ["os", "cpu","memory", "disk"] - ``` - -+ 返回体参数: - - | 参数名 | 类型 | 说明 | - | ------ | ---- | ---------------- | - | code | int | 返回码 | - | msg | str | 状态码对应的信息 | - | resp | dict | 响应数据主体 | - - resp - - | 参数名 | 类型 | 说明 | - | ------ | ---------- | -------- | - | cpu | dict | cpu信息 | - | memory | dict | 内存信息 | - | os | dict | OS信息 | - | disk | List[dict] | 硬盘信息 | - - cpu - - | 参数名 | 类型 | 说明 | - | ------------ | ---- | --------------- | - | architecture | str | CPU架构 | - | core_count | int | 核心数 | - | l1d_cache | str | 1级数据缓存大小 | - | l1i_cache | str | 1级指令缓存大小 | - | l2_cache | str | 2级缓存大小 | - | l3_cache | str | 3级缓存大小 | - | model_name | str | 模式名称 | - | vendor_id | str | 厂商ID | - - memory - - | 参数名 | 类型 | 说明 | - | ------ | ---------- | -------------- | - | size | str | 总内存大小 | - | total | int | 内存条数量 | - | info | List[dict] | 所有内存条信息 | - - info - - | 参数名 | 类型 | 说明 | - | ------------ | ---- | -------- | - | size | str | 内存大小 | - | type | str | 类型 | - | speed | str | 速度 | - | manufacturer | str | 厂商 | - - os - - | 参数名 | 类型 | 说明 | - | ------------ | ---- | -------- | - | bios_version | str | bios版本 | - | os_version | str | 系统名称 | - | kernel | str | 内核版本 | - -+ 返回示例: - - ```json - { - "code": 200, - "msg": "operate success", - "resp": { - "cpu": { - "architecture": "aarch64", - "core_count": "128", - "l1d_cache": "8 MiB (128 instances)", - "l1i_cache": "8 MiB (128 instances)", - "l2_cache": "64 MiB (128 instances)", - "l3_cache": "128 MiB (4 instances)", - "model_name": "Kunpeng-920", - "vendor_id": "HiSilicon" - }, - "memory": { - "info": [ - { - "manufacturer": "Hynix", - "size": "16 GB", - "speed": "2933 MT/s", - "type": "DDR4" - }, - { - "manufacturer": "Hynix", - "size": "16 GB", - "speed": "2933 MT/s", - "type": "DDR4" - } - ], - "size": "32G", - "total": 2 - }, - "os": { - "bios_version": "1.82", - "kernel": "5.10.0-60.18.0.50", - "os_version": "openEuler 22.03 LTS" - }, - "disk": [ - { - "capacity": "xxGB", - "model": "xxxxxx" - } - ] - } - } - ``` - -#### 4.1.5、/v1/agent/plugin/info - -+ 描述:获取主机的插件运行情况,目前仅支持gala-gopher插件。 - -+ HTTP请求方式:GET - -+ 数据提交方式:query - -+ 请求参数: - - | 参数名 | 必选 | 类型 | 说明 | - | ------ | ---- | ---- | ---- | - | | | | | - -+ 请求参数示例: - - | 参数名 | 参数值 | - | ------ | ------ | - | | | - -+ 返回体参数: - - | 参数名 | 类型 | 说明 | - | ------ | ---------- | ---------------- | - | code | int | 返回码 | - | msg | str | 状态码对应的信息 | - | resp | List[dict] | 响应数据主体 | - - resp - - | 参数名 | 类型 | 说明 | - | ------------- | ---------- | ------------------ | - | plugin_name | str | 插件名称 | - | collect_items | list | 插件采集项运行情况 | - | is_installed | str | 状态码对应的信息 | - | resource | List[dict] | 插件资源使用情况 | - | status | str | 插件运行状态 | - - resource - - | 参数名 | 类型 | 说明 | - | ------------- | ---- | ---------- | - | name | str | 资源名称 | - | current_value | str | 资源使用值 | - | limit_value | str | 资源限制值 | - -+ 返回示例: - - ```json - { - "code": 200, - "msg": "operate success", - "resp": [ - { - "collect_items": [ - { - "probe_name": "system_tcp", - "probe_status": "off", - "support_auto": false - }, - { - "probe_name": "haproxy", - "probe_status": "auto", - "support_auto": true - }, - { - "probe_name": "nginx", - "probe_status": "auto", - "support_auto": true - }, - ], - "is_installed": true, - "plugin_name": "gala-gopher", - "resource": [ - { - "current_value": "0.0%", - "limit_value": null, - "name": "cpu" - }, - { - "current_value": "13 MB", - "limit_value": null, - "name": "memory" - } - ], - "status": "active" - } - ] - } - ``` - -#### 4.1.6、/v1/agent/file/collect - -+ 描述:采集目标配置文件内容、文件权限、文件所属用户等信息。当前仅支持读取小于1M,无执行权限,且支持UTF8编码的文本文件。 - -+ HTTP请求方式:POST - -+ 数据提交方式:application/json - -+ 请求参数: - - | 参数名 | 必选 | 类型 | 说明 | - | --------------- | ---- | --------- | ------------------------ | - | configfile_path | True | List[str] | 需采集文件完整路径的序列 | - -+ 请求参数示例: - - ```json - [ "/home/test.conf", "/home/test.ini", "/home/test.json"] - ``` - -+ 返回体参数: - - | 参数名 | 类型 | 说明 | - | ------------- | ---------- | ---------------- | - | infos | List[dict] | 文件采集信息 | - | success_files | List[str] | 采集成功文件列表 | - | fail_files | List[str] | 采集失败文件列表 | - - infos - - | 参数名 | 类型 | 说明 | - | --------- | ---- | -------- | - | path | str | 文件路径 | - | content | str | 文件内容 | - | file_attr | dict | 文件属性 | - - file_attr - - | 参数名 | 类型 | 说明 | - | ------ | ---- | ------------ | - | mode | str | 文件类型权限 | - | owner | str | 文件所属用户 | - | group | str | 文件所属群组 | - -+ 返回示例: - - ```json - { - "infos": [ - { - "content": "this is a test file", - "file_attr": { - "group": "root", - "mode": "0644", - "owner": "root" - }, - "path": "/home/test.txt" - } - ], - "success_files": [ - "/home/test.txt" - ], - "fail_files": [ - "/home/test.txt" - ] - } - ``` - -#### 4.1.7、/v1/agent/collect/items/change - -+ 描述:更改插件采集项的采集状态,当前仅支持对gala-gopher采集项的更改,gala-gopher采集项可在配置文件中查看`/opt/gala-gopher/gala-gopher.conf`。 - -+ HTTP请求方式:POST - -+ 数据提交方式:application/json - -+ 请求参数: - - | 参数名 | 必选 | 类型 | 说明 | - | ----------- | ---- | ---- | -------------------------- | - | plugin_name | True | dict | 插件采集项预期修改结果数据 | - - plugin_name - - | 参数名 | 必选 | 类型 | 说明 | - | ------------ | ---- | ------ | ------------------ | - | collect_item | True | string | 采集项预期修改结果 | - -+ 请求参数示例: - - ```json - { - "gala-gopher":{ - "redis":"auto", - "system_inode":"on", - "tcp":"on", - "haproxy":"auto" - } - } - ``` - -+ 返回体参数: - - | 参数名 | 类型 | 说明 | - | ------ | ---------- | ---------------- | - | code | int | 返回码 | - | msg | str | 状态码对应的信息 | - | resp | List[dict] | 响应数据主体 | - - + resp - - | 参数名 | 类型 | 说明 | - | ----------- | ---- | ------------------ | - | plugin_name | dict | 对应采集项修改结果 | - - + plugin_name - - | 参数名 | 类型 | 说明 | - | ------- | --------- | ---------------- | - | success | List[str] | 修改成功的采集项 | - | failure | List[str] | 修改失败的采集项 | - -+ 返回示例: - - ```json - { - "code": 200, - "msg": "operate success", - "resp": { - "gala-gopher": { - "failure": [ - "redis" - ], - "success": [ - "system_inode", - "tcp", - "haproxy" - ] - } - } - } - ``` - -## FAQ - -1. 若有报错,请查看日志/var/log/aops/aops.log,根据日志中相关报错提示解决问题,并重启服务。 - -2. 建议项目在Python3.7以上环境运行,安装Python依赖库时需要注意其版本。 - -3. access_token值可在注册完成后,从`/etc/aops/agent.conf`文件中获取。 - -4. 对于插件CPU,以及内存的资源限制目前通过在插件对应service文件中的Service节点下添加MemoryHigh和CPUQuota标签实现。 - - 如对gala-gopher内存限制为40M,CPU限制为20%。 - - ```ini - [Unit] - Description=a-ops gala gopher service - After=network.target - - [Service] - Type=exec - ExecStart=/usr/bin/gala-gopher - Restart=on-failure - RestartSec=1 - RemainAfterExit=yes - ;尽可能限制该单元中的进程最多可以使用多少内存,该限制允许突破,但突破限制后,进程运行速度会收到限制,并且系统会尽可能回收超出的内存 - ;选项值可以是以字节为单位的 绝对内存大小(可以使用以1024为基数的 K, M, G, T 后缀), 也可以是以百分比表示的相对内存大小 - MemoryHigh=40M - ;为此单元的进程设置CPU时间限额,必须设为一个以"%"结尾的百分数, 表示该单元最多可使用单颗CPU总时间的百分之多少 - CPUQuota=20% - - [Install] - WantedBy=multi-user.target - ``` diff --git "a/docs/zh/docs/A-Ops/dnf\346\217\222\344\273\266\345\221\275\344\273\244\344\275\277\347\224\250\346\211\213\345\206\214.md" "b/docs/zh/docs/A-Ops/dnf\346\217\222\344\273\266\345\221\275\344\273\244\344\275\277\347\224\250\346\211\213\345\206\214.md" new file mode 100644 index 0000000000000000000000000000000000000000..16232a6d68f729b46a9c33a086283fa30bfe7646 --- /dev/null +++ "b/docs/zh/docs/A-Ops/dnf\346\217\222\344\273\266\345\221\275\344\273\244\344\275\277\347\224\250\346\211\213\345\206\214.md" @@ -0,0 +1,739 @@ +# dnf插件命令使用手册 +将dnf-hotpatch-plugin安装部署完成后,可使用dnf命令调用A-ops ceres中的冷/热补丁操作,命令包含热补丁扫描(dnf hot-updateinfo),热补丁状态设置及查询(dnf hotpatch ),热补丁应用(dnf hotupgrade),内核升级前kabi检查(dnf upgrade-en)。本文将介绍上述命令的具体使用方法。 + +>热补丁包括ACC/SGL(accumulate/single)类型 +> +>- ACC:增量补丁。目标高版本热补丁包含低版本热补丁所修复问题。 +>- SGL_xxx:单独补丁,xxx为issue id,如果有多个issue id,用多个'_'拼接。目标修复issue id相关问题。 + +## 热补丁扫描 +`dnf hot-updateinfo`命令支持扫描热补丁并指定cve查询相关热补丁,命令使用方式如下: +```shell +dnf hot-updateinfo list cves [--available(default) | --installed] [--cve [cve_id]] + +General DNF options: + -h, --help, --help-cmd + show command help + --cve CVES, --cves CVES + Include packages needed to fix the given CVE, in updates +Hot-updateinfo command-specific options: + --available + cves about newer versions of installed packages + (default) + --installed + cves about equal and older versions of installed packages +``` + +- `list cves` + +1、查询主机所有可修复的cve和对应的冷/热补丁。 + +```shell +[root@localhost ~]# dnf hot-updateinfo list cves +# cve-id level cold-patch hot-patch +Last metadata expiration check: 2:39:04 ago on 2023年12月29日 星期五 07时45分02秒. +CVE-2022-30594 Important/Sec. kernel-4.19.90-2206.1.0.0153.oe1.x86_64 patch-kernel-4.19.90-2112.8.0.0131.oe1-SGL_CVE_2022_30594-1-1.x86_64 +CVE-2023-1111 Important/Sec. redis-6.2.5-2.x86_64 patch-redis-6.2.5-1-ACC-1-1.x86_64 +CVE-2023-1112 Important/Sec. redis-6.2.5-2.x86_64 patch-redis-6.2.5-1-ACC-1-1.x86_64 +CVE-2023-1111 Important/Sec. redis-6.2.5-2.x86_64 patch-redis-6.2.5-1-SGL_CVE_2023_1111_CVE_2023_1112-1-1.x86_64 +``` + +2、查询主机所有已修复的cve和对应的冷/热补丁 + +```shell +[root@localhost ~]# dnf hot-updateinfo list cves --installed +# cve-id level cold-patch hot-patch +Last metadata expiration check: 2:39:04 ago on 2023年12月29日 星期五 07时45分02秒. +CVE-2022-36298 Important/Sec. - patch-kernel-4.19.90-2112.8.0.0131.oe1-SGL_CVE_2022_36298-1-1.x86_64 +``` + +2、指定cve查询对应的可修复冷/热补丁。 + +```shell +[root@localhost ~]# dnf hot-updateinfo list cves --cve CVE-2022-30594 +# cve-id level cold-patch hot-patch +Last metadata expiration check: 2:39:04 ago on 2023年12月29日 星期五 07时45分02秒. +CVE-2022-30594 Important/Sec. kernel-4.19.90-2206.1.0.0153.oe1.x86_64 patch-kernel-4.19.90-2112.8.0.0131.oe1-SGL_CVE_2022_30594-1-1.x86_64 +``` + +3、cve不存在时列表为空。 +```shell +[root@localhost ~]# dnf hot-updateinfo list cves --cve CVE-2022-3089 +# cve-id level cold-patch hot-patch +Last metadata expiration check: 2:39:04 ago on 2023年12月29日 星期五 07时45分02秒. +``` + +## 热补丁状态及转换图 + +- 热补丁状态图 + + NOT-APPLIED: 热补丁尚未应用。 + + DEACTIVED: 热补丁未被激活。 + + ACTIVED: 热补丁已被激活。 + + ACCEPTED: 热补丁已被激活,后续重启后会被自动应用激活。 + + ![热补丁状态转换图](./figures/syscare热补丁状态图.png) + + +## 热补丁状态查询和切换 +`dnf hotpatch`命令支持查询、切换热补丁的状态,命令使用方式如下: + +```shell +dnf hotpatch + +General DNF options: + -h, --help, --help-cmd + show command help + --cve CVES, --cves CVES + Include packages needed to fix the given CVE, in updates + +Hotpatch command-specific options: + --list [{cve, cves}] show list of hotpatch + --apply APPLY_NAME apply hotpatch + --remove REMOVE_NAME remove hotpatch + --active ACTIVE_NAME active hotpatch + --deactive DEACTIVE_NAME + deactive hotpatch + --accept ACCEPT_NAME accept hotpatch +``` +- 使用`dnf hotpatch`命令查询热补丁状态 + + - 使用`dnf hotpatch --list`命令查询当前系统中可使用的热补丁状态并展示。 + + ```shell + [root@localhost ~]# dnf hotpatch --list + Last metadata expiration check: 0:09:25 ago on 2023年12月29日 星期五 10时26分45秒. + base-pkg/hotpatch status + kernel-4.19.90-2112.8.0.0131.oe1/SGL_CVE_2022_30594-1-1/vmlinux NOT-APPLIED + ``` + + - 使用`dnf hotpatch --list cves`查询漏洞(CVE-id)对应热补丁及其状态并展示。 + + ```shell + [root@openEuler ~]# dnf hotpatch --list cves + Last metadata expiration check: 0:11:05 ago on 2023年12月29日 星期五 10时26分45秒. + CVE-id base-pkg/hotpatch status + CVE-2022-30594 kernel-4.19.90-2112.8.0.0131.oe1/SGL_CVE_2022_30594-1-1/vmlinux NOT-APPLIED + ``` + + - `dnf hotpatch --list cves --cve `筛选指定CVE对应的热补丁及其状态并展示。 + + ```shell + [root@openEuler ~]# dnf hotpatch --list cves --cve CVE-2022-30594 + Last metadata expiration check: 0:12:25 ago on 2023年12月29日 星期五 10时26分45秒. + CVE-id base-pkg/hotpatch status + CVE-2022-30594 kernel-4.19.90-2112.8.0.0131.oe1/SGL_CVE_2022_30594-1-1/vmlinux NOT-APPLIED + ``` + + - 使用`dnf hotpatch --list cves --cve `查询无结果时展示为空。 + + ```shell + [root@openEuler ~]# dnf hotpatch --list cves --cve CVE-2023-1 + Last metadata expiration check: 0:13:11 ago on 2023年12月29日 星期五 10时26分45秒. + ``` + +- 使用`dnf hotpatch --apply `命令应用热补丁,可使用 `dnf hotpatch --list`查询应用后的状态变化,变化逻辑见上文的热补丁状态转换图。 + +```shell +[root@openEuler ~]# dnf hotpatch --list +Last metadata expiration check: 0:13:55 ago on 2023年12月29日 星期五 10时26分45秒. +base-pkg/hotpatch status +kernel-4.19.90-2112.8.0.0131.oe1/SGL_CVE_2022_30594-1-1/vmlinux NOT-APPLIED +[root@openEuler ~]# dnf hotpatch --apply kernel-4.19.90-2112.8.0.0131.oe1/SGL_CVE_2022_30594-1-1 +Last metadata expiration check: 0:15:37 ago on 2023年12月29日 星期五 10时26分45秒. +Gonna apply this hot patch: kernel-4.19.90-2112.8.0.0131.oe1/SGL_CVE_2022_30594-1-1 +apply hot patch 'kernel-4.19.90-2112.8.0.0131.oe1/SGL_CVE_2022_30594-1-1' succeed +[root@openEuler ~]# dnf hotpatch --list +Last metadata expiration check: 0:16:20 ago on 2023年12月29日 星期五 10时26分45秒. +base-pkg/hotpatch status +kernel-4.19.90-2112.8.0.0131.oe1/SGL_CVE_2022_30594-1-1/vmlinux ACTIVED +``` +- 使用`dnf hotpatch --deactive `停用热补丁,可使用`dnf hotpatch --list`查询停用后的状态变化,变化逻辑见上文的热补丁状态转换图。 + +```shell +[root@openEuler ~]# dnf hotpatch --deactive kernel-4.19.90-2112.8.0.0131.oe1/SGL_CVE_2022_30594-1-1 +Last metadata expiration check: 0:19:00 ago on 2023年12月29日 星期五 10时26分45秒. +Gonna deactive this hot patch: kernel-4.19.90-2112.8.0.0131.oe1/SGL_CVE_2022_30594-1-1 +deactive hot patch 'kernel-4.19.90-2112.8.0.0131.oe1/SGL_CVE_2022_30594-1-1' succeed +[root@openEuler ~]# dnf hotpatch --list +Last metadata expiration check: 0:19:12 ago on 2023年12月29日 星期五 10时26分45秒. +base-pkg/hotpatch status +kernel-4.19.90-2112.8.0.0131.oe1/SGL_CVE_2022_30594-1-1/vmlinux DEACTIVED +``` +- 使用`dnf hotpatch --remove `删除热补丁,可使用`dnf hotpatch --list`查询删除后的状态变化,变化逻辑见上文的热补丁状态转换图。 + +```shell +[root@openEuler ~]# dnf hotpatch --remove kernel-4.19.90-2112.8.0.0131.oe1/SGL_CVE_2022_30594-1-1 +Last metadata expiration check: 0:20:12 ago on 2023年12月29日 星期五 10时26分45秒. +Gonna remove this hot patch: kernel-4.19.90-2112.8.0.0131.oe1/SGL_CVE_2022_30594-1-1 +remove hot patch 'kernel-4.19.90-2112.8.0.0131.oe1/SGL_CVE_2022_30594-1-1' succeed +[root@openEuler ~]# dnf hotpatch --list +Last metadata expiration check: 0:20:23 ago on 2023年12月29日 星期五 10时26分45秒. +base-pkg/hotpatch status +kernel-4.19.90-2112.8.0.0131.oe1/SGL_CVE_2022_30594-1-1/vmlinux NOT-APPLIED +``` +- 使用`dnf hotpatch --active `激活热补丁,可使用`dnf hotpatch --list`查询激活后的状态变化,变化逻辑见上文的热补丁状态转换图。 + +```shell +[root@openEuler ~]# dnf hotpatch --active kernel-4.19.90-2112.8.0.0131.oe1/SGL_CVE_2022_30594-1-1 +Last metadata expiration check: 0:15:37 ago on 2023年12月29日 星期五 10时26分45秒. +Gonna active this hot patch: kernel-4.19.90-2112.8.0.0131.oe1/SGL_CVE_2022_30594-1-1 +active hot patch 'kernel-4.19.90-2112.8.0.0131.oe1/SGL_CVE_2022_30594-1-1' succeed +[root@openEuler ~]# dnf hotpatch --list +Last metadata expiration check: 0:16:20 ago on 2023年12月29日 星期五 10时26分45秒. +base-pkg/hotpatch status +kernel-4.19.90-2112.8.0.0131.oe1/SGL_CVE_2022_30594-1-1/vmlinux ACTIVED +``` +- 使用`dnf hotpatch --accept `接收热补丁,可使用`dnf hotpatch --list`查询接收后的状态变化,变化逻辑见上文的热补丁状态转换图。 + +```shell +[root@openEuler ~]# dnf hotpatch --accept kernel-4.19.90-2112.8.0.0131.oe1/SGL_CVE_2022_30594-1-1 +Last metadata expiration check: 0:14:19 ago on 2023年12月29日 星期五 10时47分38秒. +Gonna accept this hot patch: kernel-4.19.90-2112.8.0.0131.oe1/SGL_CVE_2022_30594-1-1 +accept hot patch 'kernel-4.19.90-2112.8.0.0131.oe1/SGL_CVE_2022_30594-1-1' succeed +[root@openEuler ~]# dnf hotpatch --list +Last metadata expiration check: 0:14:34 ago on 2023年12月29日 星期五 10时47分38秒. +base-pkg/hotpatch status +kernel-4.19.90-2112.8.0.0131.oe1/SGL_CVE_2022_30594-1-1/vmlinux ACCEPTED +``` + + +## 热补丁应用 +`hotupgrade`命令根据cve id和热补丁名称进行热补丁修复,同时也支持全量修复。命令使用方式如下: +```shell +dnf hotupgrade [--cve [cve_id]] [PACKAGE ...] [--takeover] [-f] + +General DNF options: + -h, --help, --help-cmd + show command help + --cve CVES, --cves CVES + Include packages needed to fix the given CVE, in updates + +command-specific options: + --takeover + kernel cold patch takeover operation + -f + force retain kernel rpm package if kernel kabi check fails + PACKAGE + Package to upgrade +``` + +- 使用`dnf hotupgrade PACKAGE`安装目标热补丁。 + + - 使用`dnf hotupgrade PACKAGE`安装目标热补丁 + + ```shell + [root@openEuler ~]# dnf hotupgrade patch-kernel-4.19.90-2112.8.0.0131.oe1-SGL_CVE_2022_30594-1-1.x86_64 + Last metadata expiration check: 0:26:25 ago on 2023年12月29日 星期五 10时47分38秒. + Dependencies resolved. + xxxx(Install messgaes) + Is this ok [y/N]: y + Downloading Packages: + xxxx(Install process) + Complete! + Apply hot patch succeed: kernel-4.19.90-2112.8.0.0131.oe1/SGL_CVE_2022_30594-1-1. + ``` + + - 当目标热补丁已经应用激活,使用`dnf hotupgrade PACKAGE`安装目标热补丁 + + ```shell + [root@openEuler ~]# dnf hotupgrade patch-kernel-4.19.90-2112.8.0.0131.oe1-SGL_CVE_2022_30594-1-1.x86_64 + Last metadata expiration check: 0:28:35 ago on 2023年12月29日 星期五 10时47分38秒. + The hotpatch 'kernel-4.19.90-2112.8.0.0131.oe1/SGL_CVE_2022_30594-1-1' already has a 'ACTIVED' sub hotpatch of binary file 'vmlinux' + Package patch-kernel-4.19.90-2112.8.0.0131.oe1-SGL_CVE_2022_30594-1-1.x86_64 is already installed. + Dependencies resolved. + Nothing to do. + Complete! + ``` + + - 使用`dnf hotupgrade PACKAGE`安装目标热补丁,自动卸载激活失败的热补丁。 + + ```shell + [root@openEuler ~]# dnf hotupgrade patch-redis-6.2.5-1-ACC-1-1.x86_64 + Last metadata expiration check: 0:30:30 ago on 2023年12月29日 星期五 10时47分38秒. + Dependencies resolved. + xxxx(Install messgaes) + Is this ok [y/N]: y + Downloading Packages: + xxxx(Install process) + Complete! + Apply hot patch failed: redis-6.2.5-1/ACC-1-1. + Error: Operation failed + + Caused by: + 0. Transaction "Apply patch 'redis-6.2.5-1/ACC-1-1'" failed + + Caused by: + Cannot match any patch named "redis-6.2.5-1/ACC-1-1" + + Gonna remove unsuccessfully activated hotpatch rpm. + Remove package succeed: patch-redis-6.2.5-1-ACC-1-1.x86_64. + ``` + +- 使用`--cve `指定cve_id安装CVE对应的热补丁 + + - 使用`dnf hotupgrade --cve CVE-2022-30594`安装CVE对应的热补丁 + + ```shell + [root@openEuler ~]# dnf hotupgrade --cve CVE-2022-30594 + Last metadata expiration check: 0:26:25 ago on 2023年12月29日 星期五 10时47分38秒. + Dependencies resolved. + xxxx(Install messgaes) + Is this ok [y/N]: y + Downloading Packages: + xxxx(Install process) + Complete! + Apply hot patch succeed: kernel-4.19.90-2112.8.0.0131.oe1/SGL_CVE_2022_30594-1-1. + ``` + + - 使用`dnf hotupgrade --cve CVE-2022-2021`安装CVE对应的热补丁,对应的CVE不存在。 + + ```shell + [root@openEuler ~]# dnf hotupgrade --cve CVE-2022-2021 + Last metadata expiration check: 1:37:44 ago on 2023年12月29日 星期五 13时49分39秒. + The cve doesn't exist or cannot be fixed by hotpatch: CVE-2022-2021 + No hot patches marked for install. + Dependencies resolved. + Nothing to do. + Complete! + ``` + + - 使用`dnf hotupgrade --cve `指定cve_id安装时,该CVE对应的ACC低版本热补丁已安装时,删除低版本热补丁,安装高版本ACC热补丁包。 + + ```shell + [root@openEuler ~]# dnf hotupgrade --cve CVE-2023-1070 + Last metadata expiration check: 0:00:48 ago on 2024年01月02日 星期二 11时21分55秒. + Dependencies resolved. + xxxx(Install messgaes) + Is this ok [y/N]: y + Downloading Packages: + xxxx (Install messages and process upgrade) + Complete! + Apply hot patch succeed: kernel-5.10.0-153.12.0.92.oe2203sp2/ACC-1-3. + [root@openEuler tmp]# + ``` + + - 指定cve_id安装时,该CVE对应的最高版本热补丁包已存在 + + ```shell + [root@openEuler ~]# dnf hotupgrade --cve CVE-2023-1070 + Last metadata expiration check: 1:37:44 ago on 2023年12月29日 星期五 13时49分39秒. + The cve doesn't exist or cannot be fixed by hotpatch: CVE-2023-1070 + No hot patches marked for install. + Dependencies resolved. + Nothing to do. + Complete! + ``` + +- 使用`dnf hotupgrade`进行热补丁全量修复 + - 热补丁未安装时,使用`dnf hotupgrade`命令安装所有可安装热补丁。 + + - 当部分热补丁已经安装时,使用`dnf hotupgrade`命令进行全量修复,将保留已安装的热补丁,然后安装其他热补丁 + +- 使用`--takeover`进行内核热补丁收编 + + - 使用`dnf hotupgrade PACKAGE --takeover`安装热补丁,收编相应内核冷补丁;由于目标内核冷补丁kabi检查失败,进行自动卸载;accept热补丁,使热补丁重启后仍旧生效;恢复内核默认引导启动项。 + + ```shell + [root@openEuler ~]# dnf hotupgrade patch-kernel-4.19.90-2112.8.0.0131.oe1-SGL_CVE_2022_30594-1-1.x86_64 --takeover + Last metadata expiration check: 2:23:22 ago on 2023年12月29日 星期五 13时49分39秒. + Gonna takeover kernel cold patch: ['kernel-4.19.90-2206.1.0.0153.oe1.x86_64'] + Dependencies resolved. + xxxx(Install messgaes) + Is this ok [y/N]: y + xxxx(Install process) + Complete! + Apply hot patch succeed: kernel-4.19.90-2112.8.0.0131.oe1/SGL_CVE_2022_30594-1-1. + Kabi check for kernel-4.19.90-2206.1.0.0153.oe1.x86_64: + [Fail] Here are 81 loaded kernel modules in this system, 78 pass, 3 fail. + Failed modules are as follows: + No. Module Difference + 1 nf_nat_ipv6 secure_ipv6_port_ephemeral : 0xe1a4f16a != 0x0209f3a7 + 2 nf_nat_ipv4 secure_ipv4_port_ephemeral : 0x57f70547 != 0xe3840e18 + 3 kvm_intel kvm_lapic_hv_timer_in_use : 0x54981db4 != 0xf58e6f1f + Gonna remove kernel-4.19.90-2206.1.0.0153.oe1.x86_64 due to Kabi check failed. + Rebuild rpm database succeed. + Remove package succeed: kernel-4.19.90-2206.1.0.0153.oe1.x86_64. + Restore the default boot kernel succeed: kernel-4.19.90-2112.8.0.0131.oe1.x86_64. + No available kernel cold patch for takeover, gonna accept available kernel hot patch. + Accept hot patch succeed: kernel-4.19.90-2112.8.0.0131.oe1/SGL_CVE_2022_30594-1-1. + ``` + + - 使用`dnf hotupgrade PACKAGE --takeover -f`安装热补丁,如果内核冷补丁kabi检查未通过,使用`-f`强制保留内核冷补丁 + + ```shell + [root@openEuler ~]# dnf hotupgrade patch-kernel-4.19.90-2112.8.0.0131.oe1-SGL_CVE_2022_30594-1-1.x86_64 --takeover + Last metadata expiration check: 2:23:22 ago on 2023年12月29日 星期五 13时49分39秒. + Gonna takeover kernel cold patch: ['kernel-4.19.90-2206.1.0.0153.oe1.x86_64'] + Dependencies resolved. + xxxx(Install messgaes) + Is this ok [y/N]: y + xxxx(Install process) + Complete! + Apply hot patch succeed: kernel-4.19.90-2112.8.0.0131.oe1/SGL_CVE_2022_30594-1-1. + Kabi check for kernel-4.19.90-2206.1.0.0153.oe1.x86_64: + [Fail] Here are 81 loaded kernel modules in this system, 78 pass, 3 fail. + Failed modules are as follows: + No. Module Difference + 1 nf_nat_ipv6 secure_ipv6_port_ephemeral : 0xe1a4f16a != 0x0209f3a7 + 2 nf_nat_ipv4 secure_ipv4_port_ephemeral : 0x57f70547 != 0xe3840e18 + 3 kvm_intel kvm_lapic_hv_timer_in_use : 0x54981db4 != 0xf58e6f1f + ``` + + +## 内核升级前kabi检查 + +`dnf upgrade-en` 命令支持内核冷补丁升级前kabi检查,命令使用方式如下: + +```shell +dnf upgrade-en [PACKAGE] [--cve [cve_id]] + +upgrade with KABI(Kernel Application Binary Interface) check. If the loaded +kernel modules have KABI compatibility with the new version kernel rpm, the +kernel modules can be installed and used in the new version kernel without +recompling. + +General DNF options: + -h, --help, --help-cmd + show command help + --cve CVES, --cves CVES + Include packages needed to fix the given CVE, in updates +Upgrade-en command-specific options: + PACKAGE + Package to upgrade +``` + +- 使用`dnf upgrade-en PACKAGE`安装目标冷补丁 + + - 使用`dnf upgrade-en`安装目标冷补丁,kabi检查未通过,输出kabi差异性报告,自动卸载目标升级kernel包。 + + ```shell + [root@openEuler ~]# dnf upgrade-en kernel-4.19.90-2206.1.0.0153.oe1.x86_64 + Last metadata expiration check: 1:51:54 ago on 2023年12月29日 星期五 13时49分39秒. + Dependencies resolved. + xxxx(Install messgaes) + Is this ok [y/N]: y + Downloading Packages: + xxxx(Install process) + Complete! + Kabi check for kernel-4.19.90-2206.1.0.0153.oe1.x86_64: + [Fail] Here are 81 loaded kernel modules in this system, 78 pass, 3 fail. + Failed modules are as follows: + No. Module Difference + 1 nf_nat_ipv6 secure_ipv6_port_ephemeral : 0xe1a4f16a != 0x0209f3a7 + 2 nf_nat_ipv4 secure_ipv4_port_ephemeral : 0x57f70547 != 0xe3840e18 + 3 kvm_intel kvm_lapic_hv_timer_in_use : 0x54981db4 != 0xf58e6f1f + kvm_apic_write_nodecode : 0x56c989a1 != 0x24c9db31 + kvm_complete_insn_gp : 0x99c2d256 != 0xcd8014bd + Gonna remove kernel-4.19.90-2206.1.0.0153.oe1.x86_64 due to kabi check failed. + Rebuild rpm database succeed. + Remove package succeed: kernel-4.19.90-2206.1.0.0153.oe1.x86_64. + Restore the default boot kernel succeed: kernel-4.19.90-2112.8.0.0131.oe1.x86_64. + ``` + + - 使用`dnf upgrade-en`安装目标冷补丁,kabi检查通过 + + ```shell + [root@openEuler ~]# dnf upgrade-en kernel-4.19.90-2201.1.0.0132.oe1.x86_64 + Last metadata expiration check: 2:02:10 ago on 2023年12月29日 星期五 13时49分39秒. + Dependencies resolved. + xxxx(Install messgaes) + Is this ok [y/N]: y + Downloading Packages: + xxxx(Install process) + Complete! + Kabi check for kernel-4.19.90-2201.1.0.0132.oe1.x86_64: + [Success] Here are 81 loaded kernel modules in this system, 81 pass, 0 fail. + ``` + +- 使用`dnf upgrade-en` 进行全量修复 + +​ 全量修复如果包含目标kernel的升级,输出根据不同的kabi检查情况与`dnf upgrade-en PACKAGE`命令相同。 + +## 使用场景说明 + +本段落介绍上述命令的使用场景及顺序介绍,需要提前确认本机的热补丁repo源和相应冷补丁repo源已开启。 + +- 热补丁修复。 + +使用热补丁扫描命令查看本机待修复cve。 + +```shell +[root@openEuler ~]# dnf hot-updateinfo list cves +Last metadata expiration check: 0:00:38 ago on 2023年03月25日 星期六 11时53分46秒. +CVE-2023-22995 Important/Sec. python3-perf-5.10.0-136.22.0.98.oe2203sp1.x86_64 - +CVE-2023-26545 Important/Sec. python3-perf-5.10.0-136.22.0.98.oe2203sp1.x86_64 - +CVE-2022-40897 Important/Sec. python3-setuptools-59.4.0-5.oe2203sp1.noarch - +CVE-2021-1 Important/Sec. redis-6.2.5-2.x86_64 patch-redis-6.2.5-1-ACC-1-1.x86_64 +CVE-2021-11 Important/Sec. redis-6.2.5-2.x86_64 patch-redis-6.2.5-1-ACC-1-1.x86_64 +CVE-2021-2 Important/Sec. redis-6.2.5-3.x86_64 patch-redis-6.2.5-1-ACC-1-2.x86_64 +CVE-2021-22 Important/Sec. redis-6.2.5-3.x86_64 patch-redis-6.2.5-1-ACC-1-2.x86_64 +CVE-2021-33 Important/Sec. redis-6.2.5-4.x86_64 - +CVE-2021-3 Important/Sec. redis-6.2.5-4.x86_64 - +CVE-2022-38023 Important/Sec. samba-client-4.17.2-5.oe2203sp1.x86_64 - +CVE-2022-37966 Important/Sec. samba-client-4.17.2-5.oe2203sp1.x86_64 - +``` + +找到提供热补丁的相应cve,发现CVE-2021-1、CVE-2021-11、CVE-2021-2和CVE-2021-22可用热补丁修复。 + +在安装补丁前测试功能,基于redis.conf配置文件启动redis服务。 +```shell +[root@openEuler ~]# sudo redis-server ./redis.conf & +[1] 285075 +[root@openEuler ~]# 285076:C 25 Mar 2023 12:09:51.503 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo +285076:C 25 Mar 2023 12:09:51.503 # Redis version=255.255.255, bits=64, commit=00000000, modified=0, pid=285076, just started +285076:C 25 Mar 2023 12:09:51.503 # Configuration loaded +285076:M 25 Mar 2023 12:09:51.504 * Increased maximum number of open files to 10032 (it was originally set to 1024). +285076:M 25 Mar 2023 12:09:51.504 * monotonic clock: POSIX clock_gettime + _._ + _.-``__ ''-._ + _.-`` `. `_. ''-._ Redis 255.255.255 (00000000/0) 64 bit + .-`` .-```. ```\/ _.,_ ''-._ + ( ' , .-` | `, ) Running in standalone mode + |`-._`-...-` __...-.``-._|'` _.-'| Port: 6380 + | `-._ `._ / _.-' | PID: 285076 + `-._ `-._ `-./ _.-' _.-' + |`-._`-._ `-.__.-' _.-'_.-'| + | `-._`-._ _.-'_.-' | https://redis.io + `-._ `-._`-.__.-'_.-' _.-' + |`-._`-._ `-.__.-' _.-'_.-'| + | `-._`-._ _.-'_.-' | + `-._ `-._`-.__.-'_.-' _.-' + `-._ `-.__.-' _.-' + `-._ _.-' + `-.__.-' + +285076:M 25 Mar 2023 12:09:51.505 # Server initialized +285076:M 25 Mar 2023 12:09:51.505 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect. +285076:M 25 Mar 2023 12:09:51.506 * Ready to accept connections + +``` + +安装前测试功能。 + +```shell +[root@openEuler ~]# telnet 127.0.0.1 6380 +Trying 127.0.0.1... +Connected to 127.0.0.1. +Escape character is '^]'. + +*100 + +-ERR Protocol error: expected '$', got ' ' +Connection closed by foreign host. +``` + +指定修复CVE-2021-1,确认关联到对应的热补丁包,显示安装成功。 +```shell +[root@openEuler ~]# dnf hotupgrade patch-redis-6.2.5-1-ACC-1-1.x86_64 +Last metadata expiration check: 0:01:39 ago on 2024年01月02日 星期二 20时16分45秒. +The hotpatch 'redis-6.2.5-1/ACC-1-1' already has a 'ACTIVED' sub hotpatch of binary file 'redis-benchmark' +The hotpatch 'redis-6.2.5-1/ACC-1-1' already has a 'ACTIVED' sub hotpatch of binary file 'redis-cli' +The hotpatch 'redis-6.2.5-1/ACC-1-1' already has a 'ACTIVED' sub hotpatch of binary file 'redis-server' +Package patch-redis-6.2.5-1-ACC-1-1.x86_64 is already installed. +Dependencies resolved. +Nothing to do. +Complete! +``` + +使用dnf hotpatch --list确认该热补丁是否安装成功,确认Status为ACTIVED。 +```shell +[root@openEuler ~]# dnf hotpatch --list +Last metadata expiration check: 0:04:43 ago on 2024年01月02日 星期二 20时16分45秒. +base-pkg/hotpatch status +redis-6.2.5-1/ACC-1-1/redis-benchmark ACTIVED +redis-6.2.5-1/ACC-1-1/redis-cli ACTIVED +redis-6.2.5-1/ACC-1-1/redis-server ACTIVED +``` + +确认该cve是否已被修复,由于CVE-2021-1所使用的热补丁包patch-redis-6.2.5-1-ACC-1-1.x86_64同样修复CVE-2021-11,CVE-2021-1和CVE-2021-11都不予显示。 +```shell +[root@openEuler ~]# dnf hot-updateinfo list cves +Last metadata expiration check: 0:08:48 ago on 2023年03月25日 星期六 11时53分46秒. +CVE-2023-22995 Important/Sec. python3-perf-5.10.0-136.22.0.98.oe2203sp1.x86_64 - +CVE-2023-1076 Important/Sec. python3-perf-5.10.0-136.22.0.98.oe2203sp1.x86_64 - +CVE-2023-26607 Important/Sec. python3-perf-5.10.0-136.22.0.98.oe2203sp1.x86_64 - +CVE-2022-40897 Important/Sec. python3-setuptools-59.4.0-5.oe2203sp1.noarch - +CVE-2021-22 Important/Sec. redis-6.2.5-3.x86_64 patch-redis-6.2.5-1-ACC-1-2.x86_64 +CVE-2021-2 Important/Sec. redis-6.2.5-3.x86_64 patch-redis-6.2.5-1-ACC-1-2.x86_64 +CVE-2021-33 Important/Sec. redis-6.2.5-4.x86_64 - +CVE-2021-3 Important/Sec. redis-6.2.5-4.x86_64 - +CVE-2022-38023 Important/Sec. samba-client-4.17.2-5.oe2203sp1.x86_64 - +CVE-2022-37966 Important/Sec. samba-client-4.17.2-5.oe2203sp1.x86_64 - +``` + +激活后测试功能,对比激活前回显内容。 + +```shell +[root@openEuler ~]# telnet 127.0.0.1 6380 +Trying 127.0.0.1... +Connected to 127.0.0.1. +Escape character is '^]'. + +*100 + +-ERR Protocol error: unauthenticated multibulk length +Connection closed by foreign host. +``` + +使用dnf hotpatch --remove指定热补丁手动卸载。 +```shell +[root@openEuler ~]# dnf hotpatch --remove redis-6.2.5-1 +Last metadata expiration check: 0:11:52 ago on 2024年01月02日 星期二 20时16分45秒. +Gonna remove this hot patch: redis-6.2.5-1 +remove hot patch 'redis-6.2.5-1' succeed +[root@openEuler ~]# dnf hotpatch --list +Last metadata expiration check: 0:12:00 ago on 2024年01月02日 星期二 20时16分45秒. +base-pkg/hotpatch status +redis-6.2.5-1/ACC-1-1/redis-benchmark NOT-APPLIED +redis-6.2.5-1/ACC-1-1/redis-cli NOT-APPLIED +redis-6.2.5-1/ACC-1-1/redis-server NOT-APPLIED +``` + +使用热补丁扫描命令查看本机待修复cve,确认CVE-2021-1和CVE-2021-11正常显示。 +```shell +[root@openEuler ~]# dnf hot-updateinfo list cves +Last metadata expiration check: 0:00:38 ago on 2023年03月25日 星期六 11时53分46秒. +CVE-2023-22995 Important/Sec. python3-perf-5.10.0-136.22.0.98.oe2203sp1.x86_64 - +CVE-2023-26545 Important/Sec. python3-perf-5.10.0-136.22.0.98.oe2203sp1.x86_64 - +CVE-2022-40897 Important/Sec. python3-setuptools-59.4.0-5.oe2203sp1.noarch - +CVE-2021-1 Important/Sec. redis-6.2.5-2.x86_64 patch-redis-6.2.5-1-ACC-1-1.x86_64 +CVE-2021-11 Important/Sec. redis-6.2.5-2.x86_64 patch-redis-6.2.5-1-ACC-1-1.x86_64 +CVE-2021-2 Important/Sec. redis-6.2.5-3.x86_64 patch-redis-6.2.5-1-ACC-1-2.x86_64 +CVE-2021-22 Important/Sec. redis-6.2.5-3.x86_64 patch-redis-6.2.5-1-ACC-1-2.x86_64 +CVE-2021-33 Important/Sec. redis-6.2.5-4.x86_64 - +CVE-2021-3 Important/Sec. redis-6.2.5-4.x86_64 - +CVE-2022-38023 Important/Sec. samba-client-4.17.2-5.oe2203sp1.x86_64 - +CVE-2022-37966 Important/Sec. samba-client-4.17.2-5.oe2203sp1.x86_64 - +``` + +- 安装高版本ACC热补丁 + +指定安装热补丁包patch-redis-6.2.5-1-ACC-1-2.x86_64。 +```shell +[root@openEuler ~]# dnf hotupgrade patch-redis-6.2.5-1-ACC-1-2.x86_64 +Last metadata expiration check: 0:36:12 ago on 2024年01月02日 星期二 20时16分45秒. +The hotpatch 'redis-6.2.5-1/ACC-1-2' already has a 'ACTIVED' sub hotpatch of binary file 'redis-benchmark' +The hotpatch 'redis-6.2.5-1/ACC-1-2' already has a 'ACTIVED' sub hotpatch of binary file 'redis-cli' +The hotpatch 'redis-6.2.5-1/ACC-1-2' already has a 'ACTIVED' sub hotpatch of binary file 'redis-server' +Package patch-redis-6.2.5-1-ACC-1-2.x86_64 is already installed. +Dependencies resolved. +Nothing to do. +Complete! +``` + +使用热补丁扫描命令查看本机待修复cve,由于patch-redis-6.2.5-1-ACC-1-2.x86_64比patch-redis-6.2.5-1-ACC-1-1.x86_64的热补丁版本高,低版本热补丁对应的CVE-2021-1和CVE-2021-11,以及高版本热补丁对应的CVE-2021-2和CVE-2021-22都被修复。 +```shell +[root@openEuler ~]# dnf hot-updateinfo list cves +Last metadata expiration check: 0:00:38 ago on 2023年03月25日 星期六 11时53分46秒. +CVE-2023-22995 Important/Sec. python3-perf-5.10.0-136.22.0.98.oe2203sp1.x86_64 - +CVE-2023-26545 Important/Sec. python3-perf-5.10.0-136.22.0.98.oe2203sp1.x86_64 - +CVE-2022-40897 Important/Sec. python3-setuptools-59.4.0-5.oe2203sp1.noarch - +CVE-2021-33 Important/Sec. redis-6.2.5-4.x86_64 - +CVE-2021-3 Important/Sec. redis-6.2.5-4.x86_64 - +CVE-2022-38023 Important/Sec. samba-client-4.17.2-5.oe2203sp1.x86_64 - +CVE-2022-37966 Important/Sec. samba-client-4.17.2-5.oe2203sp1.x86_64 - +``` + +- 热补丁目标软件包版本大于本机安装版本 + +查看热补丁repo源中repodata目录下的xxx-updateinfo.xml.gz,确认文件中的CVE-2021-33、CVE-2021-3相关信息。 + +```xml + + openEuler-HotPatchSA-2023-3 + An update for mariadb is now available for openEuler-22.03-LTS + Important + openEuler + + + + + + patch-redis-6.2.5-2-ACC.(CVE-2021-3, CVE-2021-33) + + + openEuler + + patch-redis-6.2.5-2-ACC-1-1.aarch64.rpm + + + patch-redis-6.2.5-2-ACC-1-1.x86_64.rpm + + + + +``` +package中的name字段"patch-redis-6.2.5-2-ACC"的组成部分为:patch-源码包名-源码包version-源码包release-热补丁patch名,该热补丁包需要本机安装redis-6.2.5-2源码版本,检查本机redis安装版本。 + +```shell +[root@openEuler ~]# rpm -qa | grep redis +redis-6.2.5-1.x86_64 +``` +由于本机安装版本不匹配,大于本机安装版本,该热补丁包名不显示,以'-'显示。 +```shell +[root@openEuler ~]# dnf hot-updateinfo list cves +Last metadata expiration check: 0:00:38 ago on 2023年03月25日 星期六 11时53分46秒. +CVE-2023-22995 Important/Sec. python3-perf-5.10.0-136.22.0.98.oe2203sp1.x86_64 - +CVE-2023-26545 Important/Sec. python3-perf-5.10.0-136.22.0.98.oe2203sp1.x86_64 - +CVE-2022-40897 Important/Sec. python3-setuptools-59.4.0-5.oe2203sp1.noarch - +CVE-2021-33 Important/Sec. redis-6.2.5-4.x86_64 - +CVE-2021-3 Important/Sec. redis-6.2.5-4.x86_64 - +CVE-2022-38023 Important/Sec. samba-client-4.17.2-5.oe2203sp1.x86_64 - +CVE-2022-37966 Important/Sec. samba-client-4.17.2-5.oe2203sp1.x86_64 - +``` + +- 热补丁目标软件包版本小于本机安装版本。 + +查看热补丁repo源中repodata目录下的xxx-updateinfo.xml.gz,确认文件中的CVE-2021-44、CVE-2021-4相关信息。 + +```xml + + openEuler-HotPatchSA-2023-4 + An update for mariadb is now available for openEuler-22.03-LTS + Important + openEuler + + + + + + patch-redis-6.2.4-1-ACC.(CVE-2021-44, CVE-2021-4) + + + openEuler + + patch-redis-6.2.4-1-ACC-1-1.aarch64.rpm + + + patch-redis-6.2.4-1-ACC-1-1.x86_64.rpm + + + + +``` + +package中的name字段"patch-redis-6.2.4-1-ACC"的组成部分为:patch-源码包名-源码包version-源码包release-热补丁patch名,该热补丁包需要本机安装redis-6.2.4-1源码版本,检查本机redis安装版本。 + +```shell +[root@openEuler ~]# rpm -qa | grep redis +redis-6.2.5-1.x86_64 +``` + +由于本机安装版本不匹配,小于本机安装版本,该CVE不予显示。 + +```shell +[root@openEuler ~]# dnf hot-updateinfo list cves +Last metadata expiration check: 0:00:38 ago on 2023年03月25日 星期六 11时53分46秒. +CVE-2023-22995 Important/Sec. python3-perf-5.10.0-136.22.0.98.oe2203sp1.x86_64 - +CVE-2023-26545 Important/Sec. python3-perf-5.10.0-136.22.0.98.oe2203sp1.x86_64 - +CVE-2022-40897 Important/Sec. python3-setuptools-59.4.0-5.oe2203sp1.noarch - +CVE-2021-33 Important/Sec. redis-6.2.5-4.x86_64 - +CVE-2021-3 Important/Sec. redis-6.2.5-4.x86_64 - +CVE-2022-38023 Important/Sec. samba-client-4.17.2-5.oe2203sp1.x86_64 - +CVE-2022-37966 Important/Sec. samba-client-4.17.2-5.oe2203sp1.x86_64 - +``` + diff --git "a/docs/zh/docs/A-Ops/dnf\346\217\222\344\273\266\345\221\275\344\273\244\346\214\207\345\257\274\346\211\213\345\206\214.md" "b/docs/zh/docs/A-Ops/dnf\346\217\222\344\273\266\345\221\275\344\273\244\346\214\207\345\257\274\346\211\213\345\206\214.md" deleted file mode 100644 index a6782b7ac2ecf52c276baec8f94acc8a0fb1fbcd..0000000000000000000000000000000000000000 --- "a/docs/zh/docs/A-Ops/dnf\346\217\222\344\273\266\345\221\275\344\273\244\346\214\207\345\257\274\346\211\213\345\206\214.md" +++ /dev/null @@ -1,550 +0,0 @@ -# dnf插件命令使用手册 - -首先需要安装dnf插件: - -```shell -dnf install dnf-hotpatch-plugin -``` - -将dnf热补丁插件安装完成后,可使用dnf命令调用热补丁操作,命令包含热补丁扫描(dnf hot-updateinfo),热补丁状态设置及查询(dnf hotpatch ),热补丁应用(dnf hotupgrade),本文将介绍上述命令的具体使用方法。 - -## 热补丁扫描 - -`hot-updateinfo`命令支持扫描热补丁并指定cve查询相关热补丁,命令使用方式如下: - -```shell -dnf hot-updateinfo list cves [--cve [cve_id]] - -General DNF options: - -h, --help, --help-cmd - show command help - --cve CVES, --cves CVES - Include packages needed to fix the given CVE, in updates - -``` - -- `--list` - -1. 查询主机所有可修复的cve和对应的冷/热补丁。 - -```shell -[root@localhost dnf]# dnf hot-updateinfo list cves -# cve-id level cold-patch hot-patch -Last metadata expiration check: 0:54:46 ago on 2023年03月16日 星期四 09时40分27秒. -CVE-2022-3080 Important/Sec. bind-libs-9.16.23-10.oe2203.aarch64 patch-bind-libs-9.16.23-09-name-1-111.aarch64 -CVE-2021-25220 Moderate/Sec. bind-9.16.23-10.oe2203.aarch64 - -CVE-2022-1886 Critical/Sec. vim-common-8.2-39.oe2203.aarch64 patch-vim-common-8.2-38-name-1-233.aarch64 -CVE-2022-1725 Low/Sec. vim-minimal-8.2-58.oe2203.aarch64 patch-vim-minimal-8.2-57-name-2-11.aarch64 -``` - -2. 指定cve查询对应的冷/热补丁。 - -```shell -[root@localhost dnf]# dnf hot-updateinfo list cves --cve CVE-2022-3080 -# cve-id level cold-patch hot-patch -Last metadata expiration check: 0:54:46 ago on 2023年03月16日 星期四 09时40分27秒. -CVE-2022-3080 Important/Sec. bind-libs-9.16.23-10.oe2203.aarch64 patch-bind-libs-9.16.23-09-name-1-111.aarch64 -``` - -3. cve不存在时列表为空。 - -```shell -[root@localhost dnf]# dnf hot-updateinfo list cves --cve CVE-2022-3089 -# cve-id level cold-patch hot-patch -Last metadata expiration check: 0:54:46 ago on 2023年03月16日 星期四 09时40分27秒. -``` - -## 热补丁状态及转换图 - -- 热补丁状态图 - - NOT-APPLIED: 热补丁尚未安装。 - - DEACTIVED: 热补丁已被安装。 - - ACTIVED: 热补丁已被激活。 - - ACCEPT: 热补丁已被接受,后续重启后会被自动应用。 - - ![热补丁状态转换图](./figures/热补丁状态图.png) - -## 热补丁状态查询和切换 - -`hotpatch`命令支持查询、切换热补丁的状态,命令使用方式如下: - -```shell -dnf hotpatch - -General DNF options: - -h, --help, --help-cmd - show command help - --cve CVES, --cves CVES - Include packages needed to fix the given CVE, in updates - -Hotpatch command-specific opetions: - --list [{cve, cves}] show list of hotpatch - --apply APPLY_NAME apply hotpatch - --remove REMOVE_NAME remove hotpatch - --active ACTIVE_NAME active hotpatch - --deactive DEACTIVE_NAME - deactive hotpatch - --accept ACCEPT_NAME accept hotpatch -``` - -1. 使用`dnf hotpatch --list`命令查询当前系统中可使用的热补丁状态并展示。 - - ```shell - [root@localhost dnf]# dnf hotpatch --list - Last metadata expiration check: 0:54:46 ago on 2023年03月16日 星期四 09时40分27秒. - base-pkg/hotpatch status - redis-6.2.5-1/HP001 NOT-APPLIED - redis-6.2.5-1/HP001 NOT-APPLIED - redis-6.2.5-1/HP002 ACTIVED - redis-6.2.5-1/HP002 ACTIVED - ``` - -2. 使用`dnf hotpatch --list cves`查询漏洞(CVE-id)对应热补丁及其状态并展示。 - - ```shell - [root@localhost dnf]# dnf hotpatch --list cves - Last metadata expiration check: 0:54:46 ago on 2023年03月16日 星期四 09时40分27秒. - CVE-id base-pkg/hotpatch status - CVE-2023-1111 redis-6.2.5-1/HP001 NOT-APPLIED - CVE-2023-1112 redis-6.2.5-1/HP001 NOT-APPLIED - CVE-2023-2221 redis-6.2.5-1/HP002 ACTIVED - CVE-2023-2222 redis-6.2.5-1/HP002 ACTIVED - ``` - -3. 使用`dnf hotpatch --list cves --cve `筛选指定CVE对应的热补丁及其状态并展示。 - - ```shell - [root@localhost dnf]# dnf hotpatch --list cves --cve CVE-2023-1111 - Last metadata expiration check: 0:54:46 ago on 2023年03月16日 星期四 09时40分27秒. - CVE-id base-pkg/hotpatch status - CVE-2023-1111 redis-6.2.5-1/HP001 NOT-APPLIED - ``` - -4. 使用`dnf hotpatch --list cves --cve `查询无结果时展示为空。 - - ```shell - [root@localhost dnf]# dnf hotpatch --list cves --cve CVE-2023-1 - Last metadata expiration check: 0:54:46 ago on 2023年03月16日 星期四 09时40分27秒. - ``` - -5. 使用`dnf hotpatch --apply `命令应用热补丁,可使用`syscare list`查询应用后的状态变化,变化逻辑见上文的热补丁状态转换图。 - - ```shell - [root@openEuler dnf-plugins]# dnf hotpatch --apply redis-6.2.5-1/HP2 - Last metadata expiration check: 2:38:51 ago on 2023年05月25日 星期四 13时49分28秒. - Gonna apply this hot patch: redis-6.2.5-1/HP2 - apply hot patch 'redis-6.2.5-1/HP2' succeed - [root@openEuler dnf-plugins]# syscare list - Uuid Name Status - 25209ddc-b1e4-48e0-b715-e759ec8db401 redis-6.2.5-1/HP2 ACTIVED - ``` - -6. 使用`dnf hotpatch --deactive `停用热补丁,可使用`syscare list`查询停用后的状态变化,变化逻辑见上文的热补丁状态转换图。 - - ```shell - [root@openEuler dnf-plugins]# dnf hotpatch --deactive redis-6.2.5-1/HP2 - Last metadata expiration check: 2:39:10 ago on 2023年05月25日 星期四 13时49分28秒. - Gonna deactive this hot patch: redis-6.2.5-1/HP2 - deactive hot patch 'redis-6.2.5-1/HP2' succeed - [root@openEuler dnf-plugins]# syscare list - Uuid Name Status - 25209ddc-b1e4-48e0-b715-e759ec8db401 redis-6.2.5-1/HP2 DEACTIVED - ``` - -7. 使用`dnf hotpatch --remove `删除热补丁,可使用`syscare list`查询删除后的状态变化,变化逻辑见上文的热补丁状态转换图。 - - ```shell - [root@openEuler dnf-plugins]# dnf hotpatch --remove redis-6.2.5-1/HP2 - Last metadata expiration check: 2:53:25 ago on 2023年05月25日 星期四 13时49分28秒. - Gonna remove this hot patch: redis-6.2.5-1/HP2 - remove hot patch 'redis-6.2.5-1/HP2' succeed - [root@openEuler dnf-plugins]# syscare list - Uuid Name Status - 25209ddc-b1e4-48e0-b715-e759ec8db401 redis-6.2.5-1/HP2 NOT-APPLIED - ``` - -8. 使用`dnf hotpatch --active `激活热补丁,可使用`syscare list`查询激活后的状态变化,变化逻辑见上文的热补丁状态转换图。 - - ```shell - [root@openEuler dnf-plugins]# dnf hotpatch --active redis-6.2.5-1/HP2 - Last metadata expiration check: 2:53:37 ago on 2023年05月25日 星期四 13时49分28秒. - Gonna active this hot patch: redis-6.2.5-1/HP2 - active hot patch 'redis-6.2.5-1/HP2' failed, remain original status. - [root@openEuler dnf-plugins]# syscare list - Uuid Name Status - 25209ddc-b1e4-48e0-b715-e759ec8db401 redis-6.2.5-1/HP2 ACTIVED - ``` - -9. 使用`dnf hotpatch --accept `接收热补丁,可使用`syscare list`查询接收后的状态变化,变化逻辑见上文的热补丁状态转换图。 - - ```shell - [root@openEuler dnf-plugins]# dnf hotpatch --accept redis-6.2.5-1/HP2 - Last metadata expiration check: 2:53:25 ago on 2023年05月25日 星期四 13时49分28秒. - Gonna accept this hot patch: redis-6.2.5-1/HP2 - remove hot patch 'redis-6.2.5-1/HP2' succeed - [root@openEuler dnf-plugins]# syscare list - Uuid Name Status - 25209ddc-b1e4-48e0-b715-e759ec8db401 redis-6.2.5-1/HP2 ACCEPTED - ``` - -## 热补丁应用 - -`hotupgrade`命令根据cve id和热补丁名称进行热补丁修复,同时也支持全量修复。命令使用方式如下: - -```shell -dnf hotupgrade [--cve [cve_id]] [SPEC ...] - -General DNF options: - -h, --help, --help-cmd - show command help - --cve CVES, --cves CVES - Include packages needed to fix the given CVE, in updates - -command-specific options: - SPEC Hotpatch specification -``` - -- Case1:当热补丁已经安装时,使用`dnf hotupgrade`安装所有存在的热补丁。这时dnf hotupgrade会返回形如"Package xx is already installed."提示信息,告诉用户该软件包已安装。 - - ```shell - [root@openEuler aops-ceres]# dnf hotupgrade - Last metadata expiration check: 4:04:34 ago on 2023年06月02日 星期五 06时33分41秒. - Gonna apply these hot patches:['patch-redis-6.2.5-1-HP001-1-1.x86_64', 'patch-redis-6.2.5-1-HP002-1-1.x86_64'] - The target package 'redis-6.2.5-1' has a hotpatch 'HP001' applied - Gonna remove these hot patches: ['redis-6.2.5-1/HP001'] - Remove hot patch redis-6.2.5-1/HP001. - Package patch-redis-6.2.5-1-HP001-1-1.x86_64 is already installed. - Package patch-redis-6.2.5-1-HP002-1-1.x86_64 is already installed. - Dependencies resolved. - Nothing to do. - Complete! - Applying hot patch - Apply hot patch succeed: redis-6.2.5-1/HP001. - Apply hot patch failed: redis-6.2.5-1/HP002. - ``` - -- Case2: 热补丁未安装时,使用`dnf hotupgrade`命令安装存在的所有热补丁,将显示安装信息。(补充:使用hotupgrade命令时,如果热补丁已安装,会提示case1中的返回信息,如果未安装,则会返回次case中的信息。) - - ```shell - [root@openEuler A-ops]# dnf hotupgrade - Last metadata expiration check: 4:13:16 ago on 2023年06月02日 星期五 06时33分41秒. - Gonna apply these hot patches:['patch-redis-6.2.5-1-HP002-1-1.x86_64', 'patch-redis-6.2.5-1-HP001-1-1.x86_64'] - Package patch-redis-6.2.5-1-HP002-1-1.x86_64 is already installed. - Dependencies resolved. - xxxx(Install messgaes) - Is this ok [y/N]: y - Downloading Packages: - xxxx(Install process) - Complete! - - Applying hot patch - Apply hot patch succeed: redis-6.2.5-1/HP001. - ``` - -- Case3: 使用`dnf hotupgrade `升级指定热补丁包。 - - ```shell - [root@openEuler ~]# dnf hotupgrade patch-redis-6.2.5-1-HP001-1-1.x86_64 - Last metadata expiration check: 0:07:49 ago on 2023年06月08日 星期四 12时03分46秒. - Package patch-redis-6.2.5-1-HP001-1-1.x86_64 is already installed. - Dependencies resolved. - Nothing to do. - Complete! - Applying hot patch - Apply hot patch succeed: redis-6.2.5-1/HP001. - ``` - -- `--cve` - - - Case1:使用`dnf hotupgrade --cve `指定cve_id安装指定CVE对应的热补丁。 - - ```shell - [root@localhost dnf]# dnf hotupgrade --cve CVE-2021-11 - Last metadata expiration check: xxx - Dependencies resolved. - xxxx(Install messgaes) - Is this ok [y/N]: y - Downloading Packages: - xxxx(Install process) - Complete! - Applying hot patch - Apply hot patch succeed: redis-6.2.5-1/HP001 - ``` - - - Case2:使用`dnf hotupgrade --cve `指定cve_id安装时cve不存在。 - - ```shell - [root@localhost dnf]# dnf hotupgrade --cve CVE-2021-11 - Last metadata expiration check: xxx - The cve doesnt exist: CVE-2021-11 - Error: No hot patches marked for install. - ``` - - - Case3:使用`dnf hotupgrade --cve `指定cve_id安装时,该CVE对应的低版本热补丁已安装时,删除低版本热补丁包,安装高版本热补丁包。 - - ```shell - [root@localhost dnf]# dnf hotupgrade --cve CVE-2021-22 - Last metadata expiration check: xxx - The target package 'redis-6.2.5-1' has a hotpatch 'HP001' applied - Gonna remove these hot patches: ['redis-6.2.5-1/HP001'] - Is this ok [y/N]: y - Remove hot patch redis-6.2.5-1/HP001 - xxxx (install messages and process install) - Apply hot patch - apply hot patch succeed: redis-6.2.5-1/HP002 - ``` - - - Case4:使用`dnf hotupgrade --cve `指定cve_id安装时,该CVE对应的最高版本热补丁包已存在。 - - ```shell - [root@localhost dnf]# dnf hotupgrade --cve CVE-2021-22 - Package patch -redis-6.2.5-1-HP002-1-1.x86_64 is already installed. - Dependencies resolved. - Nothing to do. - Complete! - Applying hot patch - Apply hot patch succeed: redis-6.2.5-1/HP002 - ``` - -- `SPEC` - - ```shell - [root@localhost dnf]# dnf hotupgrade bind-libs-hotpatch - ``` - -子命令的输出根据不同的情况与"--cve"命令相同。 - -## 使用场景说明 - -本段落介绍上述命令的使用场景及顺序介绍,需要提前确认本机的热补丁repo源和相应冷补丁repo源已开启。 - -使用热补丁扫描命令查看本机待修复cve。 - -```shell -[root@openEuler aops-apollo_src]# dnf hot-updateinfo list cves -Last metadata expiration check: 0:00:38 ago on 2023年03月25日 星期六 11时53分46秒. -CVE-2023-22995 Important/Sec. python3-perf-5.10.0-136.22.0.98.oe2203sp1.x86_64 - -CVE-2023-26545 Important/Sec. python3-perf-5.10.0-136.22.0.98.oe2203sp1.x86_64 - -CVE-2022-40897 Important/Sec. python3-setuptools-59.4.0-5.oe2203sp1.noarch - -CVE-2021-1 Important/Sec. redis-6.2.5-2.x86_64 patch-redis-6.2.5-1-HP001-1-1.x86_64 -CVE-2021-11 Important/Sec. redis-6.2.5-2.x86_64 patch-redis-6.2.5-1-HP001-1-1.x86_64 -CVE-2021-2 Important/Sec. redis-6.2.5-3.x86_64 patch-redis-6.2.5-1-HP002-1-1.x86_64 -CVE-2021-22 Important/Sec. redis-6.2.5-3.x86_64 patch-redis-6.2.5-1-HP002-1-1.x86_64 -CVE-2021-33 Important/Sec. redis-6.2.5-4.x86_64 - -CVE-2021-3 Important/Sec. redis-6.2.5-4.x86_64 - -CVE-2022-38023 Important/Sec. samba-client-4.17.2-5.oe2203sp1.x86_64 - -CVE-2022-37966 Important/Sec. samba-client-4.17.2-5.oe2203sp1.x86_64 - -``` - -找到提供热补丁的相应cve,发现CVE-2021-1、CVE-2021-11、CVE-2021-2和CVE-2021-22可用热补丁修复。 - -在安装补丁前测试功能,基于redis.conf配置文件启动redis服务。 - -```shell -[root@openEuler redis_patch]# sudo redis-server ./redis.conf & -[1] 285075 -[root@openEuler redis_patch]# 285076:C 25 Mar 2023 12:09:51.503 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo -285076:C 25 Mar 2023 12:09:51.503 # Redis version=255.255.255, bits=64, commit=00000000, modified=0, pid=285076, just started -285076:C 25 Mar 2023 12:09:51.503 # Configuration loaded -285076:M 25 Mar 2023 12:09:51.504 * Increased maximum number of open files to 10032 (it was originally set to 1024). -285076:M 25 Mar 2023 12:09:51.504 * monotonic clock: POSIX clock_gettime - _._ - _.-``__ ''-._ - _.-`` `. `_. ''-._ Redis 255.255.255 (00000000/0) 64 bit - .-`` .-```. ```\/ _.,_ ''-._ - ( ' , .-` | `, ) Running in standalone mode - |`-._`-...-` __...-.``-._|'` _.-'| Port: 6380 - | `-._ `._ / _.-' | PID: 285076 - `-._ `-._ `-./ _.-' _.-' - |`-._`-._ `-.__.-' _.-'_.-'| - | `-._`-._ _.-'_.-' | https://redis.io - `-._ `-._`-.__.-'_.-' _.-' - |`-._`-._ `-.__.-' _.-'_.-'| - | `-._`-._ _.-'_.-' | - `-._ `-._`-.__.-'_.-' _.-' - `-._ `-.__.-' _.-' - `-._ _.-' - `-.__.-' - -285076:M 25 Mar 2023 12:09:51.505 # Server initialized -285076:M 25 Mar 2023 12:09:51.505 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect. -285076:M 25 Mar 2023 12:09:51.506 * Ready to accept connections - -``` - -安装前测试功能。 - -```shell -[root@openEuler ~]# telnet 127.0.0.1 6380 -Trying 127.0.0.1... -Connected to 127.0.0.1. -Escape character is '^]'. - -*100 - --ERR Protocol error: expected '$', got ' ' -Connection closed by foreign host. -``` - -指定修复CVE-2021-1,确认关联到对应的热补丁包,显示安装成功。 - -```shell -[root@openEuler aops-apollo_src]# dnf hotupgrade --cve CVE-2021-1 -Last metadata expiration check: 0:05:19 ago on 2023年03月25日 星期六 11时53分46秒. -Package patch-redis-6.2.5-1-HP001-1-1.x86_64 is already installed. -Dependencies resolved. -Nothing to do. -Complete! -Applying hot patch -Apply hot patch succeed: redis-6.2.5-1/HP001. -``` - -使用syscare确认该热补丁是否安装成功,确认Status为ACTIVED。 - -```shell -[root@openEuler ~]# syscare list -Uuid Name Status -cf47649c-b370-4f5a-a914-d2ca4d8f1f3a redis-6.2.5-1/HP001 ACTIVED -``` - -确认该cve是否已被修复,由于CVE-2021-1所使用的热补丁包patch-redis-6.2.5-1-HP001-1-1.x86_64同样修复CVE-2021-11,CVE-2021-1和CVE-2021-11都不予显示。 - -```shell -[root@openEuler dnf-plugins]# dnf hot-updateinfo list cves -Last metadata expiration check: 0:08:48 ago on 2023年03月25日 星期六 11时53分46秒. -CVE-2023-22995 Important/Sec. python3-perf-5.10.0-136.22.0.98.oe2203sp1.x86_64 - -CVE-2023-1076 Important/Sec. python3-perf-5.10.0-136.22.0.98.oe2203sp1.x86_64 - -CVE-2023-26607 Important/Sec. python3-perf-5.10.0-136.22.0.98.oe2203sp1.x86_64 - -CVE-2022-40897 Important/Sec. python3-setuptools-59.4.0-5.oe2203sp1.noarch - -CVE-2021-22 Important/Sec. redis-6.2.5-3.x86_64 patch-redis-6.2.5-1-HP002-1-1.x86_64 -CVE-2021-2 Important/Sec. redis-6.2.5-3.x86_64 patch-redis-6.2.5-1-HP002-1-1.x86_64 -CVE-2021-33 Important/Sec. redis-6.2.5-4.x86_64 - -CVE-2021-3 Important/Sec. redis-6.2.5-4.x86_64 - -CVE-2022-38023 Important/Sec. samba-client-4.17.2-5.oe2203sp1.x86_64 - -CVE-2022-37966 Important/Sec. samba-client-4.17.2-5.oe2203sp1.x86_64 - -``` - -激活后测试功能,对比激活前回显内容。 - -```shell -[root@openEuler ~]# telnet 127.0.0.1 6380 -Trying 127.0.0.1... -Connected to 127.0.0.1. -Escape character is '^]'. - -*100 - --ERR Protocol error: unauthenticated multibulk length -Connection closed by foreign host. -``` - -由于热补丁还未开发完卸载功能,使用syscare指定Name手动卸载。 - -```shell -[root@openEuler ~]# syscare remove redis-6.2.5-1/HP001 -[root@openEuler ~]# syscare list -Uuid Name Status -cf47649c-b370-4f5a-a914-d2ca4d8f1f3a redis-6.2.5-1/HP001 NOT-APPLIED -``` - -使用热补丁扫描命令查看本机待修复cve,确认CVE-2021-1和CVE-2021-11正常显示。 - -```shell -[root@openEuler aops-apollo_src]# dnf hot-updateinfo list cves -Last metadata expiration check: 0:00:38 ago on 2023年03月25日 星期六 11时53分46秒. -CVE-2023-22995 Important/Sec. python3-perf-5.10.0-136.22.0.98.oe2203sp1.x86_64 - -CVE-2023-26545 Important/Sec. python3-perf-5.10.0-136.22.0.98.oe2203sp1.x86_64 - -CVE-2022-40897 Important/Sec. python3-setuptools-59.4.0-5.oe2203sp1.noarch - -CVE-2021-1 Important/Sec. redis-6.2.5-2.x86_64 patch-redis-6.2.5-1-HP001-1-1.x86_64 -CVE-2021-11 Important/Sec. redis-6.2.5-2.x86_64 patch-redis-6.2.5-1-HP001-1-1.x86_64 -CVE-2021-2 Important/Sec. redis-6.2.5-3.x86_64 patch-redis-6.2.5-1-HP002-1-1.x86_64 -CVE-2021-22 Important/Sec. redis-6.2.5-3.x86_64 patch-redis-6.2.5-1-HP002-1-1.x86_64 -CVE-2021-33 Important/Sec. redis-6.2.5-4.x86_64 - -CVE-2021-3 Important/Sec. redis-6.2.5-4.x86_64 - -CVE-2022-38023 Important/Sec. samba-client-4.17.2-5.oe2203sp1.x86_64 - -CVE-2022-37966 Important/Sec. samba-client-4.17.2-5.oe2203sp1.x86_64 - -``` - -- case 1 - -指定安装热补丁包patch-redis-6.2.5-1-HP002-1-1.x86_64。 - -```shell -[root@openEuler aops-apollo_src]# dnf hotupgrade patch-redis-6.2.5-1-HP002-1-1.x86_64 -Last metadata expiration check: 0:05:19 ago on 2023年03月25日 星期六 11时53分46秒. -Package patch-redis-6.2.5-1-HP002-1-1.x86_64 is already installed. -Dependencies resolved. -Nothing to do. -Complete! -Applying hot patch -Apply hot patch succeed: redis-6.2.5-1/HP002. -``` - -使用热补丁扫描命令查看本机待修复cve,由于patch-redis-6.2.5-1-HP002-1-1.x86_64对应的冷补丁redis-6.2.5-3.x86_64比redis-6.2.5-2.x86_64版本高,redis-6.2.5-2.x86_64对应的CVE-2021-1和CVE-2021-11,以及CVE-2021-2和CVE-2021-22都被修复。 - -```shell -[root@openEuler aops-apollo_src]# dnf hot-updateinfo list cves -Last metadata expiration check: 0:00:38 ago on 2023年03月25日 星期六 11时53分46秒. -CVE-2023-22995 Important/Sec. python3-perf-5.10.0-136.22.0.98.oe2203sp1.x86_64 - -CVE-2023-26545 Important/Sec. python3-perf-5.10.0-136.22.0.98.oe2203sp1.x86_64 - -CVE-2022-40897 Important/Sec. python3-setuptools-59.4.0-5.oe2203sp1.noarch - -CVE-2021-33 Important/Sec. redis-6.2.5-4.x86_64 - -CVE-2021-3 Important/Sec. redis-6.2.5-4.x86_64 - -CVE-2022-38023 Important/Sec. samba-client-4.17.2-5.oe2203sp1.x86_64 - -CVE-2022-37966 Important/Sec. samba-client-4.17.2-5.oe2203sp1.x86_64 - -``` - -- case 2 - -查看热补丁repo源中repodata目录下的xxx-updateinfo.xml.gz,确认文件中的CVE-2021-33、CVE-2021-3相关信息。 - -```xml - - openEuler-SA-2022-3 - An update for mariadb is now available for openEuler-22.03-LTS - Important - openEuler - - - - - - patch-redis-6.2.5-2-HP001.(CVE-2022-24048) - - - openEuler - - patch-redis-6.2.5-2-HP001-1-1.aarch64.rpm - - - patch-redis-6.2.5-2-HP001-1-1.x86_64.rpm - - - - -``` - -package中的name字段"patch-redis-6.2.5-2-HP001"的组成部分为:patch-源码包名-源码包版本-源码包release-热补丁包名,该热补丁包需要本机安装redis-6.2.5-2源码版本,检查本机redis安装版本。 - -```shell -[root@openEuler ~]# rpm -qa | grep redis -redis-6.2.5-1.x86_64 -``` - -由于本机安装版本不匹配,该热补丁包名不显示,以'-'显示。 - -```shell -[root@openEuler aops-apollo_src]# dnf hot-updateinfo list cves -Last metadata expiration check: 0:00:38 ago on 2023年03月25日 星期六 11时53分46秒. -CVE-2023-22995 Important/Sec. python3-perf-5.10.0-136.22.0.98.oe2203sp1.x86_64 - -CVE-2023-26545 Important/Sec. python3-perf-5.10.0-136.22.0.98.oe2203sp1.x86_64 - -CVE-2022-40897 Important/Sec. python3-setuptools-59.4.0-5.oe2203sp1.noarch - -CVE-2021-33 Important/Sec. redis-6.2.5-4.x86_64 - -CVE-2021-3 Important/Sec. redis-6.2.5-4.x86_64 - -CVE-2022-38023 Important/Sec. samba-client-4.17.2-5.oe2203sp1.x86_64 - -CVE-2022-37966 Important/Sec. samba-client-4.17.2-5.oe2203sp1.x86_64 - -``` diff --git a/docs/zh/docs/A-Ops/figures/029B66B9-5A3E-447E-B33C-98B894FC4833.png b/docs/zh/docs/A-Ops/figures/029B66B9-5A3E-447E-B33C-98B894FC4833.png deleted file mode 100644 index 230489c21dba54311356bbf2df56e817c0975f91..0000000000000000000000000000000000000000 Binary files a/docs/zh/docs/A-Ops/figures/029B66B9-5A3E-447E-B33C-98B894FC4833.png and /dev/null differ diff --git a/docs/zh/docs/A-Ops/figures/0BFA7C40-D404-4772-9C47-76EAD7D24E69.png b/docs/zh/docs/A-Ops/figures/0BFA7C40-D404-4772-9C47-76EAD7D24E69.png deleted file mode 100644 index 528bf4e30dc6221c496dd9a6d637359f592856db..0000000000000000000000000000000000000000 Binary files a/docs/zh/docs/A-Ops/figures/0BFA7C40-D404-4772-9C47-76EAD7D24E69.png and /dev/null differ diff --git a/docs/zh/docs/A-Ops/figures/1631073636579.png b/docs/zh/docs/A-Ops/figures/1631073636579.png deleted file mode 100644 index 5aacc487264ac63fbe5322b4f89fca3ebf9c7cd9..0000000000000000000000000000000000000000 Binary files a/docs/zh/docs/A-Ops/figures/1631073636579.png and /dev/null differ diff --git a/docs/zh/docs/A-Ops/figures/1631073840656.png b/docs/zh/docs/A-Ops/figures/1631073840656.png deleted file mode 100644 index 122e391eafe7c0d8d081030a240df90aea260150..0000000000000000000000000000000000000000 Binary files a/docs/zh/docs/A-Ops/figures/1631073840656.png and /dev/null differ diff --git a/docs/zh/docs/A-Ops/figures/1631101736624.png b/docs/zh/docs/A-Ops/figures/1631101736624.png deleted file mode 100644 index 74e2f2ded2ea254c66b221e8ac27a0d8bed9362a..0000000000000000000000000000000000000000 Binary files a/docs/zh/docs/A-Ops/figures/1631101736624.png and /dev/null differ diff --git a/docs/zh/docs/A-Ops/figures/1631101865366.png b/docs/zh/docs/A-Ops/figures/1631101865366.png deleted file mode 100644 index abfbc280a368b93af1e1165385af3a9cac89391d..0000000000000000000000000000000000000000 Binary files a/docs/zh/docs/A-Ops/figures/1631101865366.png and /dev/null differ diff --git a/docs/zh/docs/A-Ops/figures/1631101982829.png b/docs/zh/docs/A-Ops/figures/1631101982829.png deleted file mode 100644 index 0b1c9c7c3676b804dbdf19afbe4f3ec9dbe0627f..0000000000000000000000000000000000000000 Binary files a/docs/zh/docs/A-Ops/figures/1631101982829.png and /dev/null differ diff --git a/docs/zh/docs/A-Ops/figures/1631102019026.png b/docs/zh/docs/A-Ops/figures/1631102019026.png deleted file mode 100644 index 54e8e7d1cffbb28711074e511b08c73f66c1fb75..0000000000000000000000000000000000000000 Binary files a/docs/zh/docs/A-Ops/figures/1631102019026.png and /dev/null differ diff --git a/docs/zh/docs/A-Ops/figures/20210908212726.png b/docs/zh/docs/A-Ops/figures/20210908212726.png deleted file mode 100644 index f7d399aecd46605c09fe2d1f50a1a8670cd80432..0000000000000000000000000000000000000000 Binary files a/docs/zh/docs/A-Ops/figures/20210908212726.png and /dev/null differ diff --git a/docs/zh/docs/A-Ops/figures/D466AC8C-2FAF-4797-9A48-F6C346A1EC77.png b/docs/zh/docs/A-Ops/figures/D466AC8C-2FAF-4797-9A48-F6C346A1EC77.png deleted file mode 100644 index d87c5e04fa8cf4f2af0884226be66ddfb5f481e1..0000000000000000000000000000000000000000 Binary files a/docs/zh/docs/A-Ops/figures/D466AC8C-2FAF-4797-9A48-F6C346A1EC77.png and /dev/null differ diff --git "a/docs/zh/docs/A-Ops/figures/attach\346\265\201\347\250\213.png" "b/docs/zh/docs/A-Ops/figures/attach\346\265\201\347\250\213.png" deleted file mode 100644 index 73b548cc332212f3ae2eec4dcec34c8af6e0e55a..0000000000000000000000000000000000000000 Binary files "a/docs/zh/docs/A-Ops/figures/attach\346\265\201\347\250\213.png" and /dev/null differ diff --git a/docs/zh/docs/A-Ops/figures/check.PNG b/docs/zh/docs/A-Ops/figures/check.PNG deleted file mode 100644 index 2dce821dd43eec6f0d13cd6b2dc1e30653f35489..0000000000000000000000000000000000000000 Binary files a/docs/zh/docs/A-Ops/figures/check.PNG and /dev/null differ diff --git a/docs/zh/docs/A-Ops/figures/dashboard.PNG b/docs/zh/docs/A-Ops/figures/dashboard.PNG deleted file mode 100644 index 2a4a827191367309aad28a8a6c1835df602bdf72..0000000000000000000000000000000000000000 Binary files a/docs/zh/docs/A-Ops/figures/dashboard.PNG and /dev/null differ diff --git a/docs/zh/docs/A-Ops/figures/deploy.PNG b/docs/zh/docs/A-Ops/figures/deploy.PNG deleted file mode 100644 index e30dcb0eb05eb4f41202c736863f3e0ff216398d..0000000000000000000000000000000000000000 Binary files a/docs/zh/docs/A-Ops/figures/deploy.PNG and /dev/null differ diff --git a/docs/zh/docs/A-Ops/figures/diag.PNG b/docs/zh/docs/A-Ops/figures/diag.PNG deleted file mode 100644 index a67e8515b8313a50b06cb985611ef9c166851811..0000000000000000000000000000000000000000 Binary files a/docs/zh/docs/A-Ops/figures/diag.PNG and /dev/null differ diff --git a/docs/zh/docs/A-Ops/figures/domain.PNG b/docs/zh/docs/A-Ops/figures/domain.PNG deleted file mode 100644 index bad499f96df5934565d36edf2308cec5e4147719..0000000000000000000000000000000000000000 Binary files a/docs/zh/docs/A-Ops/figures/domain.PNG and /dev/null differ diff --git a/docs/zh/docs/A-Ops/figures/domain_config.PNG b/docs/zh/docs/A-Ops/figures/domain_config.PNG deleted file mode 100644 index 8995424b35cda75f08881037446b7816a0ca09dc..0000000000000000000000000000000000000000 Binary files a/docs/zh/docs/A-Ops/figures/domain_config.PNG and /dev/null differ diff --git a/docs/zh/docs/A-Ops/figures/elasticsearch3.png b/docs/zh/docs/A-Ops/figures/elasticsearch3.png deleted file mode 100644 index 893aae242aa9117c64f323374d4728d230894973..0000000000000000000000000000000000000000 Binary files a/docs/zh/docs/A-Ops/figures/elasticsearch3.png and /dev/null differ diff --git "a/docs/zh/docs/A-Ops/figures/elasticsearch\351\205\215\347\275\2561.png" "b/docs/zh/docs/A-Ops/figures/elasticsearch\351\205\215\347\275\2561.png" deleted file mode 100644 index 1b7e0eab093b2f0455b8f3972884e5f757fbec3d..0000000000000000000000000000000000000000 Binary files "a/docs/zh/docs/A-Ops/figures/elasticsearch\351\205\215\347\275\2561.png" and /dev/null differ diff --git "a/docs/zh/docs/A-Ops/figures/elasticsearch\351\205\215\347\275\2562.png" "b/docs/zh/docs/A-Ops/figures/elasticsearch\351\205\215\347\275\2562.png" deleted file mode 100644 index 620dbbda71259e3b6ee6a2efb646a9692adf2456..0000000000000000000000000000000000000000 Binary files "a/docs/zh/docs/A-Ops/figures/elasticsearch\351\205\215\347\275\2562.png" and /dev/null differ diff --git a/docs/zh/docs/A-Ops/figures/host.PNG b/docs/zh/docs/A-Ops/figures/host.PNG deleted file mode 100644 index 3c00681a567cf8f1e1baddfb6fdb7b6cf7df43de..0000000000000000000000000000000000000000 Binary files a/docs/zh/docs/A-Ops/figures/host.PNG and /dev/null differ diff --git a/docs/zh/docs/A-Ops/figures/icon-note.gif b/docs/zh/docs/A-Ops/figures/icon-note.gif new file mode 100644 index 0000000000000000000000000000000000000000..6314297e45c1de184204098efd4814d6dc8b1cda Binary files /dev/null and b/docs/zh/docs/A-Ops/figures/icon-note.gif differ diff --git a/docs/zh/docs/A-Ops/figures/jiemi.png b/docs/zh/docs/A-Ops/figures/jiemi.png deleted file mode 100644 index da07cfdf9296e201a82cceb210e651261fe7ecee..0000000000000000000000000000000000000000 Binary files a/docs/zh/docs/A-Ops/figures/jiemi.png and /dev/null differ diff --git "a/docs/zh/docs/A-Ops/figures/kafka\351\205\215\347\275\256.png" "b/docs/zh/docs/A-Ops/figures/kafka\351\205\215\347\275\256.png" deleted file mode 100644 index 57eb17ccbd2fa63d97f700c29847fac7f08042ff..0000000000000000000000000000000000000000 Binary files "a/docs/zh/docs/A-Ops/figures/kafka\351\205\215\347\275\256.png" and /dev/null differ diff --git "a/docs/zh/docs/A-Ops/figures/prometheus\351\205\215\347\275\256.png" "b/docs/zh/docs/A-Ops/figures/prometheus\351\205\215\347\275\256.png" deleted file mode 100644 index 7c8d0328967e8eb9bc4aa7465a273b9ef5a30b58..0000000000000000000000000000000000000000 Binary files "a/docs/zh/docs/A-Ops/figures/prometheus\351\205\215\347\275\256.png" and /dev/null differ diff --git a/docs/zh/docs/A-Ops/figures/shanchuzhuji.png b/docs/zh/docs/A-Ops/figures/shanchuzhuji.png deleted file mode 100644 index b3da935739369dad1318fe135146755ede13c694..0000000000000000000000000000000000000000 Binary files a/docs/zh/docs/A-Ops/figures/shanchuzhuji.png and /dev/null differ diff --git a/docs/zh/docs/A-Ops/figures/shanchuzhujizu.png b/docs/zh/docs/A-Ops/figures/shanchuzhujizu.png deleted file mode 100644 index e4d85f6e3f1a269a483943f5115f54daa3de51de..0000000000000000000000000000000000000000 Binary files a/docs/zh/docs/A-Ops/figures/shanchuzhujizu.png and /dev/null differ diff --git a/docs/zh/docs/A-Ops/figures/spider.PNG b/docs/zh/docs/A-Ops/figures/spider.PNG deleted file mode 100644 index 53bad6dd38e36db9cadfdbeda21cbc3ef59eddf7..0000000000000000000000000000000000000000 Binary files a/docs/zh/docs/A-Ops/figures/spider.PNG and /dev/null differ diff --git a/docs/zh/docs/A-Ops/figures/spider_detail.jpg b/docs/zh/docs/A-Ops/figures/spider_detail.jpg deleted file mode 100644 index 6d4d2e2b9e79c53dbd359faa03e1c90f07c0ade6..0000000000000000000000000000000000000000 Binary files a/docs/zh/docs/A-Ops/figures/spider_detail.jpg and /dev/null differ diff --git "a/docs/zh/docs/A-Ops/figures/syscare\347\203\255\350\241\245\344\270\201\347\212\266\346\200\201\345\233\276.png" "b/docs/zh/docs/A-Ops/figures/syscare\347\203\255\350\241\245\344\270\201\347\212\266\346\200\201\345\233\276.png" new file mode 100644 index 0000000000000000000000000000000000000000..bbd0600fc5c913198dfe1e1bf2aba9c652576a98 Binary files /dev/null and "b/docs/zh/docs/A-Ops/figures/syscare\347\203\255\350\241\245\344\270\201\347\212\266\346\200\201\345\233\276.png" differ diff --git a/docs/zh/docs/A-Ops/figures/tianjiazhujizu.png b/docs/zh/docs/A-Ops/figures/tianjiazhujizu.png deleted file mode 100644 index ed4ab3616d418ecf33a006fee3985b8b6d2d965d..0000000000000000000000000000000000000000 Binary files a/docs/zh/docs/A-Ops/figures/tianjiazhujizu.png and /dev/null differ diff --git a/docs/zh/docs/A-Ops/figures/tprofiling-run-arch.png b/docs/zh/docs/A-Ops/figures/tprofiling-run-arch.png deleted file mode 100644 index 0ad835125a5e7b7f66938543de1e1c9d53706ce4..0000000000000000000000000000000000000000 Binary files a/docs/zh/docs/A-Ops/figures/tprofiling-run-arch.png and /dev/null differ diff --git a/docs/zh/docs/A-Ops/figures/zhuji.png b/docs/zh/docs/A-Ops/figures/zhuji.png deleted file mode 100644 index f4c7b9103baab7748c83392f6120c8f00880860f..0000000000000000000000000000000000000000 Binary files a/docs/zh/docs/A-Ops/figures/zhuji.png and /dev/null differ diff --git a/docs/zh/docs/A-Ops/figures/zuneizhuji.png b/docs/zh/docs/A-Ops/figures/zuneizhuji.png deleted file mode 100644 index 9f188d207162fa1418a61a10f83ef9c51a512e65..0000000000000000000000000000000000000000 Binary files a/docs/zh/docs/A-Ops/figures/zuneizhuji.png and /dev/null differ diff --git "a/docs/zh/docs/A-Ops/figures/\344\270\273\346\234\272\347\256\241\347\220\206.jpg" "b/docs/zh/docs/A-Ops/figures/\344\270\273\346\234\272\347\256\241\347\220\206.jpg" deleted file mode 100644 index 9f6d8858468c0cc72c1bd395403f064cc63f82bd..0000000000000000000000000000000000000000 Binary files "a/docs/zh/docs/A-Ops/figures/\344\270\273\346\234\272\347\256\241\347\220\206.jpg" and /dev/null differ diff --git "a/docs/zh/docs/A-Ops/figures/\344\270\273\346\234\272\347\273\204.jpg" "b/docs/zh/docs/A-Ops/figures/\344\270\273\346\234\272\347\273\204.jpg" deleted file mode 100644 index fb5472de6b3d30abf6af73e286f70ac8e1d58c15..0000000000000000000000000000000000000000 Binary files "a/docs/zh/docs/A-Ops/figures/\344\270\273\346\234\272\347\273\204.jpg" and /dev/null differ diff --git "a/docs/zh/docs/A-Ops/figures/\344\270\273\346\234\272\350\257\246\346\203\205.jpg" "b/docs/zh/docs/A-Ops/figures/\344\270\273\346\234\272\350\257\246\346\203\205.jpg" deleted file mode 100644 index effd8c29aba14c2e8f301f9f60d8f25ce8c533f0..0000000000000000000000000000000000000000 Binary files "a/docs/zh/docs/A-Ops/figures/\344\270\273\346\234\272\350\257\246\346\203\205.jpg" and /dev/null differ diff --git "a/docs/zh/docs/A-Ops/figures/\344\277\256\346\224\271mysql\351\205\215\347\275\256\346\226\207\344\273\266.png" "b/docs/zh/docs/A-Ops/figures/\344\277\256\346\224\271mysql\351\205\215\347\275\256\346\226\207\344\273\266.png" deleted file mode 100644 index d83425ee0622be329782620318818662b292e88b..0000000000000000000000000000000000000000 Binary files "a/docs/zh/docs/A-Ops/figures/\344\277\256\346\224\271mysql\351\205\215\347\275\256\346\226\207\344\273\266.png" and /dev/null differ diff --git "a/docs/zh/docs/A-Ops/figures/\344\277\256\346\224\271\346\217\222\344\273\266.png" "b/docs/zh/docs/A-Ops/figures/\344\277\256\346\224\271\346\217\222\344\273\266.png" deleted file mode 100644 index ba4a8d4d9aadb7f712bdcb4b193f05f956d38841..0000000000000000000000000000000000000000 Binary files "a/docs/zh/docs/A-Ops/figures/\344\277\256\346\224\271\346\217\222\344\273\266.png" and /dev/null differ diff --git "a/docs/zh/docs/A-Ops/figures/\345\267\245\344\275\234\345\217\260.jpg" "b/docs/zh/docs/A-Ops/figures/\345\267\245\344\275\234\345\217\260.jpg" deleted file mode 100644 index 998b81e3b88d888d0915dcff48dc8cc5df30d91c..0000000000000000000000000000000000000000 Binary files "a/docs/zh/docs/A-Ops/figures/\345\267\245\344\275\234\345\217\260.jpg" and /dev/null differ diff --git "a/docs/zh/docs/A-Ops/figures/\346\211\247\350\241\214\350\257\212\346\226\255.png" "b/docs/zh/docs/A-Ops/figures/\346\211\247\350\241\214\350\257\212\346\226\255.png" deleted file mode 100644 index afb5f7e9fbfb1d1ce46d096a61729766b4940cd3..0000000000000000000000000000000000000000 Binary files "a/docs/zh/docs/A-Ops/figures/\346\211\247\350\241\214\350\257\212\346\226\255.png" and /dev/null differ diff --git "a/docs/zh/docs/A-Ops/figures/\346\212\245\345\221\212\345\206\205\345\256\271.png" "b/docs/zh/docs/A-Ops/figures/\346\212\245\345\221\212\345\206\205\345\256\271.png" deleted file mode 100644 index 2029141179302ecef45d34cb0c9dc916b9142e7b..0000000000000000000000000000000000000000 Binary files "a/docs/zh/docs/A-Ops/figures/\346\212\245\345\221\212\345\206\205\345\256\271.png" and /dev/null differ diff --git "a/docs/zh/docs/A-Ops/figures/\346\217\222\344\273\266\347\256\241\347\220\206.jpg" "b/docs/zh/docs/A-Ops/figures/\346\217\222\344\273\266\347\256\241\347\220\206.jpg" deleted file mode 100644 index 2258d03976902052aaf39d36b6374fa680b9f8aa..0000000000000000000000000000000000000000 Binary files "a/docs/zh/docs/A-Ops/figures/\346\217\222\344\273\266\347\256\241\347\220\206.jpg" and /dev/null differ diff --git "a/docs/zh/docs/A-Ops/figures/app\350\257\246\346\203\205.jpg" "b/docs/zh/docs/A-Ops/figures/\346\225\205\351\232\234\350\257\212\346\226\255/app\350\257\246\346\203\205.jpg" similarity index 100% rename from "docs/zh/docs/A-Ops/figures/app\350\257\246\346\203\205.jpg" rename to "docs/zh/docs/A-Ops/figures/\346\225\205\351\232\234\350\257\212\346\226\255/app\350\257\246\346\203\205.jpg" diff --git "a/docs/zh/docs/A-Ops/figures/\344\277\256\346\224\271\346\250\241\345\236\213.png" "b/docs/zh/docs/A-Ops/figures/\346\225\205\351\232\234\350\257\212\346\226\255/\344\277\256\346\224\271\346\250\241\345\236\213.png" similarity index 100% rename from "docs/zh/docs/A-Ops/figures/\344\277\256\346\224\271\346\250\241\345\236\213.png" rename to "docs/zh/docs/A-Ops/figures/\346\225\205\351\232\234\350\257\212\346\226\255/\344\277\256\346\224\271\346\250\241\345\236\213.png" diff --git "a/docs/zh/docs/A-Ops/figures/\345\210\233\345\273\272\345\267\245\344\275\234\346\265\201.jpg" "b/docs/zh/docs/A-Ops/figures/\346\225\205\351\232\234\350\257\212\346\226\255/\345\210\233\345\273\272\345\267\245\344\275\234\346\265\201.jpg" similarity index 100% rename from "docs/zh/docs/A-Ops/figures/\345\210\233\345\273\272\345\267\245\344\275\234\346\265\201.jpg" rename to "docs/zh/docs/A-Ops/figures/\346\225\205\351\232\234\350\257\212\346\226\255/\345\210\233\345\273\272\345\267\245\344\275\234\346\265\201.jpg" diff --git "a/docs/zh/docs/A-Ops/figures/\345\221\212\350\255\246.jpg" "b/docs/zh/docs/A-Ops/figures/\346\225\205\351\232\234\350\257\212\346\226\255/\345\221\212\350\255\246.jpg" similarity index 100% rename from "docs/zh/docs/A-Ops/figures/\345\221\212\350\255\246.jpg" rename to "docs/zh/docs/A-Ops/figures/\346\225\205\351\232\234\350\257\212\346\226\255/\345\221\212\350\255\246.jpg" diff --git "a/docs/zh/docs/A-Ops/figures/\345\221\212\350\255\246\347\241\256\350\256\244.jpg" "b/docs/zh/docs/A-Ops/figures/\346\225\205\351\232\234\350\257\212\346\226\255/\345\221\212\350\255\246\347\241\256\350\256\244.jpg" similarity index 100% rename from "docs/zh/docs/A-Ops/figures/\345\221\212\350\255\246\347\241\256\350\256\244.jpg" rename to "docs/zh/docs/A-Ops/figures/\346\225\205\351\232\234\350\257\212\346\226\255/\345\221\212\350\255\246\347\241\256\350\256\244.jpg" diff --git "a/docs/zh/docs/A-Ops/figures/\345\221\212\350\255\246\350\257\246\346\203\205.jpg" "b/docs/zh/docs/A-Ops/figures/\346\225\205\351\232\234\350\257\212\346\226\255/\345\221\212\350\255\246\350\257\246\346\203\205.jpg" similarity index 100% rename from "docs/zh/docs/A-Ops/figures/\345\221\212\350\255\246\350\257\246\346\203\205.jpg" rename to "docs/zh/docs/A-Ops/figures/\346\225\205\351\232\234\350\257\212\346\226\255/\345\221\212\350\255\246\350\257\246\346\203\205.jpg" diff --git "a/docs/zh/docs/A-Ops/figures/\345\267\245\344\275\234\346\265\201.jpg" "b/docs/zh/docs/A-Ops/figures/\346\225\205\351\232\234\350\257\212\346\226\255/\345\267\245\344\275\234\346\265\201.jpg" similarity index 100% rename from "docs/zh/docs/A-Ops/figures/\345\267\245\344\275\234\346\265\201.jpg" rename to "docs/zh/docs/A-Ops/figures/\346\225\205\351\232\234\350\257\212\346\226\255/\345\267\245\344\275\234\346\265\201.jpg" diff --git "a/docs/zh/docs/A-Ops/figures/\345\267\245\344\275\234\346\265\201\350\257\246\346\203\205.jpg" "b/docs/zh/docs/A-Ops/figures/\346\225\205\351\232\234\350\257\212\346\226\255/\345\267\245\344\275\234\346\265\201\350\257\246\346\203\205.jpg" similarity index 100% rename from "docs/zh/docs/A-Ops/figures/\345\267\245\344\275\234\346\265\201\350\257\246\346\203\205.jpg" rename to "docs/zh/docs/A-Ops/figures/\346\225\205\351\232\234\350\257\212\346\226\255/\345\267\245\344\275\234\346\265\201\350\257\246\346\203\205.jpg" diff --git "a/docs/zh/docs/A-Ops/figures/\345\272\224\347\224\250.png" "b/docs/zh/docs/A-Ops/figures/\346\225\205\351\232\234\350\257\212\346\226\255/\345\272\224\347\224\250.png" similarity index 100% rename from "docs/zh/docs/A-Ops/figures/\345\272\224\347\224\250.png" rename to "docs/zh/docs/A-Ops/figures/\346\225\205\351\232\234\350\257\212\346\226\255/\345\272\224\347\224\250.png" diff --git "a/docs/zh/docs/A-Ops/figures/\346\226\260\345\242\236\346\225\205\351\232\234\346\240\221.png" "b/docs/zh/docs/A-Ops/figures/\346\226\260\345\242\236\346\225\205\351\232\234\346\240\221.png" deleted file mode 100644 index 664efd5150fcb96f009ce0eddc3d9ac91b9e622f..0000000000000000000000000000000000000000 Binary files "a/docs/zh/docs/A-Ops/figures/\346\226\260\345\242\236\346\225\205\351\232\234\346\240\221.png" and /dev/null differ diff --git "a/docs/zh/docs/A-Ops/figures/\346\237\245\347\234\213\346\212\245\345\221\212\345\210\227\350\241\250.png" "b/docs/zh/docs/A-Ops/figures/\346\237\245\347\234\213\346\212\245\345\221\212\345\210\227\350\241\250.png" deleted file mode 100644 index 58307ec6ef4c73b6b0f039b1052e5870629ac2e8..0000000000000000000000000000000000000000 Binary files "a/docs/zh/docs/A-Ops/figures/\346\237\245\347\234\213\346\212\245\345\221\212\345\210\227\350\241\250.png" and /dev/null differ diff --git "a/docs/zh/docs/A-Ops/figures/\346\237\245\347\234\213\346\225\205\351\232\234\346\240\221.png" "b/docs/zh/docs/A-Ops/figures/\346\237\245\347\234\213\346\225\205\351\232\234\346\240\221.png" deleted file mode 100644 index a566417b18e8bcf19153730904893fc8d827d885..0000000000000000000000000000000000000000 Binary files "a/docs/zh/docs/A-Ops/figures/\346\237\245\347\234\213\346\225\205\351\232\234\346\240\221.png" and /dev/null differ diff --git "a/docs/zh/docs/A-Ops/figures/\346\267\273\345\212\240\344\270\273\346\234\272\347\273\204.jpg" "b/docs/zh/docs/A-Ops/figures/\346\267\273\345\212\240\344\270\273\346\234\272\347\273\204.jpg" deleted file mode 100644 index 9fcd24d949e500323e7a466be7cbfaf48d257ad0..0000000000000000000000000000000000000000 Binary files "a/docs/zh/docs/A-Ops/figures/\346\267\273\345\212\240\344\270\273\346\234\272\347\273\204.jpg" and /dev/null differ diff --git "a/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/CVE\350\257\246\346\203\205\347\225\214\351\235\242.png" "b/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/CVE\350\257\246\346\203\205\347\225\214\351\235\242.png" new file mode 100644 index 0000000000000000000000000000000000000000..05859540cb88e11bd8dedaeb8e03253254574c40 Binary files /dev/null and "b/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/CVE\350\257\246\346\203\205\347\225\214\351\235\242.png" differ diff --git "a/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/cve\345\210\227\350\241\250.png" "b/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/cve\345\210\227\350\241\250.png" new file mode 100644 index 0000000000000000000000000000000000000000..f556e0e7e3c4096a89597cb08ba29133375aab07 Binary files /dev/null and "b/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/cve\345\210\227\350\241\250.png" differ diff --git "a/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/\344\270\212\344\274\240\345\256\211\345\205\250\345\205\254\345\221\212.png" "b/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/\344\270\212\344\274\240\345\256\211\345\205\250\345\205\254\345\221\212.png" new file mode 100644 index 0000000000000000000000000000000000000000..801c7f917d717499c86708b419101be3773348ac Binary files /dev/null and "b/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/\344\270\212\344\274\240\345\256\211\345\205\250\345\205\254\345\221\212.png" differ diff --git "a/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/\344\270\273\346\234\272\345\210\227\350\241\250\347\225\214\351\235\242.png" "b/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/\344\270\273\346\234\272\345\210\227\350\241\250\347\225\214\351\235\242.png" new file mode 100644 index 0000000000000000000000000000000000000000..0719bb8c0b71d0503d5d3a7d8e9e83da71169c64 Binary files /dev/null and "b/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/\344\270\273\346\234\272\345\210\227\350\241\250\347\225\214\351\235\242.png" differ diff --git "a/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/\344\270\273\346\234\272\350\257\246\346\203\205.png" "b/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/\344\270\273\346\234\272\350\257\246\346\203\205.png" new file mode 100644 index 0000000000000000000000000000000000000000..21c9468ce4378bcadf537e543c756cf7a1347499 Binary files /dev/null and "b/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/\344\270\273\346\234\272\350\257\246\346\203\205.png" differ diff --git "a/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/\344\273\273\345\212\241\345\210\227\350\241\250.png" "b/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/\344\273\273\345\212\241\345\210\227\350\241\250.png" new file mode 100644 index 0000000000000000000000000000000000000000..9cfd080d1a658544c559e83429a14b35dc931fc6 Binary files /dev/null and "b/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/\344\273\273\345\212\241\345\210\227\350\241\250.png" differ diff --git "a/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/\344\273\273\345\212\241\350\257\246\346\203\205.png" "b/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/\344\273\273\345\212\241\350\257\246\346\203\205.png" new file mode 100644 index 0000000000000000000000000000000000000000..7ca43b0a82b7c4dd3e43a5e46cf3b4a79d55d033 Binary files /dev/null and "b/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/\344\273\273\345\212\241\350\257\246\346\203\205.png" differ diff --git "a/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/\344\277\256\345\244\215\344\273\273\345\212\241\346\212\245\345\221\212.png" "b/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/\344\277\256\345\244\215\344\273\273\345\212\241\346\212\245\345\221\212.png" new file mode 100644 index 0000000000000000000000000000000000000000..b9acfbcd7d8e3b2b551c8bb9700142dfba681afe Binary files /dev/null and "b/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/\344\277\256\345\244\215\344\273\273\345\212\241\346\212\245\345\221\212.png" differ diff --git "a/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/\345\233\236\346\273\232\344\273\273\345\212\241\350\257\246\346\203\205.png" "b/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/\345\233\236\346\273\232\344\273\273\345\212\241\350\257\246\346\203\205.png" new file mode 100644 index 0000000000000000000000000000000000000000..6bc8cc31e05d06dbd5ee4c0f62f281683db048da Binary files /dev/null and "b/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/\345\233\236\346\273\232\344\273\273\345\212\241\350\257\246\346\203\205.png" differ diff --git "a/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/\346\267\273\345\212\240repo\346\272\220.png" "b/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/\346\267\273\345\212\240repo\346\272\220.png" new file mode 100644 index 0000000000000000000000000000000000000000..3bf992f586f7fb4d87bc01cc29f961755a315c9d Binary files /dev/null and "b/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/\346\267\273\345\212\240repo\346\272\220.png" differ diff --git "a/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/\346\274\217\346\264\236\346\211\253\346\217\217.png" "b/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/\346\274\217\346\264\236\346\211\253\346\217\217.png" new file mode 100644 index 0000000000000000000000000000000000000000..f73ccaf984e8ab55f8b78f7da5a570ce43685221 Binary files /dev/null and "b/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/\346\274\217\346\264\236\346\211\253\346\217\217.png" differ diff --git "a/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/\347\224\237\346\210\220\344\277\256\345\244\215\344\273\273\345\212\241.png" "b/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/\347\224\237\346\210\220\344\277\256\345\244\215\344\273\273\345\212\241.png" new file mode 100644 index 0000000000000000000000000000000000000000..b183298d96b8ced8954852540c891310aeda05be Binary files /dev/null and "b/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/\347\224\237\346\210\220\344\277\256\345\244\215\344\273\273\345\212\241.png" differ diff --git "a/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/\347\224\237\346\210\220\345\233\236\346\273\232\344\273\273\345\212\241.png" "b/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/\347\224\237\346\210\220\345\233\236\346\273\232\344\273\273\345\212\241.png" new file mode 100644 index 0000000000000000000000000000000000000000..c8aa813bc228326b3e8db19e303e03507873a893 Binary files /dev/null and "b/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/\347\224\237\346\210\220\345\233\236\346\273\232\344\273\273\345\212\241.png" differ diff --git "a/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/\347\224\237\346\210\220\347\203\255\350\241\245\344\270\201\347\247\273\351\231\244\344\273\273\345\212\241.png" "b/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/\347\224\237\346\210\220\347\203\255\350\241\245\344\270\201\347\247\273\351\231\244\344\273\273\345\212\241.png" new file mode 100644 index 0000000000000000000000000000000000000000..8ccebe84f60b21737414b2cb3f972472114a40c5 Binary files /dev/null and "b/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/\347\224\237\346\210\220\347\203\255\350\241\245\344\270\201\347\247\273\351\231\244\344\273\273\345\212\241.png" differ diff --git "a/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/\350\256\276\347\275\256repo\346\272\220.png" "b/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/\350\256\276\347\275\256repo\346\272\220.png" new file mode 100644 index 0000000000000000000000000000000000000000..619cc6d42b646df3d9c4e601f40a6ec452712668 Binary files /dev/null and "b/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/\350\256\276\347\275\256repo\346\272\220.png" differ diff --git "a/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/\351\202\256\344\273\266\351\200\232\347\237\245.png" "b/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/\351\202\256\344\273\266\351\200\232\347\237\245.png" new file mode 100644 index 0000000000000000000000000000000000000000..34b1d4095b8c017f3c66ebfb3c44d114bc8d6ca7 Binary files /dev/null and "b/docs/zh/docs/A-Ops/figures/\346\274\217\346\264\236\347\256\241\347\220\206/\351\202\256\344\273\266\351\200\232\347\237\245.png" differ diff --git "a/docs/zh/docs/A-Ops/figures/\350\257\212\346\226\255error1.png" "b/docs/zh/docs/A-Ops/figures/\350\257\212\346\226\255error1.png" deleted file mode 100644 index 9e5b1139febe9f00156b37f3268269ac30a78737..0000000000000000000000000000000000000000 Binary files "a/docs/zh/docs/A-Ops/figures/\350\257\212\346\226\255error1.png" and /dev/null differ diff --git "a/docs/zh/docs/A-Ops/figures/\350\257\212\346\226\255\344\270\273\347\225\214\351\235\242.png" "b/docs/zh/docs/A-Ops/figures/\350\257\212\346\226\255\344\270\273\347\225\214\351\235\242.png" deleted file mode 100644 index b536af938250004bac3053b234bf20bcbf075c9b..0000000000000000000000000000000000000000 Binary files "a/docs/zh/docs/A-Ops/figures/\350\257\212\346\226\255\344\270\273\347\225\214\351\235\242.png" and /dev/null differ diff --git "a/docs/zh/docs/A-Ops/figures/\350\257\212\346\226\255\345\233\276\347\211\207.png" "b/docs/zh/docs/A-Ops/figures/\350\257\212\346\226\255\345\233\276\347\211\207.png" deleted file mode 100644 index 6cef6216522407997d705d29131287f3a30b0f8f..0000000000000000000000000000000000000000 Binary files "a/docs/zh/docs/A-Ops/figures/\350\257\212\346\226\255\345\233\276\347\211\207.png" and /dev/null differ diff --git "a/docs/zh/docs/A-Ops/figures/\350\265\204\344\272\247\347\256\241\347\220\206/\344\270\273\346\234\272\345\210\227\350\241\250.png" "b/docs/zh/docs/A-Ops/figures/\350\265\204\344\272\247\347\256\241\347\220\206/\344\270\273\346\234\272\345\210\227\350\241\250.png" new file mode 100644 index 0000000000000000000000000000000000000000..b8f0a87e00d73961907167fcbe43d82b60caf445 Binary files /dev/null and "b/docs/zh/docs/A-Ops/figures/\350\265\204\344\272\247\347\256\241\347\220\206/\344\270\273\346\234\272\345\210\227\350\241\250.png" differ diff --git "a/docs/zh/docs/A-Ops/figures/\350\265\204\344\272\247\347\256\241\347\220\206/\344\270\273\346\234\272\347\256\241\347\220\206-\346\267\273\345\212\240.png" "b/docs/zh/docs/A-Ops/figures/\350\265\204\344\272\247\347\256\241\347\220\206/\344\270\273\346\234\272\347\256\241\347\220\206-\346\267\273\345\212\240.png" new file mode 100644 index 0000000000000000000000000000000000000000..ce25657a0627e9dfc3dc9ebf323e086103c2ecdf Binary files /dev/null and "b/docs/zh/docs/A-Ops/figures/\350\265\204\344\272\247\347\256\241\347\220\206/\344\270\273\346\234\272\347\256\241\347\220\206-\346\267\273\345\212\240.png" differ diff --git "a/docs/zh/docs/A-Ops/figures/\350\265\204\344\272\247\347\256\241\347\220\206/\344\270\273\346\234\272\347\273\204\345\206\205\344\270\273\346\234\272\346\237\245\347\234\213.png" "b/docs/zh/docs/A-Ops/figures/\350\265\204\344\272\247\347\256\241\347\220\206/\344\270\273\346\234\272\347\273\204\345\206\205\344\270\273\346\234\272\346\237\245\347\234\213.png" new file mode 100644 index 0000000000000000000000000000000000000000..2f2e2e67a98a16e1ad464c794a8ef45ebb229d7f Binary files /dev/null and "b/docs/zh/docs/A-Ops/figures/\350\265\204\344\272\247\347\256\241\347\220\206/\344\270\273\346\234\272\347\273\204\345\206\205\344\270\273\346\234\272\346\237\245\347\234\213.png" differ diff --git "a/docs/zh/docs/A-Ops/figures/\350\265\204\344\272\247\347\256\241\347\220\206/\344\270\273\346\234\272\347\273\204\347\256\241\347\220\206\345\210\227\350\241\250.png" "b/docs/zh/docs/A-Ops/figures/\350\265\204\344\272\247\347\256\241\347\220\206/\344\270\273\346\234\272\347\273\204\347\256\241\347\220\206\345\210\227\350\241\250.png" new file mode 100644 index 0000000000000000000000000000000000000000..94c9b65719050b79d2cdb9d1e8f67c459925cda7 Binary files /dev/null and "b/docs/zh/docs/A-Ops/figures/\350\265\204\344\272\247\347\256\241\347\220\206/\344\270\273\346\234\272\347\273\204\347\256\241\347\220\206\345\210\227\350\241\250.png" differ diff --git "a/docs/zh/docs/A-Ops/figures/\350\265\204\344\272\247\347\256\241\347\220\206/\344\270\273\346\234\272\347\274\226\350\276\221\347\225\214\351\235\242.png" "b/docs/zh/docs/A-Ops/figures/\350\265\204\344\272\247\347\256\241\347\220\206/\344\270\273\346\234\272\347\274\226\350\276\221\347\225\214\351\235\242.png" new file mode 100644 index 0000000000000000000000000000000000000000..7e4f0da4e88da6f18495a4fb23bd400d0da0a8da Binary files /dev/null and "b/docs/zh/docs/A-Ops/figures/\350\265\204\344\272\247\347\256\241\347\220\206/\344\270\273\346\234\272\347\274\226\350\276\221\347\225\214\351\235\242.png" differ diff --git "a/docs/zh/docs/A-Ops/figures/\350\265\204\344\272\247\347\256\241\347\220\206/\344\270\273\346\234\272\350\257\246\346\203\205.png" "b/docs/zh/docs/A-Ops/figures/\350\265\204\344\272\247\347\256\241\347\220\206/\344\270\273\346\234\272\350\257\246\346\203\205.png" new file mode 100644 index 0000000000000000000000000000000000000000..1ee8f7bb2456efe6318074f46f5008da355a2cb1 Binary files /dev/null and "b/docs/zh/docs/A-Ops/figures/\350\265\204\344\272\247\347\256\241\347\220\206/\344\270\273\346\234\272\350\257\246\346\203\205.png" differ diff --git "a/docs/zh/docs/A-Ops/figures/\350\265\204\344\272\247\347\256\241\347\220\206/\345\267\245\344\275\234\345\217\260.png" "b/docs/zh/docs/A-Ops/figures/\350\265\204\344\272\247\347\256\241\347\220\206/\345\267\245\344\275\234\345\217\260.png" new file mode 100644 index 0000000000000000000000000000000000000000..a916eebf306cca9ffa54f733143a0ac2c44313a4 Binary files /dev/null and "b/docs/zh/docs/A-Ops/figures/\350\265\204\344\272\247\347\256\241\347\220\206/\345\267\245\344\275\234\345\217\260.png" differ diff --git "a/docs/zh/docs/A-Ops/figures/\350\265\204\344\272\247\347\256\241\347\220\206/\346\211\271\351\207\217\346\267\273\345\212\240-\346\226\207\344\273\266\350\247\243\346\236\220.png" "b/docs/zh/docs/A-Ops/figures/\350\265\204\344\272\247\347\256\241\347\220\206/\346\211\271\351\207\217\346\267\273\345\212\240-\346\226\207\344\273\266\350\247\243\346\236\220.png" new file mode 100644 index 0000000000000000000000000000000000000000..31684136510cfe6248adf9b8cd086140ab5b26ef Binary files /dev/null and "b/docs/zh/docs/A-Ops/figures/\350\265\204\344\272\247\347\256\241\347\220\206/\346\211\271\351\207\217\346\267\273\345\212\240-\346\226\207\344\273\266\350\247\243\346\236\220.png" differ diff --git "a/docs/zh/docs/A-Ops/figures/\350\265\204\344\272\247\347\256\241\347\220\206/\346\211\271\351\207\217\346\267\273\345\212\240-\346\267\273\345\212\240\347\273\223\346\236\234.png" "b/docs/zh/docs/A-Ops/figures/\350\265\204\344\272\247\347\256\241\347\220\206/\346\211\271\351\207\217\346\267\273\345\212\240-\346\267\273\345\212\240\347\273\223\346\236\234.png" new file mode 100644 index 0000000000000000000000000000000000000000..df3991eb16d32d9f2296fbb36873ff26bc82fa18 Binary files /dev/null and "b/docs/zh/docs/A-Ops/figures/\350\265\204\344\272\247\347\256\241\347\220\206/\346\211\271\351\207\217\346\267\273\345\212\240-\346\267\273\345\212\240\347\273\223\346\236\234.png" differ diff --git "a/docs/zh/docs/A-Ops/figures/\350\265\204\344\272\247\347\256\241\347\220\206/\346\211\271\351\207\217\346\267\273\345\212\240\344\270\273\346\234\272.png" "b/docs/zh/docs/A-Ops/figures/\350\265\204\344\272\247\347\256\241\347\220\206/\346\211\271\351\207\217\346\267\273\345\212\240\344\270\273\346\234\272.png" new file mode 100644 index 0000000000000000000000000000000000000000..c83daeeb5f8a4d9ab4f40e3debbe7a96f427ce74 Binary files /dev/null and "b/docs/zh/docs/A-Ops/figures/\350\265\204\344\272\247\347\256\241\347\220\206/\346\211\271\351\207\217\346\267\273\345\212\240\344\270\273\346\234\272.png" differ diff --git "a/docs/zh/docs/A-Ops/figures/\350\265\204\344\272\247\347\256\241\347\220\206/\346\214\207\346\240\207\346\263\242\345\275\242.png" "b/docs/zh/docs/A-Ops/figures/\350\265\204\344\272\247\347\256\241\347\220\206/\346\214\207\346\240\207\346\263\242\345\275\242.png" new file mode 100644 index 0000000000000000000000000000000000000000..5ab697c8f9c292097356a26140750f7f615c5d81 Binary files /dev/null and "b/docs/zh/docs/A-Ops/figures/\350\265\204\344\272\247\347\256\241\347\220\206/\346\214\207\346\240\207\346\263\242\345\275\242.png" differ diff --git "a/docs/zh/docs/A-Ops/figures/\350\265\204\344\272\247\347\256\241\347\220\206/\346\217\222\344\273\266\345\274\200\345\205\263.png" "b/docs/zh/docs/A-Ops/figures/\350\265\204\344\272\247\347\256\241\347\220\206/\346\217\222\344\273\266\345\274\200\345\205\263.png" new file mode 100644 index 0000000000000000000000000000000000000000..4bde1fd7330491fda6f4ed73a2be2e8c0bfabc8d Binary files /dev/null and "b/docs/zh/docs/A-Ops/figures/\350\265\204\344\272\247\347\256\241\347\220\206/\346\217\222\344\273\266\345\274\200\345\205\263.png" differ diff --git "a/docs/zh/docs/A-Ops/figures/\350\265\204\344\272\247\347\256\241\347\220\206/\346\267\273\345\212\240\344\270\273\346\234\272\347\273\204.png" "b/docs/zh/docs/A-Ops/figures/\350\265\204\344\272\247\347\256\241\347\220\206/\346\267\273\345\212\240\344\270\273\346\234\272\347\273\204.png" new file mode 100644 index 0000000000000000000000000000000000000000..2890e4934ba903324ea134d3ebee85307665270e Binary files /dev/null and "b/docs/zh/docs/A-Ops/figures/\350\265\204\344\272\247\347\256\241\347\220\206/\346\267\273\345\212\240\344\270\273\346\234\272\347\273\204.png" differ diff --git "a/docs/zh/docs/A-Ops/figures/\350\265\204\344\272\247\347\256\241\347\220\206/\347\231\273\351\231\206\347\225\214\351\235\242.png" "b/docs/zh/docs/A-Ops/figures/\350\265\204\344\272\247\347\256\241\347\220\206/\347\231\273\351\231\206\347\225\214\351\235\242.png" new file mode 100644 index 0000000000000000000000000000000000000000..24f94c0a9ff05897b01786aa4bc8adfe4bc8db09 Binary files /dev/null and "b/docs/zh/docs/A-Ops/figures/\350\265\204\344\272\247\347\256\241\347\220\206/\347\231\273\351\231\206\347\225\214\351\235\242.png" differ diff --git "a/docs/zh/docs/A-Ops/figures/\351\205\215\347\275\256web.png" "b/docs/zh/docs/A-Ops/figures/\351\205\215\347\275\256web.png" deleted file mode 100644 index 721335115922e03f255e67e6b775c1ac0cfbbc50..0000000000000000000000000000000000000000 Binary files "a/docs/zh/docs/A-Ops/figures/\351\205\215\347\275\256web.png" and /dev/null differ diff --git "a/docs/zh/docs/A-Ops/figures/\351\205\215\347\275\256\346\272\257\346\272\220/chakanyuqi.png" "b/docs/zh/docs/A-Ops/figures/\351\205\215\347\275\256\346\272\257\346\272\220/chakanyuqi.png" new file mode 100644 index 0000000000000000000000000000000000000000..bbead6a91468d5dee570cfdc66faf9a4ab155d7c Binary files /dev/null and "b/docs/zh/docs/A-Ops/figures/\351\205\215\347\275\256\346\272\257\346\272\220/chakanyuqi.png" differ diff --git "a/docs/zh/docs/A-Ops/figures/\351\205\215\347\275\256\346\272\257\346\272\220/chaxunshijipeizhi.png" "b/docs/zh/docs/A-Ops/figures/\351\205\215\347\275\256\346\272\257\346\272\220/chaxunshijipeizhi.png" new file mode 100644 index 0000000000000000000000000000000000000000..d5f6e450fc0e1e246492ca71a6fcd8db572eb469 Binary files /dev/null and "b/docs/zh/docs/A-Ops/figures/\351\205\215\347\275\256\346\272\257\346\272\220/chaxunshijipeizhi.png" differ diff --git "a/docs/zh/docs/A-Ops/figures/\351\205\215\347\275\256\346\272\257\346\272\220/chuangjianyewuyu.png" "b/docs/zh/docs/A-Ops/figures/\351\205\215\347\275\256\346\272\257\346\272\220/chuangjianyewuyu.png" new file mode 100644 index 0000000000000000000000000000000000000000..8849a2fc81dbd14328c6c66c53033164a0b67b52 Binary files /dev/null and "b/docs/zh/docs/A-Ops/figures/\351\205\215\347\275\256\346\272\257\346\272\220/chuangjianyewuyu.png" differ diff --git "a/docs/zh/docs/A-Ops/figures/\351\205\215\347\275\256\346\272\257\346\272\220/conf_file_trace.png" "b/docs/zh/docs/A-Ops/figures/\351\205\215\347\275\256\346\272\257\346\272\220/conf_file_trace.png" new file mode 100644 index 0000000000000000000000000000000000000000..e1e518157f8def332adfa5516b37fdb89768499c Binary files /dev/null and "b/docs/zh/docs/A-Ops/figures/\351\205\215\347\275\256\346\272\257\346\272\220/conf_file_trace.png" differ diff --git a/docs/zh/docs/A-Ops/figures/peizhitongbu.png "b/docs/zh/docs/A-Ops/figures/\351\205\215\347\275\256\346\272\257\346\272\220/peizhitongbu.png" similarity index 100% rename from docs/zh/docs/A-Ops/figures/peizhitongbu.png rename to "docs/zh/docs/A-Ops/figures/\351\205\215\347\275\256\346\272\257\346\272\220/peizhitongbu.png" diff --git "a/docs/zh/docs/A-Ops/figures/\351\205\215\347\275\256\346\272\257\346\272\220/shanchupeizhi.png" "b/docs/zh/docs/A-Ops/figures/\351\205\215\347\275\256\346\272\257\346\272\220/shanchupeizhi.png" new file mode 100644 index 0000000000000000000000000000000000000000..cfea2eb44f7b8aa809404b8b49b4bd2e24172568 Binary files /dev/null and "b/docs/zh/docs/A-Ops/figures/\351\205\215\347\275\256\346\272\257\346\272\220/shanchupeizhi.png" differ diff --git "a/docs/zh/docs/A-Ops/figures/\351\205\215\347\275\256\346\272\257\346\272\220/tianjianode.png" "b/docs/zh/docs/A-Ops/figures/\351\205\215\347\275\256\346\272\257\346\272\220/tianjianode.png" new file mode 100644 index 0000000000000000000000000000000000000000..d68f5e12a62548f2ec59374bda9ab07f43b8b5cb Binary files /dev/null and "b/docs/zh/docs/A-Ops/figures/\351\205\215\347\275\256\346\272\257\346\272\220/tianjianode.png" differ diff --git "a/docs/zh/docs/A-Ops/figures/\351\205\215\347\275\256\346\272\257\346\272\220/xinzengpeizhi.png" "b/docs/zh/docs/A-Ops/figures/\351\205\215\347\275\256\346\272\257\346\272\220/xinzengpeizhi.png" new file mode 100644 index 0000000000000000000000000000000000000000..18d71c2e099c19b5d28848eec6a8d11f29ccee27 Binary files /dev/null and "b/docs/zh/docs/A-Ops/figures/\351\205\215\347\275\256\346\272\257\346\272\220/xinzengpeizhi.png" differ diff --git "a/docs/zh/docs/A-Ops/figures/\351\205\215\347\275\256\346\272\257\346\272\220/zhuangtaichaxun.png" "b/docs/zh/docs/A-Ops/figures/\351\205\215\347\275\256\346\272\257\346\272\220/zhuangtaichaxun.png" new file mode 100644 index 0000000000000000000000000000000000000000..a3d0b3294bf6e0eeec50a2c2f8c5059bdc256376 Binary files /dev/null and "b/docs/zh/docs/A-Ops/figures/\351\205\215\347\275\256\346\272\257\346\272\220/zhuangtaichaxun.png" differ diff --git "a/docs/zh/docs/A-Ops/gala-anteater\344\275\277\347\224\250\346\211\213\345\206\214.md" "b/docs/zh/docs/A-Ops/gala-anteater\344\275\277\347\224\250\346\211\213\345\206\214.md" index cb22de01deaa10ae68de83da577831fe45337647..5800f32170628bfa1fbaf9886dd2eb9d43a3563b 100644 --- "a/docs/zh/docs/A-Ops/gala-anteater\344\275\277\347\224\250\346\211\213\345\206\214.md" +++ "b/docs/zh/docs/A-Ops/gala-anteater\344\275\277\347\224\250\346\211\213\345\206\214.md" @@ -1,153 +1,220 @@ # gala-anteater使用手册 -gala-anteater是一款基于AI的操作系统异常检测平台。主要提供时序数据预处理、异常点发现、异常上报等功能。基于线下预训练、线上模型的增量学习与模型更新,能够很好地适应于多维多模态数据故障诊断。 +gala-anteater是一款基于AI的操作系统异常检测平台。主要提供时序数据预处理、异常点发现、异常上报等功能。基于线下预训练、线上模型的增量学习与模型更新,能够很好地适用于多维多模态数据故障诊断。 -本文主要介绍如何部署和使用gala-anteater服务。 +本文主要介绍如何部署和使用gala-anteater服务,检测训练集群中的慢节点/慢卡。 ## 安装 挂载repo源: ```basic -[oe-2309] # openEuler 2309 官方发布源 -name=oe2309 -baseurl=http://119.3.219.20:82/openEuler:/23.09/standard_x86_64 +[everything] +name=everything +baseurl=http://121.36.84.172/dailybuild/EBS-openEuler-24.03-LTS-SP1/rc4_openeuler-2024-12-05-15-40-49/everything/$basearch/ enabled=1 gpgcheck=0 priority=1 -[oe-2309:Epol] # openEuler 2309:Epol 官方发布源 -name=oe2309_epol -baseurl=http://119.3.219.20:82/openEuler:/23.09:/Epol/standard_x86_64/ +[EPOL] +name=EPOL +baseurl=http://repo.openeuler.org/EBS-openEuler-24.03-LTS-SP1/EPOL/main/$basearch/ enabled=1 gpgcheck=0 priority=1 + ``` 安装gala-anteater: ```bash -# yum install gala-anteater +yum install gala-anteater ``` ## 配置 -> 说明:gala-anteater不包含额外需要配置的config文件,其参数通过命令行的启动参数传递。 - -##### 启动参数介绍 +>![](./figures/icon-note.gif)**说明:** +> +>gala-anteater采用配置的config文件设置参数启动,配置文件位置: /etc/gala-anteater/config/gala-anteater.yaml。 + +### 配置文件默认参数 + +```yaml +Global: + data_source: "prometheus" + +Arangodb: + url: "http://localhost:8529" + db_name: "spider" + +Kafka: + server: "192.168.122.100" + port: "9092" + model_topic: "gala_anteater_hybrid_model" + rca_topic: "gala_cause_inference" + meta_topic: "gala_gopher_metadata" + group_id: "gala_anteater_kafka" + # auth_type: plaintext/sasl_plaintext, please set "" for no auth + auth_type: "" + username: "" + password: "" + +Prometheus: + server: "localhost" + port: "9090" + steps: "5" + +Aom: + base_url: "" + project_id: "" + auth_type: "token" + auth_info: + iam_server: "" + iam_domain: "" + iam_user_name: "" + iam_password: "" + ssl_verify: 0 + +Schedule: + duration: 1 + +Suppression: + interval: 10 +``` -| 参数项 | 参数详细名 | 类型 | 是否必须 | 默认值 | 名称 | 含义 | -|---|---|---|---|---|---|---| -| -ks | --kafka_server | string | True | | KAFKA_SERVER | Kafka Server的ip地址,如:localhost / xxx.xxx.xxx.xxx | -| -kp | --kafka_port | string | True | | KAFKA_PORT | Kafka Server的port,如:9092 | -| -ps | --prometheus_server | string | True | | PROMETHEUS_SERVER | Prometheus Server的ip地址,如:localhost / xxx.xxx.xxx.xxx | -| -pp | --prometheus_port | string | True | | PROMETHEUS_PORT | Prometheus Server的port,如:9090 | -| -m | --model | string | False | vae | MODEL | 异常检测模型,目前支持两种异常检测模型,可选(random_forest,vae)
random_forest:随机森林模型,不支持在线学习
vae:Variational Atuoencoder,无监督模型,支持首次启动时,利用历史数据,进行模型更新迭代 | -| -d | --duration | int | False | 1 | DURATION | 异常检测模型执行频率(单位:分),每x分钟,检测一次 | -| -r | --retrain | bool | False | False | RETRAIN | 是否在启动时,利用历史数据,进行模型更新迭代,目前仅支持vae模型 | -| -l | --look_back | int | False | 4 | LOOK_BACK | 利用过去x天的历史数据,更新模型 | -| -t | --threshold | float | False | 0.8 | THRESHOLD | 异常检测模型的阈值:(0,1),较大的值,能够减少模型的误报率,推荐大于等于0.5 | -| -sli | --sli_time | int | False | 400 | SLI_TIME | 表示应用性能指标(单位:毫秒),较大的值,能够减少模型的误报率,推荐大于等于200
对于误报率较高的场景,推荐1000以上 | +| 参数 | 含义 | 默认值 | +| ----------- | ------------------------------------------------------------ | ---------------------------- | +| Global | 全局配置 | 字典类型 | +| data_source | 设置数据来源 | "prometheus" | +| Arangodb | Arangodb图数据库配置信息 | 字典类型 | +| url | 图数据库Arangodb的ip地址 | "http://localhost:8529" | +| db_name | 图数据库名 | "spider" | +| Kafka | kafka配置信息 | 字典类型 | +| server | Kafka Server的ip地址,根据安装节点ip配置 | "192.168.122.100" | +| port | Kafka Server的port,如:9092 | "9092" | +| model_topic | 故障检测结果上报topic | "gala_anteater_hybrid_model" | +| rca_topic | 根因定位结果上报topic | "gala_cause_inference" | +| meta_topic | gopher采集指标数据topic | "gala_gopher_metadata" | +| group_id | kafka设置组名 | "gala_anteater_kafka" | +| Prometheus | 数据源prometheus配置信息 | 字典类型 | +| server | Prometheus Server的ip地址,根据安装节点ip配置 | "localhost" | +| port | Prometheus Server的port,如:9090 | "9090" | +| steps | 指标采样间隔 | "5" | +| Schedule | 循环调度配置信息 | 字典类型 | +| duration | 异常检测模型执行频率(单位:分),每x分钟,检测一次 | 1 | +| Suppression | 告警抑制配置信息 | 字典类型 | +| interval | 告警抑制间隔(单位: 分),表示距离上一次告警x分钟内相同告警过滤 | 10 | ## 启动 -执行如下命令启动gala-anteater。 +执行如下命令启动gala-anteater -> 说明:gala-anteter支持命令行方式启动运行,不支持systemd方式。 - -### 在线训练方式运行(推荐) - -```bash -gala-anteater -ks {ip} -kp {port} -ps {ip} -pp {port} -m vae -r True -l 7 -t 0.6 -sli 400 ``` - -### 普通方式运行 - -```bash -gala-anteater -ks {ip} -kp {port} -ps {ip} -pp {port} -m vae -t 0.6 -sli 400 +systemctl start gala-anteater ``` -### 查询gala-anteater服务状态 +>![](./figures/icon-note.gif)**说明:** +> +>gala-anteater支持启动一个进程实例,启动多个会导致内存占用过大,日志混乱。 -若日志显示如下内容,说明服务启动成功,启动日志也会保存到当前运行目录下`logs/anteater.log`文件中。 +### 查询gala-anteater服务慢节点检测执行状态 + +若日志显示如下内容,说明慢节点正常运行,启动日志也会保存到当前运行目录下`/var/log/gala-anteater/gala-anteater.log`文件中。 ```log -2022-09-01 17:52:54,435 - root - INFO - Run gala_anteater main function... -2022-09-01 17:52:54,436 - root - INFO - Start to try updating global configurations by querying data from Kafka! -2022-09-01 17:52:54,994 - root - INFO - Loads metric and operators from file: xxx\metrics.csv -2022-09-01 17:52:54,997 - root - INFO - Loads metric and operators from file: xxx\metrics.csv -2022-09-01 17:52:54,998 - root - INFO - Start to re-train the model based on last day metrics dataset! -2022-09-01 17:52:54,998 - root - INFO - Get training data during 2022-08-31 17:52:00+08:00 to 2022-09-01 17:52:00+08:00! -2022-09-01 17:53:06,994 - root - INFO - Spends: 11.995422840118408 seconds to get unique machine_ids! -2022-09-01 17:53:06,995 - root - INFO - The number of unique machine ids is: 1! -2022-09-01 17:53:06,996 - root - INFO - Fetch metric values from machine: xxxx. -2022-09-01 17:53:38,385 - root - INFO - Spends: 31.3896164894104 seconds to get get all metric values! -2022-09-01 17:53:38,392 - root - INFO - The shape of training data: (17281, 136) -2022-09-01 17:53:38,444 - root - INFO - Start to execute vae model training... -2022-09-01 17:53:38,456 - root - INFO - Using cpu device -2022-09-01 17:53:38,658 - root - INFO - Epoch(s): 0 train Loss: 136.68 validate Loss: 117.00 -2022-09-01 17:53:38,852 - root - INFO - Epoch(s): 1 train Loss: 113.73 validate Loss: 110.05 -2022-09-01 17:53:39,044 - root - INFO - Epoch(s): 2 train Loss: 110.60 validate Loss: 108.76 -2022-09-01 17:53:39,235 - root - INFO - Epoch(s): 3 train Loss: 109.39 validate Loss: 106.93 -2022-09-01 17:53:39,419 - root - INFO - Epoch(s): 4 train Loss: 106.48 validate Loss: 103.37 -... -2022-09-01 17:53:57,744 - root - INFO - Epoch(s): 98 train Loss: 97.63 validate Loss: 96.76 -2022-09-01 17:53:57,945 - root - INFO - Epoch(s): 99 train Loss: 97.75 validate Loss: 96.58 -2022-09-01 17:53:57,969 - root - INFO - Schedule recurrent job with time interval 1 minute(s). -2022-09-01 17:53:57,973 - apscheduler.scheduler - INFO - Adding job tentatively -- it will be properly scheduled when the scheduler starts -2022-09-01 17:53:57,974 - apscheduler.scheduler - INFO - Added job "partial" to job store "default" -2022-09-01 17:53:57,974 - apscheduler.scheduler - INFO - Scheduler started -2022-09-01 17:53:57,975 - apscheduler.scheduler - DEBUG - Looking for jobs to run -2022-09-01 17:53:57,975 - apscheduler.scheduler - DEBUG - Next wakeup is due at 2022-09-01 17:54:57.973533+08:00 (in 59.998006 seconds) +2024-12-02 16:25:20,727 - INFO - anteater - Groups-0, metric: npu_chip_info_hbm_used_memory, start detection. +2024-12-02 16:25:20,735 - INFO - anteater - Metric-npu_chip_info_hbm_used_memory single group has data 8. ranks: [0, 1, 2, 3, 4, 5, 6, 7] +2024-12-02 16:25:20,739 - INFO - anteater - work on npu_chip_info_hbm_used_memory, slow_node_detection start. +2024-12-02 16:25:21,128 - INFO - anteater - time_node_compare result: []. +2024-12-02 16:25:21,137 - INFO - anteater - dnscan labels: [-1 0 0 0 -1 0 -1 -1] +2024-12-02 16:25:21,139 - INFO - anteater - dnscan labels: [-1 0 0 0 -1 0 -1 -1] +2024-12-02 16:25:21,141 - INFO - anteater - dnscan labels: [-1 0 0 0 -1 0 -1 -1] +2024-12-02 16:25:21,142 - INFO - anteater - space_nodes_compare result: []. +2024-12-02 16:25:21,142 - INFO - anteater - Time and space aggregated result: []. +2024-12-02 16:25:21,144 - INFO - anteater - work on npu_chip_info_hbm_used_memory, slow_node_detection end. + +2024-12-02 16:25:21,144 - INFO - anteater - Groups-0, metric: npu_chip_info_aicore_current_freq, start detection. +2024-12-02 16:25:21,153 - INFO - anteater - Metric-npu_chip_info_aicore_current_freq single group has data 8. ranks: [0, 1, 2, 3, 4, 5, 6, 7] +2024-12-02 16:25:21,157 - INFO - anteater - work on npu_chip_info_aicore_current_freq, slow_node_detection start. +2024-12-02 16:25:21,584 - INFO - anteater - time_node_compare result: []. +2024-12-02 16:25:21,592 - INFO - anteater - dnscan labels: [0 0 0 0 0 0 0 0] +2024-12-02 16:25:21,594 - INFO - anteater - dnscan labels: [0 0 0 0 0 0 0 0] +2024-12-02 16:25:21,597 - INFO - anteater - dnscan labels: [0 0 0 0 0 0 0 0] +2024-12-02 16:25:21,598 - INFO - anteater - space_nodes_compare result: []. +2024-12-02 16:25:21,598 - INFO - anteater - Time and space aggregated result: []. +2024-12-02 16:25:21,598 - INFO - anteater - work on npu_chip_info_aicore_current_freq, slow_node_detection end. + +2024-12-02 16:25:21,598 - INFO - anteater - Groups-0, metric: npu_chip_roce_tx_err_pkt_num, start detection. +2024-12-02 16:25:21,607 - INFO - anteater - Metric-npu_chip_roce_tx_err_pkt_num single group has data 8. ranks: [0, 1, 2, 3, 4, 5, 6, 7] +2024-12-02 16:25:21,611 - INFO - anteater - work on npu_chip_roce_tx_err_pkt_num, slow_node_detection start. +2024-12-02 16:25:22,040 - INFO - anteater - time_node_compare result: []. +2024-12-02 16:25:22,040 - INFO - anteater - Skip space nodes compare. +2024-12-02 16:25:22,040 - INFO - anteater - Time and space aggregated result: []. +2024-12-02 16:25:22,040 - INFO - anteater - work on npu_chip_roce_tx_err_pkt_num, slow_node_detection end. + +2024-12-02 16:25:22,041 - INFO - anteater - accomplishment: 1/9 +2024-12-02 16:25:22,041 - INFO - anteater - accomplishment: 2/9 +2024-12-02 16:25:22,041 - INFO - anteater - accomplishment: 3/9 +2024-12-02 16:25:22,041 - INFO - anteater - accomplishment: 4/9 +2024-12-02 16:25:22,042 - INFO - anteater - accomplishment: 5/9 +2024-12-02 16:25:22,042 - INFO - anteater - accomplishment: 6/9 +2024-12-02 16:25:22,042 - INFO - anteater - accomplishment: 7/9 +2024-12-02 16:25:22,042 - INFO - anteater - accomplishment: 8/9 +2024-12-02 16:25:22,042 - INFO - anteater - accomplishment: 9/9 +2024-12-02 16:25:22,043 - INFO - anteater - SlowNodeDetector._execute costs 1.83 seconds! +2024-12-02 16:25:22,043 - INFO - anteater - END! ``` -## 输出数据 +## 异常检测输出数据 -gala-anteater如果检测到的异常点,会将结果输出至kafka。输出数据格式如下: +gala-anteater如果检测到异常点,会将结果输出至kafka的model_topic,输出数据格式如下: ```json { - "Timestamp":1659075600000, - "Attributes":{ - "entity_id":"xxxxxx_sli_1513_18", - "event_id":"1659075600000_1fd37742xxxx_sli_1513_18", - "event_type":"app" - }, - "Resource":{ - "anomaly_score":1.0, - "anomaly_count":13, - "total_count":13, - "duration":60, - "anomaly_ratio":1.0, - "metric_label":{ - "machine_id":"1fd37742xxxx", - "tgid":"1513", - "conn_fd":"18" - }, - "recommend_metrics":{ - "gala_gopher_tcp_link_notack_bytes":{ - "label":{ - "__name__":"gala_gopher_tcp_link_notack_bytes", - "client_ip":"x.x.x.165", - "client_port":"51352", - "hostname":"localhost.localdomain", - "instance":"x.x.x.172:8888", - "job":"prometheus-x.x.x.172", - "machine_id":"xxxxxx", - "protocol":"2", - "role":"0", - "server_ip":"x.x.x.172", - "server_port":"8888", - "tgid":"3381701" - }, - "score":0.24421279500639545 - }, - ... - }, - "metrics":"gala_gopher_ksliprobe_recent_rtt_nsec" - }, - "SeverityText":"WARN", - "SeverityNumber":14, - "Body":"TimeStamp, WARN, APP may be impacting sli performance issues." + "Timestamp": 1730732076935, + "Attributes": { + "resultCode": 201, + "compute": false, + "network": false, + "storage": true, + "abnormalDetail": [{ + "objectId": "-1", + "serverIp": "96.13.19.31", + "deviceInfo": "96.13.19.31:8888*-1", + "kpiId": "gala_gopher_disk_wspeed_kB", + "methodType": "TIME", + "kpiData": [], + "relaIds": [], + "omittedDevices": [] + }], + "normalDetail": [], + "errorMsg": "" + }, + "SeverityText": "WARN", + "SeverityNumber": 13, + "is_anomaly": true } ``` + +## 输出字段说明 + +| 输出字段 | 单位 | 含义 | +| -------------- | ------ | ----------------------------------------------------- | +| Timestamp | ms | 检测到故障上报的时刻 | +| resultCode | int | 故障码,201表示故障,200表示无故障 | +| compute | bool | 故障类型是否为计算类型 | +| network | bool | 故障类型是否为网络类型 | +| storage | bool | 故障类型是否为存储类型 | +| abnormalDetail | list | 表示故障的细节 | +| objectId | int | 故障对象id,-1表示节点故障,0-7表示具体的故障卡号 | +| serverIp | string | 故障对象ip | +| deviceInfo | string | 详细的故障信息 | +| kpiId | string | 检测到故障的算法类型,"TIME", "SPACE" | +| kpiData | list | 故障时序数据,需开关打开,默认关闭 | +| relaIds | list | 故障卡关联的正常卡,表示在”SPACE“算法下对比的正常卡号 | +| omittedDevices | list | 忽略显示的卡号 | +| normalDetail | list | 正常卡的时序数据 | +| errorMsg | string | 错误信息 | +| SeverityText | string | 错误类型,表示"WARN", "ERROR" | +| SeverityNumber | int | 错误等级 | +| is_anomaly | bool | 表示是否故障 | \ No newline at end of file diff --git "a/docs/zh/docs/A-Ops/gala-gopher\344\275\277\347\224\250\346\211\213\345\206\214.md" "b/docs/zh/docs/A-Ops/gala-gopher\344\275\277\347\224\250\346\211\213\345\206\214.md" index df94faa69bae053c932b0ae757d9a114d5a1c7ac..6608d9f1884353f6bbbe59a32266840425dfd7ba 100644 --- "a/docs/zh/docs/A-Ops/gala-gopher\344\275\277\347\224\250\346\211\213\345\206\214.md" +++ "b/docs/zh/docs/A-Ops/gala-gopher\344\275\277\347\224\250\346\211\213\345\206\214.md" @@ -1,1088 +1,236 @@ -# gala-gopher使用手册 - -gala-gopher作为数据采集模块提供OS级的监控能力,支持动态加 /卸载探针,可无侵入式地集成第三方探针,快速扩展监控范围。 - -本文介绍如何部署和使用gala-gopher服务。 - -## 安装 - -挂载repo源: - -```basic -[oe-2309] # openEuler 2309 官方发布源 -name=oe2309 -baseurl=http://119.3.219.20:82/openEuler:/23.09/standard_x86_64 -enabled=1 -gpgcheck=0 -priority=1 - -[oe-2309:Epol] # openEuler 2309:Epol 官方发布源 -name=oe2309_epol -baseurl=http://119.3.219.20:82/openEuler:/23.09:/Epol/standard_x86_64/ -enabled=1 -gpgcheck=0 -priority=1 -``` - -安装gala-gopher: - -```bash -# yum install gala-gopher -``` - -## 配置 - -### 配置介绍 - -gala-gopher配置文件为`/opt/gala-gopher/gala-gopher.conf`,该文件配置项说明如下(省略无需用户配置的部分)。 - -如下配置可以根据需要进行修改: - -- global:gala-gopher全局配置信息 - - log_file_name:gala-gopher日志文件名 - - log_level:gala-gopher日志级别(暂未开放此功能) - - pin_path:ebpf探针共享map存放路径(建议维持默认配置) -- metric:指标数据metrics输出方式配置 - - out_channel:metrics输出通道,支持配置web_server|logs|kafka,配置为空则输出通道关闭 - - kafka_topic:若输出通道为kafka,此为topic配置信息 -- event:异常事件event输出方式配置 - - out_channel:event输出通道,支持配置logs|kafka,配置为空则输出通道关闭 - - kafka_topic:若输出通道为kafka,此为topic配置信息 - - timeout:同一异常事件上报间隔设置 - - desc_language:异常事件描述信息语言选择,当前支持配置zh_CN|en_US -- meta:元数据metadata输出方式配置 - - out_channel:metadata输出通道,支持logs|kafka,配置为空则输出通道关闭 - - kafka_topic:若输出通道为kafka,此为topic配置信息 -- ingress:探针数据上报相关配置 - - interval:暂未使用 -- egress:上报数据库相关配置 - - interval:暂未使用 - - time_range:暂未使用 -- imdb:cache缓存规格配置 - - max_tables_num:最大的cache表个数,/opt/gala-gopher/meta目录下每个meta对应一个表 - - max_records_num:每张cache表最大记录数,通常每个探针在一个观测周期内产生至少1条观测记录 - - max_metrics_num:每条观测记录包含的最大的metric指标个数 - - record_timeout:cache表老化时间,若cache表中某条记录超过该时间未刷新则删除记录,单位为秒 -- web_server:输出通道web_server配置 - - port:监听端口 -- rest_api_server - - port:RestFul API监听端口 - - ssl_auth:设置RestFul API开启https加密以及鉴权,on为开启,off为不开启,建议用户在实际生产环境开启 - - private_key:用于RestFul API https加密的服务端私钥文件绝对路径,当ssl_auth为“on”必配 - - cert_file:用于RestFul API https加密的服务端证书绝对路径,当ssl_auth为“on”必配 - - ca_file:用于RestFul API对客户端进行鉴权的CA中心证书绝对路径,当ssl_auth为“on”必配 -- kafka:输出通道kafka配置 - - kafka_broker:kafka服务器的IP和port - - batch_num_messages:每个批次发送的消息数量 - - compression_codec:消息压缩类型 - - queue_buffering_max_messages:生产者缓冲区中允许的最大消息数 - - queue_buffering_max_kbytes:生产者缓冲区中允许的最大字节数 - - queue_buffering_max_ms:生产者在发送批次之前等待更多消息加入的最大时间 -- logs:输出通道logs配置 - - metric_dir:metrics指标数据日志路径 - - event_dir:异常事件数据日志路径 - - meta_dir:metadata元数据日志路径 - - debug_dir:gala-gopher运行日志路径 - - - -#### 配置文件示例 - -- 配置选择数据输出通道: - - ```yaml - metric = - { - out_channel = "web_server"; - kafka_topic = "gala_gopher"; - }; - - event = - { - out_channel = "kafka"; - kafka_topic = "gala_gopher_event"; - }; - - meta = - { - out_channel = "kafka"; - kafka_topic = "gala_gopher_metadata"; - }; - ``` - -- 配置kafka和webServer: - - ```yaml - web_server = - { - port = 8888; - }; - - kafka = - { - kafka_broker = ":9092"; - }; - ``` -### 启动 - -配置完成后,执行如下命令启动gala-gopher。 - -```bash -# systemctl start gala-gopher.service -``` - -查询gala-gopher服务状态。 - -```bash -# systemctl status gala-gopher.service -``` - -若显示结果如下,说明服务启动成功。需要关注开启的探针是否已启动,如果探针线程不存在,请检查配置文件及gala-gopher运行日志文件。 - -![gala-gopher成功启动状态](./figures/gala-gopher成功启动状态.png) - -> 说明:gala-gopher部署和运行均需要root权限。 - -### REST 动态配置接口 - -WEB server端口可配置(缺省9999),URL组织方式 http://[gala-gopher所在节点ip]:[端口号]/[function(采集特性)],比如火焰图的URL:http://localhost:9999/flamegraph(以下文档均以火焰图举例)。 - - - -#### 配置探针监控范围 - -探针默认关闭,可以通过API动态开启、设置监控范围。以火焰图为例,通过REST分别开启oncpu/offcpu/mem火焰图能力。并且监控范围支持进程ID、进程名、容器ID、POD四个维度来设置。 - -下面是火焰图同时开启oncpu, offcpu采集特性的API举例: - -``` -curl -X PUT http://localhost:9999/flamegraph --data-urlencode json=' -{ - "cmd": { - "bin": "/opt/gala-gopher/extend_probes/stackprobe", - "check_cmd": "", - "probe": [ - "oncpu", - "offcpu" - ] - }, - "snoopers": { - "proc_id": [ - 101, - 102 - ], - "proc_name": [ - { - "comm": "app1", - "cmdline": "", - "debugging_dir": "" - }, - { - "comm": "app2", - "cmdline": "", - "debugging_dir": "" - } - ], - "pod_id": [ - "pod1", - "pod2" - ], - "container_id": [ - "container1", - "container2" - ] - } -}' - -``` - -全量采集特性说明如下: - -| 采集特性 | 采集特性说明 | 采集子项范围 | 监控对象 | 启动文件 | 启动条件 | -| ------------- | ------------------------------------- | ------------------------------------------------------------ | ---------------------------------------- | ---------------------------------- | ------------------------- | -| flamegraph | 在线性能火焰图观测能力 | oncpu, offcpu, mem | proc_id, proc_name, pod_id, container_id | $gala-gopher-dir/stackprobe | NA | -| l7 | 应用7层协议观测能力 | l7_bytes_metrics、l7_rpc_metrics、l7_rpc_trace | proc_id, proc_name, pod_id, container_id | $gala-gopher-dir/l7probe | NA | -| tcp | TCP异常、状态观测能力 | tcp_abnormal, tcp_rtt, tcp_windows, tcp_rate, tcp_srtt, tcp_sockbuf, tcp_stats,tcp_delay | proc_id, proc_name, pod_id, container_id | $gala-gopher-dir/tcpprobe | NA | -| socket | Socket(TCP/UDP)异常观测能力 | tcp_socket, udp_socket | proc_id, proc_name, pod_id, container_id | $gala-gopher-dir/endpoint | NA | -| io | Block层I/O观测能力 | io_trace, io_err, io_count, page_cache | NA | $gala-gopher-dir/ioprobe | NA | -| proc | 进程系统调用、I/O、DNS、VFS等观测能力 | base_metrics, proc_syscall, proc_fs, proc_io, proc_dns,proc_pagecache | proc_id, proc_name, pod_id, container_id | $gala-gopher-dir/taskprobe | NA | -| jvm | JVM层GC, 线程, 内存, 缓存等观测能力 | NA | proc_id, proc_name, pod_id, container_id | $gala-gopher-dir/jvmprobe | NA | -| ksli | Redis性能SLI(访问时延)观测能力 | NA | proc_id, proc_name, pod_id, container_id | $gala-gopher-dir/ksliprobe | NA | -| postgre_sli | PG DB性能SLI(访问时延)观测能力 | NA | proc_id, proc_name, pod_id, container_id | $gala-gopher-dir/pgsliprobe | NA | -| opengauss_sli | openGauss访问吞吐量观测能力 | NA | [ip, port, dbname, user,password] | $gala-gopher-dir/pg_stat_probe.py | NA | -| dnsmasq | DNS会话观测能力 | NA | proc_id, proc_name, pod_id, container_id | $gala-gopher-dir/rabbitmq_probe.sh | NA | -| lvs | lvs会话观测能力 | NA | NA | $gala-gopher-dir/trace_lvs | lsmod\|grep ip_vs\| wc -l | -| nginx | Nginx L4/L7层会话观测能力 | NA | proc_id, proc_name, pod_id, container_id | $gala-gopher-dir/nginx_probe | NA | -| haproxy | Haproxy L4/7层会话观测能力 | NA | proc_id, proc_name, pod_id, container_id | $gala-gopher-dir/trace_haproxy | NA | -| kafka | kafka 生产者/消费者topic观测能力 | NA | dev, port | $gala-gopher-dir/kafkaprobe | NA | -| baseinfo | 系统基础信息 | cpu, mem, nic, disk, net, fs, proc,host | proc_id, proc_name, pod_id, container_id | system_infos | NA | -| virt | 虚拟化管理信息 | NA | NA | virtualized_infos | NA | -| tprofiling | 线程级性能profiling观测能力 | oncpu, syscall_file, syscall_net, syscall_lock, syscall_sched | proc_id, proc_name, pod_id, container_id | $gala-gopher-dir/tprofiling | NA | -| container | 容器信息 | NA | proc_id, proc_name, container_id | $gala-gopher-dir/cadvisor_probe.py | NA | - -#### 配置探针运行参数 - -探针在运行期间还需要设置一些参数设置,例如:设置火焰图的采样周期、上报周期。 - -``` -curl -X PUT http://localhost:9999/flamegraph --data-urlencode json=' -{ - "params": { - "report_period": 180, - "sample_period": 180, - "metrics_type": [ - "raw", - "telemetry" - ] - } -}' -``` - -详细参数运行参数如下: - -| 参数 | 含义 | 缺省值&范围 | 单位 | 支持的监控范围 | gala-gopher是否支持 | -| ------------------ | ------------------------------------------------------ | ------------------------------------------------------------ | ------- | ------------------------ | ------------------- | -| sample_period | 采样周期 | 5000, [100~10000] | ms | io, tcp | Y | -| report_period | 上报周期 | 60, [5~600] | s | ALL | Y | -| latency_thr | 时延上报门限 | 0, [10~100000] | ms | tcp, io, proc, ksli | Y | -| offline_thr | 进程离线上报门限 | 0, [10~100000] | ms | proc | Y | -| drops_thr | 丢包上送门限 | 0, [10~100000] | package | tcp, nic | Y | -| res_lower_thr | 资源百分比下限 | 0%, [0%~100%] | percent | ALL | Y | -| res_upper_thr | 资源百分比上限 | 0%, [0%~100%] | percent | ALL | Y | -| report_event | 上报异常事件 | 0, [0, 1] | NA | ALL | Y | -| metrics_type | 上报telemetry metrics | raw, [raw, telemetry] | NA | ALL | N | -| env | 工作环境类型 | node, [node, container, kubenet] | NA | ALL | N | -| report_source_port | 是否上报源端口 | 0, [0, 1] | NA | tcp | Y | -| l7_protocol | L7层协议范围 | http, [http, pgsql, mysql, redis, kafka, mongo, rocketmq, dns] | NA | l7 | Y | -| support_ssl | 支持SSL加密协议观测 | 0, [0, 1] | NA | l7 | Y | -| multi_instance | 是否每个进程输出独立火焰图 | 0, [0, 1] | NA | flamegraph | Y | -| native_stack | 是否显示本地语言堆栈(针对JAVA进程) | 0, [0, 1] | NA | flamegraph | Y | -| cluster_ip_backend | 执行Cluster IP backend转换 | 0, [0, 1] | NA | tcp,l7 | Y | -| pyroscope_server | 设置火焰图UI服务端地址 | localhost:4040 | NA | flamegraph | Y | -| svg_period | 火焰图svg文件生成周期 | 180, [30, 600] | s | flamegraph | Y | -| perf_sample_period | oncpu火焰图采集堆栈信息的周期 | 10, [10, 1000] | ms | flamegraph | Y | -| svg_dir | 火焰图svg文件存储目录 | "/var/log/gala-gopher/stacktrace" | NA | flamegraph | Y | -| flame_dir | 火焰图原始堆栈信息存储目录 | "/var/log/gala-gopher/flamegraph" | NA | flamegraph | Y | -| dev_name | 观测的网卡/磁盘设备名 | "" | NA | io, kafka, ksli, postgre_sli,baseinfo, tcp | Y | -| continuous_sampling | 是否持续采样 | 0, [0, 1] | NA | ksli | Y | -| elf_path | 要观测的可执行文件的路径 | "" | NA | nginx, haproxy, dnsmasq | Y | -| kafka_port | 要观测的kafka端口号 | 9092, [1, 65535] | NA | kafka | Y | -| cadvisor_port | 启动的cadvisor端口号 | 8080, [1, 65535] | NA | cadvisor | Y | - - - -#### 启动、停止探针 - -``` -curl -X PUT http://localhost:9999/flamegraph --data-urlencode json=' -{ - "state": "running" // optional: running,stopped -}' -``` - - - -#### 约束与限制说明 - -1. 接口为无状态形式,每次上传的设置为该探针的最终运行结果,包括状态、参数、监控范围。 -2. 监控对象可以任意组合,监控范围取合集。 -3. 启动文件必须真实有效。 -4. 采集特性可以按需开启全部/部分能力,关闭时只能整体关闭某个采集特性。 -5. opengauss监控对象是DB实例(IP/Port/dbname/user/password)。 -6. 接口每次最多接收2048长度的数据。 - - - -#### 获取探针配置与运行状态 - -``` -curl -X GET http://localhost:9999/flamegraph -{ - "cmd": { - "bin": "/opt/gala-gopher/extend_probes/stackprobe", - "check_cmd": "" - "probe": [ - "oncpu", - "offcpu" - ] - }, - "snoopers": { - "proc_id": [ - 101, - 102 - ], - "proc_name": [ - { - "comm": "app1", - "cmdline": "", - "debugging_dir": "" - }, - { - "comm": "app2", - "cmdline": "", - "debugging_dir": "" - } - ], - "pod_id": [ - "pod1", - "pod2" - ], - "container_id": [ - "container1", - "container2" - ] - }, - "params": { - "report_period": 180, - "sample_period": 180, - "metrics_type": [ - "raw", - "telemetry" - ] - }, - "state": "running" -} -``` - -## stackprobe 介绍 - -适用于云原生环境的性能火焰图。 - -### 特性 - -- 支持观测C/C++、Go、Rust、Java语言应用。 - -- 调用栈支持容器、进程粒度:对于容器内进程,在调用栈底部分别以[Pod]和[Con]前缀标记工作负载Pod名称、容器Container名称。进程名以[]前缀标识,线程及函数(方法)无前缀。 - -- 支持本地生成svg格式火焰图或上传调用栈数据到中间件。 - -- 支持依照进程粒度多实例生成/上传火焰图。 - -- 对于Java进程的火焰图,支持同时显示本地方法和Java方法。 - -- 支持oncpu/offcpu/mem等多类型火焰图。 - -- 支持自定义采样周期。 - -### 使用说明 - -启动命令示例(基本):使用默认参数启动性能火焰图。 - -```shell -curl -X PUT http://localhost:9999/flamegraph -d json='{ "cmd": {"probe": ["oncpu"] }, "snoopers": {"proc_name": [{ "comm": "cadvisor"}] }, "state": "running"}' -``` - -启动命令示例(进阶):使用自定义参数启动性能火焰图。完整可配置参数列表参见[配置探针运行参数](#配置探针运行参数)。 - -```shell -curl -X PUT http://localhost:9999/flamegraph -d json='{ "cmd": { "check_cmd": "", "probe": ["oncpu", "offcpu", "mem"] }, "snoopers": { "proc_name": [{ "comm": "cadvisor", "cmdline": "", "debugging_dir": "" }, { "comm": "java", "cmdline": "", "debugging_dir": "" }] }, "params": { "perf_sample_period": 100, "svg_period": 300, "svg_dir": "/var/log/gala-gopher/stacktrace", "flame_dir": "/var/log/gala-gopher/flamegraph", "pyroscope_server": "localhost:4040", "multi_instance": 1, "native_stack": 0 }, "state": "running"}' -``` - -下面说明主要配置项: - -- 设置开启的火焰图类型 - - 通过probe参数设置,参数值为`oncpu`,`offcpu`,`mem`,分别代表进程cpu占用时间,进程被阻塞时间,进程申请内存大小的统计。 - - 示例: - - ` "probe": ["oncpu", "offcpu", "mem"]` - -- 设置生成本地火焰图svg文件的周期 - - 通过svg_period参数设置,单位为秒,默认值180,可选设置范围为[30, 600]的整数。 - - 示例: - - `"svg_period": 300` - -- 开启/关闭堆栈信息上传到pyroscope - - 通过pyroscope_server参数设置,参数值需要包含addr和port,参数为空或格式错误则探针不会尝试上传堆栈信息。 - - 上传周期30s。 - - 示例: - - `"pyroscope_server": "localhost:4040"` - -- 设置调用栈采样周期 - - 通过perf_sample_period设置,单位为毫秒,默认值10,可选设置范围为[10, 1000]的整数,此参数仅对oncpu类型的火焰图有效。 - - 示例: - - `"perf_sample_period": 100` - -- 开启/关闭多实例生成火焰图 - - 通过multi_instance设置,参数值为0或1,默认值为0。值为0表示所有进程的火焰图会合并在一起,值为1表示分开生成每个进程的火焰图。 - - 示例: - - `"multi_instance": 1` - -- 开启/关闭本地调用栈采集 - - 通过native_stack设置,参数值为0或1,默认值为0。此参数仅对JAVA进程有效。值为0表示不采集JVM自身的调用栈,值为1表示采集JVM自身的调用栈。 - - 示例: - - `"native_stack": 1` - - 显示效果:(左"native_stack": 1,右"native_stack": 0) - - ![image-20230804172905729](./figures/flame_muti_ins.png) - - - - -### 实现方案 - -#### 1. 用户态程序逻辑 - -周期性地(30s)根据符号表将内核态上报的堆栈信息从地址转换为符号。然后使用flamegraph插件或pyroscope将符号化的调用栈转换为火焰图。 - -其中,对于代码段类型获取符号表的方法不同。 - -- 内核符号表获取:读取/proc/kallsyms。 - -- 本地语言符号表获取:查询进程的虚拟内存映射文件(/proc/{pid}/maps),获取进程内存中各个代码段的地址映射,然后利用libelf库加载每个代码段对应模块的符号表。 - -- Java语言符号表获取: - - 由于 Java 方法没有静态映射到进程的虚拟地址空间,因此我们采用其他方式获取符号化的Java调用栈。 - - ##### 方式一:perf观测 - - 通过往Java进程加载JVM agent动态库来跟踪JVM的方法编译加载事件,获取并记录内存地址到Java符号的映射,从而实时生成Java进程的符号表。这种方法需要Java进程开启-XX:+PreserveFramePointer启动参数。本方式的优点是火焰图中可显示JVM自身的调用栈,而且这种方式生成的Java火焰图可以和其他进程的火焰图合并显示。 - - ##### 方式二:JFR观测 - - 通过动态开启JVM内置分析器JFR来跟踪Java应用程序的各种事件和指标。开启JFR的方式为往Java进程加载Java agent,Java agent中会调用JFR API。本方式的优点是对Java方法调用栈的采集会更加准确详尽。 - - 上述两种针对Java进程的性能分析方法都可以实时加载(不需要重启Java进程)且具有低底噪的优点。当stackprobe的启动参数为"multi_instance": 1且"native_stack": 0时,stackprobe会使用方法二生成Java进程火焰图,否则会使用方法一。 - -#### 2. 内核态程序逻辑 - -内核态基于eBPF实现。不同火焰图类型对应不同的eBPF程序。eBPF程序会周期性地或通过事件触发的方式遍历当前用户态和内核态的调用栈,并上报用户态。 - -##### 2.1 oncpu火焰图: - -在perf SW事件PERF_COUNT_SW_CPU_CLOCK上挂载采样eBPF程序,周期性采样调用栈。 - -##### 2.2 offcpu火焰图: - -在进程调度的tracepoint(sched_switch)上挂载采样eBPF程序,采样eBPF程序中记录进程被调度出去时间和进程id,在进程被调度回来时采样调用栈。 - -#### 2.3 mem火焰图: - -在缺页异常的tracepoint(page_fault_user)上挂载采样eBPF程序,事件触发时采样调用栈。 - -#### 3. Java语言支持: - -- stackprobe主进程: - - 1. 接收到ipc消息获取要观测的Java进程。 - 2. 使用Java代理加载模块向待观测的Java进程加载JVM代理程序:jvm_agent.so(对应[方式一](#方式一perf观测))或JstackProbeAgent.jar(对应[方式二](#方式二jfr观测))。 - 3. 方式一主进程会加载对应java进程的java-symbols.bin文件,供地址转换符号时查询。方式二主进程会加载对应java进程的stacks-{flame_type}.txt文件,可直接供火焰图生成。 - -- Java代理加载模块 - - 1. 发现新增java进程则将JVM代理程序复制到该进程空间下/proc/\/root/tmp(因为attach时容器内JVM需要可见此代理程序)。 - - 2. 设置上述目录和JVM代理程序的owner和被观测java进程一致。 - 3. 启动jvm_attach子进程,并传入被观测java进程相关参数。 - -- JVM代理程序 - - - jvm_agent.so:注册JVMTI回调函数 - - 当JVM加载一个Java方法或者动态编译一个本地方法时JVM会调用回调函数,回调函数会将java类名和方法名以及对应的内存地址写入到被观测java进程空间下(/proc/\/root/tmp/java-data-\/java-symbols.bin)。 - - - JstackProbeAgent.jar:调用JFR API - - 开启持续30s的JFR功能,并转换JFR统计结果为火焰图可用的堆栈格式,结果输出到到被观测java进程空间下(/proc/\/root/tmp/java-data-\/stacks-\.txt)。详见[JstackProbe简介](https://gitee.com/openeuler/gala-gopher/blob/dev/src/probes/extends/java.probe/jstack.probe/readme.md)。 - -- jvm_attach:用于实时加载JVM代理程序到被观测进程的JVM上 - (参考jdk源码中sun.tools.attach.LinuxVirtualMachine和jattach工具)。 - - 1. 设置自身的namespace(JVM加载agent时要求加载进程和被观测进程的namespace一致)。 - - 2. 检查JVM attach listener是否启动(是否存在UNIX socket文件:/proc/\/root/tmp/.java_pid\)。 - - 3. 未启动则创建/proc/\/cwd/.attach_pid\,并发送SIGQUIT信号给JVM。 - - 4. 连接UNIX socket。 - - 5. 读取响应为0表示attach成功。 - - attach agent流程图示: - - ![attach流程](./figures/attach流程.png) - - - - -### 注意事项 - -- 对于Java应用的观测,为获取最佳观测效果,请设置stackprobe启动选项为"multi_instance": 1, "native_stack": 0来使能JFR观测(JDK8u262+)。否则stackprobe会以perf方式来生成Java火焰图。perf方式下,请开启JVM选项XX:+PreserveFramePointer(JDK8以上)。 - -### 约束条件 - -- 支持基于hotspot JVM的Java应用观测。 - - -## tprofiling 介绍 - -tprofiling 是 gala-gopher 提供的一个基于 ebpf 的线程级应用性能诊断工具,它使用 ebpf 技术观测线程的关键系统性能事件,并关联丰富的事件内容,从而实时地记录线程的运行状态和关键行为,帮助用户快速识别应用性能问题。 - -### 功能特性 - -从操作系统的视角来看,一个运行的应用程序是由多个进程组成,每个进程是由多个运行的线程组成。tprofiling 通过观测这些线程运行过程中执行的一些关键行为(后面称之为**事件**)并记录下来,然后在前端界面以时间线的方式进行展示,进而就可以很直观地分析这些线程在某段时间内正在做什么,是在 CPU 上执行还是阻塞在某个文件、网络操作上。当应用程序出现性能问题时,通过分析对应线程的关键性能事件的执行序列,快速地进行定界定位。 - -基于当前已实现的事件观测范围, tprofiling 能够定位的应用性能问题场景主要包括: - -- 文件 I/O 耗时、阻塞问题 -- 网络 I/O 耗时、阻塞问题 -- 锁竞争问题 -- 死锁问题 - -随着更多类型的事件不断地补充和完善,tprofiling 将能够覆盖更多类型的应用性能问题场景。 - -### 事件观测范围 - -tprofiling 当前支持的系统性能事件包括两大类:系统调用事件和 oncpu 事件。 - -**系统调用事件** - -应用性能问题通常是由于系统资源出现瓶颈导致,比如 CPU 资源占用过高、I/O 资源等待。应用程序往往通过系统调用访问这些系统资源,因此可以对关键的系统调用事件进行观测来识别耗时、阻塞的资源访问操作。 - -tprofiling 当前已观测的系统调用事件参见章节: [支持的系统调用事件](#支持的系统调用事件) ,大致分为几个类型:文件操作(file)、网络操作(net)、锁操作(lock)和调度操作(sched)。下面列出部分已观测的系统调用事件: - -- 文件操作(file) - - read/write:读写磁盘文件或网络,可能会耗时、阻塞。 - - sync/fsync:对文件进行同步刷盘操作,完成前线程会阻塞。 -- 网络操作(net) - - send/recv:读写网络,可能会耗时、阻塞。 -- 锁操作(lock) - - futex:用户态锁实现相关的系统调用,触发 futex 往往意味出现锁竞争,线程可能进入阻塞状态。 -- 调度操作(sched):这里泛指那些可能会引起线程状态变化的系统调用事件,如线程让出 cpu 、睡眠、或等待其他线程等。 - - nanosleep:线程进入睡眠状态。 - - epoll_wait:等待 I/O 事件到达,事件到达之前线程会阻塞。 - -**oncpu 事件** - -此外,根据线程是否在 CPU 上运行可以将线程的运行状态分为两种:oncpu 和 offcpu ,前者表示线程正在 CPU 上运行,后者表示线程不在 CPU 上运行。通过观测线程的 oncpu 事件,可以识别线程是否正在执行耗时的 cpu 操作。 - -### 事件内容 - -线程 profiling 事件主要包括以下几部分内容。 - -- 事件来源信息:包括事件所属的线程ID、线程名、进程ID、进程名、容器ID、容器名、主机ID、主机名等信息。 - - - `thread.pid`:事件所属的线程ID。 - - `thread.comm`:事件所属的线程名。 - - `thread.tgid`:事件所属的进程ID。 - - `proc.name`:事件所属的进程名。 - - `container.id`:事件所属的容器ID。 - - `container.name`:事件所属的容器名。 - - `host.id`:事件所属的主机ID。 - - `host.name`:事件所属的主机名。 - -- 事件属性信息:包括公共的事件属性和扩展的事件属性。 - - - 公共的事件属性:包括事件名、事件类型、事件开始时间、事件结束时间、事件执行时间等。 - - - `event.name`:事件名。 - - `event.type`:事件类型,目前支持 oncpu、file、net、lock、sched 五种。 - - `start_time`:事件开始时间,聚合事件中第一个事件的开始时间,关于聚合事件的说明参见章节:[聚合事件](#聚合事件) 。 - - `end_time`:事件结束时间,聚合事件中最后一个事件的结束时间。 - - `duration`:事件执行时间,值为(end_time - start_time)。 - - `count`:事件聚合数量。 - - - 扩展的事件属性:针对不同的系统调用事件,补充更加丰富的事件内容。如 read/write 文件或网络时,提供文件路径、网络连接以及函数调用栈等信息。 - - - `func.stack`:事件的函数调用栈信息。 - - `file.path`:文件类事件的文件路径信息。 - - `sock.conn`:网络类事件的tcp连接信息。 - - `futex.op`:futex系统调用事件的操作类型,取值为 wait 或 wake 。 - - 不同事件类型支持的扩展事件属性的详细情况参见章节:[支持的系统调用事件](#支持的系统调用事件) 。 - -### 事件输出 - -tprofiling 作为 gala-gopher 提供的一个扩展的 ebpf 探针程序,产生的系统事件会发送至 gala-gopher 处理,并由 gala-gopher 按照开源的 openTelemetry 事件格式对外输出,并通过 json 格式发送到 kafka 消息队列中。前端可以通过对接 kafka 消费 tprofiling 事件。 - -下面是线程 profiling 事件的一个输出示例: - -```json -{ - "Timestamp": 1661088145000, - "SeverityText": "INFO", - "SeverityNumber": 9, - "Body": "", - "Resource": { - "host.id": "", - "host.name": "", - "thread.pid": 10, - "thread.tgid": 10, - "thread.comm": "java", - "proc.name": "xxx.jar", - "container.id": "", - "container.name": "", - }, - "Attributes": { - values: [ - { - // common info - "event.name": "read", - "event.type": "file", - "start_time": 1661088145000, - "end_time": 1661088146000, - "duration": 0.1, - "count": 1, - // extend info - "func.stack": "read;", - "file.path": "/test.txt" - }, - { - "event.name": "oncpu", - "event.type": "oncpu", - "start_time": 1661088146000, - "end_time": 1661088147000, - "duration": 0.1, - "count": 1, - } - ] - } -} -``` - -部分字段说明: - -- `Timestamp`:事件上报的事件点。 -- `Resource`:包括事件来源信息。 -- `Attributes`:包括事件属性信息,它包含一个 `values` 列表字段,列表中的每一项表示一个属于相同来源的 tprofiling 事件,其中包含该事件的属性信息。 - -### 快速开始 - -#### 安装部署 - -tprofiling 是 gala-gopher 提供的一个扩展的 ebpf 探针程序,因此,需要先安装部署好 gala-gopher 软件,然后再开启 tprofiling 功能。 - -另外,为了能够在前端用户界面使用 tprofiling 的能力,[gala-ops](https://gitee.com/openeuler/gala-docs) 基于开源的 `kafka + logstash + elasticsearch + grafana` 可观测软件搭建了用于演示的 tprofiling 功能的用户界面,用户可以使用 gala-ops 提供的部署工具进行快速部署。 - -#### 运行架构 - -![](./figures/tprofiling-run-arch.png) - -前端软件说明: - -- kafka:一个开源的消息队列中间件,用于接收并存储 gala-gopher 采集的 tprofiling 事件。 -- logstash:一个实时的开源日志收集引擎,用于从 kafka 消费 tprofiling 事件,经过过滤、转换等处理后发送至 elasticsearch 。 -- elasticsearch:一个开放的分布式搜索和分析引擎,用于储存经过处理后的 tprofiling 事件,供 grafana 查询和可视化展示。 -- grafana:一个开源的可视化工具,用于查询并可视化展示采集的 tprofiling 事件。用户最终通过 grafana 提供的用户界面来使用 tprofiling 的功能,分析应用性能问题。 - -#### 部署 tprofiling 探针 - -用户需要先安装好 gala-gopher,具体的安装部署说明可参考 [gala-gopher文档](https://gitee.com/openeuler/gala-gopher#快速开始) 。由于 tprofiling 事件会发送到 kafka 中,因此部署时需要配置好 kafka 的服务地址。 - -安装并运行 gala-gopher 后,使用 gala-gopher 提供的基于 HTTP 的动态配置接口启动 tprofiling 探针。 - -```sh -curl -X PUT http://:9999/tprofiling -d json='{"cmd": {"probe": ["oncpu", "syscall_file", "syscall_net", "syscall_sched", "syscall_lock"]}, "snoopers": {"proc_name": [{"comm": "java"}]}, "state": "running"}' -``` - -配置参数说明: - -- ``:部署 gala-gopher 的节点 IP。 -- `probe`:`cmd` 下的 `probe` 配置项指定了 tprofiling 探针观测的系统事件范围。其中,oncpu、syscall_file、syscall_net、syscall_sched、syscall_lock 分别对应 oncpu 事件、以及 file、net、sched、lock 四类系统调用事件。用户可根据需要只开启部分 tprofiling 事件类型的观测。 -- `proc_name`:`snoopers` 下的 `proc_name` 配置项用于过滤要观测的进程名。另外也可以通过 `proc_id` 配置项来过滤要观测的进程ID,详情参考:[REST 动态配置接口](#rest-动态配置接口)。 - -要关闭 tprofiling 探针,执行如下命令: - -```sh -curl -X PUT http://:9999/tprofiling -d json='{"state": "stopped"}' -``` - -#### 部署前端软件 - -使用 tprofiling 功能的用户界面需要用到的软件包括:kafka、logstash、elasticsearch、grafana。这些软件安装在管理节点,用户可以使用 gala-ops 提供的部署工具进行快速安装部署,参考:[在线部署文档](https://gitee.com/openeuler/gala-docs#%E5%9C%A8%E7%BA%BF%E9%83%A8%E7%BD%B2)。 - -在管理节点上,通过 [在线部署文档](https://gitee.com/openeuler/gala-docs#%E5%9C%A8%E7%BA%BF%E9%83%A8%E7%BD%B2) 获取部署脚本后,执行如下命令一键安装中间件:kafka、logstash、elasticsearch。 - -```sh -sh deploy.sh middleware -K <部署节点管理IP> -E <部署节点管理IP> -A -p -``` - -执行如下命令一键安装 grafana 。 - -```sh -sh deploy.sh grafana -P -E -``` - -#### 使用 - -完成上述部署动作后,即可通过浏览器访问 `http://[部署节点管理IP]:3000` 登录 grafana 来使用 A-Ops,登录用户名、密码默认均为 admin。 - -登录 grafana 界面后,找到名为 `ThreadProfiling` 的 dashboard。 - -![image-20230628155002410](./figures/tprofiling-dashboard.png) - -点击进入 tprofiling 功能的前端界面,接下来就可以探索 tprofiling 的功能了。 - -![image-20230628155249009](./figures/tprofiling-dashboard-detail.png) - -### 使用案例 - -#### 案例1:死锁问题定位 - -![image-20230628095802499](./figures/deadlock.png) - -上图是一个死锁 Demo 进程的线程 profiling 运行结果,从饼图中进程事件执行时间的统计结果可以看到,这段时间内 lock 类型事件(灰色部分)占比比较高。下半部分是整个进程的线程 profiling 展示结果,纵轴展示了进程内不同线程的 profiling 事件的执行序列。其中,线程 `java` 为主线程一直处于阻塞状态,业务线程 `LockThd1` 和 `LockThd2` 在执行一些 oncpu 事件和 file 类事件后会间歇性的同时执行一段长时间的 lock 类事件。将光标悬浮到 lock 类型事件上可以查看事件内容,(如下图所示)它触发了 futex 系统调用事件,执行时间为 60 秒。 - -![image-20230628101056732](./figures/deadlock2.png) - -基于上述观测,我们可以发现业务线程 `LockThd1` 和 `LockThd2` 可能存在异常行为。接下来,我们可以进入线程视图,查看这两个业务线程 `LockThd1` 和 `LockThd2` 的线程 profiling 结果。 - -![image-20230628102138540](./figures/deadlock3.png) - -上图是每个线程的 profiling 结果展示,纵轴展示线程内不同事件类型的执行序列。从图中可以看到,线程 `LockThd1` 和 `LockThd2` 正常情况下会定期执行 oncpu 事件,其中包括执行一些 file 类事件和 lock 类事件。但是在某个时间点(10:17:00附近)它们会同时执行一个长时间的 lock 类型的 futex 事件,而且这段时间内都没有 oncpu 事件发生,说明它们都进入了阻塞状态。futex 是用户态锁实现相关的系统调用,触发 futex 往往意味出现锁竞争,线程可能进入阻塞状态。 - -基于上述分析,线程 `LockThd1` 和 `LockThd2` 很可能是出现了死锁问题。 - -#### 案例2:锁竞争问题定位 - -![image-20230628111119499](./figures/lockcompete1.png) - -上图是一个锁竞争 Demo 进程的线程 profiling 运行结果。从图中可以看到,该进程在这段时间内主要执行了 lock、net、oncpu 三类事件,该进程包括 3 个运行的业务线程。在11:05:45 - 11:06:45 这段时间内,我们发现这 3 个业务线程的事件执行时间都变得很长了,这里面可能存在性能问题。同样,我们进入线程视图,查看每个线程的线程 profiling 结果,同时我们将时间范围缩小到可能有异常的时间点附近。 - -![image-20230628112709827](./figures/lockcompete2.png) - -通过查看每个线程的事件执行序列,可以大致了解每个线程这段时间在执行什么功能。 - -- 线程 CompeteThd1:每隔一段时间触发短时的 oncpu 事件,执行一次计算任务;但是在 11:05:45 时间点附近开始触发长时的 oncpu 事件,说明正在执行耗时的计算任务。 - - ![image-20230628113336435](./figures/lockcompete3.png) - -- 线程 CompeteThd2:每隔一段时间触发短时的 net 类事件,点击事件内容可以看到,该线程正在通过 write 系统调用发送网络消息,且可以看到对应的 tcp 连接信息;同样在 11:05:45 时间点附近开始执行长时的 futex 事件并进入阻塞状态,此时 write 网络事件的执行间隔变长了。 - - ![image-20230628113759887](./figures/lockcompete4.png) - - ![image-20230628114340386](./figures/lockcompete5.png) - -- 线程 tcp-server:tcp 服务器,不断通过 read 系统调用读取客户端发送的请求;同样在 11:05:45 时间点附近开始,read 事件执行时间变长,说明此时正在等待接收网络请求。 - - ![image-20230628114659071](./figures/lockcompete6.png) - -基于上述分析,我们可以发现,每当线程 CompeteThd1 在执行耗时较长的 oncpu 操作时,线程 CompeteThd2 都会调用 futex 系统调用进入阻塞状态,一旦线程 CompeteThd1 完成 oncpu 操作时,线程 CompeteThd2 将获取 cpu 并执行网络 write 操作。因此,大概率是因为线程 CompeteThd1 和线程 CompeteThd2 之间存在锁竞争的问题。而线程 tcp-server 与线程 CompeteThd2 之间存在 tcp 网络通信,由于线程 CompeteThd2 等待锁资源无法发送网络请求,从而导致线程 tcp-server 大部分时间都在等待 read 网络请求。 - - -### topics - -#### 支持的系统调用事件 - -选择需要加入观测的系统调用事件的基本原则为: - -1. 选择可能会比较耗时、阻塞的事件(如文件操作、网络操作、锁操作等),这类事件通常涉及对系统资源的访问。 -2. 选择影响线程运行状态的事件。 - -| 事件名/系统调用名 | 描述 | 默认的事件类型 | 扩展的事件内容 | -| ----------------- | ----------------------------------------------------- | -------------- | -------------------------------- | -| read | 读写磁盘文件或网络,线程可能会耗时、阻塞 | file | file.path, sock.conn, func.stack | -| write | 读写磁盘文件或网络,线程可能会耗时、阻塞 | file | file.path, sock.conn, func.stack | -| readv | 读写磁盘文件或网络,线程可能会耗时、阻塞 | file | file.path, sock.conn, func.stack | -| writev | 读写磁盘文件或网络,线程可能会耗时、阻塞 | file | file.path, sock.conn, func.stack | -| preadv | 读写磁盘文件或网络,线程可能会耗时、阻塞 | file | file.path, sock.conn, func.stack | -| pwritev | 读写磁盘文件或网络,线程可能会耗时、阻塞 | file | file.path, sock.conn, func.stack | -| sync | 对文件进行同步刷盘操作,完成前线程会阻塞 | file | func.stack | -| fsync | 对文件进行同步刷盘操作,完成前线程会阻塞 | file | file.path, sock.conn, func.stack | -| fdatasync | 对文件进行同步刷盘操作,完成前线程会阻塞 | file | file.path, sock.conn, func.stack | -| sched_yield | 线程主动让出 CPU 重新进行调度 | sched | func.stack | -| nanosleep | 线程进入睡眠状态 | sched | func.stack | -| clock_nanosleep | 线程进入睡眠状态 | sched | func.stack | -| wait4 | 线程阻塞 | sched | func.stack | -| waitpid | 线程阻塞 | sched | func.stack | -| select | 无事件到达时线程会阻塞等待 | sched | func.stack | -| pselect6 | 无事件到达时线程会阻塞等待 | sched | func.stack | -| poll | 无事件到达时线程会阻塞等待 | sched | func.stack | -| ppoll | 无事件到达时线程会阻塞等待 | sched | func.stack | -| epoll_wait | 无事件到达时线程会阻塞等待 | sched | func.stack | -| sendto | 读写网络时,线程可能会耗时、阻塞 | net | sock.conn, func.stack | -| recvfrom | 读写网络时,线程可能会耗时、阻塞 | net | sock.conn, func.stack | -| sendmsg | 读写网络时,线程可能会耗时、阻塞 | net | sock.conn, func.stack | -| recvmsg | 读写网络时,线程可能会耗时、阻塞 | net | sock.conn, func.stack | -| sendmmsg | 读写网络时,线程可能会耗时、阻塞 | net | sock.conn, func.stack | -| recvmmsg | 读写网络时,线程可能会耗时、阻塞 | net | sock.conn, func.stack | -| futex | 触发 futex 往往意味着出现锁等待,线程可能进入阻塞状态 | lock | futex.op, func.stack | - -#### 聚合事件 - -tprofiling 当前支持的系统性能事件包括两大类:系统调用事件和 oncpu 事件。其中,oncpu 事件以及部分系统调用事件(比如read/write)在特定的应用场景下可能会频繁触发,从而产生大量的系统事件,这会对观测的应用程序性能以及 tprofiling 探针本身的性能造成较大的影响。 - -为了优化性能,tprofiling 将一段时间内(1s)属于同一个线程的具有相同事件名的多个系统事件聚合为一个事件进行上报。因此,一个 tprofiling 事件实际上指的是一个聚合事件,它包含一个或多个相同的系统事件。相比于一个真实的系统事件,一个聚合事件的部分属性的含义有如下变化, - -- `start_time`:事件开始时间,在聚合事件中是指第一个系统事件的开始时间。 -- `end_time`:事件结束时间,在聚合事件中是指(`start_time + duration`)。 -- `duration`:事件执行时间,在聚合事件中是指所有系统事件实际执行时间的累加值。 -- `count`:聚合事件中系统事件的数量,当值为 1 时,聚合事件就等价于一个系统事件。 -- 扩展的事件属性:在聚合事件中是指第一个系统事件的扩展属性。 - -## L7Probe 介绍 - -定位:L7流量观测,覆盖常见的HTTP1.X、PG、MySQL、Redis、Kafka、HTTP2.0、MongoDB、RocketMQ协议,支持加密流观测。 - -场景:覆盖Node、Container、Pod(K8S)三类场景。 - - - -### 代码框架设计 -``` -L7Probe - | --- included // 公共头文件 - -​ | --- connect.h // L7 connect对象定义 - -​ | --- pod.h // pod/container对象定义 - -​ | --- conn_tracker.h // L7协议跟踪对象定义 - - | --- protocol // L7协议解析 - -​ | --- http // HTTP1.X L7 message结构定义及解析 - -​ | --- mysql // mysql L7 message结构定义及解析 - -​ | --- pgsql // pgsql L7 message结构定义及解析 - - | --- bpf // 内核bpf代码 - -​ | --- L7.h // BPF程序解析L7层协议类型 - -​ | --- kern_sock.bpf.c // 内核socket层观测 - -​ | --- libssl.bpf.c // openSSL层观测 - -​ | --- gossl.bpf.c // GO SSL层观测 - -​ | --- cgroup.bpf.c // pod 生命周期观测 - - | --- pod_mng.c // pod/container实例管理(感知pod/container生命周期) - - | --- conn_mng.c // L7 Connect实例管理(处理BPF观测事件,比如Open/Close事件、Stats统计) - - | --- conn_tracker.c // L7 流量跟踪(跟踪BPF观测数据,比如send/write、read/recv等系统事件产生的数据) - - | --- bpf_mng.c // BPF程序生命周期管理(按需、实时open、load、attach、unload BPF程序,包括uprobe BPF程序) - - | --- session_conn.c // 管理jsse Session(记录jsse Session和sock连接的对应关系,上报jsse连接信息) - - | --- L7Probe.c // 探针主程序 - -``` - - -### 探针输出 - -| metrics_name | table_name | metrics_type | unit | metrics description | -| --------------- | ---------- | ------------ | ---- | ------------------------------------------------------------ | -| tgid | NA | key | NA | Process ID of l7 session. | -| client_ip | NA | key | NA | Client IP address of l7 session. | -| server_ip | NA | key | NA | Server IP address of l7 session.
备注:K8S场景支持Cluster IP转换成Backend IP | -| server_port | NA | key | NA | Server Port of l7 session.
备注:K8S场景支持Cluster Port转换成Backend Port | -| l4_role | NA | key | NA | Role of l4 protocol(TCP Client/Server or UDP) | -| l7_role | NA | key | NA | Role of l7 protocol(Client or Server) | -| protocol | NA | key | NA | Name of l7 protocol(http/http2/mysql...) | -| ssl | NA | label | NA | Indicates whether an SSL-encrypted l7 session is used. | -| bytes_sent | l7_link | gauge | NA | Number of bytes sent by a l7 session. | -| bytes_recv | l7_link | gauge | NA | Number of bytes recv by a l7 session. | -| segs_sent | l7_link | gauge | NA | Number of segs sent by a l7 session. | -| segs_recv | l7_link | gauge | NA | Number of segs recv by a l7 session. | -| throughput_req | l7_rpc | gauge | qps | Request throughput of l7 session. | -| throughput_resp | l7_rpc | gauge | qps | Response throughput of l7 session. | -| req_count | l7_rpc | gauge | NA | Request num of l7 session. | -| resp_count | l7_rpc | gauge | NA | Response num of l7 session. | -| latency_avg | l7_rpc | gauge | ns | L7 session averaged latency. | -| latency | l7_rpc | histogram | ns | L7 session histogram latency. | -| latency_sum | l7_rpc | gauge | ns | L7 session sum latency. | -| err_ratio | l7_rpc | gauge | % | L7 session error rate. | -| err_count | l7_rpc | gauge | NA | L7 session error count. | -### 动态控制 - -#### 控制观测Pod范围 - -1. REST->gala-gopher。 -1. gala-gopher->L7Probe。 -1. L7Probe 基于Pod获取相关Container。 -2. L7Probe 基于Container获取其 CGroup id(cpuacct_cgrp_id),并写入object模块(API: cgrp_add)。 -2. Socket系统事件上下文中,获取进程所属CGroup(cpuacct_cgrp_id),参考Linux代码(task_cgroup)。 -2. 观测过程中,通过object模块过滤(API: is_cgrp_exist)。 - -#### 控制观测能力 - -1. REST->gala-gopher。 -2. gala-gopher->L7Probe。 -3. L7Probe根据输入参数动态的开启、关闭BPF观测能力(包括吞吐量、时延、Trace、协议类型)。 - -### 观测点 - -#### 内核Socket系统调用 - -TCP相关系统调用 - -// int connect(int sockfd, const struct sockaddr *addr, socklen_t addrlen); - -// int accept(int sockfd, struct sockaddr *addr, socklen_t *addrlen); - -// int accept4(int sockfd, struct sockaddr *addr, socklen_t *addrlen, int flags); - -// ssize_t write(int fd, const void *buf, size_t count); - -// ssize_t send(int sockfd, const void *buf, size_t len, int flags); - -// ssize_t read(int fd, void *buf, size_t count); - -// ssize_t recv(int sockfd, void *buf, size_t len, int flags); - -// ssize_t writev(int fd, const struct iovec *iov, int iovcnt); - -// ssize_t readv(int fd, const struct iovec *iov, int iovcnt); - - - -TCP&UDP相关系统调用 - -// ssize_t sendto(int sockfd, const void *buf, size_t len, int flags, const struct sockaddr *dest_addr, socklen_t addrlen); - -// ssize_t recvfrom(int sockfd, void *buf, size_t len, int flags, struct sockaddr *src_addr, socklen_t *addrlen); - -// ssize_t sendmsg(int sockfd, const struct msghdr *msg, int flags); - -// ssize_t recvmsg(int sockfd, struct msghdr *msg, int flags); - -// int close(int fd); - - - -注意点: - -1. read/write、readv/writev 与普通的文件I/O操作会混淆,通过观测内核security_socket_sendmsg函数区分FD是否属于socket操作。 -2. sendto/recvfrom、sendmsg/recvmsg TCP/UDP均会使用,参考下面手册的介绍。 -3. sendmmsg/recvmmsg、sendfile 暂不支持。 - -[sendto manual](https://man7.org/linux/man-pages/man2/send.2.html) :If sendto() is used on a connection-mode (SOCK_STREAM, SOCK_SEQPACKET) socket, the arguments dest_addr and addrlen are ignored (and the error EISCONN may be returned when they are not NULL and 0), and the error ENOTCONN is returned when the socket was not actually connected. otherwise, the address of the target is given by dest_addr with addrlen specifying its size. - -sendto 判断dest_addr参数为NULL则为TCP,否则为UDP。 - - - -[recvfrom manual](https://linux.die.net/man/2/recvfrom):The recvfrom() and recvmsg() calls are used to receive messages from a socket, and may be used to receive data on a socket whether or not it is connection-oriented. - -recvfrom判断src_addr参数为NULL则为TCP,否则为UDP。 - - - -[sendmsg manual](https://man7.org/linux/man-pages/man3/sendmsg.3p.html):The sendmsg() function shall send a message through a connection-mode or connectionless-mode socket. If the socket is a connectionless-mode socket, the message shall be sent to the address specified by msghdr if no pre-specified peer address has been set. If a peer address has been pre-specified, either themessage shall be sent to the address specified in msghdr (overriding the pre-specified peer address), or the function shall return -1 and set errno to [EISCONN]. If the socket is connection-mode, the destination address in msghdr shall be ignored. - -sendmsg判断msghdr->msg_name参数为NULL则为TCP,否则为UDP。 - - - -[recvmsg manual](https://man7.org/linux/man-pages/man3/recvmsg.3p.html): The recvmsg() function shall receive a message from a connection-mode or connectionless-mode socket. It is normally used with connectionless-mode sockets because it permits the application to retrieve the source address of received data. - -recvmsg判断msghdr->msg_name参数为NULL则为TCP,否则为UDP。 - -#### libSSL API - -SSL_write - -SSL_read - -#### Go SSL API - -#### JSSE API - -sun/security/ssl/SSLSocketImpl$AppInputStream - -sun/security/ssl/SSLSocketImpl$AppOutputStream - -### JSSE观测方案 - -#### 加载JSSEProbe探针 - -main函数中通过l7_load_jsse_agent加载JSSEProbe探针。 - -轮询观测白名单(g_proc_obj_map_fd)中的进程,若为java进程,则通过jvm_attach将JSSEProbeAgent.jar加载到此观测进程上。加载成功后,该java进程会在指定观测点(参见[JSSE API](#jsse-api))将观测信息输出到jsse-metrics输出文件(/tmp/java-data-/jsse-metrics.txt)中。 - -#### 处理JSSEProbe消息 - -l7_jsse_msg_handler线程中处理JSSEProbe消息。 - -轮询观测白名单(g_proc_obj_map_fd)中的进程,若该进程有对应的jsse-metrics输出文件,则按行读取此文件并解析、转换、上报jsse读写信息。 - -##### 1. 解析jsse读写信息 - -jsse-metrics.txt的输出格式如下,从中解析出一次jsse请求的pid, sessionId, time, read/write操作, IP, port, payload信息: -```|jsse_msg|662220|Session(1688648699909|TLS_AES_256_GCM_SHA384)|1688648699989|Write|127.0.0.1|58302|This is test message|``` - -解析出的原始信息存储于session_data_args_s中。 - -##### 2. 转换jsse读写信息 - -将session_data_args_s中的信息转换为sock_conn和conn_data。 - -转化时需要查询如下两个hash map: - -session_head:记录jsse连接的session Id和sock connection Id的对应关系。若进程id和四元组信息一致,则认为session和sock connection对应。 - -file_conn_head:记录java进程的最后一个sessionId,以备L7probe读jsseProbe输出时,没有从请求开头开始读取,找不到sessionId信息。 - -##### 3. 上报jsse读写信息 - -将sock_conn和conn_data上报到map中。 - - -## 使用方法 - -### 外部依赖软件部署 - -![gopher软件架构图](./figures/gopher软件架构图.png) - -如上图所示,绿色部分为gala-gopher的外部依赖组件。gala-gopher会将指标数据metrics输出到promethous,将元数据metadata、异常事件event输出到kafka,灰色部分的gala-anteater和gala-spider会从promethous和kafka获取数据。 - -> 说明:安装kafka、promethous软件包时,需要从官网获取安装包进行部署。 - -### 输出数据 - -- **指标数据metrics** - - Promethous Server内置了Express Browser UI,用户可以通过PromQL查询语句查询指标数据内容。详细教程参见官方文档:[Using the expression browser](https://prometheus.io/docs/prometheus/latest/getting_started/#using-the-expression-browser)。示例如下: - - 指定指标名称为`gala_gopher_tcp_link_rcv_rtt`,UI显示的指标数据为: - - ```basic - gala_gopher_tcp_link_rcv_rtt{client_ip="x.x.x.165",client_port="1234",hostname="openEuler",instance="x.x.x.172:8888",job="prometheus",machine_id="1fd3774xx",protocol="2",role="0",server_ip="x.x.x.172",server_port="3742",tgid="1516"} 1 - ``` - -- **元数据metadata** - - 可以直接从kafka消费topic为`gala_gopher_metadata`的数据来看。示例如下: - - ```bash - # 输入请求 - ./bin/kafka-console-consumer.sh --bootstrap-server x.x.x.165:9092 --topic gala_gopher_metadata - # 输出数据 - {"timestamp": 1655888408000, "meta_name": "thread", "entity_name": "thread", "version": "1.0.0", "keys": ["machine_id", "pid"], "labels": ["hostname", "tgid", "comm", "major", "minor"], "metrics": ["fork_count", "task_io_wait_time_us", "task_io_count", "task_io_time_us", "task_hang_count"]} - ``` - -- **异常事件event** - - 可以直接从kafka消费topic为`gala_gopher_event`的数据来看。示例如下: - - ```bash - # 输入请求 - ./bin/kafka-console-consumer.sh --bootstrap-server x.x.x.165:9092 --topic gala_gopher_event - # 输出数据 - {"timestamp": 1655888408000, "meta_name": "thread", "entity_name": "thread", "version": "1.0.0", "keys": ["machine_id", "pid"], "labels": ["hostname", "tgid", "comm", "major", "minor"], "metrics": ["fork_count", "task_io_wait_time_us", "task_io_count", "task_io_time_us", "task_hang_count"]} - ``` +# **gala-gopher使用手册** + +gala-gopher作为数据采集模块提供OS级的监控能力,支持动态加载 /卸载探针,可无侵入式地集成第三方探针,快速扩展监控范围。 + +本章介绍如何部署和使用gala-gopher服务。 + +#### 安装 + +挂载repo源: + +```basic +[oe-22.03-lts-sp3-everything] # openEuler 22.03-LTS-SP3 官方发布源 +name=oe-2203-lts-sp3-everything +baseurl=http://repo.openeuler.org/openEuler-22.03-LTS-SP3/everything/x86_64/ +enabled=1 +gpgcheck=0 +priority=1 + +[oe-22.03-lts-sp3-epol-update] # openEuler 22.03-LTS-SP3 Update 官方发布源 +name=oe-22.03-lts-sp3-epol-update +baseurl=http://repo.openeuler.org/openEuler-22.03-LTS-SP3/EPOL/update/main/x86_64/ +enabled=1 +gpgcheck=0 +priority=1 + +[oe-22.03-lts-sp3-epol-main] # openEuler 22.03-LTS-SP3 EPOL 官方发布源 +name=oe-22.03-lts-sp3-epol-main +baseurl=http://repo.openeuler.org/openEuler-22.03-LTS-SP3/EPOL/main/x86_64/ +enabled=1 +gpgcheck=0 +priority=1 +``` + +安装gala-gopher: + +```bash +# yum install gala-gopher +``` + + + +#### 配置 + +##### 配置介绍 + +gala-gopher配置文件为`/opt/gala-gopher/gala-gopher.conf`,该文件配置项说明如下(省略无需用户配置的部分)。 + +如下配置可以根据需要进行修改: + +- global:gala-gopher全局配置信息。 + - log_file_name:gala-gopher日志文件名。 + - log_level: gala-gopher日志级别(暂未开放此功能)。 + - pin_path:ebpf探针共享map存放路径(建议维持默认配置)。 +- metric:指标数据metrics输出方式配置。 + - out_channel:metrics输出通道,支持配置web_server|kafka,配置为空则输出通道关闭。 + - kafka_topic:若输出通道为kafka,此为topic配置信息。 +- event:异常事件event输出方式配置。 + - out_channel:event输出通道,支持配置logs|kafka,配置为空则输出通道关闭。 + - kafka_topic:若输出通道为kafka,此为topic配置信息。 +- meta:元数据metadata输出方式配置。 + - out_channel:metadata输出通道,支持logs|kafka,配置为空则输出通道关闭。 + - kafka_topic:若输出通道为kafka,此为topic配置信息。 +- imdb:cache缓存规格配置。 + - max_tables_num:最大的cache表个数,/opt/gala-gopher/meta目录下每个meta对应一个表。 + - max_records_num:每张cache表最大记录数,通常每个探针在一个观测周期内产生至少1条观测记录。 + - max_metrics_num:每条观测记录包含的最大的metric指标个数。 + - record_timeout:cache表老化时间,若cache表中某条记录超过该时间未刷新则删除记录,单位为秒。 +- web_server:输出通道web_server配置。 + - port:监听端口。 +- kafka:输出通道kafka配置。 + - kafka_broker:kafka服务器的IP和port。 +- logs:输出通道logs配置。 + - metric_dir:metrics指标数据日志路径。 + - event_dir:异常事件数据日志路径。 + - meta_dir:metadata元数据日志路径。 + - debug_dir:gala-gopher运行日志路径。 +- probes:native探针配置。 + - name:探针名称,要求与native探针名一致,如example.probe 探针名为example。 + - param :探针启动参数,支持的参数详见[启动参数介绍表](#启动参数介绍)。 + - switch:探针是否启动,支持配置 on | off。 +- extend_probes :第三方探针配置。 + - name:探针名称。 + - command:探针启动命令。 + - param:探针启动参数,支持的参数详见[启动参数介绍表](#启动参数介绍)。 + - start_check:switch为auto时,需要根据start_check执行结果判定探针是否需要启动。 + - switch:探针是否启动,支持配置on | off | auto,auto会根据start_check判定结果决定是否启动探针。 + +##### 启动参数介绍 + +| 参数项 | 含义 | +| ------ | ------------------------------------------------------------ | +| -l | 是否开启异常事件上报 | +| -t | 采样周期,单位为秒,默认配置为探针5s上报一次数据 | +| -T | 延迟时间阈值,单位为ms,默认配置为0ms | +| -J | 抖动时间阈值,单位为ms,默认配置为0ms | +| -O | 离线时间阈值,单位为ms,默认配置为0ms | +| -D | 丢包阈值,默认配置为0(个) | +| -F | 配置为`task`表示按照`task_whitelist.conf`过滤,配置为具体进程的pid表示仅监控此进程 | +| -P | 指定每个探针加载的探测程序范围,目前tcpprobe、taskprobe探针涉及 | +| -U | 资源利用率阈值(上限),默认为0% | +| -L | 资源利用率阈值(下限),默认为0% | +| -c | 指示探针(tcp)是否标识client_port,默认配置为0(否) | +| -N | 指定探针(ksliprobe)的观测进程名,默认配置为NULL | +| -p | 指定待观测进程的二进制文件路径,比如nginx_probe,通过 -p /user/local/sbin/nginx指定nginx文件路径,默认配置为NULL | +| -w | 筛选应用程序监控范围,如-w /opt/gala-gopher/task_whitelist.conf,用户可将需要监控的程序名写入task_whitelist.conf中,默认配置为NULL表示不筛选 | +| -n | 指定某个网卡挂载tc ebpf,默认配置为NULL表示所有网卡均挂载,示例: -n eth0 | + +##### 配置文件示例 + +- 配置选择数据输出通道: + + ```yaml + metric = + { + out_channel = "web_server"; + kafka_topic = "gala_gopher"; + }; + + event = + { + out_channel = "kafka"; + kafka_topic = "gala_gopher_event"; + }; + + meta = + { + out_channel = "kafka"; + kafka_topic = "gala_gopher_metadata"; + }; + ``` + +- 配置kafka和webServer: + + ```yaml + web_server = + { + port = 8888; + }; + + kafka = + { + kafka_broker = ":9092"; + }; + ``` + +- 选择开启的探针,示例如下: + + ```yaml + probes = + ( + { + name = "system_infos"; + param = "-t 5 -w /opt/gala-gopher/task_whitelist.conf -l warn -U 80"; + switch = "on"; + }, + ); + extend_probes = + ( + { + name = "tcp"; + command = "/opt/gala-gopher/extend_probes/tcpprobe"; + param = "-l warn -c 1 -P 7"; + switch = "on"; + } + ); + ``` + + + +#### 启动 + +配置完成后,执行如下命令启动gala-gopher。 + +```bash +# systemctl start gala-gopher.service +``` + +查询gala-gopher服务状态。 + +```bash +# systemctl status gala-gopher.service +``` + +若显示结果如下,说明服务启动成功。需要关注开启的探针是否已启动,如果探针线程不存在,请检查配置文件及gala-gopher运行日志文件。 + +![gala-gopher成功启动状态](./figures/gala-gopher成功启动状态.png) + +> 说明:gala-gopher部署和运行均需要root权限。 + + + +#### 使用方法 + +##### 外部依赖软件部署 + +![gopher软件架构图](./figures/gopher软件架构图.png) + +如上图所示,绿色部分为gala-gopher的外部依赖组件。gala-gopher会将指标数据metrics输出到prometheus,将元数据metadata、异常事件event输出到kafka,灰色部分的gala-anteater和gala-spider会从prometheus和kafka获取数据。 + +> 说明:安装kafka、prometheus软件包时,需要从官网获取安装包进行部署。 + + + +##### 输出数据 + +- **指标数据metrics** + + Prometheus Server内置了Express Browser UI,用户可以通过PromQL查询语句查询指标数据内容。详细教程参见官方文档:[Using the expression browser](https://prometheus.io/docs/prometheus/latest/getting_started/#using-the-expression-browser)。示例如下: + + 指定指标名称为`gala_gopher_tcp_link_rcv_rtt`,UI显示的指标数据为: + + ```basic + gala_gopher_tcp_link_rcv_rtt{client_ip="x.x.x.165",client_port="1234",hostname="openEuler",instance="x.x.x.172:8888",job="prometheus",machine_id="1fd3774xx",protocol="2",role="0",server_ip="x.x.x.172",server_port="3742",tgid="1516"} 1 + ``` + +- **元数据metadata** + + 可以直接从kafka消费topic为`gala_gopher_metadata`的数据来看。示例如下: + + ```bash + # 输入请求 + ./bin/kafka-console-consumer.sh --bootstrap-server x.x.x.165:9092 --topic gala_gopher_metadata + # 输出数据 + {"timestamp": 1655888408000, "meta_name": "thread", "entity_name": "thread", "version": "1.0.0", "keys": ["machine_id", "pid"], "labels": ["hostname", "tgid", "comm", "major", "minor"], "metrics": ["fork_count", "task_io_wait_time_us", "task_io_count", "task_io_time_us", "task_hang_count"]} + ``` + +- **异常事件event** + + 可以直接从kafka消费topic为`gala_gopher_event`的数据来看。示例如下: + + ```bash + # 输入请求 + ./bin/kafka-console-consumer.sh --bootstrap-server x.x.x.165:9092 --topic gala_gopher_event + # 输出数据 + {"timestamp": 1655888408000, "meta_name": "thread", "entity_name": "thread", "version": "1.0.0", "keys": ["machine_id", "pid"], "labels": ["hostname", "tgid", "comm", "major", "minor"], "metrics": ["fork_count", "task_io_wait_time_us", "task_io_count", "task_io_time_us", "task_hang_count"]} + ``` \ No newline at end of file diff --git "a/docs/zh/docs/A-Ops/gala-spider\344\275\277\347\224\250\346\211\213\345\206\214.md" "b/docs/zh/docs/A-Ops/gala-spider\344\275\277\347\224\250\346\211\213\345\206\214.md" index 656ab5dfbccb95b5505b8bf76b16affb402d96d3..51c674f2c30d368b8ee6ddc717cae226215f3617 100644 --- "a/docs/zh/docs/A-Ops/gala-spider\344\275\277\347\224\250\346\211\213\345\206\214.md" +++ "b/docs/zh/docs/A-Ops/gala-spider\344\275\277\347\224\250\346\211\213\345\206\214.md" @@ -1,527 +1,555 @@ -# gala-spider使用手册 - -本文档主要介绍如何部署和使用gala-spider和gala-inference。 - -## gala-spider - -gala-spider 提供 OS 级别的拓扑图绘制功能,它将定期获取 gala-gopher (一个 OS 层面的数据采集软件)在某个时间点采集的所有观测对象的数据,并计算它们之间的拓扑关系,最终将生成的拓扑图保存到图数据库 arangodb 中。 - -### 安装 - -挂载 yum 源: - -```basic -[oe-2309] # openEuler 2309 官方发布源 -name=oe2309 -baseurl=http://119.3.219.20:82/openEuler:/23.09/standard_x86_64 -enabled=1 -gpgcheck=0 -priority=1 - -[oe-2309:Epol] # openEuler 2309:Epol 官方发布源 -name=oe2309_epol -baseurl=http://119.3.219.20:82/openEuler:/23.09:/Epol/standard_x86_64/ -enabled=1 -gpgcheck=0 -priority=1 -``` - -安装 gala-spider: - -```sh -# yum install gala-spider -``` - -### 配置 - -#### 配置文件说明 - -gala-spider 配置文件为 `/etc/gala-spider/gala-spider.yaml` ,该文件配置项说明如下。 - -- global:全局配置信息。 - - data_source:指定观测指标采集的数据库,当前只支持 prometheus。 - - data_agent:指定观测指标采集代理,当前只支持 gala_gopher。 -- spider:spider配置信息。 - - log_conf:日志配置信息。 - - log_path:日志文件路径。 - - log_level:日志打印级别,值包括 DEBUG/INFO/WARNING/ERROR/CRITICAL 。 - - max_size:日志文件大小,单位为兆字节(MB)。 - - backup_count:日志备份文件数量。 -- storage:拓扑图存储服务的配置信息。 - - period:存储周期,单位为秒,表示每隔多少秒存储一次拓扑图。 - - database:存储的图数据库,当前只支持 arangodb。 - - db_conf:图数据库的配置信息。 - - url:图数据库的服务器地址。 - - db_name:拓扑图存储的数据库名称。 -- kafka:kafka配置信息。 - - server:kafka服务器地址。 - - metadata_topic:观测对象元数据消息的topic名称。 - - metadata_group_id:观测对象元数据消息的消费者组ID。 -- prometheus:prometheus数据库配置信息。 - - base_url:prometheus服务器地址。 - - instant_api:单个时间点采集API。 - - range_api:区间采集API。 - - step:采集时间步长,用于区间采集API。 - -#### 配置文件示例 - -```yaml -global: - data_source: "prometheus" - data_agent: "gala_gopher" - -prometheus: - base_url: "http://localhost:9090/" - instant_api: "/api/v1/query" - range_api: "/api/v1/query_range" - step: 1 - -spider: - log_conf: - log_path: "/var/log/gala-spider/spider.log" - # log level: DEBUG/INFO/WARNING/ERROR/CRITICAL - log_level: INFO - # unit: MB - max_size: 10 - backup_count: 10 - -storage: - # unit: second - period: 60 - database: arangodb - db_conf: - url: "http://localhost:8529" - db_name: "spider" - -kafka: - server: "localhost:9092" - metadata_topic: "gala_gopher_metadata" - metadata_group_id: "metadata-spider" -``` - -### 启动 - -1. 通过命令启动。 - - ```sh - # spider-storage - ``` - -2. 通过 systemd 服务启动。 - - ```sh - # systemctl start gala-spider - ``` - -### 使用方法 - -#### 外部依赖软件部署 - -gala-spider 运行时需要依赖多个外部软件进行交互。因此,在启动 gala-spider 之前,用户需要将gala-spider依赖的软件部署完成。下图为 gala-spider 项目的软件依赖图。 - -![gala-spider软件架构图](./figures/gala-spider软件架构图.png) - -其中,右侧虚线框内为 gala-spider 项目的 2 个功能组件,绿色部分为 gala-spider 项目直接依赖的外部组件,灰色部分为 gala-spider 项目间接依赖的外部组件。 - -- **spider-storage**:gala-spider 核心组件,提供拓扑图存储功能。 - 1. 从 kafka 中获取观测对象的元数据信息。 - 2. 从 Prometheus 中获取所有的观测实例信息。 - 3. 将生成的拓扑图存储到图数据库 arangodb 中。 -- **gala-inference**:gala-spider 核心组件,提供根因定位功能。它通过订阅 kafka 的异常 KPI 事件触发异常 KPI 的根因定位流程,并基于 arangodb 获取的拓扑图来构建故障传播图,最终将根因定位的结果输出到 kafka 中。 -- **prometheus**:时序数据库,gala-gopher 组件采集的观测指标数据会上报到 prometheus,再由 gala-spider 做进一步处理。 -- **kafka**:消息中间件,用于存储 gala-gopher 上报的观测对象元数据信息,异常检测组件上报的异常事件,以及 cause-inference 组件上报的根因定位结果。 -- **arangodb**:图数据库,用于存储 spider-storage 生成的拓扑图。 -- **gala-gopher**:数据采集组件,请提前部署gala-gopher。 -- **arangodb-ui**:arangodb 提供的 UI 界面,可用于查询拓扑图。 - -gala-spider 项目中的 2 个功能组件会作为独立的软件包分别发布。 - -​**spider-storage** 组件对应本节中的 gala-spider 软件包。 - -​**gala-inference** 组件对应 gala-inference 软件包。 - -gala-gopher软件的部署参见[gala-gopher使用手册](gala-gopher使用手册.md),此处只介绍 arangodb 的部署。 - -当前使用的 arangodb 版本是 3.8.7 ,该版本对运行环境有如下要求: - -- 只支持 x86 系统 -- gcc10 以上 - -arangodb 官方部署文档参见:[arangodb部署](https://www.arangodb.com/docs/3.9/deployment.html) 。 - -arangodb 基于 rpm 的部署流程如下: - -1. 配置 yum 源。 - - ```basic - [oe-2309] # openEuler 2309 官方发布源 - name=oe2309 - baseurl=http://119.3.219.20:82/openEuler:/23.09/standard_x86_64 - enabled=1 - gpgcheck=0 - priority=1 - - [oe-2309:Epol] # openEuler 2309:Epol 官方发布源 - name=oe2309_epol - baseurl=http://119.3.219.20:82/openEuler:/23.09:/Epol/standard_x86_64/ - enabled=1 - gpgcheck=0 - priority=1 - ``` - -2. 安装 arangodb3。 - - ```sh - # yum install arangodb3 - ``` - -3. 配置修改。 - - arangodb3 服务器的配置文件路径为 `/etc/arangodb3/arangod.conf` ,需要修改如下的配置信息: - - - endpoint:配置 arangodb3 的服务器地址 - - authentication:访问 arangodb3 服务器是否需要进行身份认证,当前 gala-spider 还不支持身份认证,故此处将authentication设置为 false。 - - 示例配置如下: - - ```yaml - [server] - endpoint = tcp://0.0.0.0:8529 - authentication = false - ``` - -4. 启动 arangodb3。 - - ```sh - # systemctl start arangodb3 - ``` - -#### gala-spider配置项修改 - -依赖软件启动后,需要修改 gala-spider 配置文件的部分配置项内容。示例如下: - -配置 kafka 服务器地址: - -```yaml -kafka: - server: "localhost:9092" -``` - -配置 prometheus 服务器地址: - -```yaml -prometheus: - base_url: "http://localhost:9090/" -``` - -配置 arangodb 服务器地址: - -```yaml -storage: - db_conf: - url: "http://localhost:8529" -``` - -#### 启动服务 - -运行 `systemctl start gala-spider` 。查看启动状态可执行 `systemctl status gala-spider` ,输出如下信息说明启动成功。 - -```sh -[root@openEuler ~]# systemctl status gala-spider -● gala-spider.service - a-ops gala spider service - Loaded: loaded (/usr/lib/systemd/system/gala-spider.service; enabled; vendor preset: disabled) - Active: active (running) since Tue 2022-08-30 17:28:38 CST; 1 day 22h ago - Main PID: 2263793 (spider-storage) - Tasks: 3 (limit: 98900) - Memory: 44.2M - CGroup: /system.slice/gala-spider.service - └─2263793 /usr/bin/python3 /usr/bin/spider-storage -``` - -#### 输出示例 - -用户可以通过 arangodb 提供的 UI 界面来查询 gala-spider 输出的拓扑图。使用流程如下: - -1. 在浏览器输入 arangodb 服务器地址,如:http://localhost:8529 ,进入 arangodb 的 UI 界面。 - -2. 界面右上角切换至 `spider` 数据库。 - -3. 在 `Collections` 面板可以看到在不同时间段存储的观测对象实例的集合、拓扑关系的集合,如下图所示: - - ![spider拓扑关系图](./figures/spider拓扑关系图.png) - -4. 可进一步根据 arangodb 提供的 AQL 查询语句查询存储的拓扑关系图,详细教程参见官方文档: [aql文档](https://www.arangodb.com/docs/3.8/aql/)。 - -## gala-inference - -gala-inference 提供异常 KPI 根因定位能力,它将基于异常检测的结果和拓扑图作为输入,根因定位的结果作为输出,输出到 kafka 中。gala-inference 组件在 gala-spider 项目下进行归档。 - -### 安装 - -挂载 yum 源: - -```basic -[oe-2309] # openEuler 2309 官方发布源 -name=oe2309 -baseurl=http://119.3.219.20:82/openEuler:/23.09/standard_x86_64 -enabled=1 -gpgcheck=0 -priority=1 - -[oe-2309:Epol] # openEuler 2309:Epol 官方发布源 -name=oe2309_epol -baseurl=http://119.3.219.20:82/openEuler:/23.09:/Epol/standard_x86_64/ -enabled=1 -gpgcheck=0 -priority=1 -``` - -安装 gala-inference: - -```sh -# yum install gala-inference -``` - -### 配置 - -#### 配置文件说明 - -gala-inference 配置文件 `/etc/gala-inference/gala-inference.yaml` 配置项说明如下。 - -- inference:根因定位算法的配置信息。 - - tolerated_bias:异常时间点的拓扑图查询所容忍的时间偏移,单位为秒。 - - topo_depth:拓扑图查询的最大深度。 - - root_topk:根因定位结果输出前 K 个根因指标。 - - infer_policy:根因推导策略,包括 dfs 和 rw 。 - - sample_duration:指标的历史数据的采样周期,单位为秒。 - - evt_valid_duration:根因定位时,有效的系统异常指标事件周期,单位为秒。 - - evt_aging_duration:根因定位时,系统异常指标事件的老化周期,单位为秒。 -- kafka:kafka配置信息。 - - server:kafka服务器地址。 - - metadata_topic:观测对象元数据消息的配置信息。 - - topic_id:观测对象元数据消息的topic名称。 - - group_id:观测对象元数据消息的消费者组ID。 - - abnormal_kpi_topic:异常 KPI 事件消息的配置信息。 - - topic_id:异常 KPI 事件消息的topic名称。 - - group_id:异常 KPI 事件消息的消费者组ID。 - - abnormal_metric_topic:系统异常指标事件消息的配置信息。 - - topic_id:系统异常指标事件消息的topic名称。 - - group_id:系统异常指标事件消息的消费者组ID。 - - consumer_to:消费系统异常指标事件消息的超时时间,单位为秒。 - - inference_topic:根因定位结果输出事件消息的配置信息。 - - topic_id:根因定位结果输出事件消息的topic名称。 -- arangodb:arangodb图数据库的配置信息,用于查询根因定位所需要的拓扑子图。 - - url:图数据库的服务器地址。 - - db_name:拓扑图存储的数据库名称。 -- log_conf:日志配置信息。 - - log_path:日志文件路径。 - - log_level:日志打印级别,值包括 DEBUG/INFO/WARNING/ERROR/CRITICAL。 - - max_size:日志文件大小,单位为兆字节(MB)。 - - backup_count:日志备份文件数量。 -- prometheus:prometheus数据库配置信息,用于获取指标的历史时序数据。 - - base_url:prometheus服务器地址。 - - range_api:区间采集API。 - - step:采集时间步长,用于区间采集API。 - -#### 配置文件示例 - -```yaml -inference: - # 异常时间点的拓扑图查询所容忍的时间偏移,单位:秒 - tolerated_bias: 120 - topo_depth: 10 - root_topk: 3 - infer_policy: "dfs" - # 单位: 秒 - sample_duration: 600 - # 根因定位时,有效的异常指标事件周期,单位:秒 - evt_valid_duration: 120 - # 异常指标事件的老化周期,单位:秒 - evt_aging_duration: 600 - -kafka: - server: "localhost:9092" - metadata_topic: - topic_id: "gala_gopher_metadata" - group_id: "metadata-inference" - abnormal_kpi_topic: - topic_id: "gala_anteater_hybrid_model" - group_id: "abn-kpi-inference" - abnormal_metric_topic: - topic_id: "gala_anteater_metric" - group_id: "abn-metric-inference" - consumer_to: 1 - inference_topic: - topic_id: "gala_cause_inference" - -arangodb: - url: "http://localhost:8529" - db_name: "spider" - -log: - log_path: "/var/log/gala-inference/inference.log" - # log level: DEBUG/INFO/WARNING/ERROR/CRITICAL - log_level: INFO - # unit: MB - max_size: 10 - backup_count: 10 - -prometheus: - base_url: "http://localhost:9090/" - range_api: "/api/v1/query_range" - step: 5 -``` - -### 启动 - -1. 通过命令启动。 - - ```sh - # gala-inference - ``` - -2. 通过 systemd 服务启动。 - - ```sh - # systemctl start gala-inference - ``` - -### 使用方法 - -#### 依赖软件部署 - -gala-inference 的运行依赖和 gala-spider一样,请参见[外部依赖软件部署](#外部依赖软件部署)。此外,gala-inference 还间接依赖 [gala-spider](#gala-spider) 和 [gala-anteater](gala-anteater使用手册.md) 软件的运行,请提前部署gala-spider和gala-anteater软件。 - -#### 配置项修改 - -修改 gala-inference 的配置文件中部分配置项。示例如下: - -配置 kafka 服务器地址: - -```yaml -kafka: - server: "localhost:9092" -``` - -配置 prometheus 服务器地址: - -```yaml -prometheus: - base_url: "http://localhost:9090/" -``` - -配置 arangodb 服务器地址: - -```yaml -arangodb: - url: "http://localhost:8529" -``` - -#### 启动服务 - -直接运行 `systemctl start gala-inference` 即可。可通过执行 `systemctl status gala-inference` 查看启动状态,如下打印表示启动成功。 - -```sh -[root@openEuler ~]# systemctl status gala-inference -● gala-inference.service - a-ops gala inference service - Loaded: loaded (/usr/lib/systemd/system/gala-inference.service; enabled; vendor preset: disabled) - Active: active (running) since Tue 2022-08-30 17:55:33 CST; 1 day 22h ago - Main PID: 2445875 (gala-inference) - Tasks: 10 (limit: 98900) - Memory: 48.7M - CGroup: /system.slice/gala-inference.service - └─2445875 /usr/bin/python3 /usr/bin/gala-inference -``` - -#### 输出示例 - -当异常检测模块 gala-anteater 检测到 KPI 异常后,会将对应的异常 KPI 事件输出到 kafka 中,gala-inference 会一直监测该异常 KPI 事件的消息,如果收到异常 KPI 事件的消息,就会触发根因定位。根因定位会将定位结果输出到 kafka 中,用户可以在 kafka 服务器中查看根因定位的输出结果,基本步骤如下: - -1. 若通过源码安装 kafka ,需要进入 kafka 的安装目录下。 - - ```sh - cd /root/kafka_2.13-2.8.0 - ``` - -2. 执行消费 topic 的命令获取根因定位的输出结果。 - - ```sh - ./bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic gala_cause_inference - ``` - - 输出示例如下: - - ```json - { - "Timestamp": 1661853360000, - "event_id": "1661853360000_1fd37742xxxx_sli_12154_19", - "Atrributes": { - "event_id": "1661853360000_1fd37742xxxx_sli_12154_19" - }, - "Resource": { - "abnormal_kpi": { - "metric_id": "gala_gopher_sli_rtt_nsec", - "entity_id": "1fd37742xxxx_sli_12154_19", - "timestamp": 1661853360000, - "metric_labels": { - "machine_id": "1fd37742xxxx", - "tgid": "12154", - "conn_fd": "19" - } - }, - "cause_metrics": [ - { - "metric_id": "gala_gopher_proc_write_bytes", - "entity_id": "1fd37742xxxx_proc_12154", - "metric_labels": { - "__name__": "gala_gopher_proc_write_bytes", - "cmdline": "/opt/redis/redis-server x.x.x.172:3742", - "comm": "redis-server", - "container_id": "5a10635e2c43", - "hostname": "openEuler", - "instance": "x.x.x.172:8888", - "job": "prometheus", - "machine_id": "1fd37742xxxx", - "pgid": "12154", - "ppid": "12126", - "tgid": "12154" - }, - "timestamp": 1661853360000, - "path": [ - { - "metric_id": "gala_gopher_proc_write_bytes", - "entity_id": "1fd37742xxxx_proc_12154", - "metric_labels": { - "__name__": "gala_gopher_proc_write_bytes", - "cmdline": "/opt/redis/redis-server x.x.x.172:3742", - "comm": "redis-server", - "container_id": "5a10635e2c43", - "hostname": "openEuler", - "instance": "x.x.x.172:8888", - "job": "prometheus", - "machine_id": "1fd37742xxxx", - "pgid": "12154", - "ppid": "12126", - "tgid": "12154" - }, - "timestamp": 1661853360000 - }, - { - "metric_id": "gala_gopher_sli_rtt_nsec", - "entity_id": "1fd37742xxxx_sli_12154_19", - "metric_labels": { - "machine_id": "1fd37742xxxx", - "tgid": "12154", - "conn_fd": "19" - }, - "timestamp": 1661853360000 - } - ] - } - ] - }, - "SeverityText": "WARN", - "SeverityNumber": 13, - "Body": "A cause inferring event for an abnormal event" - } - ``` +# gala-spider使用手册 + +本章主要介绍如何部署和使用gala-spider和gala-inference。 + +## gala-spider + +gala-spider 提供 OS 级别的拓扑图绘制功能,它将定期获取 gala-gopher (一个 OS 层面的数据采集软件)在某个时间点采集的所有观测对象的数据,并计算它们之间的拓扑关系,最终将生成的拓扑图保存到图数据库 arangodb 中。 + +### 安装 + +挂载 yum 源: + +```basic +[oe-22.03-lts-sp3-everything] # openEuler 22.03-LTS-SP3 官方发布源 +name=oe-2203-lts-sp3-everything +baseurl=http://repo.openeuler.org/openEuler-22.03-LTS-SP3/everything/x86_64/ +enabled=1 +gpgcheck=0 +priority=1 + +[oe-22.03-lts-sp3-epol-update] # openEuler 22.03-LTS-SP3 Update 官方发布源 +name=oe-22.03-lts-sp3-epol-update +baseurl=http://repo.openeuler.org/openEuler-22.03-LTS-SP3/EPOL/update/main/x86_64/ +enabled=1 +gpgcheck=0 +priority=1 + +[oe-22.03-lts-sp3-epol-main] # openEuler 22.03-LTS-SP3 EPOL 官方发布源 +name=oe-22.03-lts-sp3-epol-main +baseurl=http://repo.openeuler.org/openEuler-22.03-LTS-SP3/EPOL/main/x86_64/ +enabled=1 +gpgcheck=0 +priority=1 +``` + +安装 gala-spider: + +```sh +# yum install gala-spider +``` + + + +### 配置 + +#### 配置文件说明 + +gala-spider 配置文件为 `/etc/gala-spider/gala-spider.yaml` ,该文件配置项说明如下。 + +- global:全局配置信息。 + - data_source:指定观测指标采集的数据库,当前只支持 prometheus。 + - data_agent:指定观测指标采集代理,当前只支持 gala_gopher。 +- spider:spider配置信息。 + - log_conf:日志配置信息。 + - log_path:日志文件路径。 + - log_level:日志打印级别,值包括 DEBUG/INFO/WARNING/ERROR/CRITICAL 。 + - max_size:日志文件大小,单位为兆字节(MB)。 + - backup_count:日志备份文件数量。 +- storage:拓扑图存储服务的配置信息。 + - period:存储周期,单位为秒,表示每隔多少秒存储一次拓扑图。 + - database:存储的图数据库,当前只支持 arangodb。 + - db_conf:图数据库的配置信息。 + - url:图数据库的服务器地址。 + - db_name:拓扑图存储的数据库名称。 +- kafka:kafka配置信息。 + - server:kafka服务器地址。 + - metadata_topic:观测对象元数据消息的topic名称。 + - metadata_group_id:观测对象元数据消息的消费者组ID。 +- prometheus:prometheus数据库配置信息。 + - base_url:prometheus服务器地址。 + - instant_api:单个时间点采集API。 + - range_api:区间采集API。 + - step:采集时间步长,用于区间采集API。 + +#### 配置文件示例 + +```yaml +global: + data_source: "prometheus" + data_agent: "gala_gopher" + +prometheus: + base_url: "http://localhost:9090/" + instant_api: "/api/v1/query" + range_api: "/api/v1/query_range" + step: 1 + +spider: + log_conf: + log_path: "/var/log/gala-spider/spider.log" + # log level: DEBUG/INFO/WARNING/ERROR/CRITICAL + log_level: INFO + # unit: MB + max_size: 10 + backup_count: 10 + +storage: + # unit: second + period: 60 + database: arangodb + db_conf: + url: "http://localhost:8529" + db_name: "spider" + +kafka: + server: "localhost:9092" + metadata_topic: "gala_gopher_metadata" + metadata_group_id: "metadata-spider" +``` + + + +### 启动 + +1. 通过命令启动。 + + ```sh + # spider-storage + ``` + +2. 通过 systemd 服务启动。 + + ```sh + # systemctl start gala-spider + ``` + + + +### 使用方法 + +##### 外部依赖软件部署 + +gala-spider 运行时需要依赖多个外部软件进行交互。因此,在启动 gala-spider 之前,用户需要将gala-spider依赖的软件部署完成。下图为 gala-spider 项目的软件依赖图。 + +![gala-spider软件架构图](./figures/gala-spider软件架构图.png) + +其中,右侧虚线框内为 gala-spider 项目的 2 个功能组件,绿色部分为 gala-spider 项目直接依赖的外部组件,灰色部分为 gala-spider 项目间接依赖的外部组件。 + +- **spider-storage**:gala-spider 核心组件,提供拓扑图存储功能。 + 1. 从 kafka 中获取观测对象的元数据信息。 + 2. 从 Prometheus 中获取所有的观测实例信息。 + 3. 将生成的拓扑图存储到图数据库 arangodb 中。 +- **gala-inference**:gala-spider 核心组件,提供根因定位功能。它通过订阅 kafka 的异常 KPI 事件触发异常 KPI 的根因定位流程,并基于 arangodb 获取的拓扑图来构建故障传播图,最终将根因定位的结果输出到 kafka 中。 +- **prometheus**:时序数据库,gala-gopher 组件采集的观测指标数据会上报到 prometheus,再由 gala-spider 做进一步处理。 +- **kafka**:消息中间件,用于存储 gala-gopher 上报的观测对象元数据信息,异常检测组件上报的异常事件,以及 cause-inference 组件上报的根因定位结果。 +- **arangodb**:图数据库,用于存储 spider-storage 生成的拓扑图。 +- **gala-gopher**:数据采集组件,请提前部署gala-gopher。 +- **arangodb-ui**:arangodb 提供的 UI 界面,可用于查询拓扑图。 + +gala-spider 项目中的 2 个功能组件会作为独立的软件包分别发布。 + +​ **spider-storage** 组件对应本节中的 gala-spider 软件包。 + +​ **gala-inference** 组件对应 gala-inference 软件包。 + +gala-gopher软件的部署参见[gala-gopher使用手册](gala-gopher使用手册.md),此处只介绍 arangodb 的部署。 + +当前使用的 arangodb 版本是 3.8.7 ,该版本对运行环境有如下要求: + +- 只支持 x86 系统 +- gcc10 以上 + +arangodb 官方部署文档参见:[arangodb部署](https://www.arangodb.com/docs/3.9/deployment.html) 。 + +arangodb 基于 rpm 的部署流程如下: + +1. 配置 yum 源。 + + ```basic + [oe-22.03-lts-sp3-everything] # openEuler 22.03-LTS-SP3 官方发布源 + name=oe-2203-lts-sp3-everything + baseurl=http://repo.openeuler.org/openEuler-22.03-LTS-SP3/everything/x86_64/ + enabled=1 + gpgcheck=0 + priority=1 + + [oe-22.03-lts-sp3-epol-main] # openEuler 22.03-LTS-SP3 EPOL 官方发布源 + name=oe-22.03-lts-sp3-epol-main + baseurl=http://repo.openeuler.org/openEuler-22.03-LTS-SP3/EPOL/main/x86_64/ + enabled=1 + gpgcheck=0 + priority=1 + ``` + +2. 安装 arangodb3。 + + ```sh + # yum install arangodb3 + ``` + +3. 配置修改。 + + arangodb3 服务器的配置文件路径为 `/etc/arangodb3/arangod.conf` ,需要修改如下的配置信息: + + - endpoint:配置 arangodb3 的服务器地址。 + - authentication:访问 arangodb3 服务器是否需要进行身份认证,当前 gala-spider 还不支持身份认证,故此处将authentication设置为 false。 + + 示例配置如下: + + ```yaml + [server] + endpoint = tcp://0.0.0.0:8529 + authentication = false + ``` + +4. 启动 arangodb3。 + + ```sh + # systemctl start arangodb3 + ``` + +##### gala-spider配置项修改 + +依赖软件启动后,需要修改 gala-spider 配置文件的部分配置项内容。示例如下: + +配置 kafka 服务器地址: + +```yaml +kafka: + server: "localhost:9092" +``` + +配置 prometheus 服务器地址: + +```yaml +prometheus: + base_url: "http://localhost:9090/" +``` + +配置 arangodb 服务器地址: + +```yaml +storage: + db_conf: + url: "http://localhost:8529" +``` + +##### 启动服务 + +运行 `systemctl start gala-spider` 。查看启动状态可执行 `systemctl status gala-spider` ,输出如下信息说明启动成功。 + +```sh +[root@openEuler ~]# systemctl status gala-spider +● gala-spider.service - a-ops gala spider service + Loaded: loaded (/usr/lib/systemd/system/gala-spider.service; enabled; vendor preset: disabled) + Active: active (running) since Tue 2022-08-30 17:28:38 CST; 1 day 22h ago + Main PID: 2263793 (spider-storage) + Tasks: 3 (limit: 98900) + Memory: 44.2M + CGroup: /system.slice/gala-spider.service + └─2263793 /usr/bin/python3 /usr/bin/spider-storage +``` + +##### 输出示例 + +用户可以通过 arangodb 提供的 UI 界面来查询 gala-spider 输出的拓扑图。使用流程如下: + +1. 在浏览器输入 arangodb 服务器地址,如:http://localhost:8529 ,进入 arangodb 的 UI 界面。 + +2. 界面右上角切换至 `spider` 数据库。 + +3. 在 `Collections` 面板可以看到在不同时间段存储的观测对象实例的集合、拓扑关系的集合,如下图所示: + + ![spider拓扑关系图](./figures/spider拓扑关系图.png) + +4. 可进一步根据 arangodb 提供的 AQL 查询语句查询存储的拓扑关系图,详细教程参见官方文档: [aql文档](https://www.arangodb.com/docs/3.8/aql/)。 + + + +## gala-inference + +gala-inference 提供异常 KPI 根因定位能力,它将基于异常检测的结果和拓扑图作为输入,根因定位的结果作为输出,输出到 kafka 中。gala-inference 组件在 gala-spider 项目下进行归档。 + +### 安装 + +挂载 yum 源: + +```basic +[oe-22.03-lts-sp3-everything] # openEuler 22.03-LTS-SP3 官方发布源 +name=oe-2203-lts-sp3-everything +baseurl=http://repo.openeuler.org/openEuler-22.03-LTS-SP3/everything/x86_64/ +enabled=1 +gpgcheck=0 +priority=1 + +[oe-22.03-lts-sp3-epol-update] # openEuler 22.03-LTS-SP3 Update 官方发布源 +name=oe-22.03-lts-sp3-epol-update +baseurl=http://repo.openeuler.org/openEuler-22.03-LTS-SP3/EPOL/update/main/x86_64/ +enabled=1 +gpgcheck=0 +priority=1 + +[oe-22.03-lts-sp3-epol-main] # openEuler 22.03-LTS-SP3 EPOL 官方发布源 +name=oe-22.03-lts-sp3-epol-main +baseurl=http://repo.openeuler.org/openEuler-22.03-LTS-SP3/EPOL/main/x86_64/ +enabled=1 +gpgcheck=0 +priority=1 +``` + +安装 gala-inference: + +```sh +# yum install gala-inference +``` + + + +### 配置 + +#### 配置文件说明 + +gala-inference 配置文件 `/etc/gala-inference/gala-inference.yaml` 配置项说明如下。 + +- inference:根因定位算法的配置信息。 + - tolerated_bias:异常时间点的拓扑图查询所容忍的时间偏移,单位为秒。 + - topo_depth:拓扑图查询的最大深度。 + - root_topk:根因定位结果输出前 K 个根因指标。 + - infer_policy:根因推导策略,包括 dfs 和 rw 。 + - sample_duration:指标的历史数据的采样周期,单位为秒。 + - evt_valid_duration:根因定位时,有效的系统异常指标事件周期,单位为秒。 + - evt_aging_duration:根因定位时,系统异常指标事件的老化周期,单位为秒。 +- kafka:kafka配置信息。 + - server:kafka服务器地址。 + - metadata_topic:观测对象元数据消息的配置信息。 + - topic_id:观测对象元数据消息的topic名称。 + - group_id:观测对象元数据消息的消费者组ID。 + - abnormal_kpi_topic:异常 KPI 事件消息的配置信息。 + - topic_id:异常 KPI 事件消息的topic名称。 + - group_id:异常 KPI 事件消息的消费者组ID。 + - abnormal_metric_topic:系统异常指标事件消息的配置信息。 + - topic_id:系统异常指标事件消息的topic名称。 + - group_id:系统异常指标事件消息的消费者组ID。 + - consumer_to:消费系统异常指标事件消息的超时时间,单位为秒。 + - inference_topic:根因定位结果输出事件消息的配置信息。 + - topic_id:根因定位结果输出事件消息的topic名称。 +- arangodb:arangodb图数据库的配置信息,用于查询根因定位所需要的拓扑子图。 + - url:图数据库的服务器地址。 + - db_name:拓扑图存储的数据库名称。 +- log_conf:日志配置信息。 + - log_path:日志文件路径。 + - log_level:日志打印级别,值包括 DEBUG/INFO/WARNING/ERROR/CRITICAL。 + - max_size:日志文件大小,单位为兆字节(MB)。 + - backup_count:日志备份文件数量。 +- prometheus:prometheus数据库配置信息,用于获取指标的历史时序数据。 + - base_url:prometheus服务器地址。 + - range_api:区间采集API。 + - step:采集时间步长,用于区间采集API。 + +#### 配置文件示例 + +```yaml +inference: + # 异常时间点的拓扑图查询所容忍的时间偏移,单位:秒 + tolerated_bias: 120 + topo_depth: 10 + root_topk: 3 + infer_policy: "dfs" + # 单位: 秒 + sample_duration: 600 + # 根因定位时,有效的异常指标事件周期,单位:秒 + evt_valid_duration: 120 + # 异常指标事件的老化周期,单位:秒 + evt_aging_duration: 600 + +kafka: + server: "localhost:9092" + metadata_topic: + topic_id: "gala_gopher_metadata" + group_id: "metadata-inference" + abnormal_kpi_topic: + topic_id: "gala_anteater_hybrid_model" + group_id: "abn-kpi-inference" + abnormal_metric_topic: + topic_id: "gala_anteater_metric" + group_id: "abn-metric-inference" + consumer_to: 1 + inference_topic: + topic_id: "gala_cause_inference" + +arangodb: + url: "http://localhost:8529" + db_name: "spider" + +log: + log_path: "/var/log/gala-inference/inference.log" + # log level: DEBUG/INFO/WARNING/ERROR/CRITICAL + log_level: INFO + # unit: MB + max_size: 10 + backup_count: 10 + +prometheus: + base_url: "http://localhost:9090/" + range_api: "/api/v1/query_range" + step: 5 +``` + + + +### 启动 + +1. 通过命令启动。 + + ```sh + # gala-inference + ``` + +2. 通过 systemd 服务启动。 + + ```sh + # systemctl start gala-inference + ``` + + + +### 使用方法 + +##### 依赖软件部署 + +gala-inference 的运行依赖和 gala-spider一样,请参见[外部依赖软件部署](#外部依赖软件部署)。此外,gala-inference 还间接依赖 [gala-spider](#gala-spider) 和 [gala-anteater](gala-anteater使用手册.md) 软件的运行,请提前部署gala-spider和gala-anteater软件。 + +##### 配置项修改 + +修改 gala-inference 的配置文件中部分配置项。示例如下: + +配置 kafka 服务器地址: + +```yaml +kafka: + server: "localhost:9092" +``` + +配置 prometheus 服务器地址: + +```yaml +prometheus: + base_url: "http://localhost:9090/" +``` + +配置 arangodb 服务器地址: + +```yaml +arangodb: + url: "http://localhost:8529" +``` + +##### 启动服务 + +直接运行 `systemctl start gala-inference` 即可。可通过执行 `systemctl status gala-inference` 查看启动状态,如下打印表示启动成功。 + +```sh +[root@openEuler ~]# systemctl status gala-inference +● gala-inference.service - a-ops gala inference service + Loaded: loaded (/usr/lib/systemd/system/gala-inference.service; enabled; vendor preset: disabled) + Active: active (running) since Tue 2022-08-30 17:55:33 CST; 1 day 22h ago + Main PID: 2445875 (gala-inference) + Tasks: 10 (limit: 98900) + Memory: 48.7M + CGroup: /system.slice/gala-inference.service + └─2445875 /usr/bin/python3 /usr/bin/gala-inference +``` + +##### 输出示例 + +当异常检测模块 gala-anteater 检测到 KPI 异常后,会将对应的异常 KPI 事件输出到 kafka 中,gala-inference 会一直监测该异常 KPI 事件的消息,如果收到异常 KPI 事件的消息,就会触发根因定位。根因定位会将定位结果输出到 kafka 中,用户可以在 kafka 服务器中查看根因定位的输出结果,基本步骤如下: + +1. 若通过源码安装 kafka ,需要进入 kafka 的安装目录下。 + + ```sh + cd /root/kafka_2.13-2.8.0 + ``` + +2. 执行消费 topic 的命令获取根因定位的输出结果。 + + ```sh + ./bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic gala_cause_inference + ``` + + 输出示例如下: + + ```json + { + "Timestamp": 1661853360000, + "event_id": "1661853360000_1fd37742xxxx_sli_12154_19", + "Atrributes": { + "event_id": "1661853360000_1fd37742xxxx_sli_12154_19" + }, + "Resource": { + "abnormal_kpi": { + "metric_id": "gala_gopher_sli_rtt_nsec", + "entity_id": "1fd37742xxxx_sli_12154_19", + "timestamp": 1661853360000, + "metric_labels": { + "machine_id": "1fd37742xxxx", + "tgid": "12154", + "conn_fd": "19" + } + }, + "cause_metrics": [ + { + "metric_id": "gala_gopher_proc_write_bytes", + "entity_id": "1fd37742xxxx_proc_12154", + "metric_labels": { + "__name__": "gala_gopher_proc_write_bytes", + "cmdline": "/opt/redis/redis-server x.x.x.172:3742", + "comm": "redis-server", + "container_id": "5a10635e2c43", + "hostname": "openEuler", + "instance": "x.x.x.172:8888", + "job": "prometheus", + "machine_id": "1fd37742xxxx", + "pgid": "12154", + "ppid": "12126", + "tgid": "12154" + }, + "timestamp": 1661853360000, + "path": [ + { + "metric_id": "gala_gopher_proc_write_bytes", + "entity_id": "1fd37742xxxx_proc_12154", + "metric_labels": { + "__name__": "gala_gopher_proc_write_bytes", + "cmdline": "/opt/redis/redis-server x.x.x.172:3742", + "comm": "redis-server", + "container_id": "5a10635e2c43", + "hostname": "openEuler", + "instance": "x.x.x.172:8888", + "job": "prometheus", + "machine_id": "1fd37742xxxx", + "pgid": "12154", + "ppid": "12126", + "tgid": "12154" + }, + "timestamp": 1661853360000 + }, + { + "metric_id": "gala_gopher_sli_rtt_nsec", + "entity_id": "1fd37742xxxx_sli_12154_19", + "metric_labels": { + "machine_id": "1fd37742xxxx", + "tgid": "12154", + "conn_fd": "19" + }, + "timestamp": 1661853360000 + } + ] + } + ] + }, + "SeverityText": "WARN", + "SeverityNumber": 13, + "Body": "A cause inferring event for an abnormal event" + } + ``` \ No newline at end of file diff --git "a/docs/zh/docs/A-Ops/image/ACC\347\232\204hotpatchmetadata\346\226\207\344\273\266\347\244\272\344\276\213.png" "b/docs/zh/docs/A-Ops/image/ACC\347\232\204hotpatchmetadata\346\226\207\344\273\266\347\244\272\344\276\213.png" new file mode 100644 index 0000000000000000000000000000000000000000..790df6fd5781ca008124cff14635165a71abf126 Binary files /dev/null and "b/docs/zh/docs/A-Ops/image/ACC\347\232\204hotpatchmetadata\346\226\207\344\273\266\347\244\272\344\276\213.png" differ diff --git a/docs/zh/docs/A-Ops/image/hotpatch-fix-pr.png b/docs/zh/docs/A-Ops/image/hotpatch-fix-pr.png index 209c73f7b4522819c52662a9038bdf19a88eacfd..d10fd1ec44416f6b59cfd21cca8721d001f7ed19 100644 Binary files a/docs/zh/docs/A-Ops/image/hotpatch-fix-pr.png and b/docs/zh/docs/A-Ops/image/hotpatch-fix-pr.png differ diff --git a/docs/zh/docs/A-Ops/image/image-20230607161545732.png b/docs/zh/docs/A-Ops/image/image-20230607161545732.png deleted file mode 100644 index ba6992bea8d2a1d7ca4769ebfdd850b98d1a372f..0000000000000000000000000000000000000000 Binary files a/docs/zh/docs/A-Ops/image/image-20230607161545732.png and /dev/null differ diff --git a/docs/zh/docs/A-Ops/image/image-20230908163402743.png b/docs/zh/docs/A-Ops/image/image-20230908163402743.png new file mode 100644 index 0000000000000000000000000000000000000000..c17667178689c6384a039bf0f8025ea7eb360236 Binary files /dev/null and b/docs/zh/docs/A-Ops/image/image-20230908163402743.png differ diff --git a/docs/zh/docs/A-Ops/image/image-20230908163914778.png b/docs/zh/docs/A-Ops/image/image-20230908163914778.png new file mode 100644 index 0000000000000000000000000000000000000000..a06c7e49b32286ceec9ff0e9a08f73a76c179daf Binary files /dev/null and b/docs/zh/docs/A-Ops/image/image-20230908163914778.png differ diff --git a/docs/zh/docs/A-Ops/image/image-20230908164216528.png b/docs/zh/docs/A-Ops/image/image-20230908164216528.png new file mode 100644 index 0000000000000000000000000000000000000000..15fbc694603837095244451d4f5d7e7af70789be Binary files /dev/null and b/docs/zh/docs/A-Ops/image/image-20230908164216528.png differ diff --git "a/docs/zh/docs/A-Ops/image/src-openEuler\344\273\223\350\257\204\350\256\272.png" "b/docs/zh/docs/A-Ops/image/src-openEuler\344\273\223\350\257\204\350\256\272.png" index 3f8fbd534e8f8a48fdd60a5c3f13b33531a4112a..ba3a44433117f0a23fc6048cd3b093fe6af7250c 100644 Binary files "a/docs/zh/docs/A-Ops/image/src-openEuler\344\273\223\350\257\204\350\256\272.png" and "b/docs/zh/docs/A-Ops/image/src-openEuler\344\273\223\350\257\204\350\256\272.png" differ diff --git "a/docs/zh/docs/A-Ops/image/\345\220\214\346\204\217\345\220\210\345\205\245pr.png" "b/docs/zh/docs/A-Ops/image/\345\220\214\346\204\217\345\220\210\345\205\245pr.png" new file mode 100644 index 0000000000000000000000000000000000000000..2c2e2dd78242f538c21809614e917bef769256ba Binary files /dev/null and "b/docs/zh/docs/A-Ops/image/\345\220\214\346\204\217\345\220\210\345\205\245pr.png" differ diff --git "a/docs/zh/docs/A-Ops/image/\345\220\257\345\212\250\347\203\255\350\241\245\344\270\201\345\267\245\347\250\213\346\265\201\347\250\213.png" "b/docs/zh/docs/A-Ops/image/\345\220\257\345\212\250\347\203\255\350\241\245\344\270\201\345\267\245\347\250\213\346\265\201\347\250\213.png" index 1405eced0a14e3956191e111b7c1d588e5b3d27b..2914c3eef44bb3d3528686b44157a5f9276da9c6 100644 Binary files "a/docs/zh/docs/A-Ops/image/\345\220\257\345\212\250\347\203\255\350\241\245\344\270\201\345\267\245\347\250\213\346\265\201\347\250\213.png" and "b/docs/zh/docs/A-Ops/image/\345\220\257\345\212\250\347\203\255\350\241\245\344\270\201\345\267\245\347\250\213\346\265\201\347\250\213.png" differ diff --git "a/docs/zh/docs/A-Ops/image/\347\203\255\350\241\245\344\270\201issue\345\210\235\345\247\213\345\206\205\345\256\271.png" "b/docs/zh/docs/A-Ops/image/\347\203\255\350\241\245\344\270\201issue\345\210\235\345\247\213\345\206\205\345\256\271.png" new file mode 100644 index 0000000000000000000000000000000000000000..044be7ccd001ddc2bb69ba53b34f3c2a72511f39 Binary files /dev/null and "b/docs/zh/docs/A-Ops/image/\347\203\255\350\241\245\344\270\201issue\345\210\235\345\247\213\345\206\205\345\256\271.png" differ diff --git "a/docs/zh/docs/A-Ops/image/\347\203\255\350\241\245\344\270\201issue\345\233\236\345\241\253.png" "b/docs/zh/docs/A-Ops/image/\347\203\255\350\241\245\344\270\201issue\345\233\236\345\241\253.png" new file mode 100644 index 0000000000000000000000000000000000000000..779c2fddcb02968358492e70f6aa9261be26fe48 Binary files /dev/null and "b/docs/zh/docs/A-Ops/image/\347\203\255\350\241\245\344\270\201issue\345\233\236\345\241\253.png" differ diff --git "a/docs/zh/docs/A-Ops/image/\347\203\255\350\241\245\344\270\201issue\351\223\276\346\216\245\345\222\214pr\351\223\276\346\216\245.png" "b/docs/zh/docs/A-Ops/image/\347\203\255\350\241\245\344\270\201issue\351\223\276\346\216\245\345\222\214pr\351\223\276\346\216\245.png" index c9f6dc0a0f1a1758bb936b61ec939f8f5eeee633..d97fbd1fbb5a20b97ec88989f3c7a0776bb9cdc0 100644 Binary files "a/docs/zh/docs/A-Ops/image/\347\203\255\350\241\245\344\270\201issue\351\223\276\346\216\245\345\222\214pr\351\223\276\346\216\245.png" and "b/docs/zh/docs/A-Ops/image/\347\203\255\350\241\245\344\270\201issue\351\223\276\346\216\245\345\222\214pr\351\223\276\346\216\245.png" differ diff --git "a/docs/zh/docs/A-Ops/image/\347\203\255\350\241\245\344\270\201pr\345\210\266\344\275\234\345\244\261\350\264\245.png" "b/docs/zh/docs/A-Ops/image/\347\203\255\350\241\245\344\270\201pr\345\210\266\344\275\234\345\244\261\350\264\245.png" new file mode 100644 index 0000000000000000000000000000000000000000..3acf2e93550e4962d0a5f927fd6fd0460a64b889 Binary files /dev/null and "b/docs/zh/docs/A-Ops/image/\347\203\255\350\241\245\344\270\201pr\345\210\266\344\275\234\345\244\261\350\264\245.png" differ diff --git "a/docs/zh/docs/A-Ops/image/\347\203\255\350\241\245\344\270\201pr\345\210\266\344\275\234\347\273\223\346\236\234.png" "b/docs/zh/docs/A-Ops/image/\347\203\255\350\241\245\344\270\201pr\345\210\266\344\275\234\347\273\223\346\236\234.png" new file mode 100644 index 0000000000000000000000000000000000000000..5b167be8a40762823223ccdd700d5b62f7e1aa38 Binary files /dev/null and "b/docs/zh/docs/A-Ops/image/\347\203\255\350\241\245\344\270\201pr\345\210\266\344\275\234\347\273\223\346\236\234.png" differ diff --git "a/docs/zh/docs/A-Ops/image/\347\203\255\350\241\245\344\270\201pr\347\232\204chroot\347\216\257\345\242\203.png" "b/docs/zh/docs/A-Ops/image/\347\203\255\350\241\245\344\270\201pr\347\232\204chroot\347\216\257\345\242\203.png" new file mode 100644 index 0000000000000000000000000000000000000000..a96a4d229b54b301bbf4e7f7a2c41ea1e9faf43d Binary files /dev/null and "b/docs/zh/docs/A-Ops/image/\347\203\255\350\241\245\344\270\201pr\347\232\204chroot\347\216\257\345\242\203.png" differ diff --git "a/docs/zh/docs/A-Ops/image/\347\203\255\350\241\245\344\270\201pr\350\247\246\345\217\221\346\265\201\347\250\213.png" "b/docs/zh/docs/A-Ops/image/\347\203\255\350\241\245\344\270\201pr\350\247\246\345\217\221\346\265\201\347\250\213.png" new file mode 100644 index 0000000000000000000000000000000000000000..d77335d0097f7504f0c37dd8aca1691d9f1f0a23 Binary files /dev/null and "b/docs/zh/docs/A-Ops/image/\347\203\255\350\241\245\344\270\201pr\350\247\246\345\217\221\346\265\201\347\250\213.png" differ diff --git "a/docs/zh/docs/A-Ops/image/\347\203\255\350\241\245\344\270\201\344\273\223\346\217\220pr\350\257\264\346\230\216.png" "b/docs/zh/docs/A-Ops/image/\347\203\255\350\241\245\344\270\201\344\273\223\346\217\220pr\350\257\264\346\230\216.png" new file mode 100644 index 0000000000000000000000000000000000000000..aa74c2859588ff2a49d6341dd2a2ac6fe2049eac Binary files /dev/null and "b/docs/zh/docs/A-Ops/image/\347\203\255\350\241\245\344\270\201\344\273\223\346\217\220pr\350\257\264\346\230\216.png" differ diff --git "a/docs/zh/docs/A-Ops/image/\347\203\255\350\241\245\344\270\201\350\207\252\351\252\214\344\270\213\350\275\275\351\223\276\346\216\245.png" "b/docs/zh/docs/A-Ops/image/\347\203\255\350\241\245\344\270\201\350\207\252\351\252\214\344\270\213\350\275\275\351\223\276\346\216\245.png" new file mode 100644 index 0000000000000000000000000000000000000000..404ac733fae66bda9ceac2d6c2fa18897c58dc70 Binary files /dev/null and "b/docs/zh/docs/A-Ops/image/\347\203\255\350\241\245\344\270\201\350\207\252\351\252\214\344\270\213\350\275\275\351\223\276\346\216\245.png" differ diff --git "a/docs/zh/docs/A-Ops/image/\347\203\255\350\241\245\344\270\201\350\207\252\351\252\214\345\214\205\344\270\213\350\275\275\351\223\276\346\216\245.png" "b/docs/zh/docs/A-Ops/image/\347\203\255\350\241\245\344\270\201\350\207\252\351\252\214\345\214\205\344\270\213\350\275\275\351\223\276\346\216\245.png" new file mode 100644 index 0000000000000000000000000000000000000000..6d32e8874e8e5e7f7fb5c350fca0063da9a77176 Binary files /dev/null and "b/docs/zh/docs/A-Ops/image/\347\203\255\350\241\245\344\270\201\350\207\252\351\252\214\345\214\205\344\270\213\350\275\275\351\223\276\346\216\245.png" differ diff --git "a/docs/zh/docs/A-Ops/\346\236\266\346\236\204\346\204\237\347\237\245\346\234\215\345\212\241\344\275\277\347\224\250\346\211\213\345\206\214.md" "b/docs/zh/docs/A-Ops/\346\236\266\346\236\204\346\204\237\347\237\245\346\234\215\345\212\241\344\275\277\347\224\250\346\211\213\345\206\214.md" deleted file mode 100644 index 1a1439dc4e777c3bd14ea503c86cc427b7e7b4c6..0000000000000000000000000000000000000000 --- "a/docs/zh/docs/A-Ops/\346\236\266\346\236\204\346\204\237\347\237\245\346\234\215\345\212\241\344\275\277\347\224\250\346\211\213\345\206\214.md" +++ /dev/null @@ -1,74 +0,0 @@ -# 架构感知服务使用手册 - -## 安装 - -### 手动安装 - -- 通过yum挂载repo源实现 - - 配置yum源:openEuler23.09 和 openEuler23.09:Epol,repo源路径:/etc/yum.repos.d/openEuler.repo。 - - ```ini - [everything] # openEuler 23.09 官方发布源 - name=openEuler23.09 - baseurl=https://repo.openeuler.org/openEuler-23.09/everything/$basearch/ - enabled=1 - gpgcheck=1 - gpgkey=https://repo.openeuler.org/openEuler-23.09/everything/$basearch/RPM-GPG-KEY-openEuler - - [Epol] # openEuler 23.09:Epol 官方发布源 - name=Epol - baseurl=https://repo.openeuler.org/openEuler-23.09/EPOL/main/$basearch/ - enabled=1 - gpgcheck=1 - gpgkey=https://repo.openeuler.org/openEuler-23.09/OS/$basearch/RPM-GPG-KEY-openEuler - ``` - - 然后执行如下指令下载以及安装gala-spider及其依赖。 - - ```shell - # A-Ops 架构感知,通常安装在主节点上 - yum install gala-spider - yum install python3-gala-spider - - # A-Ops 架构感知探针,通常安装在主节点上 - yum install gala-gopher - ``` - -- 通过安装rpm包实现。先下载gala-spider-vx.x.x-x.oe1.aarch64.rpm,然后执行如下命令进行安装(其中x.x-x表示版本号,请用实际情况替代)。 - - ```shell - rpm -ivh gala-spider-vx.x.x-x.oe1.aarch64.rpm - rpm -ivh gala-gopher-vx.x.x-x.oe1.aarch64.rpm - ``` - -### 使用Aops部署服务安装 - -#### 编辑任务列表 - -修改部署任务列表,打开gala_spider步骤开关: - -```yaml ---- -step_list: - ... - gala_gopher: - enable: false - continue: false - gala_spider: - enable: false - continue: false - ... -``` - -#### 编辑主机清单 - -具体步骤参见[部署管理使用手册](部署管理使用手册.md)章节2.2.2.11章节gala-spider与gala-gopher模块主机配置。 - -#### 编辑变量列表 - -具体步骤参见[部署管理使用手册](部署管理使用手册.md)章节2.2.2.11章节gala-spider与gala-gopher模块变量配置。 - -#### 执行部署任务 - -具体步骤参见[部署管理使用手册](部署管理使用手册.md)章节3执行部署任务。 diff --git "a/docs/zh/docs/A-Ops/\347\244\276\345\214\272\347\203\255\350\241\245\344\270\201\345\210\266\344\275\234\345\217\221\345\270\203\346\265\201\347\250\213.md" "b/docs/zh/docs/A-Ops/\347\244\276\345\214\272\347\203\255\350\241\245\344\270\201\345\210\266\344\275\234\345\217\221\345\270\203\346\265\201\347\250\213.md" index 1c2d4ab7fabed91e8dce2ed214ec26afd1dea637..067eb5ed716a793c24bd494baf9f4ba88c7f70a9 100644 --- "a/docs/zh/docs/A-Ops/\347\244\276\345\214\272\347\203\255\350\241\245\344\270\201\345\210\266\344\275\234\345\217\221\345\270\203\346\265\201\347\250\213.md" +++ "b/docs/zh/docs/A-Ops/\347\244\276\345\214\272\347\203\255\350\241\245\344\270\201\345\210\266\344\275\234\345\217\221\345\270\203\346\265\201\347\250\213.md" @@ -1,19 +1,22 @@ + + # 社区热补丁制作发布流程 ## 制作内核态/用户态热补丁 +> 热补丁仓库: + ### 场景1. 在src-openEuler/openEuler仓下评论pr制作新版本热补丁 > 制作内核态热补丁需在**openEuler/kernel**仓评论pr。 > > 制作用户态热补丁需在src-openEuler仓评论pr,现在支持**src-openEuler/openssl,src-openEuler/glibc,src-openEuler/systemd**。 -#### 1. 在已合入pr下评论制作热补丁 +##### 1. 在已合入pr下评论制作热补丁 - 从src-openeuler仓【支持openssl, glibc, systemd】评论已合入pr制作新版本热补丁。 - -```shell -/makehotpatch [软件包版本号] [patch list] [cve/bug] [issue id] [os_branch] +``` +/makehotpatch [软件包版本号] [ACC/SGL] [patch list] [cve/bugfix/feature] [issue id] [os_branch] ``` 命令说明:使用多个patch用','分隔,需注意patch的先后顺序。 @@ -22,239 +25,275 @@ - 从openeuler仓【支持kernel】评论已合入pr制作新版本热补丁。 -```shell -/makehotpatch [软件包版本号] [cve/bug] [issue id] [os_branch] ``` +/makehotpatch [软件包版本号] [ACC/SGL] [cve/bugfix/feature] [issue id] [os_branch] +``` + +![image-20230816105443658](./image/src-openEuler仓评论.png) -![image-20230629142933917](./image/openEuler仓评论.png) +评论后,门禁触发hotpatch_meta仓创建热补丁issue以及同步该pr。 -评论后,门禁触发hotpatch_metadata仓创建热补丁issue以及同步该pr。 -#### 2. hotpatch_metadata仓自动创建热补丁issue、同步该pr + +##### 2. hotpatch_metadata仓自动创建热补丁issue、同步该pr pr评论区提示启动热补丁制作流程。 -![image-20230629143426498](./image/启动热补丁工程流程.png) +![image-20230816105627657](./image/启动热补丁工程流程.png) -随后,hotpatch_metadata仓自动创建热补丁issue,并在hotpatch_metadata仓同步该pr。 +随后,hotpatch_meta仓自动创建热补丁issue,并在hotpatch_meta仓同步该pr。 > 热补丁issue用于跟踪热补丁制作流程。 > -> hotpatch_metadata仓用于触发制作热补丁。 +> hotpatch_meta仓用于触发制作热补丁。 -![image-20230629144503840](./image/热补丁issue链接和pr链接.png) +![image-20230816105850831](./image/热补丁issue链接和pr链接.png) 点击查看热补丁issue链接内容。 -- 热补丁Issue类别为hotpatch。 -![image-20230607161545732](./image/image-20230607161545732.png) +![image-20230816110430216](./image/热补丁issue初始内容.png) + -点击查看在hotpatch_metadata仓自动创建的pr。 -![hotpatch-fix-pr](./image/hotpatch-fix-pr.png) +点击查看在hotpatch_meta仓自动创建的pr。 -#### 3. 触发制作热补丁 +![image-20230816110637492](./image/hotpatch-fix-pr.png) -打开hotpatch_metadata仓自动创建的pr,评论区可以查看热补丁制作信息。 +##### 3. 触发制作热补丁 -![img](./image/45515A7F-0EC2-45AA-9B58-AB92DE9B0979.png) +打开hotpatch_meta仓自动创建的pr,评论区可以查看热补丁制作信息。 + +![image-20230816110919823](./image/热补丁pr触发流程.png) 查看热补丁制作结果。 -![img](./image/E574E637-0BF3-4F3B-BAE6-04ECBD09D151.png) +如果热补丁制作失败,可以根据相关日志信息、下载chroot环境自行修改patch进行调试,重新修改pr提交后或者评论 /retest直到热补丁可以被成功制作。 + +![image-20230816111330743](./image/热补丁pr制作失败.png) -如果热补丁制作失败,可以根据相关日志信息修改pr、评论 /retest直到热补丁可以被成功制作。 +![image-20230816111452301](./image/热补丁pr的chroot环境.png) 如果热补丁制作成功,可以通过Download link下载热补丁进行自验。 -![image-20230608151244425](./image/hotpatch-pr-success.png) +![image-20230816111007667](./image/热补丁pr制作结果.png) + +打开Download link链接。 + +![image-20230816112118423](./image/热补丁自验下载链接.png) + +进入Packages目录,可以下载制作成功的热补丁。 + +![image-20230816112420115](./image/热补丁自验包下载链接.png) **若热补丁制作成功,可以对热补丁进行审阅**。 -### 场景2、从hotpatch_metadata仓提pr修改热补丁 -> 从hotpatch_metadata仓提pr只能修改还未正式发布的热补丁。 -> -#### 1. 提pr +### 场景2、从hotpatch_meta仓提pr制作新版本热补丁 -用户需要手动创建热补丁issue。 +> hotpatch_meta仓地址:https://gitee.com/openeuler/hotpatch_meta -(1)阅读readme,根据热补丁issue模版创建热补丁。 +##### 1. 提pr -![image-20230612113428096](./image/image-20230612113428096.png) +在hotpatch_metadata仓提pr。 -> 用户不允许修改热补丁元数据文件中已被正式发布的热补丁的相关内容。 -> +(1)阅读readme,根据热补丁issue模版和元数据文件hotmetadata_ACC.xml/hotmetadata_SGL.xml模板创建热补丁。 + +![image-20230817095228204](./image/热补丁仓提pr说明.png) pr内容: - patch文件。 -- 修改热补丁元数据hotmetadata.xml文件。 +- 如果没有相应热补丁元数据hotmetadata_ACC.xml/hotmetadata_SGL.xml文件,则手动创建;否则修改热补丁元数据hotmetadata_ACC.xml/hotmetadata_SGL.xml文件。 + -#### 2. 触发制作热补丁 + +##### 2. 触发制作热补丁 **若热补丁制作成功,可以对热补丁进行审阅**。 -### 场景3、从hotpatch_metadata仓提pr制作新版本热补丁 -#### 1. 提pr -在hotpatch_metadata仓提pr。 +### 场景3、从hotpatch_metadata仓提pr修改热补丁 + +> hotpatch_meta仓地址:https://gitee.com/openeuler/hotpatch_meta +> +> 从hotpatch_meta仓提pr只能修改还未正式发布的热补丁。 + +##### 1. 提pr + +在hotpatch_meta仓提pr。 + +(1)如果修改过程涉及元数据文件hotmetadata_ACC.xml/hotmetadata_SGL.xml文件内容变动,请阅读readme,按照元数据文件hotmetadata_ACC.xml/hotmetadata_SGL.xml模板进行修改。 + +![image-20230817095228204](./image/热补丁仓提pr说明.png) + +> 如果需要修改元数据文件中的热补丁issue字段内容,请确保添加的热补丁Issue已经存在。 +> 用户不允许修改热补丁元数据文件中已被正式发布的热补丁的相关内容。 -(1)阅读readme,根据热补丁issue模版创建热补丁。 -![image-20230612113428096](./image/image-20230612113428096.png) pr内容: - patch文件。 -- 如果没有相应热补丁元数据hotmetadata.xml文件,则手动创建;否则修改热补丁元数据hotmetadata.xml文件。 +- 修改热补丁元数据hotmetadata_ACC.xml/hotmetadata_SGL.xml文件。 -#### 2. 触发制作热补丁 + + +##### 2. 触发制作热补丁 **若热补丁制作成功,可以对热补丁进行审阅**。 + + + ## 审阅热补丁 -### 1. 审阅热补丁pr +##### 1. 审阅热补丁pr 确认可发布,合入pr。 -### 2. pr合入,回填热补丁issue +![image-20230816112957179](./image/同意合入pr.png) -在热补丁issue页面补充热补丁路径,包含src.rpm/arm架构/x86架构的rpm包,以及对应hotpatch.xml,用于展示热补丁信息。 +##### 2. pr合入,回填热补丁issue -> 如果一个架构失败,强行合入,也可只发布单架构的包。 +自动在热补丁issue页面补充热补丁路径,包含src.rpm/arm架构/x86架构的rpm包,以及对应hotpatch.xml,用于展示热补丁信息。 -![img](./image/EF5E0132-6E5C-4DD1-8CB5-73035278E233.png) +> 如果一个架构失败,强行合入,也可只发布单架构的包。 -- 热补丁Issue标签为hotpatch。 +![image-20230816115813395](./image/热补丁issue回填.png) - 查看热补丁元数据内容。 -热补丁元数据模版: - > 热补丁元数据用于管理查看热补丁相关历史制作信息。 +hotmetadata_ACC.xml格式示例: + ```xml - Managing Hot Patch Metadata - - - - src.rpm归档地址 - x86架构debuginfo二进制包归档地址 - arm架构debuginfo二进制包归档地址 - patch文件 - - https://gitee.com/wanghuan158/hot-patch_metadata/issues/I7AE5F - - - + Managing Hot Patch Metadata + + + + 源码包下载路径(需要reealse正式路径) + x86_64架构debuginfo包下载路径(需要reealse正式路径) + aarch64架构debuginfo包下载路径(需要reealse正式路径) + 本次需要制作热补丁的patch包名1 + 本次需要制作热补丁的patch包名2 + ... + + + + 热补丁issue链接 + + + ``` +hotmetadata_SGL.xml格式示例: + ```xml - Managing Hot Patch Metadata - - - - download_link - download_link - download_link - 0001-PEM-read-bio-ret-failure.patch - - https://gitee.com/wanghuan158/hot-patch_metadata/issues/I7AE5F - - - download_link - download_link - download_link - 0001-PEM-read-bio-ret-failure.patch - - https://gitee.com/wanghuan158/hot-patch_metadata/issues/I7AE5P - - - + Managing Hot Patch Metadata + + + + 源码包下载路径(需要realse正式路径) + x86_64架构debuginfo包下载路径(需要reealse正式路径) + aarch64架构debuginfo包下载路径(需要reealse正式路径) + 本次需要制作热补丁的patch包名1 + 本次需要制作热补丁的patch包名2 + ... + + + + 热补丁issue链接 + + + ``` -> 注意:download_link均为repo仓正式的归档链接。 -> -> 热补丁当前只考虑演进,version 2基于version 1的src继续构建。 +> 注意:src_rpm的download_link均来自openeuler的repo仓下正式发布的rpm包。 + +![image-20230817100308392](./image/ACC的hotpatchmetadata文件示例.png) + +##### 3. 修改热补丁Issue + +- 将热补丁issue状态修改为”已完成“。 +- 为热补丁issue添加hotpatch标签。 -![image-20230607163358749](./image/image-20230607163358749.png) -### 3. 关闭相应热补丁Issue ## 发布热补丁 -### 1、收集热补丁发布需求 +##### 1、收集热补丁发布需求 -在release-management仓库每周update需求收集的issue下方,手动评论start-update命令,此时会收集待发布的热补丁和待发布的修复cve的冷补丁。后台会在hotpatch_meta仓库根据hotpatch标签查找已关闭的热补丁issue。 +在release-management仓库每周update需求收集的issue下方,手动评论start-update命令,此时会收集待发布的热补丁和待发布的修复cve的冷补丁。后台会在hotpatch_meta仓库根据hotpatch标签查找已完成的热补丁issue。 -### 2、生成安全公告热补丁信息 +##### 2、生成热补丁安全公告 -社区根据收集到的热补丁issue信息,在生成安全公告的同时生成hotpatch字段补丁,过滤已经发布的漏洞。 +社区根据收集到的热补丁issue信息,生成热补丁安全公告xml文件。 -- 在安全公告文件新增HotPatchTree字段,记录和公告相关漏洞的热补丁,每个补丁按架构和CVE字段区分(Type=ProductName 记录分支,Type=ProductArch 记录补丁具体的rpm包)。 +> 热补丁安全公告地址: -![](./image/patch-file.PNG) +- 在热补丁安全公告文件新增HotPatchTree字段,记录和公告相关漏洞的热补丁,每个补丁按架构和CVE字段区分(Type=ProductName 记录分支,Type=ProductArch 记录补丁具体的rpm包)。 -### 3、Majun平台上传文件到openEuler官网,同步生成updateinfo.xml文件 +![image-20230908163914778](./image/image-20230908163914778.png) + +##### 3、Majun平台上传文件到openEuler官网,同步生成updateinfo.xml文件 社区将生成的安全公告上传到openEuler官网,同时基于所收集的热补丁信息生成updateinfo.xml文件。 -![](./image/hotpatch-xml.PNG) +![image-20230908164216528](./image/image-20230908164216528.png) updateinfo.xml文件样例: ```xml - - - - openEuler-SA-2022-1 - An update for mariadb is now available for openEuler-22.03-LTS - Important - openEuler - - - - - - patch-redis-6.2.5-1-HP001.(CVE-2022-24048) - - + + + + openEuler-HotPatchSA-2023-1001 + An update for kernel is now available for openEuler-22.03-LTS-SP3 + Important + openEuler + + + + + A use-after-free vulnerability in the Linux Kernel io_uring subsystem can be exploited to achieve local privilege escalation.Racing a io_uring cancel poll request with a linked timeout can cause a UAF in a hrtimer.We recommend upgrading past commit ef7dfac51d8ed961b742218f526bd589f3900a59 (4716c73b188566865bdd79c3a6709696a224ac04 for 5.10 stable and 0e388fce7aec40992eadee654193cad345d62663 for 5.15 stable).(CVE-2023-3389) + + openEuler - - patch-redis-6.2.5-1-HP001-1-1.aarch64.rpm - - - patch-redis-6.2.5-1-HP001-1-1.x86_64.rpm - - - patch-redis-6.2.5-1-HP002-1-1.aarch64.rpm + + patch-kernel-5.10.0-153.12.0.92.oe2203sp3-ACC-1-1.aarch64.rpm - - patch-redis-6.2.5-1-HP002-1-1.x86_64.rpm + + patch-kernel-5.10.0-153.12.0.92.oe2203sp3-ACC-1-1.x86_64.rpm - - - - ... - + + + + ``` -### 4、openEuler官网可以查看更新的热补丁信息,以cve编号划分 -![image-20230612113626330](./image/image-20230612113626330.png) -### 5、获取热补丁相关文件 +##### 4、openEuler官网可以查看更新的热补丁信息 + +> openEuler官网安全公告: + +以”HotpatchSA“关键词搜索热补丁安全公告,打开安全公告查看发布热补丁详细信息。 + +![image-20230908163402743](./image/image-20230908163402743.png) + + + +##### 5、获取热补丁相关文件 -社区将热补丁相关文件同步至openEuler的repo源下,可以在各个分支的hotpatch目录下获取相应文件。 +社区将热补丁相关文件同步至openEuler的repo源下,可以在各个分支的hotpatch_update目录下获取相应文件。 > openEuler的repo地址: diff --git "a/docs/zh/docs/A-Ops/\351\205\215\347\275\256\346\272\257\346\272\220\346\234\215\345\212\241\344\275\277\347\224\250\346\211\213\345\206\214.md" "b/docs/zh/docs/A-Ops/\351\205\215\347\275\256\346\272\257\346\272\220\346\234\215\345\212\241\344\275\277\347\224\250\346\211\213\345\206\214.md" index 2d65810c8165b028cea96792b1db600a84c2dd9e..4f9db114389b5f1567f3d965091f0fd5b51972c9 100644 --- "a/docs/zh/docs/A-Ops/\351\205\215\347\275\256\346\272\257\346\272\220\346\234\215\345\212\241\344\275\277\347\224\250\346\211\213\345\206\214.md" +++ "b/docs/zh/docs/A-Ops/\351\205\215\347\275\256\346\272\257\346\272\220\346\234\215\345\212\241\344\275\277\347\224\250\346\211\213\345\206\214.md" @@ -1,151 +1,161 @@ -# gala-ragdoll的使用指导 - -============================ - -## 安装 - -### 手动安装 - -- 通过yum挂载repo源实现 - - 配置yum源:openEuler23.09 和 openEuler23.09:Epol,repo源路径:/etc/yum.repos.d/openEuler.repo。 - - ```ini - [everything] # openEuler 23.09 官方发布源 - name=openEuler23.09 - baseurl=https://repo.openeuler.org/openEuler-23.09/everything/$basearch/ - enabled=1 - gpgcheck=1 - gpgkey=https://repo.openeuler.org/openEuler-23.09/everything/$basearch/RPM-GPG-KEY-openEuler - - [Epol] # openEuler 23.09:Epol 官方发布源 - name=Epol - baseurl=https://repo.openeuler.org/openEuler-23.09/EPOL/main/$basearch/ - enabled=1 - gpgcheck=1 - gpgkey=https://repo.openeuler.org/openEuler-23.09/OS/$basearch/RPM-GPG-KEY-openEuler - ``` - - 然后执行如下指令下载以及安装gala-ragdoll及其依赖。 - - ```shell - yum install gala-ragdoll # A-Ops 配置溯源 - yum install python3-gala-ragdoll - - yum install gala-spider # A-Ops 架构感知 - yum install python3-gala-spider - ``` - -- 通过安装rpm包实现。先下载gala-ragdoll-vx.x.x-x.oe1.aarch64.rpm,然后执行如下命令进行安装(其中x.x-x表示版本号,请用实际情况替代) - - ```shell - rpm -ivh gala-ragdoll-vx.x.x-x.oe1.aarch64.rpm - ``` - -### 使用Aops部署服务安装 - -#### 编辑任务列表 - -修改部署任务列表,打开gala_ragdoll步骤开关: - -```yaml ---- -step_list: - ... - gala_ragdoll: - enable: false - continue: false - ... -``` - -#### 编辑主机清单 - -具体步骤参见[部署管理使用手册](部署管理使用手册.md)章节2.2.2.10章节gala-ragdoll模块主机配置 - -#### 编辑变量列表 - -具体步骤参见[部署管理使用手册](部署管理使用手册.md)章节2.2.2.10章节gala-ragdoll模块变量配置 - -#### 执行部署任务 - -具体步骤参见[部署管理使用手册](部署管理使用手册.md)章节3执行部署任务 - -### 配置文件介绍 - -```/etc/yum.repos.d/openEuler.repo```是用来规定yum源地址的配置文件,该配置文件内容为: - -```shell -[OS] -name=OS -baseurl=http://repo.openeuler.org/openEuler-23.09/OS/$basearch/ -enabled=1 -gpgcheck=1 -gpgkey=http://repo.openeuler.org/openEuler-23.09/OS/$basearch/RPM-GPG-KEY-openEuler -``` - -### yang模型介绍 - -`/etc/yum.repos.d/openEuler.repo`采用yang语言进行表示,参见`gala-ragdoll/yang_modules/openEuler-logos-openEuler.repo.yang`; -其中增加了三个拓展字段: - -| 拓展字段名称 | 拓展字段格式 | 样例 | -| ------------ | ---------------------- | ----------------------------------------- | -| path | OS类型:配置文件的路径 | openEuler:/etc/yum.repos.d/openEuler.repo | -| type | 配置文件类型 | ini、key-value、json、text等 | -| spacer | 配置项和配置值的中间键 | “ ”、“=”、“:”等 | - -附:yang语言的学习地址: - -### 通过配置溯源创建域 - -#### 查看配置文件 - -gala-ragdoll中存在配置溯源的配置文件 - -```shell -[root@openeuler-development-1-1drnd ~]# cat /etc/ragdoll/gala-ragdoll.conf -[git] // 定义当前的git信息:包括git仓的目录和用户信息 -git_dir = "/home/confTraceTestConf" -user_name = "user" -user_email = "email" - -[collect] // A-OPS 对外提供的collect接口 -collect_address = "http://192.168.0.0:11111" -collect_api = "/manage/config/collect" - -[ragdoll] -port = 11114 - -``` - -#### 创建配置域 - -![](./figures/chuangjianyewuyu.png) - -#### 添加配置域纳管node - -![](./figures/tianjianode.png) - -#### 添加配置域配置 - -![](./figures/xinzengpeizhi.png) - -#### 查询预期配置 - -![](./figures/chakanyuqi.png) - -#### 删除配置 - -![](./figures/shanchupeizhi.png) - -#### 查询实际配置 - -![](./figures/chaxunshijipeizhi.png) - -#### 配置校验 - -![](./figures/zhuangtaichaxun.png) - -#### 配置同步 - -![](./figures/peizhitongbu.png) +# gala-ragdoll的使用指导 + +============================ + +## 安装 + +### 手动安装 + +- 通过yum挂载repo源实现 + + 配置yum源:openEuler-24.03-LTS 和 openEuler-24.03-LTS:Epol,repo源路径:/etc/yum.repos.d/openEuler.repo。 + + ```ini + [everything] # openEuler-24.03-LTS 官方发布源 + name=openEuler-24.03-LTS + baseurl=https://repo.openeuler.org/openEuler-24.03-LTS/everything/$basearch/ + enabled=1 + gpgcheck=1 + gpgkey=https://repo.openeuler.org/openEuler-24.03-LTS/everything/$basearch/RPM-GPG-KEY-openEuler + + [Epol] # openEuler-24.03-LTS:Epol 官方发布源 + name=Epol + baseurl=https://repo.openeuler.org/openEuler-24.03-LTS/EPOL/main/$basearch/ + enabled=1 + gpgcheck=1 + gpgkey=https://repo.openeuler.org/openEuler-24.03-LTS/OS/$basearch/RPM-GPG-KEY-openEuler + ``` + + 然后执行如下指令下载以及安装gala-ragdoll及其依赖。 + + ```shell + yum install gala-ragdoll # A-Ops 配置溯源 + yum install python3-gala-ragdoll + + yum install gala-spider # A-Ops 架构感知 + yum install python3-gala-spider + ``` + +- 通过安装rpm包实现。先下载gala-ragdoll-vx.x.x-x.oe1.aarch64.rpm,然后执行如下命令进行安装(其中x.x-x表示版本号,请用实际情况替代) + + ```shell + rpm -ivh gala-ragdoll-vx.x.x-x.oe1.aarch64.rpm + ``` + +### 使用Aops部署服务安装 + +#### 编辑任务列表 + +修改部署任务列表,打开gala_ragdoll步骤开关: + +```yaml +--- +step_list: + ... + gala_ragdoll: + enable: false + continue: false + ... +``` + +#### 编辑主机清单 + +具体步骤参见[部署管理使用手册](部署管理使用手册.md)章节2.2.2.10章节gala-ragdoll模块主机配置 + +#### 编辑变量列表 + +具体步骤参见[部署管理使用手册](部署管理使用手册.md)章节2.2.2.10章节gala-ragdoll模块变量配置 + +#### 执行部署任务 + +具体步骤参见[部署管理使用手册](部署管理使用手册.md)章节3执行部署任务 + +### 配置文件介绍 + +```/etc/yum.repos.d/openEuler.repo```是用来规定yum源地址的配置文件,该配置文件内容为: + +```shell +[OS] +name=OS +baseurl=http://repo.openeuler.org/openEuler-24.03-LTS/OS/$basearch/ +enabled=1 +gpgcheck=1 +gpgkey=http://repo.openeuler.org/openEuler-24.03-LTS/OS/$basearch/RPM-GPG-KEY-openEuler +``` + +### yang模型介绍 + +`/etc/yum.repos.d/openEuler.repo`采用yang语言进行表示,参见`gala-ragdoll/yang_modules/openEuler-logos-openEuler.repo.yang`; +其中增加了三个拓展字段: + +| 拓展字段名称 | 拓展字段格式 | 样例 | +| ------------ | ---------------------- | ----------------------------------------- | +| path | OS类型:配置文件的路径 | openEuler:/etc/yum.repos.d/openEuler.repo | +| type | 配置文件类型 | ini、key-value、json、text等 | +| spacer | 配置项和配置值的中间键 | “ ”、“=”、“:”等 | + +附:yang语言的学习地址: + +### 通过配置溯源创建域 + +#### 查看配置文件 + +gala-ragdoll中存在配置溯源的配置文件 + +```shell +[root@openeuler-development-1-1drnd ~]# cat /etc/ragdoll/gala-ragdoll.conf +[git] // 定义当前的git信息:包括git仓的目录和用户信息 +git_dir = "/home/confTraceTestConf" +user_name = "user" +user_email = "email" + +[collect] // A-OPS 对外提供的collect接口 +collect_address = "http://192.168.0.0:11111" +collect_api = "/manage/config/collect" + +[ragdoll] +port = 11114 + +``` + +#### 创建配置域 + +![](./figures/配置溯源/chuangjianyewuyu.png) + +#### 添加配置域纳管node + +![](./figures/配置溯源/tianjianode.png) + +#### 添加配置域配置 + +![](./figures/配置溯源/xinzengpeizhi.png) + +#### 查询预期配置 + +![](./figures/配置溯源/chakanyuqi.png) + +#### 删除配置 + +![](./figures/配置溯源/shanchupeizhi.png) + +#### 查询实际配置 + +![](./figures/配置溯源/chaxunshijipeizhi.png) + +#### 配置校验 + +![](./figures/配置溯源/zhuangtaichaxun.png) + +#### 配置同步 + +![](./figures/配置溯源/peizhitongbu.png) + +#### 配置文件追溯 + +##### 打开监控开关 + +![](./figures/配置溯源/chuangjianyewuyu.png) + +##### 配置文件修改记录追溯 + +![](./figures/配置溯源/conf_file_trace.png) diff --git "a/docs/zh/docs/A-Tune/\345\256\211\350\243\205\344\270\216\351\203\250\347\275\262.md" "b/docs/zh/docs/A-Tune/\345\256\211\350\243\205\344\270\216\351\203\250\347\275\262.md" index 3bf95e916c14fd7d963f947fac97da32d02aa652..2b9bb426ebb658d6c6f714cb8f47eff78c6274ac 100644 --- "a/docs/zh/docs/A-Tune/\345\256\211\350\243\205\344\270\216\351\203\250\347\275\262.md" +++ "b/docs/zh/docs/A-Tune/\345\256\211\350\243\205\344\270\216\351\203\250\347\275\262.md" @@ -1,38 +1,23 @@ # 安装与部署 -本章介绍如何安装和部署A-Tune。 - -- [安装与部署](#安装与部署) - - [软硬件要求](#软硬件要求) - - [环境准备](#环境准备) - - [安装A-Tune](#安装a-tune) - - [安装模式介绍](#安装模式介绍) - - [安装操作](#安装操作) - - [部署A-Tune](#部署a-tune) - - [配置介绍](#配置介绍) - - [启动A-Tune](#启动a-tune) - - [启动A-Tune engine](#启动a-tune-engine) - +本章介绍如何安装和部署A-Tune。 ## 软硬件要求 ### 硬件要求 -- 鲲鹏920处理器 - -### 软件要求 - -- 操作系统:openEuler 21.03 +- 鲲鹏920处理器 ## 环境准备 -- 安装openEuler系统,安装方法参考 《[安装指南](../Installation/installation.md)》。 +- 安装openEuler系统,安装方法参考 《[安装指南](../Installation/installation.md)》。 -- 安装A-Tune需要使用root权限。 +- 安装A-Tune需要使用root权限。 ## 安装A-Tune 本节介绍A-Tune的安装模式和安装方法。 + ### 安装模式介绍 A-Tune支持单机模式、分布式模式安装和集群模式安装: @@ -61,8 +46,9 @@ A-Tune支持单机模式、分布式模式安装和集群模式安装: 1. 挂载openEuler的iso文件。 ``` - # mount openEuler-22.03-LTS-everything-x86_64-dvd.iso /mnt + # mount openEuler-{version}-everything-x86_64-dvd.iso /mnt ``` + 请安装everything的iso。 2. 配置本地yum源。 @@ -87,10 +73,9 @@ A-Tune支持单机模式、分布式模式安装和集群模式安装: # rpm --import /mnt/RPM-GPG-KEY-openEuler ``` - 4. 安装A-Tune服务端。 - >![](./public_sys-resources/icon-note.gif) **说明:** + >![](./public_sys-resources/icon-note.gif) **说明:** >本步骤会同时安装服务端和客户端软件包,对于单机部署模式,请跳过**步骤5**。 ``` @@ -114,52 +99,52 @@ A-Tune支持单机模式、分布式模式安装和集群模式安装: atune-engine-xxx ``` - ## 部署A-Tune 本节介绍A-Tune的配置部署。 + ### 配置介绍 A-Tune配置文件/etc/atuned/atuned.cnf的配置项说明如下: - A-Tune服务启动配置(可根据需要进行修改)。 - - protocol:系统gRPC服务使用的协议,unix或tcp,unix为本地socket通信方式,tcp为socket监听端口方式。默认为unix。 - - address:系统gRPC服务的侦听地址,默认为unix socket,若为分布式部署,需修改为侦听的ip地址。 - - port:系统gRPC服务的侦听端口,范围为0\~65535未使用的端口。如果protocol配置是unix,则不需要配置。 - - connect:若为集群部署时,A-Tune所在节点的ip列表,ip地址以逗号分隔。 - - rest_host:系统rest service的侦听地址,默认为localhost。 - - rest_port:系统rest service的侦听端口,范围为0~65535未使用的端口,默认为8383。 - - engine_host:与系统atune engine service链接的地址。 - - engine_port:与系统atune engine service链接的端口。 - - sample_num:系统执行analysis流程时采集样本的数量,默认为20。 - - interval:系统执行analysis流程时采集样本的间隔时间,默认为5s。 - - grpc_tls:系统gRPC的SSL/TLS证书校验开关,默认不开启。开启grpc_tls后,atune-adm命令在使用前需要设置以下环境变量方可与服务端进行通讯: - - export ATUNE_TLS=yes - - export ATUNED_CACERT=<客户端CA证书路径> - - export ATUNED_CLIENTCERT=<客户端证书路径> - - export ATUNED_CLIENTKEY=<客户端密钥路径> - - export ATUNED_SERVERCN=server - - tlsservercafile:gRPC服务端CA证书路径。 - - tlsservercertfile:gRPC服务端证书路径。 - - tlsserverkeyfile:gRPC服务端密钥路径。 - - rest_tls:系统rest service的SSL/TLS证书校验开关,默认开启。 - - tlsrestcacertfile:系统rest service的服务端CA证书路径。 - - tlsrestservercertfile:系统rest service的服务端证书路径 - - tlsrestserverkeyfile:系统rest service的服务端密钥路径。 - - engine_tls:系统atune engine service的SSL/TLS证书校验开关,默认开启。 - - tlsenginecacertfile:系统atune engine service的客户端CA证书路径。 - - tlsengineclientcertfile:系统atune engine service的客户端证书路径 - - tlsengineclientkeyfile:系统atune engine service的客户端密钥路径 + - protocol:系统gRPC服务使用的协议,unix或tcp,unix为本地socket通信方式,tcp为socket监听端口方式。默认为unix。 + - address:系统gRPC服务的侦听地址,默认为unix socket,若为分布式部署,需修改为侦听的ip地址。 + - port:系统gRPC服务的侦听端口,范围为0\~65535未使用的端口。如果protocol配置是unix,则不需要配置。 + - connect:若为集群部署时,A-Tune所在节点的ip列表,ip地址以逗号分隔。 + - rest_host:系统rest service的侦听地址,默认为localhost。 + - rest_port:系统rest service的侦听端口,范围为0~65535未使用的端口,默认为8383。 + - engine_host:与系统atune engine service链接的地址。 + - engine_port:与系统atune engine service链接的端口。 + - sample_num:系统执行analysis流程时采集样本的数量,默认为20。 + - interval:系统执行analysis流程时采集样本的间隔时间,默认为5s。 + - grpc_tls:系统gRPC的SSL/TLS证书校验开关,默认不开启。开启grpc_tls后,atune-adm命令在使用前需要设置以下环境变量方可与服务端进行通讯: + - export ATUNE_TLS=yes + - export ATUNED_CACERT=<客户端CA证书路径> + - export ATUNED_CLIENTCERT=<客户端证书路径> + - export ATUNED_CLIENTKEY=<客户端密钥路径> + - export ATUNED_SERVERCN=server + - tlsservercafile:gRPC服务端CA证书路径。 + - tlsservercertfile:gRPC服务端证书路径。 + - tlsserverkeyfile:gRPC服务端密钥路径。 + - rest_tls:系统rest service的SSL/TLS证书校验开关,默认开启。 + - tlsrestcacertfile:系统rest service的服务端CA证书路径。 + - tlsrestservercertfile:系统rest service的服务端证书路径 + - tlsrestserverkeyfile:系统rest service的服务端密钥路径。 + - engine_tls:系统atune engine service的SSL/TLS证书校验开关,默认开启。 + - tlsenginecacertfile:系统atune engine service的客户端CA证书路径。 + - tlsengineclientcertfile:系统atune engine service的客户端证书路径 + - tlsengineclientkeyfile:系统atune engine service的客户端密钥路径 - system信息 system为系统执行相关的优化需要用到的参数信息,必须根据系统实际情况进行修改。 - - disk:执行analysis流程时需要采集的对应磁盘的信息或执行磁盘相关优化时需要指定的磁盘。 - - network:执行analysis时需要采集的对应的网卡的信息或执行网卡相关优化时需要指定的网卡。 + - disk:执行analysis流程时需要采集的对应磁盘的信息或执行磁盘相关优化时需要指定的磁盘。 + - network:执行analysis时需要采集的对应的网卡的信息或执行网卡相关优化时需要指定的网卡。 - - user:执行ulimit相关优化时用到的用户名。目前只支持root用户。 + - user:执行ulimit相关优化时用到的用户名。目前只支持root用户。 - 日志信息 @@ -173,9 +158,8 @@ A-Tune配置文件/etc/atuned/atuned.cnf的配置项说明如下: tuning为系统进行离线调优时需要用到的参数信息。 - - noise:高斯噪声的评估值。 - - sel_feature:控制离线调优参数重要性排名输出的开关,默认关闭。 - + - noise:高斯噪声的评估值。 + - sel_feature:控制离线调优参数重要性排名输出的开关,默认关闭。 ### 配置示例 @@ -275,12 +259,12 @@ A-Tune engine配置文件/etc/atuned/engine.cnf的配置项说明如下: - A-Tune engine服务启动配置(可根据需要进行修改)。 - - engine_host:系统atune engine service的侦听地址,默认为localhost。 - - engine_port:系统atune engine service的侦听端口,范围为0~65535未使用的端口,默认为3838。 - - engine_tls:系统atune engine service的SSL/TLS证书校验开关,默认开启。 - - tlsenginecacertfile:系统atune engine service的服务端CA证书路径。 - - tlsengineservercertfile:系统atune engine service的服务端证书路径 - - tlsengineserverkeyfile:系统atune engine service的服务端密钥路径。 + - engine_host:系统atune engine service的侦听地址,默认为localhost。 + - engine_port:系统atune engine service的侦听端口,范围为0~65535未使用的端口,默认为3838。 + - engine_tls:系统atune engine service的SSL/TLS证书校验开关,默认开启。 + - tlsenginecacertfile:系统atune engine service的服务端CA证书路径。 + - tlsengineservercertfile:系统atune engine service的服务端证书路径 + - tlsengineserverkeyfile:系统atune engine service的服务端密钥路径。 - 日志信息 diff --git "a/docs/zh/docs/AI/AI\345\244\247\346\250\241\345\236\213\346\234\215\345\212\241\351\225\234\345\203\217\344\275\277\347\224\250\346\214\207\345\215\227.md" "b/docs/zh/docs/AI/AI\345\244\247\346\250\241\345\236\213\346\234\215\345\212\241\351\225\234\345\203\217\344\275\277\347\224\250\346\214\207\345\215\227.md" new file mode 100644 index 0000000000000000000000000000000000000000..c7c492b104b74e25ac87980c2d8580885a43df0e --- /dev/null +++ "b/docs/zh/docs/AI/AI\345\244\247\346\250\241\345\236\213\346\234\215\345\212\241\351\225\234\345\203\217\344\275\277\347\224\250\346\214\207\345\215\227.md" @@ -0,0 +1,94 @@ +# 支持百川、chatglm、星火等AI大模型的容器化封装 + +已配好相关依赖,分为CPU和GPU版本,降低使用门槛,开箱即用。 + +## 拉取镜像(CPU版本) + +```bash +docker pull openeuler/llm-server:1.0.0-oe2203sp3 +``` + +## 拉取镜像(GPU版本) + +```bash +docker pull icewangds/llm-server:1.0.0 +``` + +## 下载模型, 并转换为gguf格式 + +```bash +# 安装huggingface +pip install huggingface-hub + +# 下载你想要部署的模型 +export HF_ENDPOINT=https://hf-mirror.com +huggingface-cli download --resume-download baichuan-inc/Baichuan2-13B-Chat --local-dir /root/models/Baichuan2-13B-Chat --local-dir-use-symlinks False + +# gguf格式转换 +cd /root/models/ +git clone https://github.com/ggerganov/llama.cpp.git +python llama.cpp/convert-hf-to-gguf.py ./Baichuan2-13B-Chat +# 生成的gguf格式的模型路径 /root/models/Baichuan2-13B-Chat/ggml-model-f16.gguf +``` + +## 启动方式 + +需要Docker v25.0.0及以上版本。 + +若使用GPU镜像,需要OS上安装nvidia-container-toolkit,安装方式见。 + +docker-compose.yaml: + +```yaml +version: '3' +services: + model: + image: : #镜像名称与tag + restart: on-failure:5 + ports: + - 8001:8000 #监听端口号,修改“8001”以更换端口 + volumes: + - /root/models:/models # 大模型挂载目录 + environment: + - MODEL=/models/Baichuan2-13B-Chat/ggml-model-f16.gguf # 容器内的模型文件路径 + - MODEL_NAME=baichuan13b # 自定义模型名称 + - KEY=sk-12345678 # 自定义API Key + - CONTEXT=8192 # 上下文大小 + - THREADS=8 # CPU线程数,仅CPU部署时需要 + deploy: # 指定GPU资源, 仅GPU部署时需要 + resources: + reservations: + devices: + - driver: nvidia + count: all + capabilities: [gpu] +``` + +```bash +docker-compose -f docker-compose.yaml up +``` + +docker run: + +```text +cpu部署: docker run -d --restart on-failure:5 -p 8001:8000 -v /root/models:/models -e MODEL=/models/Baichuan2-13B-Chat/ggml-model-f16.gguf -e MODEL_NAME=baichuan13b -e KEY=sk-12345678 openeuler/llm-server:1.0.0-oe2203sp3 + +gpu部署: docker run -d --gpus all --restart on-failure:5 -p 8001:8000 -v /root/models:/models -e MODEL=/models/Baichuan2-13B-Chat/ggml-model-f16.gguf -e MODEL_NAME=baichuan13b -e KEY=sk-12345678 icewangds/llm-server:1.0.0 +``` + +## 调用大模型接口测试,成功返回则表示大模型服务已部署成功 + +```bash +curl -X POST http://127.0.0.1:8001/v1/chat/completions \ + -H "Content-Type: application/json" \ + -H "Authorization: Bearer sk-12345678" \ + -d '{ + "model": "baichuan13b", + "messages": [ + {"role": "system", "content": "你是一个社区助手,请回答以下问题。"}, + {"role": "user", "content": "你是谁?"} + ], + "stream": false, + "max_tokens": 1024 + }' +``` diff --git "a/docs/zh/docs/AI/AI\345\256\271\345\231\250\351\225\234\345\203\217\347\224\250\346\210\267\346\214\207\345\215\227.md" "b/docs/zh/docs/AI/AI\345\256\271\345\231\250\351\225\234\345\203\217\347\224\250\346\210\267\346\214\207\345\215\227.md" new file mode 100644 index 0000000000000000000000000000000000000000..53a19a7a261a6bd79e9664204b1cddcabc601cb2 --- /dev/null +++ "b/docs/zh/docs/AI/AI\345\256\271\345\231\250\351\225\234\345\203\217\347\224\250\346\210\267\346\214\207\345\215\227.md" @@ -0,0 +1,110 @@ +# openEuler AI 容器镜像用户指南 + +## 简介 + +openEuler AI 容器镜像封装了不同硬件算力的 SDK 以及 AI 框架、大模型应用等软件,用户只需要在目标环境中加载镜像并启动容器,即可进行 AI 应用开发或使用,大大减少了应用部署和环境配置的时间,提升效率。 + +## 获取镜像 + +目前,openEuler 已发布支持 Ascend 和 NVIDIA 平台的容器镜像,获取路径如下: + +- [openeuler/cann](https://hub.docker.com/r/openeuler/cann) +存放 SDK 类镜像,在 openEuler 基础镜像之上安装 CANN 系列软件,适用于 Ascend 环境。 + +- [openeuler/cuda](https://hub.docker.com/r/openeuler/cuda) +存放 SDK 类镜像,在 openEuler 基础镜像之上安装 CUDA 系列软件,适用于 NVIDIA 环境。 + +- [openeuler/pytorch](https://hub.docker.com/r/openeuler/pytorch) +存放 AI 框架类镜像,在 SDK 镜像基础之上安装 PyTorch,根据安装的 SDK 软件内容区分适用平台。 + +- [openeuler/tensorflow](https://hub.docker.com/r/openeuler/tensorflow) +存放 AI 框架类镜像,在 SDK 镜像基础之上安装 TensorFlow,根据安装的 SDK 软件内容区分适用平台。 + +- [openeuler/llm](https://hub.docker.com/r/openeuler/tensorrt) +存放模型应用类镜像,在 AI 框架镜像之上包含特定大模型及工具链,根据安装的 SDK 软件内容区分适用平台。 + +详细的 AI 容器镜像分类和镜像 tag 的规范说明见[oEEP-0014](https://gitee.com/openeuler/TC/blob/master/oEEP/oEEP-0014%20openEuler%20AI容器镜像软件栈规范.md)。 + +由于 AI 容器镜像的体积一般较大,推荐用户在启动容器前先通过如下命令将镜像拉取到开发环境中。 + +```sh +docker pull image:tag +``` + +其中,`image`为仓库名,如`openeuler/cann`,`tag`为目标镜像的 TAG,待镜像拉取完成后即可启动容器。注意,使用`docker pull`命令需按照下文方法安装`docker`软件。 + +## 启动容器 + +1. 在环境中安装`docker`,官方安装方法见[Install Docker Engine](https://docs.docker.com/engine/install/),也可直接通过如下命令进行安装。 + + ```sh + yum install -y docker + ``` + + 或 + + ```sh + apt-get install -y docker + ``` + +2. NVIDIA环境安装`nvidia-container` + + 1)配置yum或apt repo + - 使用yum安装时,执行: + + ```sh + curl -s -L https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo | \ + sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo + ``` + + - 使用apt安装时,执行: + + ```sh + curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg + ``` + + ```sh + curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \ + sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \ + sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list + ``` + + 2)安装`nvidia-container-toolkit`,`nvidia-container-runtime`,执行: + + ```sh + # yum安装 + yum install -y nvidia-container-toolkit nvidia-container-runtime + ``` + + ```sh + # apt安装 + apt-get install -y nvidia-container-toolkit nvidia-container-runtime + ``` + + 3)配置docker + + ```sh + nvidia-ctk runtime configure --runtime=docker + systemctl restart docker + ``` + + 非NVIDIA环境不执行此步骤。 + +3. 确保环境中安装`driver`及`firmware`,用户可从[NVIDIA](https://www.nvidia.com/)或[Ascend](https://www.hiascend.com/)官网获取正确版本进行安装。安装完成后 Ascend 平台使用`npu-smi`命令、NVIDIA 平台使用`nvidia-smi`进行测试,正确显示硬件信息则说明安装正常。 + +4. 完成上述操作后,即可使用`docker run`命令启动容器。 + +```sh +# Ascend环境启动容器 +docker run --rm --network host \ + --device /dev/davinci0:/dev/davinci0 \ + --device /dev/davinci_manager --device /dev/devmm_svm --device /dev/hisi_hdc \ + -v /usr/local/dcmi:/usr/local/dcmi -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \ + -v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/ \ + -ti image:tag +``` + +```sh +# NVIDIA环境启动容器 +docker run --gpus all -d -ti image:tag +``` diff --git a/docs/zh/docs/AI/openEuler_Copilot_System/README.md b/docs/zh/docs/AI/openEuler_Copilot_System/README.md new file mode 100644 index 0000000000000000000000000000000000000000..3bccd2fec515b2d0d405c7f6dbb4c6300e7f8347 --- /dev/null +++ b/docs/zh/docs/AI/openEuler_Copilot_System/README.md @@ -0,0 +1,44 @@ +# openEuler Copilot System + +## 功能描述 + +openEuler Copilot System 智能问答平台目前支持 Web 和智能 Shell 两个入口。 + +- Web 入口:操作简单,可咨询操作系统相关基础知识,openEuler 动态数据、openEuler 运维问题解决方案、openEuler 项目介绍与使用指导等等。 +- 智能 Shell 入口:自然语言和 openEuler 交互,启发式的运维。 + +## 应用场景 + +- 面向 openEuler 普通用户:深入了解 openEuler 相关知识和动态数据,比如咨询如何迁移到 openEuler。 +- 面向 openEuler 开发者:熟悉 openEuler 开发贡献流程、关键特性、相关项目的开发等知识。 +- 面向 openEuler 运维人员:熟悉 openEuler 常见或疑难问题的解决思路和方案、openEuler 系统管理知识和相关命令。 + +## 用户手册目录 + +### 部署手册 + +- [Web 端部署指南](./部署指南) + - [网络环境下部署指南](./部署指南/网络环境下部署指南.md) + - [无网络环境下部署指南](./部署指南/无网络环境下部署指南.md) + +- [插件部署指南](./部署指南/插件部署指南) + - [智能调优](./部署指南/插件部署指南/智能调优/插件—智能调优部署指南.md) + - [智能诊断](./部署指南/插件部署指南/智能诊断/插件—智能诊断部署指南.md) + - [AI容器栈](./部署指南/插件部署指南/AI容器栈/插件—AI容器栈部署指南.md) + +- [本地资产库构建指南](./部署指南/本地资产库构建指南.md) + +### 使用手册 + +- [管理员:知识库管理](./使用指南/知识库管理/witChainD使用指南.md) + +- [Web 端使用手册](./使用指南/线上服务/前言.md) + - [注册与登录](./使用指南/线上服务/注册与登录.md) + - [智能问答](./使用指南/线上服务/智能问答使用指南.md) + - [智能插件](./使用指南/线上服务/智能插件简介.md) + +- [智能 Shell 使用手册](./使用指南/命令行客户端/命令行助手使用指南.md) + - [准备工作:获取 API Key](./使用指南/命令行客户端/获取%20API%20Key.md) + - [智能插件](./使用指南/命令行客户端/命令行助手使用指南.md#智能插件) + - [智能调优](./使用指南/命令行客户端/智能调优.md) + - [智能诊断](./使用指南/命令行客户端/智能诊断.md) diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-chat-ask.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-chat-ask.png" new file mode 100644 index 0000000000000000000000000000000000000000..00d5cf5ecf894dd62366ec086bf96eae532f0b5d Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-chat-ask.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-chat-continue-result.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-chat-continue-result.png" new file mode 100644 index 0000000000000000000000000000000000000000..f30f9fe7a015e775742bc184b8ac75790dc482fa Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-chat-continue-result.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-chat-continue.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-chat-continue.png" new file mode 100644 index 0000000000000000000000000000000000000000..7e4801504fd53fab989574416e6220c4fa3f1d38 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-chat-continue.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-chat-exit.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-chat-exit.png" new file mode 100644 index 0000000000000000000000000000000000000000..0bb81190a3039f6c5a311b365376ec230c1ad4b5 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-chat-exit.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-cmd-edit-result.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-cmd-edit-result.png" new file mode 100644 index 0000000000000000000000000000000000000000..c5e6f8245e7d66cdbe5370f18d15a791a33a517a Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-cmd-edit-result.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-cmd-edit.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-cmd-edit.png" new file mode 100644 index 0000000000000000000000000000000000000000..bb6209373a6d2a1881728bee352e7c3b46cc91d7 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-cmd-edit.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-cmd-exec-multi-select.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-cmd-exec-multi-select.png" new file mode 100644 index 0000000000000000000000000000000000000000..2dda108a39af54fc15a4ff8c0dca107de38b9cf0 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-cmd-exec-multi-select.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-cmd-exec-result.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-cmd-exec-result.png" new file mode 100644 index 0000000000000000000000000000000000000000..f4fff6a62b8b4220b52fdf55b133f2ba37850569 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-cmd-exec-result.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-cmd-explain-result.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-cmd-explain-result.png" new file mode 100644 index 0000000000000000000000000000000000000000..707dd36aa7c7eadae4f29254cf5fc18ce877f597 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-cmd-explain-result.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-cmd-explain-select.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-cmd-explain-select.png" new file mode 100644 index 0000000000000000000000000000000000000000..bf58b69e241ea11a6945f21e3fc69d22a401be2e Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-cmd-explain-select.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-cmd-interact.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-cmd-interact.png" new file mode 100644 index 0000000000000000000000000000000000000000..00bb3a288fbd2fb962b08f34fbe90c733afe0343 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-cmd-interact.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-cmd.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-cmd.png" new file mode 100644 index 0000000000000000000000000000000000000000..619172c8ed60a7b536364944a306fbf76fcbfb1f Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-cmd.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-help.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-help.png" new file mode 100644 index 0000000000000000000000000000000000000000..97d0dedd3f7b1c749bc5fded471744923d766b8b Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-help.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-init.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-init.png" new file mode 100644 index 0000000000000000000000000000000000000000..bbb2257eb1ff2bfec36110409fc6c55a26386c9e Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-init.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-plugin-diagnose-detail.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-plugin-diagnose-detail.png" new file mode 100644 index 0000000000000000000000000000000000000000..7bd624e025eaae4b77c603d88bf1b9ad5e235fe7 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-plugin-diagnose-detail.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-plugin-diagnose-detect.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-plugin-diagnose-detect.png" new file mode 100644 index 0000000000000000000000000000000000000000..2b38259ff0c1c7045dbff9abf64f36a109a3377b Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-plugin-diagnose-detect.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-plugin-diagnose-profiling.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-plugin-diagnose-profiling.png" new file mode 100644 index 0000000000000000000000000000000000000000..0e63c01f35dbc291f805b56de749eac09e0a079d Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-plugin-diagnose-profiling.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-plugin-diagnose-report.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-plugin-diagnose-report.png" new file mode 100644 index 0000000000000000000000000000000000000000..c16f0184a2ad3d2468466b33d0e861d2a31bc4e2 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-plugin-diagnose-report.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-plugin-diagnose-switch-mode.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-plugin-diagnose-switch-mode.png" new file mode 100644 index 0000000000000000000000000000000000000000..165c6c453353b70c3e1e2cb07d7f43d5ee3525e3 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-plugin-diagnose-switch-mode.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-plugin-result.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-plugin-result.png" new file mode 100644 index 0000000000000000000000000000000000000000..3e3f45a974a0700d209f7d30af89eb2050a392d6 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-plugin-result.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-plugin-select.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-plugin-select.png" new file mode 100644 index 0000000000000000000000000000000000000000..13959203c77eaa9f41051897cf9e847ff3642a8a Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-plugin-select.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-plugin-tuning-metrics-collect.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-plugin-tuning-metrics-collect.png" new file mode 100644 index 0000000000000000000000000000000000000000..4d5678b7f77b05d48552fcb9656f4a4372dbbe61 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-plugin-tuning-metrics-collect.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-plugin-tuning-report.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-plugin-tuning-report.png" new file mode 100644 index 0000000000000000000000000000000000000000..01daaa9a84c13158a95afddffeb8a7e3303f1e76 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-plugin-tuning-report.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-plugin-tuning-script-exec.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-plugin-tuning-script-exec.png" new file mode 100644 index 0000000000000000000000000000000000000000..0b694c3fba6918ef39cca977b2072b2913d12b95 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-plugin-tuning-script-exec.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-plugin-tuning-script-gen.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-plugin-tuning-script-gen.png" new file mode 100644 index 0000000000000000000000000000000000000000..6e95551767e213f59669d03fd4cceba05801a983 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-plugin-tuning-script-gen.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-plugin-tuning-script-view.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-plugin-tuning-script-view.png" new file mode 100644 index 0000000000000000000000000000000000000000..c82c77bf6f4e4e19f400395aaadc9f99dc8d373c Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-plugin-tuning-script-view.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-plugin-tuning-switch-mode.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-plugin-tuning-switch-mode.png" new file mode 100644 index 0000000000000000000000000000000000000000..0f06c803ea3621a0f4fb83bbbe731e2bb4bba788 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-plugin-tuning-switch-mode.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-plugin.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-plugin.png" new file mode 100644 index 0000000000000000000000000000000000000000..4c1afd306a6aee029f5bda38aa7b1fce57227e31 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/pictures/shell-plugin.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/\345\221\275\344\273\244\350\241\214\345\212\251\346\211\213\344\275\277\347\224\250\346\214\207\345\215\227.md" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/\345\221\275\344\273\244\350\241\214\345\212\251\346\211\213\344\275\277\347\224\250\346\214\207\345\215\227.md" new file mode 100644 index 0000000000000000000000000000000000000000..751a5f48d6fcdefaa5b2ed13b56915b4459d600c --- /dev/null +++ "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/\345\221\275\344\273\244\350\241\214\345\212\251\346\211\213\344\275\277\347\224\250\346\214\207\345\215\227.md" @@ -0,0 +1,169 @@ +# 命令行助手使用指南 + +## 简介 + +openEuler Copilot System 命令行助手是一个命令行(Shell)AI 助手,您可以通过它来快速生成 Shell 命令并执行,从而提高您的工作效率。除此之外,基于 Gitee AI 在线服务的标准版本还内置了 openEuler 的相关知识,可以助力您学习与使用 openEuler 操作系统。 + +## 环境要求 + +- 操作系统:openEuler 22.03 LTS SP3,或者 openEuler 24.03 LTS 及以上版本 +- 命令行软件: + - Linux 桌面环境:支持 GNOME、KDE、DDE 等桌面环境的内置终端 + - 远程 SSH 链接:支持兼容 xterm-256 与 UTF-8 字符集的终端 + +## 安装 + +openEuler Copilot System 命令行助手支持通过 OEPKGS 仓库进行安装。 + +### 配置 OEPKGS 仓库 + +```bash +sudo dnf config-manager --add-repo https://repo.oepkgs.net/openeuler/rpm/`sed 's/release //;s/[()]//g;s/ /-/g' /etc/openEuler-release`/extras/`uname -m` +``` + +```bash +sudo dnf clean all +``` + +```bash +sudo dnf makecache +``` + +### 安装命令行助手 + +```bash +sudo dnf install eulercopilot-cli +``` + +若遇到 `Error: GPG check FAILED` 错误,使用 `--nogpgcheck` 跳过检查。 + +```bash +sudo dnf install --nogpgcheck eulercopilot-cli +``` + +## 初始化 + +```bash +copilot --init +``` + +然后根据提示输入 API Key 完成配置。 + +![shell-init](./pictures/shell-init.png) + +初次使用前请先退出终端或重新连接 SSH 会话使配置生效。 + +- **查看助手帮助页面** + + ```bash + copilot --help + ``` + + ![shell-help](./pictures/shell-help.png) + +## 使用 + +在终端中输入问题,按下 `Ctrl + O` 提问。 + +### 快捷键 + +- 输入自然语言问题后,按下 `Ctrl + O` 可以直接向 AI 提问。 +- 直接按下 `Ctrl + O` 可以自动填充命令前缀 `copilot`,输入参数后按下 `Enter` 即可执行。 + +### 智能问答 + +命令行助手初始化完成后,默认处于智能问答模式。 +命令提示符**左上角**会显示当前模式。 +若当前模式不是“智能问答”,执行 `copilot -c` (`copilot --chat`) 切换到智能问答模式。 + +![chat-ask](./pictures/shell-chat-ask.png) + +AI 回答完毕后,会根据历史问答生成推荐问题,您可以复制、粘贴到命令行中进行追问。输入追问的问题后,按下 `Enter` 提问。 + +![chat-next](./pictures/shell-chat-continue.png) + +![chat-next-result](./pictures/shell-chat-continue-result.png) + +智能问答模式下支持连续追问,每次追问最多可以关联3条历史问答的上下文。 + +输入 `exit` 可以退出智能问答模式,回到 Linux 命令行。 + +![chat-exit](./pictures/shell-chat-exit.png) + +- 若问答过程中遇到程序错误,可以按下 `Ctrl + C` 立即退出当前问答,再尝试重新提问。 + +### Shell 命令 + +AI 会根据您的问题返回 Shell 命令,openEuler Copilot System 命令行助手可以解释、编辑或执行这些命令,并显示命令执行结果。 + +![shell-cmd](./pictures/shell-cmd.png) + +命令行助手会自动提取 AI 回答中的命令,并显示相关操作。您可以通过键盘上下键选择操作,按下 `Enter` 确认。 + +![shell-cmd-interact](./pictures/shell-cmd-interact.png) + +#### 解释 + +如果 AI 仅返回了一条命令,选择解释后会直接请求 AI 解释命令,并显示回答。 +若 AI 回答了多条命令,选择后会显示命令列表,您每次可以选择**一条**请求 AI 解释。 + +![shell-cmd-explain-select](./pictures/shell-cmd-explain-select.png) + +完成解释后,您可以继续选择其他操作。 + +![shell-cmd-explain-result](./pictures/shell-cmd-explain-result.png) + +#### 编辑 + +![shell-cmd-edit](./pictures/shell-cmd-edit.png) + +选择一条命令进行编辑,编辑完成后按下 `Enter` 确认。 + +![shell-cmd-edit-result](./pictures/shell-cmd-edit-result.png) + +完成编辑后,您可以继续编辑其他命令或选择其他操作。 + +#### 执行 + +如果 AI 仅返回了一条命令,选择执行后会直接执行命令,并显示执行结果。 +若 AI 回答了多条命令,选择后会显示命令列表,您每次可以选择**多条**命令来执行。 + +您可以通过键盘上下键移动光标,按下 `空格键` 选择命令,按下 `Enter` 执行所选命令。 +被选中的命令会显示**蓝色高亮**,如图所示。 + +![shell-cmd-exec-multi-select](./pictures/shell-cmd-exec-multi-select.png) + +若不选择任何命令,直接按下 `Enter`,则会跳过执行命令,直接进入下一轮问答。 + +按下 `Enter` 后,被选中的命令会从上到下依次执行。 + +![shell-cmd-exec-result](./pictures/shell-cmd-exec-result.png) + +若执行过程中遇到错误,命令行助手会显示错误信息,并**终止执行命令**,进入下一轮问答。 +您可以在下一轮问答中提示 AI 更正命令,或要求 AI 重新生成命令。 + +### 智能插件 + +在 Linux 命令行中执行 `copilot -p` (`copilot --plugin`) 切换到智能插件模式。 + +![shell-plugin](./pictures/shell-plugin.png) + +输入问题并按下 `Ctrl + O` 提问后,从列表中选择插件,按下 `Enter` 调用插件回答问题。 + +![shell-plugin-select](./pictures/shell-plugin-select.png) + +![shell-plugin-result](./pictures/shell-plugin-result.png) + +## 卸载 + +```bash +sudo dnf remove eulercopilot-cli +``` + +然后使用以下命令删除配置文件。 + +```bash +rm ~/.config/eulercopilot/config.json +``` + +卸载完成后请重启终端或重新连接 SSH 会话使配置还原。 diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/\346\231\272\350\203\275\350\257\212\346\226\255.md" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/\346\231\272\350\203\275\350\257\212\346\226\255.md" new file mode 100644 index 0000000000000000000000000000000000000000..eb999cb5483620450b2e2aea77a818382aeca2a4 --- /dev/null +++ "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/\346\231\272\350\203\275\350\257\212\346\226\255.md" @@ -0,0 +1,50 @@ +# 智能插件:智能诊断 + +部署智能诊断工具后,可以通过 openEuler Copilot System 智能体框架实现对本机进行诊断。 +在智能诊断模式提问,智能体框架服务可以调用本机的诊断工具诊断异常状况、分析并生成报告。 + +## 操作步骤 + +**步骤1** 切换到“智能插件”模式 + +```bash +copilot -p +``` + +![切换到智能插件模式](./pictures/shell-plugin-diagnose-switch-mode.png) + +**步骤2** 异常事件检测 + +```bash +帮我进行异常事件检测 +``` + +按下 `Ctrl + O` 键提问,然后在插件列表中选择“智能诊断”。 + +![异常事件检测](./pictures/shell-plugin-diagnose-detect.png) + +**步骤3** 查看异常事件详情 + +```bash +查看 XXX 容器的异常事件详情 +``` + +![查看异常事件详情](./pictures/shell-plugin-diagnose-detail.png) + +**步骤4** 执行异常事件分析 + +```bash +请对 XXX 容器的 XXX 指标执行 profiling 分析 +``` + +![异常事件分析](./pictures/shell-plugin-diagnose-profiling.png) + +**步骤5** 查看异常事件分析报告 + +等待 5 至 10 分钟,然后查看分析报告。 + +```bash +查看 对应的 profiling 报告 +``` + +![执行优化脚本](./pictures/shell-plugin-diagnose-report.png) diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/\346\231\272\350\203\275\350\260\203\344\274\230.md" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/\346\231\272\350\203\275\350\260\203\344\274\230.md" new file mode 100644 index 0000000000000000000000000000000000000000..b5c40581668ae4f6074043e62a93b2c4b240e5b3 --- /dev/null +++ "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/\346\231\272\350\203\275\350\260\203\344\274\230.md" @@ -0,0 +1,53 @@ +# 智能插件:智能调优 + +部署智能调优工具后,可以通过 openEuler Copilot System 智能体框架实现对本机进行调优。 +在智能调优模式提问,智能体框架服务可以调用本机的调优工具采集性能指标,并生成性能分析报告和性能优化建议。 + +## 操作步骤 + +**步骤1** 切换到“智能调优”模式 + +```bash +copilot -t +``` + +![切换到智能调优模式](./pictures/shell-plugin-tuning-switch-mode.png) + +**步骤2** 采集性能指标 + +```bash +帮我进行性能指标采集 +``` + +![性能指标采集](./pictures/shell-plugin-tuning-metrics-collect.png) + +**步骤3** 生成性能分析报告 + +```bash +帮我生成性能分析报告 +``` + +![性能分析报告](./pictures/shell-plugin-tuning-report.png) + +**步骤4** 生成性能优化建议 + +```bash +请生成性能优化脚本 +``` + +![性能优化脚本](./pictures/shell-plugin-tuning-script-gen.png) + +**步骤5** 选择“执行命令”,运行优化脚本 + +![执行优化脚本](./pictures/shell-plugin-tuning-script-exec.png) + +- 脚本内容如图: + ![优化脚本内容](./pictures/shell-plugin-tuning-script-view.png) + +## 远程调优 + +如果需要对其他机器进行远程调优,请在上文示例的问题前面加上对应机器的 IP 地址。 + +例如:`请对 192.168.1.100 这台机器进行性能指标采集。` + +进行远程调优前请确保目标机器已部署智能调优工具,同时请确保 openEuler Copilot System 智能体框架能够访问目标机器。 diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/\350\216\267\345\217\226 API Key.md" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/\350\216\267\345\217\226 API Key.md" new file mode 100644 index 0000000000000000000000000000000000000000..d6f9f21eafcaaaedb938426cbe18d9346e5e1617 --- /dev/null +++ "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\345\221\275\344\273\244\350\241\214\345\256\242\346\210\267\347\253\257/\350\216\267\345\217\226 API Key.md" @@ -0,0 +1,28 @@ +# 获取 API Key + +## 前言 + +openEuler Copilot System 命令行助手使用 API Key 来验证用户身份,并获取 API 访问权限。 +因此,开始使用前,您需要先获取 API Key。 + +## 注意事项 + +- 请妥善保管您的 API Key,不要泄露给他人。 +- API Key 仅用于命令行助手与 DevStation 桌面端,不用于其他用途。 +- 每位用户仅可拥有一个 API Key,重复创建 API Key 将导致旧密钥失效。 +- API Key 仅在创建时显示一次,请务必及时保存。若密钥丢失,您需要重新创建。 +- 若您在使用过程中遇到“请求过于频繁”的错误,您的 API Key 可能已被他人使用,请及时前往官网刷新或撤销 API Key。 + +## 获取方法 + +1. 登录 openEuler Copilot System 网页端。 +2. 点击右上角头像,选择“API KEY”。 +3. 点击“新建”按钮。 +4. **请立即保存 API Key,它仅在创建时显示一次,请勿泄露给他人。** + +## 管理 API Key + +1. 登录 openEuler Copilot System 网页端。 +2. 点击右上角头像,选择“API KEY”。 +3. 点击“刷新”按钮,刷新 API Key;点击“撤销”按钮,撤销 API Key。 + - 刷新 API Key 后,旧密钥失效,请立即保存新生成的 API Key。 diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\345\257\274\345\205\245\346\226\207\346\241\243.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\345\257\274\345\205\245\346\226\207\346\241\243.png" new file mode 100644 index 0000000000000000000000000000000000000000..3d6818a10a728cd8bf7bd15b6f4f1a8e7817e9c4 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\345\257\274\345\205\245\346\226\207\346\241\243.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\345\257\274\345\207\272\350\265\204\344\272\247\345\272\223.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\345\257\274\345\207\272\350\265\204\344\272\247\345\272\223.png" new file mode 100644 index 0000000000000000000000000000000000000000..73f3d3b92800e51bf00c9b71c82d76cabd5352de Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\345\257\274\345\207\272\350\265\204\344\272\247\345\272\223.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\346\211\271\351\207\217\345\220\257\347\224\250.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\346\211\271\351\207\217\345\220\257\347\224\250.png" new file mode 100644 index 0000000000000000000000000000000000000000..3cf960c771ae2ce533f311a55584734c7853f07c Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\346\211\271\351\207\217\345\220\257\347\224\250.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\346\211\271\351\207\217\345\257\274\345\205\245\350\265\204\344\272\247\345\272\223.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\346\211\271\351\207\217\345\257\274\345\205\245\350\265\204\344\272\247\345\272\223.png" new file mode 100644 index 0000000000000000000000000000000000000000..e08bc79f363a862e2a0f3780487c5614c6415b64 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\346\211\271\351\207\217\345\257\274\345\205\245\350\265\204\344\272\247\345\272\223.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\346\220\234\347\264\242\346\226\207\346\241\243.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\346\220\234\347\264\242\346\226\207\346\241\243.png" new file mode 100644 index 0000000000000000000000000000000000000000..7f71660723fcc451152b73e12a0c630604efa390 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\346\220\234\347\264\242\346\226\207\346\241\243.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\346\226\207\346\234\254\345\235\227\347\273\223\346\236\234\351\242\204\350\247\210.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\346\226\207\346\234\254\345\235\227\347\273\223\346\236\234\351\242\204\350\247\210.png" new file mode 100644 index 0000000000000000000000000000000000000000..05e003a48f4fb0a452448b0dc8bf74b598e6936e Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\346\226\207\346\234\254\345\235\227\347\273\223\346\236\234\351\242\204\350\247\210.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\346\226\207\346\241\243\347\256\241\347\220\206\347\225\214\351\235\242.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\346\226\207\346\241\243\347\256\241\347\220\206\347\225\214\351\235\242.png" new file mode 100644 index 0000000000000000000000000000000000000000..c17ea11b55489c10fa52eae2e9d8915313e3d39e Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\346\226\207\346\241\243\347\256\241\347\220\206\347\225\214\351\235\242.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\346\226\207\346\241\243\350\247\243\346\236\220.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\346\226\207\346\241\243\350\247\243\346\236\220.png" new file mode 100644 index 0000000000000000000000000000000000000000..2524ce76edb826092b5dc9611d64537bed08b3ec Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\346\226\207\346\241\243\350\247\243\346\236\220.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\346\226\207\346\241\243\350\247\243\346\236\2202.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\346\226\207\346\241\243\350\247\243\346\236\2202.png" new file mode 100644 index 0000000000000000000000000000000000000000..30dd2f5bef9b23c3dceb92b63817898076096a49 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\346\226\207\346\241\243\350\247\243\346\236\2202.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\346\226\260\345\242\236\350\265\204\344\272\247\345\272\223.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\346\226\260\345\242\236\350\265\204\344\272\247\345\272\223.png" new file mode 100644 index 0000000000000000000000000000000000000000..d728d99741a03ff2f82e2c59bd424b848614aebe Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\346\226\260\345\242\236\350\265\204\344\272\247\345\272\223.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\346\250\241\345\236\213\351\205\215\347\275\256.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\346\250\241\345\236\213\351\205\215\347\275\256.png" new file mode 100644 index 0000000000000000000000000000000000000000..97a489cc7637416306a88394a3faa7fa47cf9b95 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\346\250\241\345\236\213\351\205\215\347\275\256.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\347\274\226\350\276\221\346\226\207\346\241\243\351\205\215\347\275\256.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\347\274\226\350\276\221\346\226\207\346\241\243\351\205\215\347\275\256.png" new file mode 100644 index 0000000000000000000000000000000000000000..bd0ed29ba5d6a4eb4dca5851b8469bd161f70300 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\347\274\226\350\276\221\346\226\207\346\241\243\351\205\215\347\275\256.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\347\274\226\350\276\221\350\265\204\344\272\247\345\272\223.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\347\274\226\350\276\221\350\265\204\344\272\247\345\272\223.png" new file mode 100644 index 0000000000000000000000000000000000000000..3488720160efd58d2fd1f46046f04296f552b4d6 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\347\274\226\350\276\221\350\265\204\344\272\247\345\272\223.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\347\274\226\350\276\221\350\265\204\344\272\247\345\272\2230.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\347\274\226\350\276\221\350\265\204\344\272\247\345\272\2230.png" new file mode 100644 index 0000000000000000000000000000000000000000..64d0cc3f8637592007503972267751f2bbe87b96 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\347\274\226\350\276\221\350\265\204\344\272\247\345\272\2230.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\347\274\226\350\276\221\350\265\204\344\272\247\345\272\223\351\205\215\347\275\256.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\347\274\226\350\276\221\350\265\204\344\272\247\345\272\223\351\205\215\347\275\256.png" new file mode 100644 index 0000000000000000000000000000000000000000..e91dd94c7dc0a71e3f3ddee47c3d21926c27e619 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\347\274\226\350\276\221\350\265\204\344\272\247\345\272\223\351\205\215\347\275\256.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\350\247\243\346\236\220\345\256\214\346\210\220.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\350\247\243\346\236\220\345\256\214\346\210\220.png" new file mode 100644 index 0000000000000000000000000000000000000000..9e9968fc2e71ace3a58ec454e19b25bcd961f0c0 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\350\247\243\346\236\220\345\256\214\346\210\220.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\350\265\204\344\272\247\345\272\223\347\256\241\347\220\206\347\225\214\351\235\242.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\350\265\204\344\272\247\345\272\223\347\256\241\347\220\206\347\225\214\351\235\242.png" new file mode 100644 index 0000000000000000000000000000000000000000..33b9a3e0852f8e5ae1e95da572dcfc13f6d59da2 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/pictures/\350\265\204\344\272\247\345\272\223\347\256\241\347\220\206\347\225\214\351\235\242.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/witChainD\344\275\277\347\224\250\346\214\207\345\215\227.md" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/witChainD\344\275\277\347\224\250\346\214\207\345\215\227.md" new file mode 100644 index 0000000000000000000000000000000000000000..4759a57baa4e35ee529e9f4da70e1d1405612e6e --- /dev/null +++ "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\237\245\350\257\206\345\272\223\347\256\241\347\220\206/witChainD\344\275\277\347\224\250\346\214\207\345\215\227.md" @@ -0,0 +1,87 @@ +# witChainD 使用指南——知识库管理 + +完成 witChainD 部署之后,即可使用 witChainD 进行知识库管理管理。 + +下文会从页面的维度进行 witChainD 的功能介绍。 + +## 1. 资产库管理界面 + +该页面为资产库管理界面,用户登录后将会进入该界面。 + +![资产库管理界面](./pictures/资产库管理界面.png) + +**支持操作:** + +- 配置模型:点击右上角的设置按键,可以修改模型相关的配置。 + + ![模型配置](./pictures/模型配置.png) + +- 新增资产库:点击新建资产库按钮新建,支持自定义名称、描述、语言、嵌入模型、解析方法、文件分块大小、文档类别。注意:重复名称会自动将名称修改成资产库id。 + + ![新增资产库](./pictures/新增资产库.png) + +- 编辑资产库:点击资产库的编辑按钮编辑,支持修改名称、描述、语言、解析方法、文件分块大小、文档类别。注意:不能修改成重复名称。 + + ![编辑资产库](./pictures/编辑资产库0.png) + + ![编辑资产库](./pictures/编辑资产库.png) + +- 导出资产库:点击资产库的导出按钮导出,导出完成后需要按任务列表中的下载任务下载对应资产库到本地。 + + ![导出资产库](./pictures/导出资产库.png) + +- 批量导入资产库:点击批量导入,上传本地文件后选中即可导入。 + + ![批量导入资产库](./pictures/批量导入资产库.png) + +- 搜索资产库:在搜索栏中键入文本,可以搜索得到名称包含对应文本的资产库。 + +## 2. 文档管理界面 + +在资产管理界面点击对应资产库,可以进入文档管理界面。 + +![文档管理界面](./pictures/文档管理界面.png) + +**支持操作:** + +- 导入文档:点击导入文档,从本地上传文件导入,导入后会自动以该资产库默认配置开始解析。 + + ![导入文档](./pictures/导入文档.png) + +- 解析文档:点击操作中的解析,对文档进行解析。也可以选中多个文档批量解析。 + + ![文档解析](./pictures/文档解析.png) + + ![文档解析2](./pictures/文档解析2.png) + + ![解析完成](./pictures/解析完成.png) + +- 编辑文档配置:点击编辑对文档配置进行编辑,支持编辑文档名称、解析方法、类别、文件分块大小。 + + ![编辑文档配置](./pictures/编辑文档配置.png) + +- 下载文档:点击下载即可将文档下载至本地,也可以选中多个文档批量下载。 + +- 删除文档:点击删除即可将文档从资产库中删除,也可以选中多个文档批量删除。 + +- 搜索文档:点击文档名称旁的搜索键,在弹出的搜索框中键入搜索的文本,可以搜索得到名称包含这些文本的文档。 + + ![搜索文档](./pictures/搜索文档.png) + +- 编辑资产库配置:支持编辑资产库名称、描述、语言、默认解析方法、文件分块大小、文档信息类别。 + + ![编辑资产库配置](./pictures/编辑资产库配置.png) + +## 3. 解析结果管理界面 + +点击解析完成的文档,可以进入文档的解析结果管理界面。界面中会按照顺序显示文档解析后的文本块内容预览,每个文本块会附带一个标签,表示该文本块中的信息来源于文档中的段落、列表或者是图片。右侧的开关表示该文本块是否被启用。 + +![文本块结果预览](./pictures/文本块结果预览.png) + +**支持操作**: + +- 关闭/启用文本块:点击文本块右侧的开关即可关闭/启用对应文本块,也可以选中多个文本块批量关闭/启用。 + + ![批量启用](./pictures/批量启用.png) + +- 搜索文本块:在搜索框中键入内容,可以查找包含对应内容的文本块。 diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/authhub-login-click2signup.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/authhub-login-click2signup.png" new file mode 100644 index 0000000000000000000000000000000000000000..6e6f96b4a902d04c67eb2e299ad038423dcb04c7 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/authhub-login-click2signup.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/authhub-login.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/authhub-login.png" new file mode 100644 index 0000000000000000000000000000000000000000..b5ea5a7577f2ce19fad4df5274847676134d95e0 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/authhub-login.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/authhub-signup.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/authhub-signup.png" new file mode 100644 index 0000000000000000000000000000000000000000..c20a54d270988f56039a2b93eca6aa369d048884 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/authhub-signup.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/bulk-delete-confirmation.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/bulk-delete-confirmation.png" new file mode 100644 index 0000000000000000000000000000000000000000..3cc5a6a25618eff0bfa9807e1c19e4f88edc7da4 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/bulk-delete-confirmation.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/bulk-delete-multi-select.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/bulk-delete-multi-select.png" new file mode 100644 index 0000000000000000000000000000000000000000..772c51d903531cfe74245f08ecdca06d4677f935 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/bulk-delete-multi-select.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/bulk-delete.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/bulk-delete.png" new file mode 100644 index 0000000000000000000000000000000000000000..929230cd06cc792b633ab183155225926d2c300d Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/bulk-delete.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/chat-area.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/chat-area.png" new file mode 100644 index 0000000000000000000000000000000000000000..966432e02f08a6c769e8cd87b0468bd25f257f5e Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/chat-area.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/context-support.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/context-support.png" new file mode 100644 index 0000000000000000000000000000000000000000..0bd5f091d0eff34d9b5f36eec6df63b191656daa Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/context-support.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/delete-session-confirmation.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/delete-session-confirmation.png" new file mode 100644 index 0000000000000000000000000000000000000000..729096bdae14895b81e8725a8065d1f4bfcdbf6c Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/delete-session-confirmation.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/delete-session.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/delete-session.png" new file mode 100644 index 0000000000000000000000000000000000000000..596af33f7be41d456a57e6a297820530f8485f34 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/delete-session.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/feedback-illegal.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/feedback-illegal.png" new file mode 100644 index 0000000000000000000000000000000000000000..b6e84ba45977d911db960da97bdff714624ba18c Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/feedback-illegal.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/feedback-misinfo.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/feedback-misinfo.png" new file mode 100644 index 0000000000000000000000000000000000000000..cc5505226add1e6fbde7b93ff09877038e8cfdce Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/feedback-misinfo.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/feedback.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/feedback.png" new file mode 100644 index 0000000000000000000000000000000000000000..9fe1c27acb57d4d24a26c8dde61ee4272f954e46 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/feedback.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/file-upload-ask-against-file.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/file-upload-ask-against-file.png" new file mode 100644 index 0000000000000000000000000000000000000000..2cf2c5e50c8c02c4c2713fde63c7e11c110c8bb2 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/file-upload-ask-against-file.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/file-upload-btn-prompt.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/file-upload-btn-prompt.png" new file mode 100644 index 0000000000000000000000000000000000000000..45e38672d0c46ccc2ded83669875f7c832f2c73d Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/file-upload-btn-prompt.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/file-upload-btn.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/file-upload-btn.png" new file mode 100644 index 0000000000000000000000000000000000000000..2f6a7cee51e2cb02b52baf6ffa7136f5601a26e1 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/file-upload-btn.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/file-upload-history-tag.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/file-upload-history-tag.png" new file mode 100644 index 0000000000000000000000000000000000000000..487a48e6f72cbe8f115d8ce2001808b9b4a74dec Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/file-upload-history-tag.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/file-upload-parsing.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/file-upload-parsing.png" new file mode 100644 index 0000000000000000000000000000000000000000..812090a59ee3594b11ecfcb55cc7a8b7361ca2bb Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/file-upload-parsing.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/file-upload-showcase.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/file-upload-showcase.png" new file mode 100644 index 0000000000000000000000000000000000000000..60234df165d16abb976ffdf74d0b1ad890387e57 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/file-upload-showcase.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/file-upload-uploading.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/file-upload-uploading.png" new file mode 100644 index 0000000000000000000000000000000000000000..7f29ba755ce71d08098d0d5950239b69e1d7f16a Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/file-upload-uploading.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/icon-arrow-next.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/icon-arrow-next.png" new file mode 100644 index 0000000000000000000000000000000000000000..1a36c84e0965f9dbf1f90e9a3daadcd1a2560951 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/icon-arrow-next.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/icon-arrow-prev.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/icon-arrow-prev.png" new file mode 100644 index 0000000000000000000000000000000000000000..eb667e93cc6d51aa191a0ac7607e72d4d6923cbc Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/icon-arrow-prev.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/icon-cancel.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/icon-cancel.png" new file mode 100644 index 0000000000000000000000000000000000000000..34d4454b6f92ee12db6841dafe0e94a12c3b9584 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/icon-cancel.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/icon-confirm.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/icon-confirm.png" new file mode 100644 index 0000000000000000000000000000000000000000..1d650f8192e04fae8f7b7c08cd527227c91b833a Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/icon-confirm.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/icon-edit.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/icon-edit.png" new file mode 100644 index 0000000000000000000000000000000000000000..f7b28aa605b5e899855a261d641d27a2674703eb Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/icon-edit.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/icon-search.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/icon-search.png" new file mode 100644 index 0000000000000000000000000000000000000000..7902923196c3394ae8eafaf5a2b6fdf7f19b1f40 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/icon-search.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/icon-thumb-down.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/icon-thumb-down.png" new file mode 100644 index 0000000000000000000000000000000000000000..cda14d196d92898da920ed64ad37fa9dd124c775 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/icon-thumb-down.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/icon-thumb-up.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/icon-thumb-up.png" new file mode 100644 index 0000000000000000000000000000000000000000..c75ce44bff456e24bc19040c18e4e644bbb77bd1 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/icon-thumb-up.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/icon-user.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/icon-user.png" new file mode 100644 index 0000000000000000000000000000000000000000..e6b06878b76d9e6d268d74070539b388129fa8c4 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/icon-user.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/login-popup.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/login-popup.png" new file mode 100644 index 0000000000000000000000000000000000000000..7834248e8603aca100b8b7e33a93611777cc6ede Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/login-popup.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/logout.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/logout.png" new file mode 100644 index 0000000000000000000000000000000000000000..da51441e632cb77dfbe0f86056e333f69485c500 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/logout.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/main-page-clean-ref.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/main-page-clean-ref.png" new file mode 100644 index 0000000000000000000000000000000000000000..2e00878b62408e75d8f82c40b3a1f5e0f4f878f6 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/main-page-clean-ref.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/main-page-sections.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/main-page-sections.png" new file mode 100644 index 0000000000000000000000000000000000000000..9d8f013318c840a5b05b3010b9b08047870be822 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/main-page-sections.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/new-chat.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/new-chat.png" new file mode 100644 index 0000000000000000000000000000000000000000..784a0da650df405e1df147409b785a026109e239 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/new-chat.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/plugin-list.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/plugin-list.png" new file mode 100644 index 0000000000000000000000000000000000000000..90270b4c9d8991463e4a4129625ab0325ac09922 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/plugin-list.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/plugin-result.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/plugin-result.png" new file mode 100644 index 0000000000000000000000000000000000000000..a810a8c123f34f51c06c2dd22c9fc1e9cb4efa06 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/plugin-result.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/plugin-selected.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/plugin-selected.png" new file mode 100644 index 0000000000000000000000000000000000000000..fa5342091d0a023a545c3edab8c4368654df8a90 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/plugin-selected.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/plugin-suggestion.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/plugin-suggestion.png" new file mode 100644 index 0000000000000000000000000000000000000000..bb416881550349000f61b0c1bd3dd540878bd6ad Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/plugin-suggestion.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/plugin-workflow-case-step-1.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/plugin-workflow-case-step-1.png" new file mode 100644 index 0000000000000000000000000000000000000000..e961ddc5b9aa6b687c69e4587ea3a59f54b6ad27 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/plugin-workflow-case-step-1.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/plugin-workflow-case-step-2-result.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/plugin-workflow-case-step-2-result.png" new file mode 100644 index 0000000000000000000000000000000000000000..dfc52217a1595613a934c5860704d688a2876a37 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/plugin-workflow-case-step-2-result.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/plugin-workflow-case-step-2.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/plugin-workflow-case-step-2.png" new file mode 100644 index 0000000000000000000000000000000000000000..0cb59551c2695151491aa1120163ea0c1aabb317 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/plugin-workflow-case-step-2.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/plugin-workflow-fill-in-param-result.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/plugin-workflow-fill-in-param-result.png" new file mode 100644 index 0000000000000000000000000000000000000000..899ee2672ba8b5eb8518fb9b80104577159d1cb4 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/plugin-workflow-fill-in-param-result.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/plugin-workflow-fill-in-param.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/plugin-workflow-fill-in-param.png" new file mode 100644 index 0000000000000000000000000000000000000000..4c03312d72f49c51868826d62bc59d0f2f925cc7 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/plugin-workflow-fill-in-param.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/privacy-policy-entry.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/privacy-policy-entry.png" new file mode 100644 index 0000000000000000000000000000000000000000..d7efce3e6e8d477ef47a1bc8a9bba0d087cf8058 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/privacy-policy-entry.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/privacy-policy.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/privacy-policy.png" new file mode 100644 index 0000000000000000000000000000000000000000..0bc0980a7dd78e055fc920d591a77d5394b5fb84 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/privacy-policy.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/recommend-questions.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/recommend-questions.png" new file mode 100644 index 0000000000000000000000000000000000000000..076ec7092af7fe7987e5dc7c864a6b9f8b2b1160 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/recommend-questions.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/regenerate.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/regenerate.png" new file mode 100644 index 0000000000000000000000000000000000000000..655c9d5002df4a17aaf84e8780fff4a0118c6c01 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/regenerate.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/rename-session-confirmation.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/rename-session-confirmation.png" new file mode 100644 index 0000000000000000000000000000000000000000..d64708bd57d53deafdc5ddbb70d9deaeaca0d132 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/rename-session-confirmation.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/rename-session.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/rename-session.png" new file mode 100644 index 0000000000000000000000000000000000000000..73e7e19c5ac8e8035df0e4b553a9b78ff5c9a051 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/rename-session.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/report-options.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/report-options.png" new file mode 100644 index 0000000000000000000000000000000000000000..8a54fd2598d51fc40b57052f404dd830cf621f4d Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/report-options.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/report.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/report.png" new file mode 100644 index 0000000000000000000000000000000000000000..471bcbe8614fc8bab4dcc1805fa1bf4574990fc8 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/report.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/search-history.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/search-history.png" new file mode 100644 index 0000000000000000000000000000000000000000..2239d14a7aa8bc13a7b8d3ec71ba9ed71b95e850 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/pictures/search-history.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/\345\211\215\350\250\200.md" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/\345\211\215\350\250\200.md" new file mode 100644 index 0000000000000000000000000000000000000000..445130848d35b8f9eb045deee708da79c3ca824e --- /dev/null +++ "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/\345\211\215\350\250\200.md" @@ -0,0 +1,67 @@ +# 前言 + +## 概述 + +本文档介绍了 openEuler Copilot System 的使用方法,对 openEuler Copilot System 线上服务的 Web 界面的各项功能做了详细介绍,同时提供了常见的问题解答,详细请参考对应手册。 + +## 读者对象 + +本文档主要适用于 openEuler Copilot System 的使用人员。使用人员必须具备以下经验和技能: + +- 熟悉 openEuler 操作系统相关情况。 +- 有 AI 对话使用经验。 + +## 修改记录 + +| 文档版本 | 发布日期 | 修改说明 | +|--------|------------|----------------| +| 03 | 2024-09-19 | 更新新版界面。 | +| 02 | 2024-05-13 | 优化智能对话操作指引。 | +| 01 | 2024-01-28 | 第一次正式发布。 | + +## 介绍 + +### 免责声明 + +- 使用过程中涉及的非工具本身验证功能所用的用户名和密码,不作他用,且不会被保存在系统环境中。 +- 在您进行对话或操作前应当确认您为应用程序的所有者或已获得所有者的充足授权同意。 +- 对话结果中可能包含您所分析应用的内部信息和相关数据,请妥善管理。 +- 除非法律法规或双方合同另有规定,openEuler 社区对分析结果不做任何明示或暗示的声明和保证,不对分析结果的适销性、满意度、非侵权性或特定用途适用性等作出任何保证或者承诺。 +- 您根据分析记录所采取的任何行为均应符合法律法规的要求,并由您自行承担风险。 +- 未经所有者授权,任何个人或组织均不得使用应用程序及相关分析记录从事任何活动。openEuler 社区不对由此造成的一切后果负责,亦不承担任何法律责任。必要时,将追究其法律责任。 + +### openEuler Copilot System 简介 + +openEuler Copilot System 是一个基于 openEuler 操作系统的人工智能助手,可以帮助用户解决各种技术问题,提供技术支持和咨询服务。它使用了最先进的自然语言处理技术和机器学习算法,能够理解用户的问题并提供相应的解决方案。 + +### 场景内容 + +1. OS 领域通用知识:openEuler Copilot System 可以咨询 Linux 常规知识、上游信息和工具链介绍和指导。 +2. openEuler 专业知识:openEuler Copilot System 可以咨询 openEuler 社区信息、技术原理和使用指导。 +3. openEuler 扩展知识:openEuler Copilot System 可以咨询 openEuler 周边硬件特性知识和ISV、OSV相关信息。 +4. openEuler 应用案例:openEuler Copilot System 可以提供 openEuler 技术案例、行业应用案例。 +5. shell 命令生成:openEuler Copilot System 可以帮助用户生成单条 shell 命令或者复杂命令。 + +总之,openEuler Copilot System 可以应用于各种场景,帮助用户提高工作效率和了解 Linux、openEuler 等的相关知识。 + +### 访问和使用 + +openEuler Copilot System 通过网址访问 Web 网页进行使用。账号注册与登录请参考[注册与登录](./注册与登录.md)。使用方法请参考[智能问答使用指南](./智能问答使用指南.md)。 + +### 界面说明 + +#### 界面分区 + +openEuler Copilot System 界面主要由如图 1 所示的区域组成,各个区域的作用如表 1 所示。 + +- 图 1 openEuler Copilot System 界面 +![Copilot 界面](./pictures/main-page-sections.png) + +- 表 1 openEuler Copilot System 首页界面分区说明 + +| 区域 | 名称 | 说明 | +|-----|------------|----------------------------------------------------------------| +| 1 | 设置管理区 | 提供账号登录和退出操作入口和明亮/黑暗模式切换功能 | +| 2 | 对话管理区 | 用于用户新建对话、对话历史记录管理和对话历史记录批量删除操作 | +| 3 | 对话区 | 用于用户和 openEuler Copilot System 的对话聊天 | +| 4 | 服务协议和隐私政策区 | 提供查看服务协议和隐私政策入口 | diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/\346\231\272\350\203\275\346\217\222\344\273\266\347\256\200\344\273\213.md" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/\346\231\272\350\203\275\346\217\222\344\273\266\347\256\200\344\273\213.md" new file mode 100644 index 0000000000000000000000000000000000000000..0ea19a2b1d0246b07c829da85533d6e43d6f734e --- /dev/null +++ "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/\346\231\272\350\203\275\346\217\222\344\273\266\347\256\200\344\273\213.md" @@ -0,0 +1,38 @@ +# 智能插件 + +## 基本用法 + +1. 如图所示,在输入框左上角可以选择插件,点击显示插件列表。 + + ![智能插件](./pictures/plugin-list.png) + +2. 勾选一个插件,然后提问。 + + ![智能插件](./pictures/plugin-selected.png) + +3. 等待服务响应,查看返回结果。 + + 智能插件模式下,推荐问题将置顶推荐的工作流,蓝色文字为对应插件名称,点击后可快捷追问。 + + ![智能插件](./pictures/plugin-suggestion.png) + ![智能插件](./pictures/plugin-result.png) + +## 插件工作流 + +openEuler Copilot System 支持插件工作流。插件工作流由多个步骤组成,每个步骤都会调用一次插件。每个步骤的输出将作为下一个步骤的输入。下面以使用 CVE 漏洞查询插件查看漏洞修复任务完成情况为例,介绍插件工作流的使用方法。 + +1. 查询全部 CVE 修复任务信息 + + ![插件工作流](./pictures/plugin-workflow-case-step-1.png) + +2. 根据上一步的结果查询指定 CVE 修复任务的详细信息 + + ![插件工作流](./pictures/plugin-workflow-case-step-2.png) + ![插件工作流](./pictures/plugin-workflow-case-step-2-result.png) + +### 补全参数 + +当上下文信息不足时,插件会提示用户补充缺失的参数。 + +![补全参数](./pictures/plugin-workflow-fill-in-param.png) +![执行结果](./pictures/plugin-workflow-fill-in-param-result.png) diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/\346\231\272\350\203\275\351\227\256\347\255\224\344\275\277\347\224\250\346\214\207\345\215\227.md" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/\346\231\272\350\203\275\351\227\256\347\255\224\344\275\277\347\224\250\346\214\207\345\215\227.md" new file mode 100644 index 0000000000000000000000000000000000000000..64b4a1881e5224360463b97b09dab11e7bb2f3e6 --- /dev/null +++ "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/\346\231\272\350\203\275\351\227\256\347\255\224\344\275\277\347\224\250\346\214\207\345\215\227.md" @@ -0,0 +1,179 @@ +# 智能问答使用指南 + +## 开始对话 + +在对话区下侧输入框即可输入对话想要提问的内容,输入 `Shift + Enter` 可进行换行,输入 `Enter` 即可发送对话提问内容,或者单击“发送”也可发送对话提问内容。 + +> **说明** +> +> 对话区位于页面的主体部分,如图 1 所示。 + +- 图 1 对话区 + ![对话区](./pictures/chat-area.png) + +### 多轮连续对话 + +openEuler Copilot System 智能问答支持多轮连续对话。只需要在同一个对话中继续追问即可使用,如图 2 所示。 + +- 图 2 多轮对话 + ![多轮对话](./pictures/context-support.png) + +### 重新生成 + +如遇到 AI 生成的内容有误或不完整的特殊情况,可以要求 AI 重新回答问题。单击对话回答左下侧的“重新生成”文字,可让 openEuler Copilot System 重新回答用户问题,重新回答后,在对话回答右下侧,会出现回答翻页的图标![向前翻页](./pictures/icon-arrow-prev.png)和![向后翻页](./pictures/icon-arrow-next.png),单击![向前翻页](./pictures/icon-arrow-prev.png)或![向后翻页](./pictures/icon-arrow-next.png)可查看不同的回答,如图 3 所示。 + +- 图 3 重新生成 + ![重新生成](./pictures/regenerate.png) + +### 推荐问题 + +在 AI 回答的下方,会展示一些推荐的问题,单击即可进行提问,如图 4 所示。 + +- 图 4 推荐问题 + ![推荐问题](./pictures/recommend-questions.png) + +## 自定义背景知识 + +openEuler Copilot System 支持上传文件功能。上传文件后,AI 会将上传的文件内容作为背景知识,在回答问题时,会结合背景知识进行回答。上传的背景知识只会作用于当前对话,不会影响其他对话。 + +### 上传文件 + +**步骤1** 单击对话区左下角的“上传文件”按钮,如图 5 所示。 + +- 图 5 上传文件按钮 + ![上传文件](./pictures/file-upload-btn.png) + +> **说明** +> +> 鼠标悬停到“上传文件”按钮上,会显示提示允许上传文件的规格和格式,如图 6 所示。 + +- 图 6 鼠标悬停显示上传文件规格提示 + ![上传文件提示](./pictures/file-upload-btn-prompt.png) + +**步骤2** 在弹出的文件选择框中,选择需要上传的文件,单击“打开”,即可上传文件。最多上传10个文件,总大小限制为64MB。接受 PDF、docx、doc、txt、md、xlsx。 + +开始上传后,对话区下方会显示上传进度,如图 7 所示。 + +- 图 7 同时上传的所有文件排列在问答输入框下方 + ![上传文件](./pictures/file-upload-uploading.png) + +文件上传完成后会自动解析,如图 8 所示,解析完成后,对话区下方会显示每个文件的大小信息。 + +- 图 8 文件上传至服务器后将显示“正在解析” + ![文件解析](./pictures/file-upload-parsing.png) + +文件上传成功后,左侧历史记录区会显示上传的文件数量,如图 9 所示。 + +- 图 9 对话历史记录标签上会展示上传文件数量 + ![历史记录标记](./pictures/file-upload-history-tag.png) + +### 针对文件提问 + +文件上传完成后,即可针对文件进行提问,提问方式同普通对话模式,如图 10 所示。 +回答结果如图 11 所示。 + +- 图 10 询问与上传的文件相关的问题 + ![针对文件提问](./pictures/file-upload-ask-against-file.png) + +- 图 11 AI 以上传的为背景知识进行回答 + ![根据自定义背景知识回答](./pictures/file-upload-showcase.png) + +## 管理对话 + +> **说明** +> +> 对话管理区在页面左侧。 + +### 新建对话 + +单击“新建对话”按钮即可新建对话,如图 12 所示。 + +- 图 12 “新建对话”按钮在页面左上方 + ![新建对话](./pictures/new-chat.png) + +### 对话历史记录搜索 + +在页面左侧历史记录搜索输入框输入关键词,然后单击![搜索](./pictures/icon-search.png)即可进行对话历史记录搜索如图 13 所示。 + +- 图 13 对话历史记录搜索框 + ![对话历史记录搜索](./pictures/search-history.png) + +### 对话历史记录单条管理 + +历史记录的列表位于历史记录搜索栏的下方,在每条对话历史记录的右侧,单击![编辑](./pictures/icon-edit.png)即可编辑对话历史记录的名字,如图 14 所示。 + +- 图 14 点击“编辑”图标重命名历史记录 + ![重命名历史记录](./pictures/rename-session.png) + +在对话历史记录名字重新书写完成后,单击右侧![确认](./pictures/icon-confirm.png)即可完成重命名,或者单击右侧![取消](./pictures/icon-cancel.png)放弃本次重命名,如图 15 所示。 + +- 图 15 完成/取消重命名历史记录 + ![完成/取消重命名历史记录](./pictures/rename-session-confirmation.png) + +另外,单击对话历史记录右侧的删除图标,如图 16 所示,即可对删除单条对话历史记录进行二次确认,在二次确认弹出框,如图 17 所示,单击“确认”,可确认删除单条对话历史记录,或者单击“取消”,取消本次删除。 + +- 图 16 点击“垃圾箱”图标删除单条历史记录 + ![删除单条历史记录](./pictures/delete-session.png) + +- 图 17 二次确认后删除历史记录 + ![删除单条历史记录二次确认](./pictures/delete-session-confirmation.png) + +### 对话历史记录批量删除 + +首先单击“批量删除”,如图 18 所示。 + +- 图 18 批量删除功能在历史记录搜索框右上方 + ![批量删除](./pictures/bulk-delete.png) + +然后可对历史记录进行选择删除,如图 19 所示。单击“全选”,即对所有历史记录选中,单击单条历史记录或历史记录左侧的选择框,可对单条历史记录进行选中。 + +- 图 19 在左侧勾选要批量删除历史记录 + ![批量删除历史记录选择](./pictures/bulk-delete-multi-select.png) + +最后需要对批量删除历史记录进行二次确认,如图 20 所示,单击“确认”,即删除,单击“取消”,即取消本次删除。 + +- 图 20 二次确认后删除所选的历史记录 + ![批量删除二次确认](./pictures/bulk-delete-confirmation.png) + +## 反馈与举报 + +在对话记录区,对话回答的右下侧,可进行对话回答反馈,如图 21 所示,单击![满意](./pictures/icon-thumb-up.png),可给对话回答点赞;单击![不满意](./pictures/icon-thumb-down.png),可以给对话回答反馈不满意的原因。 + +- 图 21 点赞和不满意反馈 + ![点赞和不满意反馈](./pictures/feedback.png) + +对于反馈不满意原因,如图 22 所示,在单击![不满意](./pictures/icon-thumb-down.png)之后,对话机器人会展示反馈内容填写的对话框,可选择相关的不满意原因的选项。 + +- 图 22 回答不满意反馈 + ![回答不满意反馈](./pictures/feedback-illegal.png) + +其中单击选择“存在错误信息”,需要填写参考答案链接和描述,如图 23 所示。 + +- 图 23 回答不满意反馈——存在错误信息 + ![回答不满意反馈——存在错误信息](./pictures/feedback-misinfo.png) + +### 举报 + +如果发现 AI 返回的内容中有违规信息,可以点击右下角按钮举报,如图 24 所示。点击举报后选择举报类型并提交,若没有合适的选项,请选择“其他”并输入原因,如图 25 所示。 + +- 图 24 举报按钮在对话块的右下角 + ![举报1](./pictures/report.png) + +- 图 25 点击后可选择举报类型 + ![举报2](./pictures/report-options.png) + +## 查看服务协议和隐私政策 + +单击文字“服务协议”,即可查看服务协议,单击文字“隐私政策”,即可查看隐私政策,如图 26、图 27 所示。 + +- 图 26 服务协议和隐私政策入口在页面底部信息栏 + ![服务协议和隐私政策入口](./pictures/privacy-policy-entry.png) + +- 图 27 点击后显示服务协议或隐私政策弹窗 + ![服务协议和隐私政策](./pictures/privacy-policy.png) + +## 附录 + +### 用户信息导出说明 + +openEuler Copilot System 后台存在用户信息导出功能,如用户需要,需主动通过 邮箱联系我们,运维会将导出的用户信息通过邮箱回送给用户。 diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/\346\263\250\345\206\214\344\270\216\347\231\273\345\275\225.md" "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/\346\263\250\345\206\214\344\270\216\347\231\273\345\275\225.md" new file mode 100644 index 0000000000000000000000000000000000000000..c81e923702b2a928a3f5f06aed500f9ef5a84ce9 --- /dev/null +++ "b/docs/zh/docs/AI/openEuler_Copilot_System/\344\275\277\347\224\250\346\214\207\345\215\227/\347\272\277\344\270\212\346\234\215\345\212\241/\346\263\250\345\206\214\344\270\216\347\231\273\345\275\225.md" @@ -0,0 +1,55 @@ +# 登录 openEuler Copilot System + +本章节介绍登录通过 *[openEuler Copilot System 部署指南](../../部署指南)* 部署的 openEuler Copilot System 网页端的操作步骤。 + +## 浏览器要求 + +浏览器要求如表 1 所示。 + +- 表 1 浏览器要求 + +| 浏览器类型 | 最低版本 | 推荐版本 | +| ----- | ----- | ----- | +| Google Chrome | 72 | 121 或更高版本 | +| Mozilla Firefox | 89 | 122 或更高版本 | +| Apple Safari | 11.0 | 16.3 或更高版本 | + +## 操作步骤 + +**步骤1** 打开本地 PC 机的浏览器,在地址栏输入 *[部署指南](../../部署指南/网络环境下部署指南.md#2-安装-openeuler-copilot-system)* 中配置好的域名,按下 `Enter`。在未登录状态,进入 openEuler Copilot System,会出现登录提示弹出框,如图 1 所示。 + +- 图 1 未登录 +![未登录](./pictures/login-popup.png) + +**步骤2** 登录 openEuler Copilot System(已注册账号)。 + +打开登录界面,如图 2 所示。 + +- 图 2 登录 openEuler Copilot System +![登录 openEuler Copilot System](./pictures/authhub-login.png) + +## 注册 openEuler Copilot System 账号 + +**步骤1** 在登录信息输入框右下角单击“立即注册”,如图 3 所示。 + +- 图 3 立即注册 +![立即注册](./pictures/authhub-login-click2signup.png) + +**步骤2** 进入账号注册页,根据页面提示填写相关内容,如图 4 所示。 + +- 图 4 账号注册 +![账号注册](./pictures/authhub-signup.png) + +**步骤3** 按页面要求填写账号信息后,单击“注册”,即可注册成功。注册后即可返回登录。 + +## 退出登录 + +**步骤1** 单击![退出登录](./pictures/icon-user.png)后,会出现“退出登录”下拉框,如图 5 所示。 + +> **说明** +> 账号管理区位于页面的右上角部分,如图 5 所示。 + +- 图 5 账号管理区 +![账号管理区](./pictures/logout.png) + +**步骤2** 单击“退出登录”即可退出登录,如图 5 所示。 diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/WEB\347\225\214\351\235\242.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/WEB\347\225\214\351\235\242.png" new file mode 100644 index 0000000000000000000000000000000000000000..bb9be4e33ce470865fe5a07decbc056b9ee4e9bb Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/WEB\347\225\214\351\235\242.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/WEB\347\231\273\345\275\225\347\225\214\351\235\242.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/WEB\347\231\273\345\275\225\347\225\214\351\235\242.png" new file mode 100644 index 0000000000000000000000000000000000000000..fddbab4df70b940d5d5ed26fb8ec688f1592b5e8 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/WEB\347\231\273\345\275\225\347\225\214\351\235\242.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/authhub\347\231\273\345\275\225\347\225\214\351\235\242.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/authhub\347\231\273\345\275\225\347\225\214\351\235\242.png" new file mode 100644 index 0000000000000000000000000000000000000000..341828b1b6f728888d1dd52eec755033680155da Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/authhub\347\231\273\345\275\225\347\225\214\351\235\242.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\345\210\233\345\273\272\345\272\224\347\224\250\346\210\220\345\212\237\347\225\214\351\235\242.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\345\210\233\345\273\272\345\272\224\347\224\250\346\210\220\345\212\237\347\225\214\351\235\242.png" new file mode 100644 index 0000000000000000000000000000000000000000..a871907f348317e43633cf05f5241cb978476fb4 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\345\210\233\345\273\272\345\272\224\347\224\250\346\210\220\345\212\237\347\225\214\351\235\242.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\345\210\233\345\273\272\345\272\224\347\224\250\347\225\214\351\235\242.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\345\210\233\345\273\272\345\272\224\347\224\250\347\225\214\351\235\242.png" new file mode 100644 index 0000000000000000000000000000000000000000..d82c736a94b106a30fd8d1f7b781f9e335bb441f Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\345\210\233\345\273\272\345\272\224\347\224\250\347\225\214\351\235\242.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/k8s\351\233\206\347\276\244\344\270\255postgres\346\234\215\345\212\241\347\232\204\345\220\215\347\247\260.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/k8s\351\233\206\347\276\244\344\270\255postgres\346\234\215\345\212\241\347\232\204\345\220\215\347\247\260.png" new file mode 100644 index 0000000000000000000000000000000000000000..473a0006c9710c92375e226a760c3a79989312f9 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/k8s\351\233\206\347\276\244\344\270\255postgres\346\234\215\345\212\241\347\232\204\345\220\215\347\247\260.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/postgres\346\234\215\345\212\241\347\253\257\345\217\243.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/postgres\346\234\215\345\212\241\347\253\257\345\217\243.png" new file mode 100644 index 0000000000000000000000000000000000000000..cfee6d88da56bc939886caece540f7de8cf77bbc Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/postgres\346\234\215\345\212\241\347\253\257\345\217\243.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/rag_port.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/rag_port.png" new file mode 100644 index 0000000000000000000000000000000000000000..b1d93f9c9d7587aa88a27d7e0bf185586583d438 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/rag_port.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/rag\351\205\215\347\275\256\344\277\241\346\201\257\346\210\220\345\212\237.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/rag\351\205\215\347\275\256\344\277\241\346\201\257\346\210\220\345\212\237.png" new file mode 100644 index 0000000000000000000000000000000000000000..fec3cdaa2b260e50f5523477da3e58a9e14e2130 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/rag\351\205\215\347\275\256\344\277\241\346\201\257\346\210\220\345\212\237.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\345\210\233\345\273\272\350\265\204\344\272\247\345\272\223\345\244\261\350\264\245\347\224\261\344\272\216\347\273\237\344\270\200\350\265\204\344\272\247\344\270\213\345\255\230\345\234\250\345\220\214\345\220\215\350\265\204\344\272\247\345\272\223.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\345\210\233\345\273\272\350\265\204\344\272\247\345\272\223\345\244\261\350\264\245\347\224\261\344\272\216\347\273\237\344\270\200\350\265\204\344\272\247\344\270\213\345\255\230\345\234\250\345\220\214\345\220\215\350\265\204\344\272\247\345\272\223.png" new file mode 100644 index 0000000000000000000000000000000000000000..624459821de4542b635eeffa115eeba780929a4e Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\345\210\233\345\273\272\350\265\204\344\272\247\345\272\223\345\244\261\350\264\245\347\224\261\344\272\216\347\273\237\344\270\200\350\265\204\344\272\247\344\270\213\345\255\230\345\234\250\345\220\214\345\220\215\350\265\204\344\272\247\345\272\223.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\345\210\233\345\273\272\350\265\204\344\272\247\346\210\220\345\212\237.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\345\210\233\345\273\272\350\265\204\344\272\247\346\210\220\345\212\237.png" new file mode 100644 index 0000000000000000000000000000000000000000..3104717bfa8f6615ad6726577a24938bc29884b2 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\345\210\233\345\273\272\350\265\204\344\272\247\346\210\220\345\212\237.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\345\210\240\351\231\244\344\270\215\345\255\230\345\234\250\347\232\204\350\265\204\344\272\247\345\244\261\350\264\245.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\345\210\240\351\231\244\344\270\215\345\255\230\345\234\250\347\232\204\350\265\204\344\272\247\345\244\261\350\264\245.png" new file mode 100644 index 0000000000000000000000000000000000000000..454b9fdfa4b7f209dc370f78677a2f4e71ea49be Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\345\210\240\351\231\244\344\270\215\345\255\230\345\234\250\347\232\204\350\265\204\344\272\247\345\244\261\350\264\245.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\345\210\240\351\231\244\350\257\255\346\226\231.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\345\210\240\351\231\244\350\257\255\346\226\231.png" new file mode 100644 index 0000000000000000000000000000000000000000..d52d25d4778f6db2d2ec076d65018c40cd1da4d3 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\345\210\240\351\231\244\350\257\255\346\226\231.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\345\210\240\351\231\244\350\265\204\344\272\247\345\272\223\345\244\261\350\264\245\357\274\214\350\265\204\344\272\247\344\270\213\344\270\215\345\255\230\345\234\250\345\257\271\345\272\224\350\265\204\344\272\247\345\272\223.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\345\210\240\351\231\244\350\265\204\344\272\247\345\272\223\345\244\261\350\264\245\357\274\214\350\265\204\344\272\247\344\270\213\344\270\215\345\255\230\345\234\250\345\257\271\345\272\224\350\265\204\344\272\247\345\272\223.png" new file mode 100644 index 0000000000000000000000000000000000000000..82ed79c0154bd8e406621440c4e4a7caaab7e06e Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\345\210\240\351\231\244\350\265\204\344\272\247\345\272\223\345\244\261\350\264\245\357\274\214\350\265\204\344\272\247\344\270\213\344\270\215\345\255\230\345\234\250\345\257\271\345\272\224\350\265\204\344\272\247\345\272\223.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\345\210\240\351\231\244\350\265\204\344\272\247\346\210\220\345\212\237.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\345\210\240\351\231\244\350\265\204\344\272\247\346\210\220\345\212\237.png" new file mode 100644 index 0000000000000000000000000000000000000000..7dd2dea945f39ada1d7dd053d150a995b160f203 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\345\210\240\351\231\244\350\265\204\344\272\247\346\210\220\345\212\237.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\345\273\272\347\253\213\350\265\204\344\272\247\345\272\223.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\345\273\272\347\253\213\350\265\204\344\272\247\345\272\223.png" new file mode 100644 index 0000000000000000000000000000000000000000..84737b4185ce781d7b32ab42d39b8d2452138dad Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\345\273\272\347\253\213\350\265\204\344\272\247\345\272\223.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\346\214\207\345\256\232\344\270\215\345\255\230\345\234\250\347\232\204\350\265\204\344\272\247\345\210\233\345\273\272\350\265\204\344\272\247\345\272\223\345\244\261\350\264\245.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\346\214\207\345\256\232\344\270\215\345\255\230\345\234\250\347\232\204\350\265\204\344\272\247\345\210\233\345\273\272\350\265\204\344\272\247\345\272\223\345\244\261\350\264\245.png" new file mode 100644 index 0000000000000000000000000000000000000000..be89bdfde2518bba3941eee5d475f52ad9124343 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\346\214\207\345\256\232\344\270\215\345\255\230\345\234\250\347\232\204\350\265\204\344\272\247\345\210\233\345\273\272\350\265\204\344\272\247\345\272\223\345\244\261\350\264\245.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\346\225\260\346\215\256\345\272\223\345\210\235\345\247\213\345\214\226.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\346\225\260\346\215\256\345\272\223\345\210\235\345\247\213\345\214\226.png" new file mode 100644 index 0000000000000000000000000000000000000000..27530840aaa5382a226e1ed8baea883895d9d75e Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\346\225\260\346\215\256\345\272\223\345\210\235\345\247\213\345\214\226.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\346\225\260\346\215\256\345\272\223\351\205\215\347\275\256\344\277\241\346\201\257\346\210\220\345\212\237.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\346\225\260\346\215\256\345\272\223\351\205\215\347\275\256\344\277\241\346\201\257\346\210\220\345\212\237.png" new file mode 100644 index 0000000000000000000000000000000000000000..aa04e6f7f0648adfca1240c750ca5b79b88da5f9 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\346\225\260\346\215\256\345\272\223\351\205\215\347\275\256\344\277\241\346\201\257\346\210\220\345\212\237.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\346\227\240\350\265\204\344\272\247\346\227\266\346\237\245\350\257\242\350\265\204\344\272\247.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\346\227\240\350\265\204\344\272\247\346\227\266\346\237\245\350\257\242\350\265\204\344\272\247.png" new file mode 100644 index 0000000000000000000000000000000000000000..74905172c0c0a0acc4c4d0e35efd2493dc421c4e Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\346\227\240\350\265\204\344\272\247\346\227\266\346\237\245\350\257\242\350\265\204\344\272\247.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\346\237\245\347\234\213\346\226\207\346\241\243\344\272\247\347\224\237\347\211\207\346\256\265\346\200\273\346\225\260\345\222\214\344\270\212\344\274\240\346\210\220\345\212\237\346\200\273\346\225\260.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\346\237\245\347\234\213\346\226\207\346\241\243\344\272\247\347\224\237\347\211\207\346\256\265\346\200\273\346\225\260\345\222\214\344\270\212\344\274\240\346\210\220\345\212\237\346\200\273\346\225\260.png" new file mode 100644 index 0000000000000000000000000000000000000000..432fbfcd02f6d2220e7d2a8512aee893d67be24d Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\346\237\245\347\234\213\346\226\207\346\241\243\344\272\247\347\224\237\347\211\207\346\256\265\346\200\273\346\225\260\345\222\214\344\270\212\344\274\240\346\210\220\345\212\237\346\200\273\346\225\260.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\346\237\245\350\257\242\345\205\250\351\203\250\350\257\255\346\226\231.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\346\237\245\350\257\242\345\205\250\351\203\250\350\257\255\346\226\231.png" new file mode 100644 index 0000000000000000000000000000000000000000..a4f4ea8a3999a9ab659ccd9ea39b80b21ff46e84 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\346\237\245\350\257\242\345\205\250\351\203\250\350\257\255\346\226\231.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\346\237\245\350\257\242\350\265\204\344\272\247.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\346\237\245\350\257\242\350\265\204\344\272\247.png" new file mode 100644 index 0000000000000000000000000000000000000000..675b40297363664007f96948fb21b1cb90d6beea Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\346\237\245\350\257\242\350\265\204\344\272\247.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\350\216\267\345\217\226\346\225\260\346\215\256\345\272\223pod\345\220\215\347\247\260.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\350\216\267\345\217\226\346\225\260\346\215\256\345\272\223pod\345\220\215\347\247\260.png" new file mode 100644 index 0000000000000000000000000000000000000000..8fc0c988e8b3830c550c6be6e42b88ac13448d1a Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\350\216\267\345\217\226\346\225\260\346\215\256\345\272\223pod\345\220\215\347\247\260.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\350\257\255\346\226\231\344\270\212\344\274\240\346\210\220\345\212\237.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\350\257\255\346\226\231\344\270\212\344\274\240\346\210\220\345\212\237.png" new file mode 100644 index 0000000000000000000000000000000000000000..5c897e9883e868bf5160d92cb106ea4e4e9bc356 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\350\257\255\346\226\231\344\270\212\344\274\240\346\210\220\345\212\237.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\350\257\255\346\226\231\345\210\240\351\231\244\345\244\261\350\264\245\357\274\214\346\234\252\346\237\245\350\257\242\345\210\260\347\233\270\345\205\263\350\257\255\346\226\231.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\350\257\255\346\226\231\345\210\240\351\231\244\345\244\261\350\264\245\357\274\214\346\234\252\346\237\245\350\257\242\345\210\260\347\233\270\345\205\263\350\257\255\346\226\231.png" new file mode 100644 index 0000000000000000000000000000000000000000..407e49b929b7ff4cf14703046a4ba0bfe1bb441e Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\350\257\255\346\226\231\345\210\240\351\231\244\345\244\261\350\264\245\357\274\214\346\234\252\346\237\245\350\257\242\345\210\260\347\233\270\345\205\263\350\257\255\346\226\231.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\350\257\255\346\226\231\346\237\245\350\257\242\346\210\220\345\212\237.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\350\257\255\346\226\231\346\237\245\350\257\242\346\210\220\345\212\237.png" new file mode 100644 index 0000000000000000000000000000000000000000..a4f4ea8a3999a9ab659ccd9ea39b80b21ff46e84 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\350\257\255\346\226\231\346\237\245\350\257\242\346\210\220\345\212\237.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\350\265\204\344\272\247\344\270\213\346\234\252\346\237\245\350\257\242\345\210\260\350\265\204\344\272\247\345\272\223.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\350\265\204\344\272\247\344\270\213\346\234\252\346\237\245\350\257\242\345\210\260\350\265\204\344\272\247\345\272\223.png" new file mode 100644 index 0000000000000000000000000000000000000000..45ab521ec5f5afbd81ad54f023aae3b7a867dbf2 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\350\265\204\344\272\247\344\270\213\346\234\252\346\237\245\350\257\242\345\210\260\350\265\204\344\272\247\345\272\223.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\350\265\204\344\272\247\344\270\213\346\237\245\350\257\242\350\265\204\344\272\247\345\272\223\346\210\220\345\212\237.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\350\265\204\344\272\247\344\270\213\346\237\245\350\257\242\350\265\204\344\272\247\345\272\223\346\210\220\345\212\237.png" new file mode 100644 index 0000000000000000000000000000000000000000..90ed5624ae93ff9784a750514c53293df4e961f0 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\350\265\204\344\272\247\344\270\213\346\237\245\350\257\242\350\265\204\344\272\247\345\272\223\346\210\220\345\212\237.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\350\265\204\344\272\247\345\272\223\345\210\233\345\273\272\346\210\220\345\212\237.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\350\265\204\344\272\247\345\272\223\345\210\233\345\273\272\346\210\220\345\212\237.png" new file mode 100644 index 0000000000000000000000000000000000000000..7b2cc38a931c9c236517c14c86fa93e3eb2b6dcd Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\350\265\204\344\272\247\345\272\223\345\210\233\345\273\272\346\210\220\345\212\237.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\350\265\204\344\272\247\345\272\223\345\210\240\351\231\244\345\244\261\350\264\245\357\274\214\344\270\215\345\255\230\345\234\250\350\265\204\344\272\247.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\350\265\204\344\272\247\345\272\223\345\210\240\351\231\244\345\244\261\350\264\245\357\274\214\344\270\215\345\255\230\345\234\250\350\265\204\344\272\247.png" new file mode 100644 index 0000000000000000000000000000000000000000..1365a8d69467dec250d3451ac63e2615a2194c18 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\350\265\204\344\272\247\345\272\223\345\210\240\351\231\244\345\244\261\350\264\245\357\274\214\344\270\215\345\255\230\345\234\250\350\265\204\344\272\247.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\350\265\204\344\272\247\345\272\223\345\210\240\351\231\244\346\210\220\345\212\237png.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\350\265\204\344\272\247\345\272\223\345\210\240\351\231\244\346\210\220\345\212\237png.png" new file mode 100644 index 0000000000000000000000000000000000000000..1bd944264baa9369e6f8fbfd04cabcd12730c0e9 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\350\265\204\344\272\247\345\272\223\345\210\240\351\231\244\346\210\220\345\212\237png.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\350\265\204\344\272\247\345\272\223\346\237\245\350\257\242\345\244\261\350\264\245\357\274\214\344\270\215\345\255\230\345\234\250\350\265\204\344\272\247.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\350\265\204\344\272\247\345\272\223\346\237\245\350\257\242\345\244\261\350\264\245\357\274\214\344\270\215\345\255\230\345\234\250\350\265\204\344\272\247.png" new file mode 100644 index 0000000000000000000000000000000000000000..58bcd320e145dd29d9e5d49cb6d86964ebb83b51 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\350\265\204\344\272\247\345\272\223\346\237\245\350\257\242\345\244\261\350\264\245\357\274\214\344\270\215\345\255\230\345\234\250\350\265\204\344\272\247.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\351\205\215\347\275\256\346\230\240\345\260\204\344\270\255\351\227\264\345\261\202.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\351\205\215\347\275\256\346\230\240\345\260\204\344\270\255\351\227\264\345\261\202.png" new file mode 100644 index 0000000000000000000000000000000000000000..809b785b999b6663d9e9bd41fed953925093d6bd Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\351\205\215\347\275\256\346\230\240\345\260\204\344\270\255\351\227\264\345\261\202.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\351\205\215\347\275\256\346\230\240\345\260\204\346\272\220\347\233\256\345\275\225.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\351\205\215\347\275\256\346\230\240\345\260\204\346\272\220\347\233\256\345\275\225.png" new file mode 100644 index 0000000000000000000000000000000000000000..62ba5f6615f18deb3d5a71fd68ee8c929638d814 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\351\205\215\347\275\256\346\230\240\345\260\204\346\272\220\347\233\256\345\275\225.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\351\205\215\347\275\256\346\230\240\345\260\204\347\233\256\346\240\207\347\233\256\345\275\225.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\351\205\215\347\275\256\346\230\240\345\260\204\347\233\256\346\240\207\347\233\256\345\275\225.png" new file mode 100644 index 0000000000000000000000000000000000000000..d32c672fafcb0ef665bda0bcfdce19d2df44db01 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\351\205\215\347\275\256\346\230\240\345\260\204\347\233\256\346\240\207\347\233\256\345\275\225.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\351\207\215\345\244\215\345\210\233\345\273\272\350\265\204\344\272\247\345\244\261\350\264\245.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\351\207\215\345\244\215\345\210\233\345\273\272\350\265\204\344\272\247\345\244\261\350\264\245.png" new file mode 100644 index 0000000000000000000000000000000000000000..a5ecd6b65abc97320e7467f00d82ff1fd9bf0e44 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272/\351\207\215\345\244\215\345\210\233\345\273\272\350\265\204\344\272\247\345\244\261\350\264\245.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\351\203\250\347\275\262\350\247\206\345\233\276.png" "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\351\203\250\347\275\262\350\247\206\345\233\276.png" new file mode 100644 index 0000000000000000000000000000000000000000..181bf1d2ddbe15cfd296c27df27d865bdbce8d69 Binary files /dev/null and "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/pictures/\351\203\250\347\275\262\350\247\206\345\233\276.png" differ diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/\346\217\222\344\273\266\351\203\250\347\275\262\346\214\207\345\215\227/AI\345\256\271\345\231\250\346\240\210/\346\217\222\344\273\266\342\200\224AI\345\256\271\345\231\250\346\240\210\351\203\250\347\275\262\346\214\207\345\215\227.md" "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/\346\217\222\344\273\266\351\203\250\347\275\262\346\214\207\345\215\227/AI\345\256\271\345\231\250\346\240\210/\346\217\222\344\273\266\342\200\224AI\345\256\271\345\231\250\346\240\210\351\203\250\347\275\262\346\214\207\345\215\227.md" new file mode 100644 index 0000000000000000000000000000000000000000..faef49e028ce1d637fd0a65b34d38aea89f8b80f --- /dev/null +++ "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/\346\217\222\344\273\266\351\203\250\347\275\262\346\214\207\345\215\227/AI\345\256\271\345\231\250\346\240\210/\346\217\222\344\273\266\342\200\224AI\345\256\271\345\231\250\346\240\210\351\203\250\347\275\262\346\214\207\345\215\227.md" @@ -0,0 +1,35 @@ +# AI容器栈部署指南 + +## 准备工作 + ++ 提前安装 [openEuler Copilot System 命令行(智能 Shell)客户端](../../../使用指南/命令行客户端/命令行助手使用指南.md) + ++ 修改 /xxxx/xxxx/values.yaml 文件的 `euler-copilot-tune` 部分,将 `enable` 字段改为 `True` + +```yaml +enable: True +``` + ++ 更新环境 + +```bash +helm upgrade euler-copilot . +``` + ++ 检查 Compatibility-AI-Infra 目录下的 openapi.yaml 中 `servers.url` 字段,确保AI容器服务的启动地址被正确设置 + ++ 获取 `$plugin_dir` 插件文件夹的路径,该变量位于 deploy/chart/euler_copilot/values.yaml 中的 `framework` 模块 + ++ 如果插件目录不存在,需新建该目录 + ++ 将该目录下的 Compatibility-AI-Infra 文件夹放到 `$plugin_dir` 中 + +```bash +cp -r ./Compatibility-AI-Infra $PLUGIN_DIR +``` + ++ 重建 framework pod,重载插件配置 + +```bash +kubectl delete pod framework-xxxx -n 命名空间 +``` diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/\346\217\222\344\273\266\351\203\250\347\275\262\346\214\207\345\215\227/\346\231\272\350\203\275\350\257\212\346\226\255/\346\217\222\344\273\266\342\200\224\346\231\272\350\203\275\350\257\212\346\226\255\351\203\250\347\275\262\346\214\207\345\215\227.md" "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/\346\217\222\344\273\266\351\203\250\347\275\262\346\214\207\345\215\227/\346\231\272\350\203\275\350\257\212\346\226\255/\346\217\222\344\273\266\342\200\224\346\231\272\350\203\275\350\257\212\346\226\255\351\203\250\347\275\262\346\214\207\345\215\227.md" new file mode 100644 index 0000000000000000000000000000000000000000..733fea049b62c54021eca335b769215edd778d8e --- /dev/null +++ "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/\346\217\222\344\273\266\351\203\250\347\275\262\346\214\207\345\215\227/\346\231\272\350\203\275\350\257\212\346\226\255/\346\217\222\344\273\266\342\200\224\346\231\272\350\203\275\350\257\212\346\226\255\351\203\250\347\275\262\346\214\207\345\215\227.md" @@ -0,0 +1,189 @@ +# 智能诊断部署指南 + +## 准备工作 + ++ 提前安装 [openEuler Copilot System 命令行(智能 Shell)客户端](../../../使用指南/命令行客户端/命令行助手使用指南.md) + ++ 被诊断机器不能安装 crictl 和 isula,只能有 docker 一个容器管理工具 + ++ 在需要被诊断的机器上安装 gala-gopher 和 gala-anteater + +## 部署 gala-gopher + +### 1. 准备 BTF 文件 + +**如果Linux内核支持 BTF,则不需要准备 BTF 文件。**可以通过以下命令来查看Linux内核是否已经支持 BTF: + +```bash +cat /boot/config-$(uname -r) | grep CONFIG_DEBUG_INFO_BTF +``` + +如果输出结果为`CONFIG_DEBUG_INFO_BTF=y`,则表示内核支持BTF。否则表示内核不支持BTF。 +如果内核不支持BTF,需要手动制作BTF文件。步骤如下: + +1. 获取当前Linux内核版本的 vmlinux 文件 + + vmlinux 文件存放在 `kernel-debuginfo` 包里面,存放路径为 `/usr/lib/debug/lib/modules/$(uname -r)/vmlinux`。 + + 例如,对于 `kernel-debuginfo-5.10.0-136.65.0.145.oe2203sp1.aarch64`,对应的vmlinux路径为`/usr/lib/debug/lib/modules/5.10.0-136.65.0.145.oe2203sp1.aarch64/vmlinux`。 + +2. 制作 BTF 文件 + + 基于获取到 vmlinux 文件来制作 BTF 文件。这一步可以在自己的环境里操作。首先,需要安装相关的依赖包: + + ```bash + # 说明:dwarves 包中包含 pahole 命令,llvm 包中包含 llvm-objcopy 命令 + yum install -y llvm dwarves + ``` + + 执行下面的命令行,生成 BTF 文件。 + + ```bash + kernel_version=4.19.90-2112.8.0.0131.oe1.aarch64 # 说明:这里需要替换成目标内核版本,可通过 uname -r 命令获取 + pahole -J vmlinux + llvm-objcopy --only-section=.BTF --set-section-flags .BTF=alloc,readonly --strip-all vmlinux ${kernel_version}.btf + strip -x ${kernel_version}.btf + ``` + + 生成的 BTF 文件名称为`.btf`格式,其中 ``为目标机器的内核版本,可通过 `uname -r` 命令获取。 + +### 2. 下载 gala-gopher 容器镜像 + +#### 在线下载 + +gala-gopher 容器镜像已归档到 仓库中,可通过如下命令获取。 + +```bash +# 获取 aarch64 架构的镜像 +docker pull hub.oepkgs.net/a-ops/gala-gopher-profiling-aarch64:latest +# 获取 x86_64 架构的镜像 +docker pull hub.oepkgs.net/a-ops/gala-gopher-profiling-x86_64:latest +``` + +#### 离线下载 + +若无法通过在线下载的方式下载容器镜像,可联系我(何秀军 00465007)获取压缩包。 + +拿到压缩包后,放到目标机器上,解压并加载容器镜像,命令行如下: + +```bash +tar -zxvf gala-gopher-profiling-aarch64.tar.gz +docker load < gala-gopher-profiling-aarch64.tar +``` + +### 3. 启动 gala-gopher 容器 + +容器启动命令: + +```shell +docker run -d --name gala-gopher-profiling --privileged --pid=host --network=host -v /:/host -v /etc/localtime:/etc/localtime:ro -v /sys:/sys -v /usr/lib/debug:/usr/lib/debug -v /var/lib/docker:/var/lib/docker -v /tmp/$(uname -r).btf:/opt/gala-gopher/btf/$(uname -r).btf -e GOPHER_HOST_PATH=/host gala-gopher-profiling-aarch64:latest +``` + +启动配置参数说明: + ++ `-v /tmp/$(uname -r).btf:/opt/gala-gopher/btf/$(uname -r).btf` :如果内核支持 BTF,则删除该配置即可。如果内核不支持 BTF,则需要将前面准备好的 BTF 文件拷贝到目标机器上,并将 `/tmp/$(uname -r).btf` 替换为对应的路径。 ++ `gala-gopher-profiling-aarch64-0426` :gala-gopher容器对应的tag,替换成实际下载的tag。 + +探针启动: + ++ `container_id` 为需要观测的容器 id ++ 分别启动 sli 和 container 探针 + +```bash +curl -X PUT http://localhost:9999/sli -d json='{"cmd":{"check_cmd":""},"snoopers":{"container_id":[""]},"params":{"report_period":5},"state":"running"}' +``` + +```bash +curl -X PUT http://localhost:9999/container -d json='{"cmd":{"check_cmd":""},"snoopers":{"container_id":[""]},"params":{"report_period":5},"state":"running"}' +``` + +探针关闭 + +```bash +curl -X PUT http://localhost:9999/sli -d json='{"state": "stopped"}' +``` + +```bash +curl -X PUT http://localhost:9999/container -d json='{"state": "stopped"}' +``` + +## 部署 gala-anteater + +源码部署: + +```bash +# 请指定分支为 930eulercopilot +git clone https://gitee.com/GS-Stephen_Curry/gala-anteater.git +``` + +安装部署请参考 +(请留意python版本导致执行setup.sh install报错) + +镜像部署: + +```bash +docker pull hub.oepkgs.net/a-ops/gala-anteater:2.0.2 +``` + +`/etc/gala-anteater/config/gala-anteater.yaml` 中 Kafka 和 Prometheus 的 `server` 和 `port` 需要按照实际部署修改,`model_topic`、`meta_topic`、`group_id` 自定义 + +```yaml +Kafka: + server: "xxxx" + port: "xxxx" + model_topic: "xxxx" # 自定义,与rca配置中保持一致 + meta_topic: "xxxx" # 自定义,与rca配置中保持一致 + group_id: "xxxx" # 自定义,与rca配置中保持一致 + # auth_type: plaintext/sasl_plaintext, please set "" for no auth + auth_type: "" + username: "" + password: "" + +Prometheus: + server: "xxxx" + port: "xxxx" + steps: "5" +``` + +gala-anteater 中模型的训练依赖于 gala-gopher 采集的数据,因此请保证 gala-gopher 探针正常运行至少24小时,在运行 gala-anteater。 + +## 部署 gala-ops + +每个中间件的大致介绍: + +kafka : 一个数据库中间件, 分布式数据分流作用, 可以配置为当前的管理节点。 + +prometheus:性能监控, 配置需要监控的生产节点 ip list。 + +直接通过yum install安装kafka和prometheus,可参照安装脚本 + +只需要参照其中 kafka 和 prometheus 的安装即可 + +## 部署 euler-copilot-rca + +镜像拉取 + +```bash +docker pull hub.oepkgs.net/a-ops/euler-copilot-rca:0.9.1 +``` + ++ 修改 `config/config.json` 文件,配置 gala-gopher 镜像的 `container_id` 以及 `ip`,Kafka 和 Prometheus 的 `ip` 和 `port`(与上述 gala-anteater 配置保持一致) + +```yaml +"gopher_container_id": "xxxx", # gala-gopher的容器id + "remote_host": "xxxx" # gala-gopher的部署机器ip + }, + "kafka": { + "server": "xxxx", + "port": "xxxx", + "storage_topic": "usad_intermediate_results", + "anteater_result_topic": "xxxx", + "rca_result_topic": "xxxx", + "meta_topic": "xxxx" + }, + "prometheus": { + "server": "xxxx", + "port": "xxxx", + "steps": 5 + }, +``` diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/\346\217\222\344\273\266\351\203\250\347\275\262\346\214\207\345\215\227/\346\231\272\350\203\275\350\260\203\344\274\230/\346\217\222\344\273\266\342\200\224\346\231\272\350\203\275\350\260\203\344\274\230\351\203\250\347\275\262\346\214\207\345\215\227.md" "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/\346\217\222\344\273\266\351\203\250\347\275\262\346\214\207\345\215\227/\346\231\272\350\203\275\350\260\203\344\274\230/\346\217\222\344\273\266\342\200\224\346\231\272\350\203\275\350\260\203\344\274\230\351\203\250\347\275\262\346\214\207\345\215\227.md" new file mode 100644 index 0000000000000000000000000000000000000000..50a589da381c58012ae700031d7165301faa7361 --- /dev/null +++ "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/\346\217\222\344\273\266\351\203\250\347\275\262\346\214\207\345\215\227/\346\231\272\350\203\275\350\260\203\344\274\230/\346\217\222\344\273\266\342\200\224\346\231\272\350\203\275\350\260\203\344\274\230\351\203\250\347\275\262\346\214\207\345\215\227.md" @@ -0,0 +1,131 @@ +# 智能调优部署指南 + +## 准备工作 + ++ 提前安装 [openEuler Copilot System 命令行(智能 Shell)客户端](../../../使用指南/命令行客户端/命令行助手使用指南.md) + ++ 被调优机器需要为 openEuler 22.03 LTS-SP3 + ++ 在需要被调优的机器上安装依赖 + +```bash +yum install -y sysstat perf +``` + ++ 被调优机器需要开启 SSH 22端口 + +## 编辑配置文件 + +修改values.yaml文件的tune部分,将 `enable` 字段改为 `True` ,并配置大模型设置、 +Embedding模型文件地址、以及需要调优的机器和对应机器上的 mysql 的账号名以及密码 + +```bash +vim /home/euler-copilot-framework/deploy/chart/agents/values.yaml +``` + +```yaml +tune: + # 【必填】是否启用智能调优Agent + enabled: true + # 镜像设置 + image: + # 镜像仓库。留空则使用全局设置。 + registry: "" + # 【必填】镜像名称 + name: euler-copilot-tune + # 【必填】镜像标签 + tag: "0.9.1" + # 拉取策略。留空则使用全局设置。 + imagePullPolicy: "" + # 【必填】容器根目录只读 + readOnly: false + # 性能限制设置 + resources: {} + # Service设置 + service: + # 【必填】Service类型,ClusterIP或NodePort + type: ClusterIP + nodePort: + # 大模型设置 + llm: + # 【必填】模型地址(需要包含v1后缀) + url: + # 【必填】模型名称 + name: "" + # 【必填】模型API Key + key: "" + # 【必填】模型最大Token数 + max_tokens: 8096 + # 【必填】Embedding模型文件地址 + embedding: "" + # 待优化机器信息 + machine: + # 【必填】IP地址 + ip: "" + # 【必填】Root用户密码 + # 注意:必需启用Root用户以密码形式SSH登录 + password: "" + # 待优化应用设置 + mysql: + # 【必填】数据库用户名 + user: "root" + # 【必填】数据库密码 + password: "" +``` + +## 安装智能调优插件 + +```bash +helm install -n euler-copilot agents . +``` + +如果之前有执行过安装,则按下面指令更新插件服务 + +```bash +helm upgrade-n euler-copilot agents . +``` + +如果 framework未重启,则需要重启framework配置 + +```bash +kubectl delete pod framework-deploy-service-bb5b58678-jxzqr -n eulercopilot +``` + +## 测试 + ++ 查看 tune 的 pod 状态 + + ```bash + NAME READY STATUS RESTARTS AGE + authhub-backend-deploy-authhub-64896f5cdc-m497f 2/2 Running 0 16d + authhub-web-deploy-authhub-7c48695966-h8d2p 1/1 Running 0 17d + pgsql-deploy-databases-86b4dc4899-ppltc 1/1 Running 0 17d + redis-deploy-databases-f8866b56-kj9jz 1/1 Running 0 17d + mysql-deploy-databases-57f5f94ccf-sbhzp 2/2 Running 0 17d + framework-deploy-service-bb5b58678-jxzqr 2/2 Running 0 16d + rag-deploy-service-5b7887644c-sm58z 2/2 Running 0 110m + vectorize-deploy-service-57f5f94ccf-sbhzp 2/2 Running 0 17d + web-deploy-service-74fbf7999f-r46rg 1/1 Running 0 2d + tune-deploy-agents-5d46bfdbd4-xph7b 1/1 Running 0 2d + ``` + ++ pod启动失败排查办法 + + 检查 euler-copilot-tune 目录下的 openapi.yaml 中 `servers.url` 字段,确保调优服务的启动地址被正确设置 + + 检查 `$plugin_dir` 插件文件夹的路径是否配置正确,该变量位于 `deploy/chart/euler_copilot/values.yaml` 中的 `framework`模块,如果插件目录不存在,需新建该目录,并需要将该目录下的 euler-copilot-tune 文件夹放到 `$plugin_dir` 中。 + + 检查sglang的地址和key填写是否正确,该变量位于 `vim /home/euler-copilot-framework/deploy/chart/euler_copilot/values.yaml` + + ```yaml + # 用于Function Call的模型 + scheduler: + # 推理框架类型 + backend: sglang + # 模型地址 + url: "" + # 模型 API Key + key: "" + # 数据库设置 + ``` + ++ 命令行客户端使用智能调优 + + 具体使用可参考 [openEuler Copilot System 命令行(智能插件:智能调优)](../../../使用指南/命令行客户端/智能调优.md) diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/\346\227\240\347\275\221\347\273\234\347\216\257\345\242\203\344\270\213\351\203\250\347\275\262\346\214\207\345\215\227.md" "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/\346\227\240\347\275\221\347\273\234\347\216\257\345\242\203\344\270\213\351\203\250\347\275\262\346\214\207\345\215\227.md" new file mode 100644 index 0000000000000000000000000000000000000000..89f11b2b5da094f278824edec21878d4f5b8ccb6 --- /dev/null +++ "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/\346\227\240\347\275\221\347\273\234\347\216\257\345\242\203\344\270\213\351\203\250\347\275\262\346\214\207\345\215\227.md" @@ -0,0 +1,741 @@ +# 无网络环境下部署指南 + +## 介绍 + +openEuler Copilot System 是一款智能问答工具,使用 openEuler Copilot System 可以解决操作系统知识获取的便捷性,并且为OS领域模型赋能开发者及运维人员。作为获取操作系统知识,使能操作系统生产力工具 (如 A-Ops / A-Tune / x2openEuler / EulerMaker / EulerDevOps / StratoVirt / iSulad 等),颠覆传统命令交付方式,由传统命令交付方式向自然语义进化,并结合智能体任务规划能力,降低开发、使用操作系统特性的门槛。 + +### 组件介绍 + +| 组件 | 端口 | 说明 | +| ----------------------------- | --------------- | -------------------- | +| euler-copilot-framework | 8002 (内部端口) | 智能体框架服务 | +| euler-copilot-web | 8080 | 智能体前端界面 | +| euler-copilot-rag | 8005 (内部端口) | 检索增强服务 | +| euler-copilot-vectorize-agent | 8001 (内部端口) | 文本向量化服务 | +| mysql | 3306 (内部端口) | MySQL数据库 | +| redis | 6379 (内部端口) | Redis数据库 | +| postgres | 5432 (内部端口) | 向量数据库 | +| secret_inject | 无 | 配置文件安全复制工具 | + +## 环境要求 + +### 软件要求 + +| 类型 | 版本要求 | 说明 | +|------------| -------------------------------------|--------------------------------------| +| 操作系统 | openEuler 22.03 LTS 及以上版本 | 无 | +| K3s | >= v1.30.2,带有 Traefik Ingress 工具 | K3s 提供轻量级的 Kubernetes 集群,易于部署和管理 | +| Helm | >= v3.15.3 | Helm 是一个 Kubernetes 的包管理工具,其目的是快速安装、升级、卸载 openEuler Copilot System 服务 | +| python | >=3.9.9 | python3.9.9 以上版本为模型的下载和安装提供运行环境 | + +### 硬件要求 + +| 类型 | 硬件要求 | +|----------------| -----------------------------| +| 服务器 | 1台 | +| CPU | 鲲鹏或x86_64,>= 32 cores | +| RAM | >= 64GB | +| 存储 | >= 500 GB | +| GPU | Tesla V100 16GB,4张 | +| NPU | 910ProB、910B | + +注意: + +1. 若无 GPU 或 NPU 资源,建议通过调用 OpenAI 接口的方式来实现功能。(接口样例:) +2. 调用第三方 OpenAI 接口的方式不需要安装高版本的 python (>=3.9.9) +3. 英伟达 GPU 对 Docker 的支持必需要新版本 Docker (>= v25.4.0) + +### 部署视图 + +![部署图](./pictures/部署视图.png) + +## 获取 openEuler Copilot System + +- 从 openEuler Copilot System 的官方Git仓库 [euler-copilot-framework](https://gitee.com/openeuler/euler-copilot-framework) 下载最新的部署仓库 +- 如果您正在使用 Kubernetes,则不需要安装 k3s 工具。 + + ```bash + # 下载目录以 home 为例 + cd /home + ``` + + ```bash + git clone https://gitee.com/openeuler/euler-copilot-framework.git + ``` + +## 环境准备 + +如果您的服务器、硬件、驱动等全部就绪,即可启动环境初始化流程,以下部署步骤在无公网环境执行。 + +### 1. 环境检查 + +环境检查主要是对服务器的主机名、DNS、防火墙设置、磁盘剩余空间大小、网络、检查 SELinux 的设置。 + +- 主机名设置 + 在Shell中运行如下命令: + + ```bash + cat /etc/hostname + echo "主机名" > /etc/hostname + ``` + +- 系统DNS设置:需要给当前主机设置有效的DNS +- 防火墙设置 + + ```bash + # 查看防火墙状态 + systemctl status firewalld + # 查看防火墙列表 + firewall-cmd --list-all + # 关闭防火墙 + systemctl stop firewalld + systemctl disable firewalld + ``` + +- SELinux设置 + + ```bash + # 需要关闭selinux,可以临时关闭或永久关闭 + # 永久关闭SELinux + sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config + # 临时关闭 + setenforce 0 + ``` + +### 2. 文件下载 + +- 模型文件 bge-reranker-large、bge-mixed-model 下载 [模型文件下载链接](https://repo.oepkgs.net/openEuler/rpm/openEuler-22.03-LTS/contrib/EulerCopilot/) + + ```bash + mkdir -p /home/EulerCopilot/models + cd /home/EulerCopilot/models + # 将需要下载的bge文件放置在models目录 + wget https://repo.oepkgs.net/openEuler/rpm/openEuler-22.03-LTS/contrib/EulerCopilot/bge-mixed-model.tar.gz + wget https://repo.oepkgs.net/openEuler/rpm/openEuler-22.03-LTS/contrib/EulerCopilot/bge-reranker-large.tar.gz + ``` + +- 下载分词工具 text2vec-base-chinese-paraphrase [分词工具下载链接](https://repo.oepkgs.net/openEuler/rpm/openEuler-22.03-LTS/contrib/EulerCopilot/) + + ```bash + mkdir -p /home/EulerCopilot/text2vec + cd /home/EulerCopilot/text2vec + wget https://repo.oepkgs.net/openEuler/rpm/openEuler-22.03-LTS/contrib/EulerCopilot/text2vec-base-chinese-paraphrase.tar.gz + ``` + +- 镜像包下载 + - x86或arm架构的EulerCopilot服务的各组件镜像单独提供 + +### 3. 安装部署工具 + +#### 3.1 安装 Docker + +如需要基于 GPU/NPU 部署大模型,需要检查 Docker 版本是否满足>= v25.4.0 ,如不满足,请升级 Docker 版本 + +#### 3.2 安装 K3s 并导入镜像 + +- 安装 SELinux 配置文件 + + ```bash + yum install -y container-selinux selinux-policy-base + # packages里有k3s-selinux-0.1.1-rc1.el7.noarch.rpm的离线包 + rpm -i https://rpm.rancher.io/k3s-selinux-0.1.1-rc1.el7.noarch.rpm + ``` + +- x86 架构安装 k3s + + ```bash + # 在有网络的环境上获取k3s相关包,以v1.30.3+k3s1示例 + wget https://github.com/k3s-io/k3s/releases/download/v1.30.3%2Bk3s1/k3s + wget https://github.com/k3s-io/k3s/releases/download/v1.30.3%2Bk3s1/k3s-airgap-images-amd64.tar.zst + cp k3s /usr/local/bin/ + cd /var/lib/rancher/k3s/agent + mkdir images + cp k3s-airgap-images-arm64.tar.zst /var/lib/rancher/k3s/agent/images + # packages里有k3s-install.sh的离线包 + curl -sfL https://rancher-mirror.rancher.cn/k3s/k3s-install.sh + INSTALL_K3S_SKIP_DOWNLOAD=true ./k3s-install.sh + export KUBECONFIG=/etc/rancher/k3s/k3s.yaml + ``` + +- arm 架构安装 k3s + + ```bash + # 在有网络的环境上获取k3s相关包,以v1.30.3+k3s1示例 + wget https://github.com/k3s-io/k3s/releases/download/v1.30.3%2Bk3s1/k3s-arm64 + wget https://github.com/k3s-io/k3s/releases/download/v1.30.3%2Bk3s1/k3s-airgap-images-arm64.tar.zst + cp k3s-arm64 /usr/local/bin/k3s + cd /var/lib/rancher/k3s/agent + mkdir images + cp k3s-airgap-images-arm64.tar.zst /var/lib/rancher/k3s/agent/images + # packages里有k3s-install.sh的离线包 + curl -sfL https://rancher-mirror.rancher.cn/k3s/k3s-install.sh + INSTALL_K3S_SKIP_DOWNLOAD=true ./k3s-install.sh + export KUBECONFIG=/etc/rancher/k3s/k3s.yaml + ``` + +- 导入镜像 + + ```bash + # 导入已下载的镜像文件 + k3s ctr image import $(镜像文件) + ``` + +#### 3.3 安装 Helm 工具 + +- x86_64 架构 + + ```bash + wget https://get.helm.sh/helm-v3.15.0-linux-amd64.tar.gz + tar -xzf helm-v3.15.0-linux-amd64.tar.gz + mv linux-amd64/helm /usr/sbin + rm -rf linux-amd64 + ``` + +- arm64 架构 + + ```bash + wget https://get.helm.sh/helm-v3.15.0-linux-arm64.tar.gz + tar -xzf helm-v3.15.0-linux-arm64.tar.gz + mv linux-arm64/helm /usr/sbin + rm -rf linux-arm64 + ``` + +#### 3.4 大模型准备 + +提供第三方openai接口或基于硬件本都部署大模型,本地部署大模型可参考附录部分。 + +## 安装 + +您的环境现已就绪,接下来即可启动 openEuler Copilot System 的安装流程。 + +- 下载目录以home为例,进入 openEuler Copilot System 仓库的 Helm 配置文件目录 + + ```bash + cd /home/euler-copilot-framework && ll + ``` + + ```bash + total 28 + drwxr-xr-x 3 root root 4096 Aug 28 17:45 docs/ + drwxr-xr-x 5 root root 4096 Aug 28 17:45 deploy/ + ``` + +- 查看deploy的目录 + + ```bash + tree deploy + ``` + + ```bash + deploy/chart + ├── databases + │   ├── Chart.yaml + │   ├── configs + │   ├── templates + │   └── values.yaml + ├── authhub + │   ├── Chart.yaml + │   ├── configs + │   ├── templates + │   └── values.yaml + └── euler_copilot + ├── Chart.yaml + ├── configs + ├── templates + │   ├── NOTES.txt + │   ├── rag + │   ├── vectorize + │   └── web + └── values.yaml + ``` + +### 1. 安装数据库 + +- 编辑 values.yaml + + ```bash + cd deploy/chart/databases + ``` + + 仅需修改镜像tag为对应架构,其余可不进行修改 + + ```bash + vim values.yaml + ``` + +- 创建命名空间 + + ```bash + kubectl create namespace euler-copilot + ``` + + 设置环境变量 + + ```bash + export KUBECONFIG=/etc/rancher/k3s/k3s.yaml + ``` + +- 安装数据库 + + ```bash + helm install -n euler-copilot databases . + ``` + +- 查看 pod 状态 + + ```bash + kubectl -n euler-copilot get pods + ``` + + ```bash + pgsql-deploy-databases-86b4dc4899-ppltc 1/1 Running 0 17d + redis-deploy-databases-f8866b56-kj9jz 1/1 Running 0 17d + mysql-deploy-databases-57f5f94ccf-sbhzp 2/2 Running 0 17d + ``` + +- 若服务器之前部署过 mysql,则可预先清除下 pvc,再部署 databases。 + + ```bash + # 获取pvc + kubectl -n euler-copilot get pvc + ``` + + ```bash + # 删除pvc + kubectl -n euler-copilot delete pvc mysql-pvc + ``` + +### 2. 安装鉴权平台Authhub + +- 编辑 values.yaml + + ```bash + cd deploy/chart/authhub + ``` + + 请结合 YAML 中的注释中的[必填]项进行修改 + + ```bash + vim values.yaml + ``` + + - 注意: + 1. authHub 需要域名,可预先申请域名或在 'C:\Windows\System32\drivers\etc\hosts' 下配置。 + authhub和euler-copilot必须是同一个根域名的两个子域名, 例如authhub.test.com和 + eulercopilot.test.com + 2. 修改tag为对应架构的tag; + +- 安装 AuthHub + + ```bash + helm install -n euler-copilot authhub . + ``` + + AuthHub 默认账号 `administrator`, 密码 `changeme` + +- 查看 pod 状态 + + ```bash + kubectl -n euler-copilot get pods + ``` + + ```bash + NAME READY STATUS RESTARTS AGE + authhub-backend-deploy-authhub-64896f5cdc-m497f 2/2 Running 0 16d + authhub-web-deploy-authhub-7c48695966-h8d2p 1/1 Running 0 17d + pgsql-deploy-databases-86b4dc4899-ppltc 1/1 Running 0 17d + redis-deploy-databases-f8866b56-kj9jz 1/1 Running 0 17d + mysql-deploy-databases-57f5f94ccf-sbhzp 2/2 Running 0 17d + ``` + +- 登录 AuthHub + + AuthHub 的域名以 为例,浏览器输入`https://authhub.test.com`, 登录界面如下图所示: + + ![部署图](./pictures/authhub登录界面.png) + +- 创建应用eulercopilot + + ![部署图](./pictures/创建应用界面.png) + 点击创建应用,输入应用名称、应用主页和应用回调地址(登录后回调地址),参考如下: + - 应用名称:eulercopilot + - 应用主页: + - 应用回调地址: + - 应用创建好后会生成 Client ID 和 Client Secret,将生成的 Client ID 和 Client Secret 配置到应用里,以 eulercopilot 为例,创建应用后在配置文件中添加配置 `deploy/chart/euler_copilot/values.yaml` 中添加配置 + + ![部署图](./pictures/创建应用成功界面.png) + +### 2. 安装 openEuler Copilot System + +- 编辑 values.yaml + + ```bash + cd deploy/chart/euler_copilot + ``` + + 请结合 YAML 中的注释中的[必填]项进行修改 + + ```bash + vim values.yaml + ``` + + - 注意: + 1. 查看系统架构,并修改values.yaml中的tag; + 2. 修改values.yaml中的globals的domain为EulerCopilot域名,并配置大模型的相关信息 + 3. 手动创建`docs_dir`、`plugin_dir`、`models`三个文件挂载目录 + 4. 修改values.yaml中framework章节的web_url和oidc设置 + 5. 如果部署插件,则需要配置用于Function Call的模型,此时必须有GPU环境用于部署sglang,可参考附件 + +- 安装 openEuler Copilot System + + ```bash + helm install -n euler-copilot service . + ``` + +- 查看 Pod 状态 + + ```bash + kubectl -n euler-copilot get pods + ``` + + 镜像拉取过程可能需要大约一分钟的时间,请耐心等待。部署成功后,所有 Pod 的状态应显示为 Running。 + + ```bash + NAME READY STATUS RESTARTS AGE + authhub-backend-deploy-authhub-64896f5cdc-m497f 2/2 Running 0 16d + authhub-web-deploy-authhub-7c48695966-h8d2p 1/1 Running 0 17d + pgsql-deploy-databases-86b4dc4899-ppltc 1/1 Running 0 17d + redis-deploy-databases-f8866b56-kj9jz 1/1 Running 0 17d + mysql-deploy-databases-57f5f94ccf-sbhzp 2/2 Running 0 17d + framework-deploy-service-bb5b58678-jxzqr 2/2 Running 0 16d + rag-deploy-service-5b7887644c-sm58z 2/2 Running 0 110m + vectorize-deploy-service-57f5f94ccf-sbhzp 2/2 Running 0 17d + web-deploy-service-74fbf7999f-r46rg 1/1 Running 0 2d + ``` + + 注意:如果 Pod 状态出现失败,建议按照以下步骤进行排查 +注意:如果 Pod 状态出现失败,建议按照以下步骤进行排查 + + 1. 查看 Kubernetes 集群的事件 (Events),以获取更多关于 Pod 失败的上下文信息 + + ```bash + kubectl -n euler-copilot get events + ``` + + 2. 查看镜像拉取是否成功 + + ```bash + k3s crictl images + ``` + + 3. 检查 RAG 的 Pod 日志,以确定是否有错误信息或异常行为。 + + ```bash + kubectl logs rag-deploy-service-5b7887644c-sm58z -n euler-copilot + ``` + + 4. 验证 Kubernetes 集群的资源状态,检查服务器资源或配额是否足够,资源不足常导致 Pod 镜像服拉取失败。 + + ```bash + df -h + ``` + + 5. 如果未拉取成且镜像大小为0,请检查是否是 k3s 版本未满足要求,低于 v1.30.2 + + ```bash + k3s -v + ``` + + 6. 确认 values.yaml 中 framework 的 OIDC 设置是否正确配置,以确保身份验证和授权功能正常工作。 + + ```bash + vim /home/euler-copilot-framework/deploy/chart/euler_copilot/values.yaml + ``` + +## 验证安装 + +恭喜您,openEuler Copilot System 的部署已完成!现在,您可以开启智能问答的非凡体验之旅了。 +请在浏览器中输入 https://$(host_ip):8080 或 (其中 port 默认值为8080,若更改则需相应调整)访问 openEuler Copilot System 网页,并尝试进行智能问答体验。 + +首先请点击下方页面的“立即注册”按钮,完成账号的注册与登录。 +![Web登录界面](./pictures/WEB登录界面.png) + +![Web 界面](./pictures/WEB界面.png) + +## 安装插件 + +详细信息请参考文档 [插件部署指南](./插件部署指南) + +## 构建专有领域智能问答 + +### 1. 构建 openEuler 专业知识领域的智能问答 + + 1. 修改 values.yaml 的 pg 的镜像仓为 `pg-data` + 2. 修改 values.yaml 的 rag 部分的字段 `knowledgebaseID: openEuler_2bb3029f` + 3. 将 `vim deploy/chart/databases/templates/pgsql/pgsql-deployment.yaml` 的 volumes 相关字段注释 + 4. 进入 `cd deploy/chart/databases`,执行更新服务 `helm upgrade -n euler-copilot databases .` + 5. 进入 `cd deploy/chart/euler_copilot`,执行更新服务 `helm upgrade -n euler-copilot service .` + 6. 进入网页端进行 openEuler 专业知识领域的问答 + +### 2. 构建项目专属知识领域智能问答 + +详细信息请参考文档 [本地资产库构建指南](本地资产库构建指南.md) + +## 附录 + +### 大模型准备 + +#### GPU 环境 + +参考以下方式进行部署 + +1. 下载模型文件: + + ```bash + huggingface-cli download --resume-download Qwen/Qwen1.5-14B-Chat --local-dir Qwen1.5-14B-Chat + ``` + +2. 创建终端 control + + ```bash + screen -S control + ``` + + ```bash + python3 -m fastchat.serve.controller + ``` + + - 按 Ctrl A+D 置于后台 + +3. 创建新终端 api + + ```bash + screen -S api + ``` + + ```bash + python3 -m fastchat.serve.openai_api_server --host 0.0.0.0 --port 30000 --api-keys sk-123456 + ``` + + - 按 Ctrl A+D 置于后台 + - 如果当前环境的 Python 版本是 3.12 或者 3.9 可以创建 python3.10 的 conda 虚拟环境 + + ```bash + mkdir -p /root/py310 + ``` + + ```bash + conda create --prefix=/root/py310 python==3.10.14 + ``` + + ```bash + conda activate /root/py310 + ``` + +4. 创建新终端 worker + + ```bash + screen -S worker + ``` + + ```bash + screen -r worker + ``` + + 安装 fastchat 和 vllm + + ```bash + pip install fschat vllm + ``` + + 安装依赖: + + ```bash + pip install fschat[model_worker] + ``` + + ```bash + python3 -m fastchat.serve.vllm_worker --model-path /root/models/Qwen1.5-14B-Chat/ --model-name qwen1.5 --num-gpus 8 --gpu-memory-utilization=0.7 --dtype=half + ``` + + - 按 Ctrl A+D 置于后台 + +5. 按照如下方式配置文件,并更新服务。 + + ```bash + vim deploy/chart/euler_copilot/values.yaml + ``` + + 修改如下部分 + + ```yaml + llm: + # 开源大模型,OpenAI兼容接口 + openai: + url: "http://$(IP):30000" + key: "sk-123456" + model: qwen1.5 + max_tokens: 8192 + ``` + +#### NPU 环境 + +NPU 环境部署可参考链接 [MindIE安装指南](https://www.hiascend.com/document/detail/zh/mindie/10RC2/whatismindie/mindie_what_0001.html) + +## FAQ + +### 1. huggingface 使用报错? + +```text +File "/usr/lib/python3.9/site-packages/urllib3/connection.py", line 186, in _new_conn +raise NewConnectionError( +urllib3.exceptions.eanconectionError: : Failed to establish a new conmection: [Errno 101] Network is unreachable +``` + +- 解决办法 + +```bash +pip3 install -U huggingface_hub +``` + +```bash +export HF_ENDPOINT=https://hf-mirror.com +``` + +### 2. 如何在 RAG 容器中调用获取问答结果的接口? + +- 请先进入到 RAG 对应 Pod + +```bash +curl -k -X POST "http://localhost:8005/kb/get_answer" -H "Content-Type: application/json" -d '{ \ + "question": "", \ + "kb_sn": "default_test", \ + "fetch_source": true }' +``` + +### 3. 执行 `helm upgrade` 报错? + +```text +Error: INSTALLATI0N FAILED: Kubernetes cluster unreachable: Get "http:/localhost:880/version": dial tcp [:1:8089: connect: connection refused +``` + +或者 + +```text +Error: UPGRADE FAILED: Kubernetes cluster unreachable: the server could not find the requested resource +``` + +- 解决办法 + +```bash +export KUBECONFIG=/etc/rancher/k3s/k3s.yaml +``` + +### 4. 无法查看 Pod 日志? + +```text +[root@localhost euler-copilot]# kubectl logs rag-deployservice65c75c48d8-44vcp-n euler-copilotDefaulted container "rag" out of: rag.rag-copy secret (init)Error from server: Get "https://172.21.31.11:10250/containerlogs/euler copilot/rag deploy"service 65c75c48d8-44vcp/rag": Forbidden +``` + +- 解决办法 + 如果设置了代理,需要将本机的网络 IP 从代理中剔除 + +```bash +cat /etc/systemd/system/k3s.service.env +``` + +```text +http_proxy="http://172.21.60.51:3128" +https_proxy="http://172.21.60.51:3128" +no_proxy=172.21.31.10 # 代理中剔除本机IP +``` + +### 5. GPU环境部署大模型时出现无法流式回复? + +在服务执行 curl 大模型失败,但是将 `"stream": true` 改为 `"stream": false`就可以 curl 通? + +```bash +curl http://localhost:30000/v1/chat/completions -H "Content-Type: application/json" -H "Authorization: Bearer sk-123456" -d '{ +"model": "qwen1.5", +"messages": [ +{ +"role": "system", +"content": "你是情感分析专家,你的任务是xxxx" +}, +{ +"role": "user", +"content": "你好" +} +], +"stream": true, +"n": 1, +"max_tokens": 32768 +}' +``` + +- 解决办法: + +```bash +pip install Pydantic=1.10.13 +``` + +### 6. 如何部署 sglang? + +```bash +# 1. 激活 Conda 环境, 并激活 Python 3.10 的 Conda 环境。假设你的环境名为 `myenv`: +conda activate myenv + +# 2. 在激活的环境中,安装 sglang[all] 和 flashinfer +pip install sglang[all]==0.3.0 +pip install flashinfer -i https://flashinfer.ai/whl/cu121/torch2.4/ + +# 3. 启动服务器 +python -m sglang.launch_server --served-model-name Qwen2.5-32B --model-path Qwen2.5-32B-Instruct-AWQ --host 0.0.0.0 --port 8001 --api-key sk-12345 --mem-fraction-static 0.5 --tp 8 +``` + +- 验证安装 + + ```bash + pip show sglang + pip show flashinfer + ``` + +- 注意: + + 1. API Key:请确保 `--api-key` 参数中的 API 密钥是正确的 + 2. 模型路径: 确保 `--model-path` 参数中的路径是正确的,并且模型文件存在于该路径下。 + 3. CUDA 版本:确保你的系统上安装了 CUDA 12.1 和 PyTorch 2.4,因为 `flashinfer` 包依赖于这些特定版本。 + 4. 线程池大小:根据你的GPU资源和预期负载调整线程池大小。如果你有 8 个 GPU,那么可以选择 --tp 8 来充分利用这些资源。 + +### 7. 如何 curl embedding? + +```bash +curl -k -X POST http://$IP:8001/embedding \ + -H "Content-Type: application/json" \ + -d '{"texts": ["sample text 1", "sample text 2"]}' +# $IP为vectorize的Embedding的内网地址 +``` + +### 8. 如何生成证书? + +```bash +下载地址: https://github.com/FiloSottile/mkcert/releases +# 1. 下载 mkcert +# x86_64 +wget https://github.com/FiloSottile/mkcert/releases/download/v1.4.4/mkcert-v1.4.4-linux-amd64 +# arm64 +wget https://github.com/FiloSottile/mkcert/releases/download/v1.4.4/mkcert-v1.4.4-linux-arm64 + +# 2. 执行下面的命令生成秘钥 +mkcert -install +# mkcert 可直接接域名或 IP, 生成证书和秘钥 +mkcert example.com + +# 3. 将证书和秘钥拷贝到 /home/euler-copilot-framework_openeuler/deploy/chart_ssl/traefik-secret.yaml 中, 并执行下面命令使其生效。 +kubectl apply -f traefik-secret.yaml +``` + +### 8. Pod状态由runnning变为pending? + +在Pod正常运行一段时间后,其状态从“Running”全部转变为 “Pending” 或 “Completed”, +可执行命令`df -h`,查看Pod所在宿主机的存储空间,确保可用空间不低于30%,以保证pod的正常运行。 diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272\346\214\207\345\215\227.md" "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272\346\214\207\345\215\227.md" new file mode 100644 index 0000000000000000000000000000000000000000..21873a3335df2e1ce11832ed0ae6f38dfa33093a --- /dev/null +++ "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/\346\234\254\345\234\260\350\265\204\344\272\247\345\272\223\346\236\204\345\273\272\346\214\207\345\215\227.md" @@ -0,0 +1,406 @@ +# 本地资产库构建指南 + +- RAG 是一个检索增强的模块,该指南主要是为rag提供命令行的方式进行数据库管理、资产管理、资产库管理和语料资产管理; + 对于数据库管理提供了清空数据库、初始化数据库等功能; + 对于资产管理提供了资产创建、资产查询和资产删除等功能; + 对于资产库管理提供了资产库创建、资产库查询和资产库删除等功能; + 对于语料资产管理提供了语料上传、语料查询和语料删除等功能。 +- 当前指南面向管理员进行编写,对于管理员而言,可以拥有多个资产,一个资产包含多个资产库(不同资产库的使用的向量化模型可以不同),一个资产库对应一个语料资产。 +- 本地语料上传指南是用户构建项目专属语料的指导,当前支持 docx、pdf、markdown、txt 和 xlsx 文件上传,推荐使用 docx 格式上传。 + +## 准备工作 + +- RAG 中关于语料上传目录挂载的配置: + +将本地语料保存到服务器的目录,例如 /home/docs 目录,且将 /home/data 目录权限设置为755 + +```bash +# 设置本地存放文档目录权限为755 +chmod -R 755 /home/docs +``` + +将文件存放的源目录映射至 RAG 容器目标目录,源目录的配置在 中,下面是文件中具体配置映射源目录的配置方法: + +![配置映射源目录](./pictures/本地资产库构建/配置映射源目录.png) + +中间层的配置(链接源目录和目标目录的配置)在 中,下面是文件中具体映射中间层的配置方法: + +![配置映射中间层](./pictures/本地资产库构建/配置映射中间层.png) + +目标目录的配置在 中,下面是文件中具体映射目标目录的配置方法: + +![配置映射目标目录](./pictures/本地资产库构建/配置映射目标目录.png) + +- 更新 Copilot 服务: + + ```bash + root@openeuler:/home/EulerCopilot/deploy/chart# helm upgrade -n euler-copilot service . + # 请注意:service是服务名,可根据实际修改 + ``` + +- 进入到 RAG 容器: + + ```bash + root@openeuler:~# kubectl -n euler-copilot get pods + NAME READY STATUS RESTARTS AGE + framework-deploy-service-bb5b58678-jxzqr 2/2 Running 0 16d + mysql-deploy-service-c7857c7c9-wz9gn 1/1 Running 0 17d + pgsql-deploy-service-86b4dc4899-ppltc 1/1 Running 0 17d + rag-deploy-service-5b7887644c-sm58z 2/2 Running 0 110m + redis-deploy-service-f8866b56-kj9jz 1/1 Running 0 17d + vectorize-deploy-service-57f5f94ccf-sbhzp 2/2 Running 0 17d + web-deploy-service-74fbf7999f-r46rg 1/1 Running 0 2d + # 进入rag pod + root@openeuler:~# kubectl -n euler-copilot exec -it rag-deploy-service-5b7887644c-sm58z -- bash + ``` + +- 设置 PYTHONPATH + + ```bash + # 设置PYTHONPATH + export PYTHONPATH=$(pwd) + ``` + +## 上传语料 + +### 查看脚本帮助信息 + +```bash +python3 scripts/rag_kb_manager.pyc --help +usage: rag_kb_manager.pyc [-h] --method + {init_database_info,init_rag_info,init_database,clear_database,create_kb,del_kb,query_kb,create_kb_asset,del_kb_asset,query_kb_asset,up_corpus,del_corpus,query_corpus,stop_corpus_uploading_job} + [--database_url DATABASE_URL] [--vector_agent_name VECTOR_AGENT_NAME] [--parser_agent_name PARSER_AGENT_NAME] + [--rag_url RAG_URL] [--kb_name KB_NAME] [--kb_asset_name KB_ASSET_NAME] [--corpus_dir CORPUS_DIR] + [--corpus_chunk CORPUS_CHUNK] [--corpus_name CORPUS_NAME] [--up_chunk UP_CHUNK] + [--embedding_model {TEXT2VEC_BASE_CHINESE_PARAPHRASE,BGE_LARGE_ZH,BGE_MIXED_MODEL}] [--vector_dim VECTOR_DIM] + [--num_cores NUM_CORES] + +optional arguments: + -h, --help show this help message and exit + --method {init_database_info,init_rag_info,init_database,clear_database,create_kb,del_kb,query_kb,create_kb_asset,del_kb_asset,query_kb_asset,up_corpus,del_corpus,query_corpus,stop_corpus_uploading_job} + 脚本使用模式,有init_database_info(初始化数据库配置)、init_database(初始化数据库)、clear_database(清除数据库)、create_kb(创建资产)、 + del_kb(删除资产)、query_kb(查询资产)、create_kb_asset(创建资产库)、del_kb_asset(删除资产库)、query_kb_asset(查询 + 资产库)、up_corpus(上传语料,当前支持txt、html、pdf、docx和md格式)、del_corpus(删除语料)、query_corpus(查询语料)和 + stop_corpus_uploading_job(上传语料失败后,停止当前上传任务) + --database_url DATABASE_URL + 语料资产所在数据库的url + --vector_agent_name VECTOR_AGENT_NAME + 向量化插件名称 + --parser_agent_name PARSER_AGENT_NAME + 分词插件名称 + --rag_url RAG_URL rag服务的url + --kb_name KB_NAME 资产名称 + --kb_asset_name KB_ASSET_NAME + 资产库名称 + --corpus_dir CORPUS_DIR + 待上传语料所在路径 + --corpus_chunk CORPUS_CHUNK + 语料切割尺寸 + --corpus_name CORPUS_NAME + 待查询或者待删除语料名 + --up_chunk UP_CHUNK 语料单次上传个数 + --embedding_model {TEXT2VEC_BASE_CHINESE_PARAPHRASE,BGE_LARGE_ZH,BGE_MIXED_MODEL} + 初始化资产时决定使用的嵌入模型 + --vector_dim VECTOR_DIM + 向量化维度 + --num_cores NUM_CORES + 语料处理使用核数 +``` + +### 具体操作 + +以下出现的命令中带**初始化**字段需要在进行资产管理前按指南中出现的相对顺序进行执行,命令中带**可重复**执字段的在后续过程中可以反复执行,命令中带**注意**字段的需谨慎执行。 + +### 步骤1:配置数据库和 RAG 信息 + +- #### 配置数据库信息(初始化) + +```bash +python3 scripts/rag_kb_manager.pyc --method init_database_info --database_url postgresql+psycopg2://postgres:123456@{dabase_url}:{databse_port}/postgres +``` + +**注意:** + +**{dabase_url}**为 k8s 集群内访问 postgres 服务的 url,请根据具体情况修改,一般为 **{postgres_servive_name}-{{ .Release.Name }}.\.svc.cluster.local** 格式,其中 **{postgres_servive_name}** 可以从 找到: + +![k8s集群中postgres服务的名称](./pictures/本地资产库构建/k8s集群中postgres服务的名称.png) + +**{{ .Release.Name }}**和**\** 为部署服务时helm安装应用时指定的 **my-release-name** 以及 **my-namespace**,一条 helm 安装应用的命令如下所示: + +```bash +helm install my-release-name --namespace my-namespace path/to/chart +``` + +**database_port** 的信息可以在 中查看,以下为字段所在位置(一般为5432): + +![postgres服务端口](./pictures/本地资产库构建/postgres服务端口.png) + +数据库信息配置命令执行命令完成之后会在 scripts/config 下出现 database_info.json 文件,文件内容如下: + +```bash +{"database_url": "postgresql+psycopg2://postgres:123456@{dabase_url}:{databse_port}/postgres"} +``` + +下面是命令执行成功的截图: + +![数据库配置信息成功](./pictures/本地资产库构建/数据库配置信息成功.png) + +- #### 配置rag信息(初始化) + +```bash +python3 scripts/rag_kb_manager.pyc --method init_rag_info --rag_url http://{rag_url}:{rag_port} +``` + +**{rag_url}** 为 0.0.0.0,**{rag_port}** 可以从 中获取(一般为8005): + +![rag_port](./pictures/本地资产库构建/rag_port.png) + +数据库信息配置命令执行命令完成之后会在 scripts/config 下出现 rag_info.json 文件,文件内容如下: + +```bash +{"rag_url": "http://{rag_url}:{rag_port}"} +``` + +下面是命令执行成功的截图: + +![rag配置信息成功](./pictures/本地资产库构建/rag配置信息成功.png) + +### 步骤2:初始化数据库 + +- #### 初始化数据库表格 + +```bash +python3 scripts/rag_kb_manager.pyc --method init_database +# 注意: +# 对于特殊关系型数据库可指定插件参数'--vector_agent_name VECTOR_AGENT_NAME'和 '--parser_agent_name PARSER_AGENT_NAME';其中VECTOR_AGENT_NAME默认为vector, PARSER_AGENT_NAME默认为zhparser +``` + +命令执行完成之后可以进入数据库容器查看表格是否创建成功,首先获取命名空间中的所有节点名称: + +```bash +# 获取命名空间中的所有pod节点 +kubectl get pods -n euler-copilot +``` + +结果如下: + +![获取数据库pod名称](./pictures/本地资产库构建/获取数据库pod名称.png) + +使用下面命令进入数据库: + +```bash +kubectl exec -it pgsql-deploy-b4cc79794-qn8zd -n euler-copilot -- bash +``` + +进入容器后使用下面命令进入数据库: + +```bash +root@pgsql-deploy-b4cc79794-qn8zd:/tmp# psql -U postgres +``` + +再使用\dt查看数据库初始化情况,出现下面内容表示数据库初始化成功: + +![数据库初始化](./pictures/本地资产库构建/数据库初始化.png) + +- #### 清空数据库(注意) + + 假设您想清空 RAG 产生的所有数据库数据,可以使用下面命令(**此命令会清空整个数据库,需谨慎操作!**)。 + +```bash +python3 scripts/rag_kb_manager.pyc --method clear_database +# 清空数据库请谨慎操作 +``` + +### 步骤3:创建资产 + + 下列指令若不指定 kb_name,则默认资产名为 default_test(ps:Copilot 不允许存在两个同名的资产): + +- #### 创建资产(可重复) + +```bash +python3 scripts/rag_kb_manager.pyc --method create_kb --kb_name default_test +``` + +创建资产成功会有以下提示: + +![创建资产成功](./pictures/本地资产库构建/创建资产成功.png) + +创建同名资产会有以下提示: + +![重复创建资产失败](./pictures/本地资产库构建/重复创建资产失败.png) + +- #### 删除资产(可重复) + +```bash +python3 scripts/rag_kb_manager.pyc --method del_kb --kb_name default_test +``` + +删除资产成功会出现以下提示(会将资产下的所有资产库和语料资产全部删除): + +![删除资产成功](./pictures/本地资产库构建/删除资产成功.png) + +对于不存在的资产进行删除,会出现以下提示: + +![删除不存在的资产失败](./pictures/本地资产库构建/删除不存在的资产失败.png) + +- #### 查询资产(可重复) + +```bash +python3 scripts/rag_kb_manager.pyc --method query_kb +``` + +查询资产成功会出现下面内容: + +![查询资产](./pictures/本地资产库构建/查询资产.png) + +对于无资产的情况下查询资产会出现以下内容: + +![无资产时查询资产](./pictures/本地资产库构建/无资产时查询资产.png) + +### 步骤4:创建资产库 + +下列指令若不指定资产名(kb_name)和资产库名(kb_asset_name),则默认资产名为 default_test 和资产库名 default_test_asset(ps:Copilot 同一个资产下不允许存在两个同名的资产库): + +- #### 创建资产库(可重复) + +```bash +python3 scripts/rag_kb_manager.pyc --method create_kb_asset --kb_name default_test --kb_asset_name default_test_asset +# 创建属于default_test的资产库 +``` + +对于创建资产库成功会出现以下内容: + +![资产库创建成功](./pictures/本地资产库构建/资产库创建成功.png) + +对于指定不存在的资产库创建资产会出现以下内容: + +![指定不存在的资产创建资产库失败](./pictures/本地资产库构建/指定不存在的资产创建资产库失败.png) + +对于同一个资产下重复创建同名资产库会出现以下内容: + +![创建资产库失败由于统一资产下存在同名资产库](./pictures/本地资产库构建/创建资产库失败由于统一资产下存在同名资产库.png) + +- #### 删除资产库(可重复) + +```bash +python3 scripts/rag_kb_manager.pyc --method del_kb_asset --kb_name default_test --kb_asset_name default_test_asset +``` + +对于删除资产库成功会出现以下内容: + +![资产库删除成功](./pictures/本地资产库构建/资产库删除成功png.png) + +对于删除不存在的资产库失败会出现以下内容: + +![资产下不存在对应资产库](./pictures/本地资产库构建/删除资产库失败,资产下不存在对应资产库.png) + +对于删除不存在的资产下的资产库会出现以下内容: + +![不存在资产](./pictures/本地资产库构建/资产库删除失败,不存在资产.png) + +- #### 查询资产库(可重复) + +```bash +python3 scripts/rag_kb_manager.pyc --method query_kb_asset --kb_name default_test +# 注意:资产是最上层的,资产库属于资产,且不能重名 +``` + +对于查询资产库成功会出现以下内容: + +![资产下查询资产库成功](./pictures/本地资产库构建/资产下查询资产库成功.png) + +对于资产内无资产库的情况下查询资产库会出现以下内容: + +![资产下未查询到资产库](./pictures/本地资产库构建/资产下未查询到资产库.png) + +对于查询不存在的资产下的资产库会出现以下内容: + +![不存在资产](./pictures/本地资产库构建/资产库查询失败,不存在资产.png) + +### 步骤5:上传语料 + +下列指令若不指定资产名(kb_name)和资产库名(kb_asset_name),则默认资产名为 default_test 和资产库名 default_test_asset,对于删除语料命令需要指定完整的语料名称(语料统一为 docx 格式保存在数据库中,可以通过查询语料命令查看已上传的文档名称);对于查询语料命令可以不指定语料名称(corpus_name),此时默认查询所有语料,可以指定部分或者完整的语料名,此时通过模糊搜索匹配数据库内相关的语料名称。 + +- 上传语料 + +```bash +python3 scripts/rag_kb_manager.pyc --method up_corpus --corpus_dir ./scripts/docs/ --kb_name default_test --kb_asset_name default_test_asset +# 注意: +# 1. RAG容器用于存储用户语料的目录路径是'./scripts/docs/'。在执行相关命令前,请确保该目录下已有本地上传的语料。 +# 2. 若语料已上传但查询未果,请检查宿主机上的待向量化语料目录(位于/home/euler-copilot/docs)的权限设置。 +# 为确保无权限问题影响,您可以通过运行chmod 755 /home/euler-copilot/docs命令来赋予该目录最大访问权限。 +``` + +对于语料上传成功会出现以下内容: + +![语料上传成功](./pictures/本地资产库构建/语料上传成功.png) + +对于语料具体的分割和上传情况可以在 logs/app.log 下查看,内容如下: + +![查看文档产生片段总数和上传成功总数](./pictures/本地资产库构建/查看文档产生片段总数和上传成功总数.png) + +- 删除语料 + +```bash +python3 scripts/rag_kb_manager.pyc --method del_corpus --corpus_name abc.docx --kb_name default_test --kb_asset_name default_test_asset +# 上传的文件统一转换为docx +``` + +对于语料删除成功会出现以下内容: + +![删除语料](./pictures/本地资产库构建/删除语料.png) + +对于删除不存在的语料会出现以下内容: + +![语料删除失败](./pictures/本地资产库构建/语料删除失败,未查询到相关语料.png) + +- 查询语料 + +```bash +# 查询指定名称的语料: +python3 scripts/rag_kb_manager.pyc --method query_corpus --corpus_name 语料名.docx +# 查询所有语料: +python3 scripts/rag_kb_manager.pyc --method query_corpus +``` + +对于查询所有语料会出现以下内容: + +![查询全部语料](./pictures/本地资产库构建/查询全部语料.png) + +- 停止上传任务 + +```bash +python3 scripts/rag_kb_manager.pyc --method stop_corpus_uploading_job +``` + +对于某些极端条件下(例如内存受限),上传语料失败,需要执行上面shell命令用于清除语料上传失败的缓存。 + +## 网页端查看语料上传进度 + +您可以灵活设置端口转发规则,通过执行如下命令将容器端口映射到主机上的指定端口,并在任何设备上通过访问 http://<主机IP>:<映射端口>(例如 )来查看语料上传的详细情况。 + +```bash +kubectl port-forward rag-deploy-service-5b7887644c-sm58z 3000:8005 -n euler-copilot --address=0.0.0.0 +# 注意: 3000是主机上的端口,8005是rag的容器端口,可修改映射到主机上的端口 +``` + +## 验证上传后效果 + +上传语料成功之后你可以通过以下命令直接与 RAG 交互,来观察语料是否上传成功。 + +```bash +curl -k -X POST "http://{rag_url}:{rag_port}/kb/get_answer" -H "Content-Type: application/json" -d '{ \ + "question": "question", \ + "kb_sn": "kb_name", \ + "fetch_source": true, \ + "top_k": 3 \ +}' +``` + +- `question`:问题 + +- `kb_sn`:资产库名称 + +- `fetch_source`:是否返回关联片段以及片段来源,`false` 代表不返回,`true` 代表返回 + +- `top_k`:关联语料片段个数,需要大于等于3 diff --git "a/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/\347\275\221\347\273\234\347\216\257\345\242\203\344\270\213\351\203\250\347\275\262\346\214\207\345\215\227.md" "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/\347\275\221\347\273\234\347\216\257\345\242\203\344\270\213\351\203\250\347\275\262\346\214\207\345\215\227.md" new file mode 100644 index 0000000000000000000000000000000000000000..83186358b104c8d51042db780d9850923db6e99c --- /dev/null +++ "b/docs/zh/docs/AI/openEuler_Copilot_System/\351\203\250\347\275\262\346\214\207\345\215\227/\347\275\221\347\273\234\347\216\257\345\242\203\344\270\213\351\203\250\347\275\262\346\214\207\345\215\227.md" @@ -0,0 +1,625 @@ +# 网络环境部署指南 + +## 介绍 + +openEuler Copilot System 是一款智能问答工具,使用 openEuler Copilot System 可以解决操作系统知识获取的便捷性,并且为OS领域模型赋能开发者及运维人员。作为获取操作系统知识,使能操作系统生产力工具 (如 A-Ops / A-Tune / x2openEuler / EulerMaker / EulerDevOps / StratoVirt / iSulad 等),颠覆传统命令交付方式,由传统命令交付方式向自然语义进化,并结合智能体任务规划能力,降低开发、使用操作系统特性的门槛。 + +### 组件介绍 + +| 组件 | 端口 | 说明 | +| ----------------------------- | --------------- | -------------------- | +| euler-copilot-framework | 8002 (内部端口) | 智能体框架服务 | +| euler-copilot-web | 8080 | 智能体前端界面 | +| euler-copilot-rag | 8005 (内部端口) | 检索增强服务 | +| euler-copilot-vectorize-agent | 8001 (内部端口) | 文本向量化服务 | +| mysql | 3306 (内部端口) | MySQL数据库 | +| redis | 6379 (内部端口) | Redis数据库 | +| postgres | 5432 (内部端口) | 向量数据库 | +| secret_inject | 无 | 配置文件安全复制工具 | + +## 环境要求 + +### 软件要求 + +| 类型 | 版本要求 | 说明 | +|------------| -------------------------------------|--------------------------------------| +| 操作系统 | openEuler 22.03 LTS 及以上版本 | 无 | +| K3s | >= v1.30.2,带有 Traefik Ingress 工具 | K3s 提供轻量级的 Kubernetes 集群,易于部署和管理 | +| Helm | >= v3.15.3 | Helm 是一个 Kubernetes 的包管理工具,其目的是快速安装、升级、卸载 openEuler Copilot System 服务 | +| python | >=3.9.9 | python3.9.9 以上版本为模型的下载和安装提供运行环境 | + +### 硬件要求 + +| 类型 | 硬件要求 | +|----------------| -----------------------------| +| 服务器 | 1台 | +| CPU | 鲲鹏或x86_64,>= 32 cores | +| RAM | >= 64GB | +| 存储 | >= 500 GB | +| GPU | Tesla V100 16GB,4张 | +| NPU | 910ProB、910B | + +注意: + +1. 若无 GPU 或 NPU 资源,建议通过调用 OpenAI 接口的方式来实现功能。(接口样例: 参考链接:[API-KEY的获取与配置](https://help.aliyun.com/zh/dashscope/developer-reference/acquisition-and-configuration-of-api-key?spm=a2c4g.11186623.0.0.30e7694eaaxxGa)) +2. 调用第三方 OpenAI 接口的方式不需要安装高版本的 python (>=3.9.9) +3. 英伟达 GPU 对 Docker 的支持必需要新版本 Docker (>= v25.4.0) +4. 如果k8s集群环境,则不需要单独安装k3s,要求version >= 1.28 + +### 部署视图 + +![部署图](./pictures/部署视图.png) + +## 获取 openEuler Copilot System + +- 从 openEuler Copilot System 的官方Git仓库 [euler-copilot-framework](https://gitee.com/openeuler/euler-copilot-framework) 下载最新的部署仓库 +- 如果您正在使用 Kubernetes,则不需要安装 k3s 工具。 + +```bash +# 下载目录以 home 为例 +cd /home +``` + +```bash +git clone https://gitee.com/openeuler/euler-copilot-framework.git +``` + +## 环境准备 + +设备需联网并符合 openEuler Copilot System 的最低软硬件要求。确认服务器、硬件、驱动等准备就绪后,即可开始环境准备工作。为了顺利进行后续操作,请按照指引,先进入我 +们的脚本部署目录,并且按照提供的操作步骤和脚本路径依次执行,以确保初始化成功。 + +```bash +# 进入部署脚本目录 +cd /home/euler-copilot-framework/deploy/scripts && tree +``` + +```bash +. +├── check_env.sh +├── download_file.sh +├── get_log.sh +├── install_tools.sh +└── prepare_docker.sh +``` + +| 序号 | 步骤内容 | 相关指令 | 说明 | +|-------------- |----------|---------------------------------------------|------------------------------------------ | +|1| 环境检查 | `bash check_env.sh` | 主要对服务器的主机名、DNS、防火墙设置、磁盘剩余空间大小、网络、检查SELinux的设置 | +|2| 文件下载 | `bash download_file.sh` | 模型bge-reranker-large、bge-mixed-mode下载 | +|3| 安装部署工具 | `bash install_tools.sh v1.30.2+k3s1 v3.15.3 cn` | 安装helm、k3s工具。注意:cn的使用是使用镜像站,可以去掉不用 | +|4| 大模型准备 | 提供第三方 OpenAI 接口或基于硬件本都部署大模型 | 本地部署大模型可参考附录部分 | + +## 安装 + +您的环境现已就绪,接下来即可启动 openEuler Copilot System 的安装流程。 + +- 下载目录以home为例,进入 openEuler Copilot System 仓库的 Helm 配置文件目录 + + ```bash + cd /home/euler-copilot-framework && ll + ``` + + ```bash + total 28 + drwxr-xr-x 3 root root 4096 Aug 28 17:45 docs/ + drwxr-xr-x 5 root root 4096 Aug 28 17:45 deploy/ + ``` + +- 查看deploy的目录 + + ```bash + tree deploy + ``` + + ```bash + deploy/chart + ├── databases + │   ├── Chart.yaml + │   ├── configs + │   ├── templates + │   └── values.yaml + ├── authhub + │   ├── Chart.yaml + │   ├── configs + │   ├── templates + │   └── values.yaml + └── euler_copilot + ├── Chart.yaml + ├── configs + ├── templates + │   ├── NOTES.txt + │   ├── rag + │   ├── vectorize + │   └── web + └── values.yaml + ``` + +### 1. 安装数据库 + +- 编辑 values.yaml + + ```bash + cd deploy/chart/databases + ``` + + 仅需修改镜像tag为对应架构,其余可不进行修改 + + ```bash + vim values.yaml + ``` + +- 创建命名空间 + + ```bash + kubectl create namespace euler-copilot + ``` + + 设置环境变量 + + ```bash + export KUBECONFIG=/etc/rancher/k3s/k3s.yaml + ``` + +- 安装数据库 + + ```bash + helm install -n euler-copilot databases . + ``` + +- 查看 pod 状态 + + ```bash + kubectl -n euler-copilot get pods + ``` + + ```bash + pgsql-deploy-databases-86b4dc4899-ppltc 1/1 Running 0 17d + redis-deploy-databases-f8866b56-kj9jz 1/1 Running 0 17d + mysql-deploy-databases-57f5f94ccf-sbhzp 2/2 Running 0 17d + ``` + +- 若服务器之前部署过 mysql,则可预先清除下 pvc,再部署 databases。 + + ```bash + # 获取pvc + kubectl -n euler-copilot get pvc + ``` + + ```bash + # 删除pvc + kubectl -n euler-copilot delete pvc mysql-pvc + ``` + +### 2. 安装鉴权平台Authhub + +- 编辑 values.yaml + + ```bash + cd deploy/chart/authhub + ``` + + 请结合 YAML 中的注释中的[必填]项进行修改 + + ```bash + vim values.yaml + ``` + + - 注意: + 1. authHub 需要域名,可预先申请域名或在 'C:\Windows\System32\drivers\etc\hosts' 下配置。 + authhub和euler-copilot必须是同一个根域名的两个子域名, 例如authhub.test.com和 + eulercopilot.test.com + 2. 修改tag为对应架构的tag; + +- 安装 AuthHub + + ```bash + helm install -n euler-copilot authhub . + ``` + + AuthHub 默认账号 `administrator`, 密码 `changeme` + +- 查看 pod 状态 + + ```bash + kubectl -n euler-copilot get pods + ``` + + ```bash + NAME READY STATUS RESTARTS AGE + authhub-backend-deploy-authhub-64896f5cdc-m497f 2/2 Running 0 16d + authhub-web-deploy-authhub-7c48695966-h8d2p 1/1 Running 0 17d + pgsql-deploy-databases-86b4dc4899-ppltc 1/1 Running 0 17d + redis-deploy-databases-f8866b56-kj9jz 1/1 Running 0 17d + mysql-deploy-databases-57f5f94ccf-sbhzp 2/2 Running 0 17d + ``` + +- 登录 AuthHub + + AuthHub 的域名以 为例,浏览器输入`https://authhub.test.com`, 登录界面如下图所示: + + ![部署图](./pictures/authhub登录界面.png) + +- 创建应用eulercopilot + + ![部署图](./pictures/创建应用界面.png) + 点击创建应用,输入应用名称、应用主页和应用回调地址(登录后回调地址),参考如下: + - 应用名称:eulercopilot + - 应用主页: + - 应用回调地址: + - 应用创建好后会生成 Client ID 和 Client Secret,将生成的 Client ID 和 Client Secret 配置到应用里,以 eulercopilot 为例,创建应用后在配置文件中添加配置 `deploy/chart/euler_copilot/values.yaml` 中添加配置 + + ![部署图](./pictures/创建应用成功界面.png) + +### 2. 安装 openEuler Copilot System + +- 编辑 values.yaml + + ```bash + cd deploy/chart/euler_copilot + ``` + + 请结合 YAML 中的注释中的[必填]项进行修改 + + ```bash + vim values.yaml + ``` + + - 注意: + 1. 查看系统架构,并修改values.yaml中的tag; + 2. 修改values.yaml中的globals的domain为EulerCopilot域名,并配置大模型的相关信息 + 3. 手动创建`docs_dir`、`plugin_dir`、`models`三个文件挂载目录 + 4. 修改values.yaml中framework章节的web_url和oidc设置 + 5. 如果部署插件,则需要配置用于Function Call的模型,此时必须有GPU环境用于部署sglang,可参考附件 + +- 安装 openEuler Copilot System + + ```bash + helm install -n euler-copilot service . + ``` + +- 查看 Pod 状态 + + ```bash + kubectl -n euler-copilot get pods + ``` + + 镜像拉取过程可能需要大约一分钟的时间,请耐心等待。部署成功后,所有 Pod 的状态应显示为 Running。 + + ```bash + NAME READY STATUS RESTARTS AGE + authhub-backend-deploy-authhub-64896f5cdc-m497f 2/2 Running 0 16d + authhub-web-deploy-authhub-7c48695966-h8d2p 1/1 Running 0 17d + pgsql-deploy-databases-86b4dc4899-ppltc 1/1 Running 0 17d + redis-deploy-databases-f8866b56-kj9jz 1/1 Running 0 17d + mysql-deploy-databases-57f5f94ccf-sbhzp 2/2 Running 0 17d + framework-deploy-service-bb5b58678-jxzqr 2/2 Running 0 16d + rag-deploy-service-5b7887644c-sm58z 2/2 Running 0 110m + vectorize-deploy-service-57f5f94ccf-sbhzp 2/2 Running 0 17d + web-deploy-service-74fbf7999f-r46rg 1/1 Running 0 2d + ``` + + 注意:如果 Pod 状态出现失败,建议按照以下步骤进行排查 + +注意:如果 Pod 状态出现失败,建议按照以下步骤进行排查 + + 1. 查看 Kubernetes 集群的事件 (Events),以获取更多关于 Pod 失败的上下文信息 + + ```bash + kubectl -n euler-copilot get events + ``` + + 2. 查看镜像拉取是否成功 + + ```bash + k3s crictl images + ``` + + 3. 检查 RAG 的 Pod 日志,以确定是否有错误信息或异常行为。 + + ```bash + kubectl logs rag-deploy-service-5b7887644c-sm58z -n euler-copilot + ``` + + 4. 验证 Kubernetes 集群的资源状态,检查服务器资源或配额是否足够,资源不足常导致 Pod 镜像服拉取失败。 + + ```bash + df -h + ``` + + 5. 如果未拉取成且镜像大小为0,请检查是否是 k3s 版本未满足要求,低于 v1.30.2 + + ```bash + k3s -v + ``` + + 6. 确认 values.yaml 中 framework 的 OIDC 设置是否正确配置,以确保身份验证和授权功能正常工作。 + + ```bash + vim /home/euler-copilot-framework/deploy/chart/euler_copilot/values.yaml + ``` + +## 验证安装 + +恭喜您,openEuler Copilot System 的部署已完成!现在,您可以开启智能问答的非凡体验之旅了。 +请在浏览器中输入 https://$(host_ip):8080 或 (其中 port 默认值为8080,若更改则需相应调整)访问 openEuler Copilot System 网页,并尝试进行智能问答体验。 + +首先请点击下方页面的“立即注册”按钮,完成账号的注册与登录。 +![Web登录界面](./pictures/WEB登录界面.png) +![Web 界面](./pictures/WEB界面.png) + +## 安装插件 + +详细信息请参考文档 [插件部署指南](./插件部署指南) + +## 构建专有领域智能问答 + +### 1. 构建 openEuler 专业知识领域的智能问答 + + 1. 修改 values.yaml 的 pg 的镜像仓为 `pg-data` + 2. 修改 values.yaml 的 rag 部分的字段 `knowledgebaseID: openEuler_2bb3029f` + 3. 将 `vim deploy/chart/databases/templates/pgsql/pgsql-deployment.yaml` 的 volumes 相关字段注释 + 4. 进入 `cd deploy/chart/databases`,执行更新服务 `helm upgrade -n euler-copilot databases .` + 5. 进入 `cd deploy/chart/euler_copilot`,执行更新服务 `helm upgrade -n euler-copilot service .` + 6. 进入网页端进行 openEuler 专业知识领域的问答 + +### 2. 构建项目专属知识领域智能问答 + +详细信息请参考文档 [本地资产库构建指南](本地资产库构建指南.md) + +## 附录 + +### 大模型准备 + +#### GPU 环境 + +参考以下方式进行部署 + +1. 下载模型文件: + + ```bash + huggingface-cli download --resume-download Qwen/Qwen1.5-14B-Chat --local-dir Qwen1.5-14B-Chat + ``` + +2. 创建终端 control + + ```bash + screen -S control + ``` + + ```bash + python3 -m fastchat.serve.controller + ``` + + - 按 Ctrl A+D 置于后台 + +3. 创建新终端 api + + ```bash + screen -S api + ``` + + ```bash + python3 -m fastchat.serve.openai_api_server --host 0.0.0.0 --port 30000 --api-keys sk-123456 + ``` + + - 按 Ctrl A+D 置于后台 + - 如果当前环境的 Python 版本是 3.12 或者 3.9 可以创建 python3.10 的 conda 虚拟环境 + + ```bash + mkdir -p /root/py310 + ``` + + ```bash + conda create --prefix=/root/py310 python==3.10.14 + ``` + + ```bash + conda activate /root/py310 + ``` + +4. 创建新终端 worker + + ```bash + screen -S worker + ``` + + ```bash + screen -r worker + ``` + + 安装 fastchat 和 vllm + + ```bash + pip install fschat vllm + ``` + + 安装依赖: + + ```bash + pip install fschat[model_worker] + ``` + + ```bash + python3 -m fastchat.serve.vllm_worker --model-path /root/models/Qwen1.5-14B-Chat/ --model-name qwen1.5 --num-gpus 8 --gpu-memory-utilization=0.7 --dtype=half + ``` + + - 按 Ctrl A+D 置于后台 + +5. 按照如下方式配置文件,并更新服务。 + + ```bash + vim deploy/chart/euler_copilot/values.yaml + ``` + + 修改如下部分 + + ```yaml + llm: + # 开源大模型,OpenAI兼容接口 + openai: + url: "http://$(IP):30000" + key: "sk-123456" + model: qwen1.5 + max_tokens: 8192 + ``` + +#### NPU 环境 + +NPU 环境部署可参考链接 [MindIE安装指南](https://www.hiascend.com/document/detail/zh/mindie/10RC2/whatismindie/mindie_what_0001.html) + +## FAQ + +### 1. huggingface 使用报错? + +```text +File "/usr/lib/python3.9/site-packages/urllib3/connection.py", line 186, in _new_conn +raise NewConnectionError( +urllib3.exceptions.eanconectionError: : Failed to establish a new conmection: [Errno 101] Network is unreachable +``` + +- 解决办法 + +```bash +pip3 install -U huggingface_hub +``` + +```bash +export HF_ENDPOINT=https://hf-mirror.com +``` + +### 2. 如何在 RAG 容器中调用获取问答结果的接口? + +- 请先进入到 RAG 对应 Pod + +```bash +curl -k -X POST "http://localhost:8005/kb/get_answer" -H "Content-Type: application/json" -d '{ \ + "question": "", \ + "kb_sn": "default_test", \ + "fetch_source": true }' +``` + +### 3. 执行 `helm upgrade` 报错? + +```text +Error: INSTALLATI0N FAILED: Kubernetes cluster unreachable: Get "http:/localhost:880/version": dial tcp [:1:8089: connect: connection refused +``` + +或者 + +```text +Error: UPGRADE FAILED: Kubernetes cluster unreachable: the server could not find the requested resource +``` + +- 解决办法 + +```bash +export KUBECONFIG=/etc/rancher/k3s/k3s.yaml +``` + +### 4. 无法查看 Pod 日志? + +```text +[root@localhost euler-copilot]# kubectl logs rag-deployservice65c75c48d8-44vcp-n euler-copilotDefaulted container "rag" out of: rag.rag-copy secret (init)Error from server: Get "https://172.21.31.11:10250/containerlogs/euler copilot/rag deploy"service 65c75c48d8-44vcp/rag": Forbidden +``` + +- 解决办法 + 如果设置了代理,需要将本机的网络 IP 从代理中剔除 + +```bash +cat /etc/systemd/system/k3s.service.env +``` + +```text +http_proxy="http://172.21.60.51:3128" +https_proxy="http://172.21.60.51:3128" +no_proxy=172.21.31.10 # 代理中剔除本机IP +``` + +### 5. GPU环境部署大模型时出现无法流式回复? + +在服务执行 curl 大模型失败,但是将 `"stream": true` 改为 `"stream": false`就可以 curl 通? + +```bash +curl http://localhost:30000/v1/chat/completions -H "Content-Type: application/json" -H "Authorization: Bearer sk-123456" -d '{ +"model": "qwen1.5", +"messages": [ +{ +"role": "system", +"content": "你是情感分析专家,你的任务是xxxx" +}, +{ +"role": "user", +"content": "你好" +} +], +"stream": true, +"n": 1, +"max_tokens": 32768 +}' +``` + +- 解决办法: + +```bash +pip install Pydantic=1.10.13 +``` + +### 6. 如何部署 sglang? + +```bash +# 1. 激活 Conda 环境, 并激活 Python 3.10 的 Conda 环境。假设你的环境名为 `myenv`: +conda activate myenv + +# 2. 在激活的环境中,安装 sglang[all] 和 flashinfer +pip install sglang[all]==0.3.0 +pip install flashinfer -i https://flashinfer.ai/whl/cu121/torch2.4/ + +# 3. 启动服务器 +python -m sglang.launch_server --served-model-name Qwen2.5-32B --model-path Qwen2.5-32B-Instruct-AWQ --host 0.0.0.0 --port 8001 --api-key sk-12345 --mem-fraction-static 0.5 --tp 8 +``` + +- 验证安装 + + ```bash + pip show sglang + pip show flashinfer + ``` + +- 注意: + + 1. API Key:请确保 `--api-key` 参数中的 API 密钥是正确的 + 2. 模型路径: 确保 `--model-path` 参数中的路径是正确的,并且模型文件存在于该路径下。 + 3. CUDA 版本:确保你的系统上安装了 CUDA 12.1 和 PyTorch 2.4,因为 `flashinfer` 包依赖于这些特定版本。 + 4. 线程池大小:根据你的GPU资源和预期负载调整线程池大小。如果你有 8 个 GPU,那么可以选择 --tp 8 来充分利用这些资源。 + +### 7. 如何 curl embedding? + +```bash +curl -k -X POST http://$IP:8001/embedding \ + -H "Content-Type: application/json" \ + -d '{"texts": ["sample text 1", "sample text 2"]}' +# $IP为vectorize的Embedding的内网地址 +``` + +### 8. 如何生成证书? + +```bash +下载地址: https://github.com/FiloSottile/mkcert/releases +# 1. 下载 mkcert +# x86_64 +wget https://github.com/FiloSottile/mkcert/releases/download/v1.4.4/mkcert-v1.4.4-linux-amd64 +# arm64 +wget https://github.com/FiloSottile/mkcert/releases/download/v1.4.4/mkcert-v1.4.4-linux-arm64 +# 2. 执行下面的命令生成秘钥 +mkcert -install +# mkcert 可直接接域名或 IP, 生成证书和秘钥 +mkcert example.com +# 3. 将证书和秘钥拷贝到 `/home/euler-copilot-framework_openeuler/deploy/chart_ssl/traefik-secret.yaml` 中, 并执行下面命令使其生效。 +kubectl apply -f traefik-secret.yaml +``` + +### 8. Pod状态由runnning变为pending? + +在Pod正常运行一段时间后,其状态从“Running”全部转变为 “Pending” 或 “Completed”, +可执行命令`df -h`,查看Pod所在宿主机的存储空间,确保可用空间不低于30%,以保证pod的正常运行。 diff --git "a/docs/zh/docs/AI4C/AI4C\347\224\250\346\210\267\344\275\277\347\224\250\346\214\207\345\215\227.md" "b/docs/zh/docs/AI4C/AI4C\347\224\250\346\210\267\344\275\277\347\224\250\346\214\207\345\215\227.md" new file mode 100644 index 0000000000000000000000000000000000000000..63d82062e3be04f90a529cd5a51122a62b9ecaaf --- /dev/null +++ "b/docs/zh/docs/AI4C/AI4C\347\224\250\346\210\267\344\275\277\347\224\250\346\214\207\345\215\227.md" @@ -0,0 +1,550 @@ +# AI4C 使用手册 + +## 1 AI4C 介绍 + +AI4C 代表 AI 辅助编译器的套件,是一个使编译器能够集成机器学习驱动编译优化的框架。 + +## 2 软件架构说明 + +本框架包含以下几个模块,自动编译调优工具依赖 python 环境: + +* AI 辅助编译优化的推理引擎,驱动编译器在优化 pass 内使用AI模型推理所获得的结果实现编译优化。 + * 当前 GCC 内的 AI 使能优化 pass 基本通过编译器插件的形式实现,与编译器主版本解耦。 +* 自动编译调优工具,通过编译器外部的调优工具(OpenTuner)驱动编译器执行多层粒度的自动编译调优,当前支持 GCC 和 LLVM 编译器。 + * 选项调优工具,用于应用级的编译选项调优。 + * 编译调优工具,基于 [BiSheng-Autotuner](https://gitee.com/openeuler/BiSheng-Autotuner) 实现,可实现细粒度和粗粒度的编译调优。 + * 细粒度调优,调优优化 pass 内的关键优化参数,例如,循环展开的次数(unroll count)。 + * 粗粒度调优,调优函数级的编译选项。 + +未来规划方向: + +- [ ] 集成 [ACPO](https://gitee.com/src-openeuler/ACPO) 的 LLVM 编译优化模型,同时将 ACPO LLVM 侧的相关代码提取成插件,与 LLVM 主版本解耦。 +- [ ] AI4Compiler 框架支持更多的开源机器学习框架的推理(pytorch - LibTorch、tensorflow - LiteRT)。 +- [ ] 提供更多的 AI 辅助编译优化模型及相应的编译器插件。 +- [ ] 集成新的搜索算法(基于白盒信息)并优化参数搜索空间(热点函数调优)。 +- [ ] 支持 JDK 的编译参数调优。 + +## 3 AI4C 的安装构建 + +### 3.1 直接安装AI4C + +若用户使用最新的openEuler系统(24.03-LTS-SP1),同时只准备使用`AI4C`的现有特性,可以直接安装`AI4C`包。 + +```shell +yum install -y AI4C +``` + +若用户使用其他版本的`AI4C`特性或在其他OS版本中安装`AI4C`,需重新构建`AI4C`,可以参考以下步骤。 + +### 3.2 RPM包构建安装流程(推荐) + +1. 使用 root 权限,安装 rpmbuild、rpmdevtools,具体命令如下: + + ```bash + # 安装 rpmbuild + yum install dnf-plugins-core rpm-build + # 安装 rpmdevtools + yum install rpmdevtools + ``` + +2. 在主目录`/root`下生成 rpmbuild 文件夹: + + ```bash + rpmdev-setuptree + # 检查自动生成的目录结构 + ls ~/rpmbuild/ + BUILD BUILDROOT RPMS SOURCES SPECS SRPMS + ``` + +3. 使用`git clone https://gitee.com/src-openeuler/AI4C.git`,从目标仓库的 `openEuler-24.03-LTS-SP1` 分支拉取代码,并把目标文件放入 rpmbuild 的相应文件夹下: + + ``` shell + cp AI4C/AI4C-v%{version}-alpha.tar.gz ~/rpmbuild/SOURCES/ + cp AI4C/*.patch ~/rpmbuild/SOURCES/ + cp AI4C/AI4C.spec ~/rpmbuild/SPECS/ + ``` + +4. 用户可通过以下步骤生成 `AI4C` 的 RPM 包: + + ```shell + # 安装 AI4C 所需依赖 + yum-builddep ~/rpmbuild/SPECS/AI4C.spec + # 构建 AI4C 依赖包 + # 若出现 check-rpaths 相关报错,则需要在 rpmbuild 前添加 QA_RPATHS=0x0002,例如 + # QA_RPATHS=0x0002 rpmbuild -ba ~/rpmbuild/SPECS/AI4C.spec + rpmbuild -ba ~/rpmbuild/SPECS/AI4C.spec + # 安装 RPM 包 + cd ~/rpmbuild/RPMS/ + rpm -ivh AI4C--..rpm + ``` + + 注意事项:若系统因存有旧版本的 RPM 安装包而导致文件冲突,可以通过以下方式解决: + + ```shell + # 解决方案一:强制安装新版本 + rpm -ivh AI4C--..rpm --force + # 解决方案二:更新安装包 + rpm -Uvh AI4C--..rpm + ``` + + 安装完成后,系统内会存在以下文件: + + * `/usr/bin/ai4c-*`: AI 使能的编译器以及自动调优工具的 wrapper + * `/usr/lib64/libonnxruntime.so`: ONNX Runtime 的推理框架动态库 + * `/usr/lib64/AI4C/*.onnx`: AI 辅助编译优化模型(ONNX 格式) + * `/usr/lib64/python/site-packages/ai4c/lib/*.so`: + * AI 辅助编译优化的推理引擎动态库 + * AI 辅助编译优化与编译调优的编译器插件动态库 + * `/usr/lib64/python/site-packages/ai4c/autotuner/*`: 粗、细粒度调优工具的相关文件 + * `/usr/lib64/python/site-packages/ai4c/optimizer/*`: AI 辅助编译优化的相关文件 + * `/usr/lib64/python/site-packages/ai4c/option_tuner/*`: 应用级编译选项调优的相关文件 + +### 3.3 源码构建安装流程 + +AI4C 的源码地址:https://gitee.com/openeuler/AI4C + +#### 3.3.1 安装 ONNX Runtime 依赖 + +**方案一:** + +在 GitHub 下载 1.16.3 版本,并解压相应架构的 tgz 文件,例如,aarch64 架构下,下载`onnxruntime-linux-aarch64-1.16.3.tgz`。 + +地址:https://github.com/microsoft/onnxruntime/releases/tag/v1.16.3 + +**注意事项**:`tgz` 文件解压后,`libonnxruntime.so`的动态库存在于`lib`目录下,为构建 AI4C 框架,需将`lib`目录重命名为`lib64`,否则可能会导致`-lonnxruntime`找不到路径的报错。 + +**方案二:** + +保证以下 onnxruntime 的依赖包已安装: + +```shell +yum install -y cmake make gcc gcc-c++ abseil-cpp-devel boost-devel bzip2 python3-devel python3-numpy python3-setuptools python3-pip +``` + +使用 cmake 安装 onnxruntime: + +```shell +cd path/to/your/AI4C/third_party/onnxruntime +cmake \ + -DCMAKE_INSTALL_PREFIX=path/to/your/onnxruntime \ + -Donnxruntime_BUILD_SHARED_LIB=ON \ + -Donnxruntime_BUILD_UNIT_TESTS=ON \ + -Donnxruntime_INSTALL_UNIT_TESTS=OFF \ + -Donnxruntime_BUILD_BENCHMARKS=OFF \ + -Donnxruntime_USE_FULL_PROTOBUF=ON \ + -DPYTHON_VERSION=%{python3_version} \ + -Donnxruntime_ENABLE_CPUINFO=ON \ + -Donnxruntime_DISABLE_ABSEIL=ON \ + -Donnxruntime_USE_NEURAL_SPEED=OFF \ + -Donnxruntime_ENABLE_PYTHON=OFF \ + -DCMAKE_BUILD_TYPE=Release \ + -S cmake +make -j %{max_jobs} && make install +``` + +#### 3.3.2 安装 AI4C 的其他构建依赖 + +保证以下依赖包已安装: + +```shell +yum install -y python3-wheel openssl openssl-devel yaml-cpp yaml-cpp-devel gcc-plugin-devel libstdc++-static +``` + +#### 3.3.3 构建 AI4C 框架 + +```shell +cd path/to/your/AI4C/python +python3 setup.py bdist_wheel \ + -Donnxruntime_ROOTDIR=path/to/your/onnxruntime \ + -DCMAKE_BUILD_TYPE=Release \ + -DCMAKE_CXX_COMPILER=path/to/your/g++ \ + -DCMAKE_C_COMPILER=path/to/your/gcc +pip3 install dist/ai4c----_.whl --force-reinstall --no-deps +``` + +安装完成后,系统内会存在以下文件: + +* `path/to/your/pythonbin/ai4c-*`: AI 使能的编译器以及自动调优工具的 wrapper +* `path/to/your/onnxruntime/lib64/libonnxruntime.so`: ONNX Runtime 的推理框架动态库 +* `path/to/your/AI4C/models/*.onnx`: AI 辅助编译优化模型(ONNX 格式) +* `path/to/your/pythonlib/ai4c/lib/*.so`: + * AI 辅助编译优化的推理引擎动态库 + * AI 辅助编译优化与编译调优的编译器插件动态库 +* `path/to/your/pythonlib/ai4c/autotuner/*`: 粗、细粒度调优工具的相关文件 +* `path/to/your/pythonlib/ai4c/optimizer/*`: AI 辅助编译优化的相关文件 +* `path/to/your/pythonlib/ai4c/option_tuner/*`: 应用级编译选项调优的相关文件 + +注意事项: + +* `path/to/your/pythonbin`:安装完成后,可通过`which ai4c-gcc`查看 bin 的路径 +* `path/to/your/pythonlib`:安装完成后,可通过`pip show ai4c`显示的 Location 查看 lib 的路径 + +## 4 使用流程 + +### 4.1 AI 辅助编译优化 + +当前的 AI 辅助编译优化模块,主要由三部分输入组成: +- ONNX 模型,训练后的辅助编译优化模型。 +- 编译器插件(**当前仅支持 GCC 编译器**),用于运行 ONNX 模型推理并获取优化参数。 +- AI4Compiler 框架,提供 ONNX 推理引擎和 GCC 优化编译命令。 + +用户事先根据开源机器学习框架训练一个 AI 模型,输出成 ONNX 格式。同时,针对该 AI 模型提供一个对应的编译器插件,插件内至少包含三个模块: + +* 提取 AI 模型所需的编译器输入特征。 +* 驱动推理引擎调用 AI 模型执行推理。 +* 标注推理结果回编译器的数据结构。 + +在下述测试例中,仅需要在每次编译目标二进制的编译命令中,增加三个与插件相关的编译选项:插件路径、插件对应的 AI 模型路径、推理引擎路径,即可在编译时使能 AI 辅助编译优化模型。 + +```shell +# 若 onnxruntime 安装在非系统的文件夹下,注意设置环境变量 +# export LD_LIBRARY_PATH=path/to/your/onnxruntime/lib64/:$LD_LIBRARY_PATH + +gcc_compiler=path/to/your/gcc +infer_engine_path=$(ai4c-gcc --inference-engine) +model_path=path/to/your/model.onnx +plugin_path=path/to/your/.so + +$gcc_compiler test.c -O2 -o test \ + -fplugin=$plugin_path \ + -fplugin-arg--model=$model_path \ + -fplugin-arg--engine=$infer_engine_path +``` + +当前已支持的插件存在于`$(ai4c-gcc --inference-engine)`的同目录下,已支持的模型存在于`path/to/your/AI4C/models`下。 + +**注意事项:** + +* 编译 AI 模型对应的编译器插件与编译目标优化应用的编译器需保证为同一个,否则会出现编译器版本不一致导致的编译报错。 +* 当前 AI4C 仅支持在 GCC 编译器 cc1 阶段实现的 AI 辅助编译优化 pass 使用插件形式。 + +详细的编译器插件开发流程与使用流程可以参照 [AI 辅助编译优化手册](https://gitee.com/openeuler/AI4C/blob/master/python/docs/gcc-opt.md) 和 [测试例](https://gitee.com/openeuler/AI4C/tree/master/python/test/optimizer/block_correction) 进行。 + +下面我们举两个位于不同编译阶段的 AI 辅助编译优化模型的使用例。**循环展开与函数内联模型**位于`cc1`编译优化阶段,使用 GCC 插件形式实现 AI 模型适配与推理;**BOLT 采样基本块精度修正模型**位于`BOLT`链接后优化阶段,模型适配层位于 [LLVM-BOLT](https://gitee.com/src-openeuler/llvm-bolt) 仓库。 + +#### 4.1.1 循环展开与函数内联模型 + +循环展开与函数内联模型对应的编译优化选项如下: + +| 选项名 | 说明 | +| ---------------------------------------------------- | ------------------------------------------------------------ | +| -fplugin | 指定循环展开与函数内联插件的**绝对路径**(`-fplugin=/path/to/.so`)。 | +| -fplugin-arg--engine | 指定函数内联 ONNX 模型的推理引擎**绝对路径**(`-fplugin-arg--inline_model=/path/to/inference_engine.so`),需要与`-fplugin`同时开启。`/path/to/inference_engine.so`的路径可通过`ai4c-gcc --inference-engine`获得。 | +| -fplugin-arg--inline_model | 指定函数内联 ONNX 模型的**绝对路径**(`-fplugin-arg--inline_model=/path/to/inline_model.onnx`),需要与`-fplugin`和`-fplugin-arg--engine`同时开启。 | +| -fplugin-arg--unroll_model | 指定循环展开 ONNX 模型的**绝对路径**(`-fplugin-arg--unroll_model=/path/to/unroll_model.onnx`),需要与`-fplugin`和`-fplugin-arg--engine`同时开启。 | + +用户可同时启用一个 GCC 插件内的多个 AI 辅助编译优化模型,例如: + +```shell +gxx_compiler=path/to/your/g++ +infer_engine_path=$(ai4c-gcc --inference-engine) +inline_model_path=path/to/your/inline_model.onnx +unroll_model_path=path/to/your/unroll_model.onnx +plugin_path=path/to/your/.so + +$gxx_compiler test.cc -O3 -o test -funroll-loops \ + -fplugin=$plugin_path \ + -fplugin-arg--engine=$infer_engine_path \ + -fplugin-arg--inline_model=$inline_model_path \ + -fplugin-arg--unroll_model=$unroll_model_path +``` + +#### 4.1.2 BOLT 采样基本块精度修正模型 + +BOLT 采样的基本块精度修正模型对应的 BOLT 优化选项如下: + +| 选项名 | 说明 | +| ------------------- | ------------------------------------------------------------ | +| -block-correction | 开启 AI 优化 CFG BB Count 选项,需要与 `-model-path` 选项同时开启以指定 ONNX 模型。 | +| -model-path | 指定 ONNX 模型的**绝对路径**(`-model-path=/path/to/model.onnx`),需要与`-block-correction`同时开启。 | +| -annotate-threshold | 使用模型预测结果的置信度阈值,默认是 0.95。 | + +BOLT 内自定义的优化选项可以通过 GCC 的`-fbolt-option`调用使能,例如: + +```shell +g++ -fbolt-use= -fbolt-target= -fbolt-option=\"-block-correction -model-path=path/to/your/block_correction_model.onnx\" +``` + +### 4.2 细粒度调优 + +此处我们以 GCC 内**循环展开**优化 pass 的细粒度调优为例,展开调优工具的使用流程。 + +当前的细粒度调优模块,由两部分输入组成: + +* 应用的调优配置文件(.ini):处理应用的编译流程、执行流程。 +* 搜参空间配置文件(YAML):Autotuner 阶段配置的选项调优搜参空间,可替换默认搜参空间。 + +当前细粒度调优基于 [BiSheng-Autotuner](https://gitee.com/openeuler/BiSheng-Autotuner) 实现: + +1. 在编译器的`generate`阶段,生成一组可调优的编译数据结构与可调优系数集合,保存在`opp/*.yaml`内。 +2. 根据额外提供的编译搜参空间(`search_space.yaml`)与可调优数据结构,Autotuner 通过调优算法针对每个可调优数据结构生成下一组调优系数,保存在`input.yaml`中。 +3. 在编译器的`autotune`阶段,根据`input.yaml`内数据结构的 hash 值,将调优系数标注到对应的数据结构里,完成调优。 + +在开启细粒度调优前,需安装以下依赖包: + +```shell +yum install -y BiSheng-Autotuner bisheng-opentuner +``` + +下列测试例中,我们将调优 [CoreMark](https://github.com/eembc/coremark) 的循环展开参数。首先,我们将准备`CoreMark`的调优配置文件`coremark_sample.ini`。用户需要 + +* 提供应用路径、应用的编译与运行命令。 +* 在基础编译命令中加入细粒度调优的动态库`-fplugin=%(PluginPath)s/rtl_unroll_autotune_plugin_gcc12.so`。 + * 在`generate`和`autotune`阶段,分别加入`-fplugin-arg-rtl_unroll_autotune_plugin_gcc12-`的相应输入文件。 +* 可自定义可调优结构配置文件的路径(`./opp/*.yaml`)、Autotuner 生成的编译器输入文件路径(`input.yaml`)等。 + +```ini +[DEFAULT] # optional +# PluginPath = /path/to/gcc-plugins + +[Environment Setting] # optional +# prepend a list of paths into the PATH in order. +# PATH = /path/to/bin +# you can also set other enviroment variables here too + +[Compiling Setting] # required +# NOTE: ConfigFilePath is set to the path to the current config file automatically by default. +CompileDir = /path/to/coremark +LLVMInputFile = %(CompileDir)s/input.yaml + +# OppDir and OppCompileCommand are optional, +# do not have to specify this if not using auto_run sub-command +OppDir = autotune_datadir/opp + +CompilerCXX = /path/to/bin/gcc +BaseCommand = %(CompilerCXX)s -I. -I./posix -DFLAGS_STR=\"" -lrt"\" \ + -DPERFORMANCE_RUN=1 -DITERATIONS=10000 -g \ + core_list_join.c core_main.c core_matrix.c \ + core_state.c core_util.c posix/core_portme.c \ + -funroll-loops -O2 -o coremark \ + -fplugin=%(PluginPath)s/rtl_unroll_autotune_plugin_gcc12.so + +# auto-tuning +CompileCommand = %(BaseCommand)s \ + -fplugin-arg-rtl_unroll_autotune_plugin_gcc12-autotune=%(LLVMInputFile)s + +RunDir = %(CompileDir)s +RunCommand = ./coremark 0x0 0x0 0x66 100000 # run 300000 iterations for coremark + +# generate +OppCompileCommand = %(BaseCommand)s \ + -fplugin-arg-rtl_unroll_autotune_plugin_gcc12-generate=%(OppDir)s +``` + +其次,我们可以准备一份额外的参数搜索空间文件`seach_space.yaml`,自定义缩小参数空间。例如,动态库默认选择循环展开系数空间为$\{0, 2^0=1, 2^1=2, ..., 2^6=64\}$,我们可以把搜索空间调整为$\{0, 2^0=1, 2^1=2, ..., 2^5=32\}$。 + +```yaml +CodeRegion: + CodeRegionType: loop + Pass: loop2_unroll + Args: + UnrollCount: + Value: [0, 1, 2, 4, 8, 16, 32] + Type: enum +``` + +最终我们将 `coremark`,`coremark_sample.ini`,和`search_space.yaml` 放在同一个文件夹下,并运行以下脚本: + +```shell +ai4c-autotune autorun coremark_sample.ini \ + -scf search_space.yaml --stage-order loop \ + --time-after-convergence=100 +``` + +其中,参数`time-after-convergence`代表历史最佳值后多少秒未发现新的最优配置时,即提早结束调优。 + +调优完成后,最佳调优配置将保存在`loop.yaml`内,并可通过重新调用`autotune`阶段编译命令,同时修改`autotune`选项的输入文件(i.e., `-fplugin-arg-rtl_unroll_autotune_plugin_gcc12-autotune=loop.yaml`),复现该调优组合的性能值。 + +用户可以通过以下方式调取历史调优配置文件(`autotune_config.csv`)与性能数据文件(`autotune_data.csv`): + +```shell +ai4c-autotune dump -c coremark/input.yaml \ + --database=opentuner.db/localhost.localdomain.db -o autotune +``` + +**注意事项:** + +* 当前默认支持程序运行时间作为性能值。 + +详细使用信息,请参考[细粒度调优使用手册](https://gitee.com/openeuler/AI4C/blob/master/python/docs/autotuner.md) 与该测试例:https://gitee.com/openeuler/AI4C/tree/master/python/test/autotuner/loop_unroll + +LLVM 编译器的细粒度调优请参考 [BiSheng-Autotuner使用指南](../BiSheng-Autotuner/BiSheng-Autotuner使用指南.md)。 + +### 4.3 函数级的粗粒度调优 + +当前的函数级粗粒度调优模块,由三部分输入组成: + +* 应用的调优配置文件(.ini):处理应用的编译流程、执行流程。 +* 搜参空间配置文件(YAML):Autotuner 阶段配置的选项调优搜参空间,可替换默认搜参空间。 +* 编译选项全集文件(YAML):预先设置的编译选项搜索空间全集,默认文件位于`path/to/your/python/site-packages/ai4c/autotuner/yaml/coarse_options.yaml`。 + +当前函数级粗粒度调优基于 [BiSheng-Autotuner](https://gitee.com/openeuler/BiSheng-Autotuner) 实现,可以帮助各函数使用不同的编译选项组合执行编译优化,其调优原理细粒度调优与一致。由于各函数可调优的编译选项众多,可预先对选项空间做裁剪。 + +在开启函数级的粗粒度调优前,需安装以下依赖包: + +```shell +yum install -y BiSheng-Autotuner bisheng-opentuner +``` + +粗粒度调优的使用流程基本与细粒度调优一致。下列测试例中,我们将调优`test_coarse_tuning.cc`中各函数的编译选项参数。首先,我们将准备`test_coarse_tuning.cc`的调优配置文件`test_coarse_tuning.ini`。用户需要 + +* 提供应用路径、应用的编译与运行命令。 +* 在基础编译命令中加入粗粒度调优的动态库`-fplugin=%(PluginPath)s/coarse_option_tuning_plugin_gcc12.so`和编译选项全集文件`-fplugin-arg-coarse_option_tuning_plugin_gcc12-yaml=`。 + * 在`generate`和`autotune`阶段,分别加入`-fplugin-arg-coarse_option_tuning_plugin_gcc12-`的相应输入文件。 +* 可自定义可调优结构配置文件的路径(`./opp/*.yaml`)、Autotuner 生成的编译器输入文件路径(`input.yaml`)等。 + +```ini +[DEFAULT] # optional +# TuningYAMLFile = /path/to/coarse_option_tuning_yaml_config_file + +[Environment Setting] # optional + +[Compiling Setting] # required +CompileDir = ./autotune_datadir +LLVMInputFile = %(CompileDir)s/input.yaml + +OppDir = opp + +Compiler = g++ +BaseCommand = %(Compiler)s ../test_coarse_tuning.cc -O2 -o test_coarse_tuning \ + -fplugin=%(PluginPath)s/coarse_option_tuning_plugin_gcc12.so \ + -fplugin-arg-coarse_option_tuning_plugin_gcc12-yaml=%(TuningYAMLFile)s + +# auto-tuning +CompileCommand = %(BaseCommand)s \ + -fplugin-arg-coarse_option_tuning_plugin_gcc12-autotune=input.yaml + +RunDir = %(CompileDir)s +RunCommand = ./test_coarse_tuning 3 + +# generate +OppCompileCommand = %(BaseCommand)s \ + -fplugin-arg-coarse_option_tuning_plugin_gcc12-generate=%(OppDir)s +``` + +其次,我们可以准备一份额外的参数搜索空间文件`seach_space.yaml`,自定义参数空间。例如,在以下文件中,我们将搜索空间限制在预取相关选项上的调优。 + +```yaml +CodeRegion: + CodeRegionType: function + Pass: coarse_option_generate + Args: + flag_prefetch_loop_arrays: + Type: bool + param_prefetch_latency: + Min: 100 + Max: 2000 + Type: int + param_simultaneous_prefetches: + Min: 1 + Max: 80 + Type: int +``` + +最终我们将 `test_coarse_tuning.cc`,`test_coarse_tuning.ini`,和`search_space.yaml` 放在同一个文件夹下,并运行以下脚本: + +```shell +ai4c-autotune autorun test_coarse_tuning.ini \ + -scf search_space.yaml \ + --stage-order function \ + --time-after-convergence=10 +``` + +其中,参数`time-after-convergence`代表历史最佳值后多少秒未发现新的最优配置时,即提早结束调优。 + +调优完成后,最佳调优配置将保存在`function.yaml`内,并可通过重新调用`autotune`阶段编译命令,同时修改`autotune`选项的输入文件(i.e., `-fplugin-arg-coarse_option_tuning_plugin_gcc12-autotune=function.yaml`),复现该调优组合的性能值。 + +**注意事项:** + +* 当前默认支持程序运行时间作为性能值。 +* 粗粒度调优暂不支持 dump 数据库内保存的历史数据。 +* 当前的粗粒度调优支持与当前版本的 GCC 版本(12.3.1)配套使用,其他编译器版本会出现部分编译选项不支持的问题。可在`path/to/your/AI4C/aiframe/include/option_utils.h`中注释编译器未识别的编译选项。 + +详细使用信息,请参考该测试例:https://gitee.com/openeuler/AI4C/tree/master/python/test/autotuner/coarse_tuning + +LLVM 编译器的粗粒度调优请参考 [BiSheng-Autotuner使用指南](../BiSheng-Autotuner/BiSheng-Autotuner使用指南.md)。 + +### 4.4 应用级选项调优 + +当前的应用级选项调优模块,主要由三部分输入组成: + +* 应用的编译与运行脚本(shell):处理应用的编译流程(并将生成的下一组选项替换进编译脚本内)、执行流程、和性能数据采集流程。 +* 编译选项与动态库选项的搜参空间配置文件(YAML):配置选项调优的搜参空间,可配置开关选项(编译优化/动态库)、编译参数、枚举选项。 +* 性能值的配置文件(YAML):配置多个性能项的权重,与目标优化方向(最大/最小值),需与“性能数据采集流程”所获取的性能值数量、顺序对应。 + +应用级选项调优工具将不断收集应用的性能数据,更新性能模型,并生成一组模型预期收益较高的新编译选项组合。通过应用的编译与运行脚本将新的编译选项组合替换进编译脚本内,生成新的二进制文件并执行下一轮运行。反复调优,获取历史最优性能值。 + +在开启应用级选项调优前,需安装以下依赖包: + +```shell +pip install xgboost scikit-learn +yum install -y time +``` + +以下用例将使用不同的编译选项组合构建并调优`test.cc` 3 轮。应用的编译与运行脚本如下: + +```shell +# ---------- run_test.sh ---------- # +parent_dir=$1 # path for intermediate tuning files +config=$(cat ${parent_dir}/tuning/config.txt) # current compiler configuration file +performance_file="${parent_dir}/tuning/performance.txt" # current performance data file + +measure_raw_file="time.txt" + +compiler=g++ +compile_command="${compiler} test.cc -O2 -o test_opt_tuner" +eval "${compile_command} ${config}" # program compilation, appending tuning options + +run_command="time -p -o ${measure_raw_file} ./test_opt_tuner 3" +eval "${run_command}" # program execution + +info_collect_command="grep real ${measure_raw_file} | awk '{printf \"1 1 %s\", \$2}' > ${performance_file}" +eval "${info_collect_command}" # program performance collection + +# ---------- run_option_tuner.sh ---------- # +ai4c-option-tune --test_limit 3 --runfile run_test.sh + # --optionfile path/to/your/python/site-packages/ai4c/option_tuner/input/options.yaml \ + # --libfile path/to/your/python/site-packages/ai4c/option_tuner/input/options_lib.yaml \ + # --measurefile path/to/your/python/site-packages/ai4c/option_tuner/input/config_measure.yaml +``` + +其中默认的选项与性能值配置文件存在于以下路径:`path/to/your/python/site-packages/ai4c/option_tuner/input/*.yaml` + +用户可根据需要修改编译选项与动态库选项配置文件,相关关键词为: + +* `required_*`:必选调优项,将一直保留在调优中 +* `bool_*`:可选的编译优化开关选项 +* `interval_*`: 可选的编译参数(值选项,数据区间) +* `enum_*`: 可选的编译参数(枚举选项) + +例如, + +```yaml +required_config: +- -O2 +bool_config: +- -funroll-loops +interval_config: +- name: --param max-inline-insns-auto + default: 15 + min: 10 + max: 190 +``` + +用户可根据需要修改性能值配置文件,相关关键词为: + +* `weight`: 性能值权重 +* `optim`: 目标优化方向(最大/最小值) + +例如, + +```yaml +config_measure: +- name: throughput + weight: 1 + optim: maximize +``` + +调优完成后,历史与最佳调优数据将保留在`${parent_dir}/tuning/train.csv`和`${parent_dir}/tuning/result.txt`中。 + +详细使用信息,请参考该测试例:https://gitee.com/openeuler/AI4C/tree/master/python/test/option_tuner \ No newline at end of file diff --git a/docs/zh/docs/Administration/figures/AT_CHECK_Process.png b/docs/zh/docs/Administration/figures/AT_CHECK_Process.png new file mode 100644 index 0000000000000000000000000000000000000000..f32d5af3a31c740febf1a4783a1dd0daafacb0df Binary files /dev/null and b/docs/zh/docs/Administration/figures/AT_CHECK_Process.png differ diff --git a/docs/zh/docs/Administration/figures/Process_Of_EXECVAT_ATCHECK.png b/docs/zh/docs/Administration/figures/Process_Of_EXECVAT_ATCHECK.png new file mode 100644 index 0000000000000000000000000000000000000000..c8f54fe96648f0c012462073a8cd118fd552483c Binary files /dev/null and b/docs/zh/docs/Administration/figures/Process_Of_EXECVAT_ATCHECK.png differ diff --git a/docs/zh/docs/Administration/figures/root_of_trust_framework.png b/docs/zh/docs/Administration/figures/root_of_trust_framework.png new file mode 100644 index 0000000000000000000000000000000000000000..354b40fa4c4f0ed6f7312e0ce3848ed42155732e Binary files /dev/null and b/docs/zh/docs/Administration/figures/root_of_trust_framework.png differ diff --git "a/docs/zh/docs/Administration/\344\275\277\347\224\250DNF\347\256\241\347\220\206\350\275\257\344\273\266\345\214\205.md" "b/docs/zh/docs/Administration/\344\275\277\347\224\250DNF\347\256\241\347\220\206\350\275\257\344\273\266\345\214\205.md" index e704595388bf9c48f177a6d46e5b74277df9ee26..3fb3d6c0b2b2a3760fd8b066744c6b8f1aa41d0b 100644 --- "a/docs/zh/docs/Administration/\344\275\277\347\224\250DNF\347\256\241\347\220\206\350\275\257\344\273\266\345\214\205.md" +++ "b/docs/zh/docs/Administration/\344\275\277\347\224\250DNF\347\256\241\347\220\206\350\275\257\344\273\266\345\214\205.md" @@ -142,10 +142,10 @@ repository部分允许您定义定制化的openEuler软件源仓库,各个仓 [OS] name=openEuler-$releasever-OS - baseurl=https://repo.openeuler.org/openEuler-23.09/OS/$basearch/ + baseurl=https://repo.openeuler.org/openEuler-{version}/OS/$basearch/ enabled=1 gpgcheck=1 - gpgkey=https://repo.openeuler.org/openEuler-23.09/OS/$basearch/RPM-GPG-KEY-openEuler + gpgkey=https://repo.openeuler.org/openEuler-{version}/OS/$basearch/RPM-GPG-KEY-openEuler ``` >![](./public_sys-resources/icon-note.gif) **说明:** @@ -295,7 +295,7 @@ repository部分允许您定义定制化的openEuler软件源仓库,各个仓 ``` >![](./public_sys-resources/icon-note.gif) **说明:** ->安装RPM包过程中,若出现安装失败,可参考[安装时出现软件包冲突、文件冲突或缺少软件包导致安装失败](./FAQ-54.html#安装时出现软件包冲突文件冲突或缺少软件包导致安装失败)。 +>安装RPM包过程中,若出现安装失败,可参考[安装时出现软件包冲突、文件冲突或缺少软件包导致安装失败](./FAQ-54.md#安装时出现软件包冲突文件冲突或缺少软件包导致安装失败)。 ### 下载软件包 diff --git "a/docs/zh/docs/Administration/\344\275\277\347\224\250KAE\345\212\240\351\200\237\345\274\225\346\223\216.md" "b/docs/zh/docs/Administration/\344\275\277\347\224\250KAE\345\212\240\351\200\237\345\274\225\346\223\216.md" index 31fb609e8c3d1f5711016f6acddce9d96226da4b..c344b06821c45b064be80d365d0e4123198458dd 100644 --- "a/docs/zh/docs/Administration/\344\275\277\347\224\250KAE\345\212\240\351\200\237\345\274\225\346\223\216.md" +++ "b/docs/zh/docs/Administration/\344\275\277\347\224\250KAE\345\212\240\351\200\237\345\274\225\346\223\216.md" @@ -64,7 +64,7 @@ KAE加速引擎主要有以下应用场景,如[表1](#table11915824163418)所 >- 物理机场景使用加速器需要关闭SMMU,具体操作请参考《[TaiShan 200服务器BIOS参数参考](https://support.huawei.com/enterprise/zh/doc/EDOC1100088653)》。 - CPU:Kunpeng 920 -- 操作系统:openEuler-21.09-aarch64-dvd.iso +- 操作系统:openEuler ##### KAE加速引擎软件说明 diff --git "a/docs/zh/docs/Administration/\345\206\205\346\240\270\345\217\257\344\277\241\346\240\271\346\241\206\346\236\266\347\224\250\346\210\267\346\226\207\346\241\243.md" "b/docs/zh/docs/Administration/\345\206\205\346\240\270\345\217\257\344\277\241\346\240\271\346\241\206\346\236\266\347\224\250\346\210\267\346\226\207\346\241\243.md" new file mode 100644 index 0000000000000000000000000000000000000000..f2835d168db1ab2d2aacb4915ec98c67ea3d0712 --- /dev/null +++ "b/docs/zh/docs/Administration/\345\206\205\346\240\270\345\217\257\344\277\241\346\240\271\346\241\206\346\236\266\347\224\250\346\210\267\346\226\207\346\241\243.md" @@ -0,0 +1,92 @@ +# 内核可信根框架 + +## 概述 + +典型的攻击手段往往伴随着信息系统真实性、完整性的破坏,目前业界的共识是通过硬件可信根对系统关键组件进行度量/验证,一旦检测到篡改或仿冒行为,就执行告警或拦截。 + +当前业界主流是采用TPM作为信任根,结合完整性保护软件栈逐级构筑系统信任链,从而保证系统各组件的真实性和完整性。openEuler支持的完整性保护特性如下: + +- 可信启动:系统启动阶段,度量启动组件的摘要值并记录到PCR寄存器; +- IMA度量:文件访问阶段,度量文件的摘要值,并扩展到PCR寄存器; +- DIM度量:进程运行阶段,度量内存代码段的摘要值,并扩展到PCR寄存器。 + +近年来,随着可信计算、机密计算等安全技术的发展,业界出现了各种形态不一的硬件可信根,及其配套的证明体系,例如: + +- TCG Trusted Platform Module (TPM) +- Trusted Cryptography Module (TCM) +- Trusted Platform Control Module (TPCM) +- Intel Trust Domain Extensions (TDX) +- Arm Confidential Compute Architecture (CCA) +- Virtualized Arm Confidential Compute Architecture (virtCCA) + +因此,本特性旨在支持一套内核态的可信根框架,南向支持多种可信根驱动,北向提供统一度量接口,对接上层完整性保护软件栈,将openEuler E2E完整性保护技术的硬件支持范围从单TPM扩展为多元异构可信根。 + +root_of_trust_framework + +## 特性介绍 + +本特性目前支持哈希扩展类型的度量可信根,即采用若干个度量寄存器(对应一种或多种哈希算法)采用如下形式记录多个度量结果: + +``` +value_new = hash(value_old | measure_result) +``` + +上式中value_new/value_old代表可信根内的度量寄存器的新/旧值;measure_result代表本次的度量结果;hash代表该度量寄存器所使用的哈希算法。 + +对于每一个可信根,开发者可通过本特性提供的框架层完成该可信根实例的定义和注册。注册成功后,完整性度量特性会自动将度量结果传入可信根实例中,完成哈希扩展和寄存器更新。 + +## 特性范围 + +本特性于openEuler 24.03 LTS SP1(6.6内核)版本支持,当前南向支持TPM和virtCCA两种可信根,北向支持内核完整性度量(IMA)特性。后续将持续完善对于其他可信根和度量特性的支持工作。 + +## 接口说明 + +### 结构体接口说明 + +对于每个可信根实例,需要定义如下结构体: + +``` +struct ima_rot { + const char *name; + int nr_allocated_banks; + struct tpm_bank_info *allocated_banks; + + int (*init)(struct ima_rot *rot); + int (*extend)(struct tpm_digest *digests_arg, const void *args); + int (*calc_boot_aggregate)(struct ima_digest_data *hash); +}; +``` + +成员变量描述如下: + +| **成员** | **说明** | +| ------------------- | ------------------------------------- | +| name | 可信根设备的名称 | +| nr_allocated_banks | 可信根支持的度量寄存器数量 | +| allocated_banks | 可信根度量寄存器算法定义 | +| init | 可信根初始化函数实现 | +| extend | 可信根扩展函数实现 | +| calc_boot_aggregate | IMA特性的boot aggregate值计算函数实现 | + +接口体数组定义在内核代码的security/integrity/ima/ima_rot.c文件中的ima_rots变量,在该数组变量定义中追加可信根实例,即可实现IMA特性对不同可信根的功能扩展。 + +### 启动参数接口说明 + +本特性涉及新增如下启动参数: + +| **参数** | **取值** | **说明** | +| -------- | -------- | ------------------------------------------------------------ | +| ima_rot= | 字符串 | 指定IMA优先使用的可信根设备的名称。若指定设备不存在,则尝试使用默认设备(TPM);如指定设备或默认设备初始化失败,则无可信根。 | + +## 使用说明 + +以用户在机密虚机中配置IMA度量使用virtCCA可信根为例,用户可在启动参数中添加如下参数: + +``` +ima_rot=virtcca +``` + +**注意:** 如果环境中仅virtcca可信根可用,无其他可信根(如vTPM)可用,也可不配置该参数。 + +配置完成后,首条IMA度量日志(boot aggregate日志)即为virtCCA的RIM的中存储的度量值;每条IMA度量日志的哈希值将在virtCCA的REM[0]中进行哈希扩展。用户可以基于virtCCA提供的远程证明软件栈实现机密虚机中的应用度量结果校验。 + diff --git "a/docs/zh/docs/Administration/\345\206\205\346\240\270\345\256\214\346\225\264\346\200\247\345\272\246\351\207\217\357\274\210IMA\357\274\211.md" "b/docs/zh/docs/Administration/\345\206\205\346\240\270\345\256\214\346\225\264\346\200\247\345\272\246\351\207\217\357\274\210IMA\357\274\211.md" new file mode 100644 index 0000000000000000000000000000000000000000..08d12c5d40834d1a064228c3508cb4d6a4e9ed40 --- /dev/null +++ "b/docs/zh/docs/Administration/\345\206\205\346\240\270\345\256\214\346\225\264\346\200\247\345\272\246\351\207\217\357\274\210IMA\357\274\211.md" @@ -0,0 +1,1361 @@ +# 内核完整性度量(IMA) + +## 概述 + +### IMA介绍 + +IMA,全称 Integrity Measurement Architecture(完整性度量架构),是内核中的一个子系统,能够基于自定义策略对通过`execve()`、`mmap()`和`open()`等系统调用访问的文件进行度量,度量结果可被用于**本地/远程证明**,或者和已有的参考值比较以**控制对文件的访问**。 + +IMA的运行模式主要包含以下两种: + +- 度量(measure):提供了对文件的完整性状态观测功能,访问受保护文件时,会往度量日志(位于内核内存中)增加度量记录。如果系统包含TPM芯片,还可以往TPM芯片PCR寄存器中扩展度量摘要值,以保证度量信息不被篡改。度量场景并不提供对文件访问的控制,它记录的文件信息可传递给上层应用软件,进一步用于远程证明。 +- 评估(appraise):提供了对文件的完整性校验功能,从根本上杜绝了未知的/被篡改的文件的访问。通过哈希、签名、HMAC等密码学技术对文件的内容进行完整性验证,如果验证失败,则不允许任何进程对该文件进行访问。该特性为系统提供了底层韧性设计,在系统被破坏时牺牲一部分功能(被篡改的部分文件),避免攻击造成的影响进一步升级。 + +可以看到,IMA度量模式相当于一个“只记录不干涉”的观察员,IMA评估模式相当于一位严格的保安人员,它的职责是拒绝对所有“人证不一”的文件访问。 + +### EVM介绍 + +EVM,全称 Extended Verification Module(扩展验证模块),是对IMA功能的扩展,在通过IMA实现对于文件内容的完整性保护的基础上,使用EVM可以更进一步地实现对于文件扩展属性(如UID、security.ima 、security.selinux等属性)的保护。 + +### IMA摘要列表介绍 + +IMA Digest Lists(IMA摘要列表)是openEuler对内核原生完整性保护机制的增强,旨在对原生的IMA/EVM机制的以下痛点进行优化: + +**TPM扩展导致文件访问性能下降:** + +IMA度量模式下,每次触发度量都需要访问TPM芯片,TPM属于低速芯片,通常采用几十MHz时钟频率的SPI协议与CPU通信,导致系统调用性能下降: + +![](./figures/ima_tpm.png) + +**非对称运算导致文件访问性能下降:** + +IMA评估模式下,需要使用签名机制保护不可变文件,每次触发文件校验都需要进行签名验证,而非对称运算相对复杂 ,同样导致系统调用性能下降: + +![](./figures/ima_sig_verify.png) + +**复杂的部署方式导致效率和安全性下降:** + +IMA评估模式下,需要通过fix模式进行部署,即系统首先需要进入fix模式进行IMA/EVM扩展属性标记,然后再切换为校验模式启动。同时在受保护的文件升级时,需要重启进入fix模式,完成文件和扩展属性更新。一方面降低了部署效率,另一方面需要在运行环境中访问密钥,降低了安全性: + +![](./figures/ima_priv_key.png) + +IMA摘要列表旨在通过一个哈希列表文件管理一系列文件的基准摘要值,即将若干文件(如一个软件包中的所有可执行文件)的基准摘要值汇总到单个文件中进行管理。基准摘要值可包含文件内容摘要(对应IMA模式)和文件扩展属性摘要(对应EVM模式),这个文件就是IMA摘要列表文件。 + +![](./figures/ima_digest_list_pkg.png) + +开启IMA摘要列表功能后,内核维护一个哈希白名单池,用于存放导入的IMA摘要列表文件中的摘要值,并通过securityfs对外提供IMA摘要列表文件的导入/删除/查询等接口。 + +在度量模式下,导入内核的摘要列表文件需要进行度量和TPM扩展才可添加至白名单池,后续如果度量的目标文件的摘要值和白名单池匹配,则不进行额外的度量日志记录以及TPM扩展;在评估模式下,导入内核的摘要列表文件需要通过签名验证才可添加至白名单池,后续将访问的目标文件的摘要值和白名单池中的摘要值进行匹配即可判断评估结果。 + +![](./figures/ima_digest_list_flow.png) + +相比Linux原生IMA/EVM机制,IMA摘要列表扩展从安全性、性能、易用性三个方面进行了改良,以实现更好的落地效果: + +- 安全性:IMA摘要列表可以随软件包一起发布,软件包安装时同步导入摘要列表,确保了基准值来自于软件发行商(如openEuler社区),避免在运行环境生成基准值的流程,实现了完整的信任链。 +- 性能:IMA摘要列表机制以摘要列表为单位进行度量/校验,降低TPM访问和非对称运算频率为1/n(n为平均单个摘要列表管理的文件哈希数量),可一定程度提升系统调用性能和系统启动性能。 +- 易用性:IMA摘要列表机制可以实现“开箱即用”,即完成系统安装后直接进入评估模式,且允许在评估模式下直接安装/升级软件包,而无需进入fix模式进行文件标记,从而实现快速部署和平滑升级。 + +需要注意的是,IMA摘要列表相比原生IMA/EVM,将度量/评估的基准值在内核内存中进行维护,也引入了一个假设,即内核内存不可被未授权篡改,这就使得IMA摘要列表也依赖于其他安全机制(如内核模块安全启动和内存动态度量等)以保护内核内存的完整性。 + +但无论社区原生IMA机制还是IMA摘要列表扩展,都只是系统安全链中的一环,无法孤立地保证系统的安全性,安全自始至终都是一个构建纵深防御的系统工程。 + +## 接口说明 + +### 内核启动参数说明 + +openEuler IMA/EVM机制提供的内核启动参数及说明如下: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
参数名称取值功能
ima_appraiseenforce-evmIMA评估强制校验模式(EVM开启)
log-evmIMA评估日志模式(EVM开启)
enforceIMA评估强制校验模式
logIMA评估日志模式
off关闭IMA评估
ima_appraise_digest_listdigest基于摘要列表进行IMA+EVM评估(比较文件内容和扩展属性)
digest-nometadata基于摘要列表进行IMA评估(只比较文件内容)
evmx509直接开启基于可移植签名的EVM(无论EVM证书是否加载)
complete启动后不允许通过securityfs接口修改EVM模式
allow_metadata_writes允许修改文件元数据,EVM不做拦截
ima_hashsha256/sha1/...声明IMA度量哈希算法
ima_templateima声明IMA度量模板(d|n)
ima-ng声明IMA度量模板(d-ng|n-ng),默认使用该模板
ima-sig声明IMA度量模板(d-ng|n-ng|sig)
ima_policyexec_tcb度量所有执行、映射方式访问的文件,以及加载的内核模块、固件、内核等文件
tcb在exec_tcb策略的基础上,额外度量以uid=0或euid=0身份访问的文件
secure_boot评估所有加载的内核模块、固件、内核等文件,并指定使用IMA签名模式
appraise_exec_tcb在secure_boot策略的基础上,额外评估所有执行、映射方式访问的文件
appraise_tcb评估访问的所有属主为0的文件
appraise_exec_immutable与appraise_exec_tcb策略配合使用,可执行文件的扩展属性不可变
ima_digest_list_pcr10在PCR 10中扩展基于摘要列表的IMA度量结果,禁用原生IMA度量
11在PCR 11中扩展基于摘要列表的IMA度量结果,禁用原生IMA度量
+11在PCR 11中扩展基于摘要列表的IMA度量结果,在PCR 10中扩展原生IMA度量结果
ima_digest_db_sizenn[M]设置内核摘要列表上限(0M~64M),不做配置的情况下默认为16MB(不做配置指的是不写该参数,但注意不能将值留空,如ima_digest_db_size=)
ima_capacity-1~2147483647设置内核度量日志条数上限,不做配置的情况下默认为100000条,配置-1表示无上限
initramtmpfs在initrd中支持tmpfs,以携带文件扩展属性
+ + +根据用户实际场景诉求,建议采取如下参数组合: + +**1) 原生IMA度量:** + +``` +# 原生IMA度量+自定义策略 +无需配置,默认开启 +# 原生IMA度量+TCB默认策略 +ima_policy="tcb" +``` + +**2) 基于摘要列表的IMA度量:** + +``` +# 摘要列表IMA度量+自定义策略 +ima_digest_list_pcr=11 ima_template=ima-ng initramtmpfs +# 摘要列表IMA度量+默认策略 +ima_digest_list_pcr=11 ima_template=ima-ng ima_policy="exec_tcb" initramtmpfs +``` + +**3) 基于摘要列表的IMA评估,只保护文件内容:** + +``` +# IMA评估+日志模式 +ima_appraise=log ima_appraise_digest_list=digest-nometadata ima_policy="appraise_exec_tcb" initramtmpfs +# IMA评估+强制校验模式 +ima_appraise=enforce ima_appraise_digest_list=digest-nometadata ima_policy="appraise_exec_tcb" initramtmpfs +``` + +**4) 基于摘要列表的IMA评估,保护文件内容和扩展属性:** + +``` +# IMA评估+日志模式 +ima_appraise=log-evm ima_appraise_digest_list=digest ima_policy="appraise_exec_tcb|appraise_exec_immutable" initramtmpfs evm=x509 evm=complete +# IMA评估+强制校验模式 +ima_appraise=enforce-evm ima_appraise_digest_list=digest ima_policy="appraise_exec_tcb|appraise_exec_immutable" initramtmpfs evm=x509 evm=complete +``` + +> ![](./public_sys-resources/icon-note.gif) **说明:** +> +> 以上四种参数都可以单独配置使用,但只有基于摘要列表的度量和评估模式可以组合使用,即2)和3)搭配或2)和4)搭配。 + +### securityfs接口说明 + +openEuler IMA提供的securityfs接口位于`/sys/kernel/security`目录下,接口名及说明如下: + +| 路径 | 权限 | 说明 | +| :----------------------------- | :--- | :-------------------------------------- | +| ima/policy | 600 | IMA策略查询/导入接口 | +| ima/ascii_runtime_measurement | 440 | 查询IMA度量日志,以字符串形式输出 | +| ima/binary_runtime_measurement | 440 | 查询IMA度量日志,以二进制形式输出 | +| ima/runtime_measurement_count | 440 | 查询IMA度量日志条数 | +| ima/violations | 440 | 查询异常IMA度量日志数量 | +| ima/digests_count | 440 | 显示系统哈希表中的总摘要数量(IMA+EVM) | +| ima/digest_list_data | 200 | 摘要列表新增接口 | +| ima/digest_list_data_del | 200 | 摘要列表删除接口 | +| evm | 660 | 查询/设置EVM模式 | + +其中,`/sys/kernel/security/evm` 的接口的取值有以下三种: + +- 0:EVM 未初始化; +- 1:使用 HMAC(对称加密)方式校验扩展属性完整性; +- 2:使用公钥验签(非对称加密)方式校验扩展属性完整性; +- 6:关闭扩展属性完整性校验。 + +### 摘要列表管理工具说明 + +digest-list-tools软件包提供IMA摘要列表文件生成和管理的工具,主要包含如下几个命令行工具: + +#### gen_digest_lists工具 + +用户可通过调用gen_digest_lists命令行工具生成摘要列表。命令参数定义如下: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
参数名称取值功能
-d<path>指定生成摘要列表文件存放的位置,需为有效目录。
-fcompact指定生成摘要列表文件的格式,当前仅支持compact格式。
-i<option arg>:<option value>指定生成摘要列表的目标文件范围,具体参数定义如下。
I:<path>指定需要生成摘要列表的文件绝对路径,如指定目录,则会执行递归生成。
E:<path>指定需要排除的路径或目录。
F:<path>指定路径或目录,为该路径或目录下所有文件生成摘要列表(同时指定e:参数时,忽略e:选项的筛选效果)。
e:仅对可执行文件生成摘要列表。
l:policy从系统SELinux策略匹配文件安全上下文,而不是直接从文件扩展属性中读取安全上下文。
i:当生成metadata类型的摘要列表时,被计算的扩展属性信息包含文件的摘要值(必须指定)。
M:允许显式指定文件的扩展属性信息(需要结合rpmbuild命令使用)。
u:将“L:”参数所指定的列表文件名作为生成摘要列表的文件名(需要结合rpmbuild命令使用)。
L:<path>指定列表文件的路径,列表文件中包含需要生成摘要列表的信息数据(需要结合rpmbuild命令使用)。
-oadd指定生成摘要列表的的操作,当前仅支持add操作,即将摘要列表添加到文件中。
-p-1指定将摘要列表写入文件中的位置,当前仅支持指定-1。
-tfile只针对文件内容生成摘要列表。
metadata针对对文件的内容和扩展属性分别生成摘要列表。
-TNA不添加该参数,则生成摘要列表文件,添加该参数则生成TLV摘要列表文件。
-A<path>指定相对根目录,将文件路径截去指定的前缀进行路径匹配和SELinux标签匹配。
-mimmutable指定生成摘要列表文件的modifiers属性,当前仅支持指定immutable。摘要列表在enforce/enforece-evm模式下,摘要列表只能以只读模式打开。
-hNA打印帮助信息。
+ +**参考使用示例:** + +- 场景1:为单个文件生成摘要列表/TLV摘要列表。 + + ``` + gen_digest_lists -t metadata -f compact -i l:policy -o add -p -1 -m immutable -i I:/usr/bin/ls -d ./ -i i: gen_digest_lists -t metadata -f compact -i l:policy -o add -p -1 -m immutable -i I:/usr/bin/ls -d ./ -i i: -T + ``` + +- 场景2: 为单个文件生成摘要列表/TLV摘要列表,并指定相对根目录。 + + ``` + gen_digest_lists -t metadata -f compact -i l:policy -o add -p -1 -m immutable -i I:/usr/bin/ls -A /usr/ -d ./ -i i: gen_digest_lists -t metadata -f compact -i l:policy -o add -p -1 -m immutable -i I:/usr/bin/ls -A /usr/ -d ./ -i i: -T + ``` + +- 场景3:为目录下的文件递归生成摘要列表/TLV摘要列表。 + + ``` + gen_digest_lists -t metadata -f compact -i l:policy -o add -p -1 -m immutable -i I:/usr/bin/ -d ./ -i i: + gen_digest_lists -t metadata -f compact -i l:policy -o add -p -1 -m immutable -i I:/usr/bin/ -d ./ -i i: -T + ``` + +- 场景4:为目录下的可执行文件递归生成摘要列表/TLV摘要列表。 + + ``` + gen_digest_lists -t metadata -f compact -i l:policy -o add -p -1 -m immutable -i I:/usr/bin/ -d ./ -i i: -i e:gen_digest_lists -t metadata -f compact -i l:policy -o add -p -1 -m immutable -i I:/usr/bin/ -d ./ -i i: -i e: -T + ``` + +- 场景5:为目录下的文件递归生成摘要列表/TLV摘要列表,排除部分子目录。 + + ``` + gen_digest_lists -t metadata -f compact -i l:policy -o add -p -1 -m immutable -i I:/usr/ -d ./ -i i: -i E:/usr/bin/gen_digest_lists -t metadata -f compact -i l:policy -o add -p -1 -m immutable -i I:/usr/ -d ./ -i i: -i E:/usr/bin/ -T + ``` + +- 场景6:rpmbuild回调脚本中,通过读取rpmbuild传入的列表文件生成摘要列表。 + + ``` + gen_digest_lists -i M: -t metadata -f compact -d $DIGEST_LIST_DIR -i l:policy \ + -i i: -o add -p -1 -m immutable -i L:$BIN_PKG_FILES -i u: \ + -A $RPM_BUILD_ROOT -i e: \ + -i E:/usr/src \ + -i E:/boot/efi \ + -i F:/lib \ + -i F:/usr/lib \ + -i F:/lib64 \ + -i F:/usr/lib64 \ + -i F:/lib/modules \ + -i F:/usr/lib/modules \ + -i F:/lib/firmware \ + -i F:/usr/lib/firmware + + gen_digest_lists -i M: -t metadata -f compact -d $DIGEST_LIST_DIR.tlv \ + -i l:policy -i i: -o add -p -1 -m immutable -i L:$BIN_PKG_FILES -i u: \ + -T -A $RPM_BUILD_ROOT -i e: \ + -i E:/usr/src \ + -i E:/boot/efi \ + -i F:/lib \ + -i F:/usr/lib \ + -i F:/lib64 \ + -i F:/usr/lib64 \ + -i F:/lib/modules \ + -i F:/usr/lib/modules \ + -i F:/lib/firmware \ + -i F:/usr/lib/firmware + ``` + +#### manage_digest_lists工具 + +manage_digest_lists命令行工具主要用于将二进制格式的TLV摘要列表文件解析转换为可读的文本形式。命令参数定义如下: + +| 参数名称 | 取值 | 功能 | +| -------- | ---------- | ----------------------------------------------------------- | +| -d | \ | 指定TLV摘要列表文件存放的目录。 | +| -f | \ | 指定TLV摘要列表文件名。 | +| -p | dump | 指定操作类型,当前仅支持dump,表示解析打印TLV摘要列表操作。 | +| -v | NA | 打印详细信息。 | +| -h | NA | 打印帮助信息。 | + +**参考使用示例:** + +查看TLV摘要列表信息: + +``` +manage_digest_lists -p dump -d /etc/ima/digest_lists.tlv/ +``` + +## 文件格式说明 + +### IMA策略文件语法说明 + +IMA策略文件为文本文件,一个文件中可包含若干条按照换行符`\n`分隔的规则语句,每条规则语句都必须以 action 关键字代表的**动作**开头,后接**筛选条件**: + +``` + <筛选条件1> [筛选条件2] [筛选条件3]... +``` + +action表示该条策略具体的动作,一条策略只能选一个 action,具体的action见后表(实际书写时**可忽略 action 字样**,例如直接书写 dont_measure,不需要写成 action=dont_measure): + +筛选条件支持如下几种类型: + +- func:表示被度量或评估的文件类型,常和 mask 匹配使用,一条策略只能选一个 func。 + + - FILE_CHECK 只能同 MAY_EXEC、MAY_WRITE、MAY_READ 匹配使用。 + - MODULE_CHECK、MMAP_CHECK、BPRM_CHECK 只能同 MAY_EXEC 匹配使用。 + - 匹配关系以外的组合不会产生效果。 + +- mask:表示文件在做什么操作时将被度量或评估,一条策略只能选一个 mask。 + +- fsmagic:表示文件系统类型的十六进制魔数,定义在 `/usr/include/linux/magic.h` 文件中(默认情况下度量所有文件系统,除非使用 dont_measure/dont_appraise 标记不度量某文件系统)。 + +- fsuuid:表示系统设备 uuid 的 16 位的十六进制字符串。 + +- objtype:表示文件安全类型,一条策略只能选一个文件类型,objtype 相比 func 而言,划分的粒度更细,比如 obj_type=nova_log_t 表示SELinux类型为 nova_log_t 的文件。 + +- uid:表示哪个用户(用用户 id 表示)对文件进行操作,一条策略只能选一个 uid。 + +- fowner:表示文件的属主(用用户 id 表示)是谁,一条策略只能选一个 fowner。 + +关键字的具体取值及说明如下: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
关键字说明
actionmeasure开启IMA度量
dont_measure禁用IMA度量
appraise开启IMA评估
dont_appraise禁用IMA评估
audit开启审计
funcFILE_CHECK将要被打开的文件
MODULE_CHECK将要被装载的内核模块文件
MMAP_CHECK将要被映射到进程内存空间的动态库文件
BRPM_CHECK将要被执行的文件(不含通过 /bin/hash 等程序打开的脚本文件)
POLICY_CHECK将要被导入的IMA策略文件
FIRMWARE_CHECK将要被加载到内存中的固件
DIGEST_LIST_CHECK将要被加载到内核中的摘要列表文件
KEXEC_KERNEL_CHECK将要切换的 kexec 内核
maskMAY_EXEC执行文件
MAY_WRITE写文件
MAY_READ读文件
MAY_APPEND扩展文件属性
fsmagicfsmagic=xxx表示文件系统类型的十六进制魔数
fsuuidfsuuid=xxx表示系统设备 uuid 的 16 位的十六进制字符串
fownerfowner=xxx文件属主的用户 id
uiduid=xxx操作文件的用户 id
obj_typeobj_type=xxx_t表示文件的类型(基于 SELinux 标签)
pcrpcr=选择 TPM 中用于扩展度量值的 PCR(默认为 10)
appraise_typeimasig基于签名进行IMA评估
meta_immutable基于签名进行文件扩展属性的评估(支持摘要列表)
+ +## 使用说明 + +> ![](./public_sys-resources/icon-note.gif) **说明:** +> +> 原生IMA/EVM为Linux开源特性,本章节仅简单介绍基本使用方式,其他资料可参考开源WIKI: +> +> + +### 原生IMA使用说明 + +#### IMA度量模式 + +用户需要配置度量策略,开启IMA度量功能,具体步骤如下: + +**步骤1:** 用户可通过配置启动参数或手动配置的方式,指定度量策略。通过启动参数配置IMA策略的示例如下: + +``` +ima_policy="tcb" +``` + +手动配置IMA策略的示例如下: + +``` +echo "measure func=BPRM_CHECK" > /sys/kernel/security/ima/policy +``` + +**步骤2:** 重启系统,用户可实时检查度量日志获取当前的度量情况: + +``` +cat /sys/kernel/security/ima/ascii_runtime_measurements +``` + +#### IMA评估模式 + +用户需要首先进入fix模式,完成文件的IMA标记后,再开启log或enforce模式。具体步骤如下: + +**步骤1:** 配置启动参数,重启后进入fix模式: + +``` +ima_appraise=fix ima_policy=appraise_tcb +``` + +**步骤2:** 为所有需要评估的文件生成IMA扩展属性: + +对于不可变文件(如二进制程序文件)可以使用签名模式,将文件摘要值的签名写入IMA扩展属性中。举例如下(其中`/path/to/ima.key`指的是和IMA证书匹配的签名私钥): + +``` +find /usr/bin -fstype ext4 -type f -executable -uid 0 -exec evmctl -a sha256 ima_sign --key /path/to/ima.key '{}' \; +``` + +对于可变文件(如数据文件)可以使用哈希模式,将文件的摘要值写入IMA扩展属性中。IMA支持自动标记机制,即在fix模式下仅需触发文件访问,即可自动生成IMA扩展属性: + +``` +find / -fstype ext4 -type f -uid 0 -exec dd if='{}' of=/dev/null count=0 status=none \; +``` + +可通过如下命令检查文件是否被成功标记了IMA扩展属性(security.ima): + +``` +getfattr -m - -d /sbin/init +``` + +**步骤3:** 配置启动参数,修改IMA评估为log或enforce模式后,重启系统: + +``` +ima_appraise=enforce ima_policy=appraise_tcb +``` + +### IMA摘要列表使用说明 + +#### 前置条件 + +IMA摘要列表特性使用前,用户需安装`ima-evm-utils`和`digest-list-tools`软件包: + +``` +yum install ima-evm-utils digest-list-tools +``` + +#### 机制介绍 + +##### 摘要列表文件 + +在安装openEuler发布的RPM包后,默认会在`/etc/ima`目录下生成摘要列表文件。根据文件名的不同,存在如下几种文件: + +**/etc/ima/digest_lists/0-metadata_list-compact-** + +为IMA摘要列表文件,通过`gen_digest_lists`命令生成(生成方法详见[gen_digest_lists工具](#gen_digest_list工具)),该文件为二进制格式,包含header信息以及一连串SHA256哈希值,分别代表合法的文件内容摘要值和文件扩展属性摘要值。该文件被度量或评估后,最终被导入内核,并以该文件中的白名单摘要值为基准进行IMA摘要列表度量或评估。 + +**/etc/ima/digest_lists/0-metadata_list-rpm-** + +为RPM摘要列表文件,**实际为RPM包的头信息**。RPM包安装后,如果IMA摘要列表文件不包含签名,则会把RPM头信息写入该文件中,并将头信息的签名写入`security.ima`扩展属性中。这样通过签名可验证RPM头信息的真实性,由于RPM头信息又包含了摘要列表文件的摘要值,则可实现摘要列表的间接验证。 + +**/etc/ima/digest_lists/0-parser_list-compact-libexec** + +为IMA PARSER摘要列表文件,存放`/usr/libexec/rpm_parser`文件的摘要值。该文件用于实现RPM摘要列表->IMA摘要列表的信任链,内核IMA摘要列表机制会对该文件执行后产生的进程进行特殊校验,如果确定是`rpm_parser`程序,则会信任其导入的所有摘要列表而无需校验签名。 + +**/etc/ima/digest_lists.sig/0-metadata_list-compact-.sig** + +为IMA摘要列表的签名文件,若RPM包中包含此文件,则在RPM包安装阶段,会将该文件的内容写入对应的RPM摘要列表文件的`security.ima`扩展属性,从而在IMA摘要列表导入内核阶段进行签名验证。 + +**/etc/ima/digest_lists.tlv/0-metadata_list-compact_tlv-** + +为TLV摘要列表文件,通常在对目标文件生成IMA摘要列表文件时一并生成,存放目标文件的完整性信息(文件内容摘要值、文件扩展属性等)。该文件的功能是协助用户查询/恢复目标文件的完整性信息。 + +##### 摘要列表文件签名方式 + +在IMA摘要列表评估模式下,IMA摘要列表文件需要经过签名验证才可导入内核,并用于后续的文件白名单匹配。IMA摘要列表文件支持如下几种签名方式: + +**1) IMA扩展属性签名** + +即原生的IMA签名机制,将签名信息按照一定格式,存放在`security.ima`扩展属性中。可通过`evmctl`命令生成并添加: + +``` +evmctl ima_sign --key /path/to/ima.key -a sha256 +``` + +也可添加`-f`参数,将签名信息和头信息存入独立的文件中: + +``` +evmctl ima_sign -f --key /path/to/ima.key -a sha256 +``` + +在开启IMA摘要列表评估模式下,可直接将摘要列表文件路径写入内核接口,实现摘要列表的导入/删除。该过程会自动触发评估,基于`security.ima`扩展属性完成对摘要列表文件内容的签名验证: + +``` +# 导入IMA摘要列表文件 +echo > /sys/kernel/security/ima/digest_list_data +# 删除IMA摘要列表文件 +echo > /sys/kernel/security/ima/digest_list_data_del +``` + +**2) IMA摘要列表追加签名(openEuler 24.03 LTS版本默认)** + +openEuler 24.03 LTS版本开始支持IMA专用签名密钥,并采用CMS签名。由于签名信息包含证书链,可能由于长度超出限制而无法写入文件的`security.ima`扩展属性中,因此采用类似内核模块的追加签名的方式: + + + +其签名机制为: + +1) 将CMS签名信息追加到IMA摘要列表文件末尾; + +2) 填充结构体并添加到签名信息末尾,结构体定义如下: + +``` +struct module_signature { + u8 algo; /* Public-key crypto algorithm [0] */ + u8 hash; /* Digest algorithm [0] */ + u8 id_type; /* Key identifier type [PKEY_ID_PKCS7] */ + u8 signer_len; /* Length of signer's name [0] */ + u8 key_id_len; /* Length of key identifier [0] */ + u8 __pad[3]; + __be32 sig_len; /* Length of signature data */ +}; +``` + +3) 添加魔鬼字符串`"~Module signature appended~\n"` + +此步骤的参考脚本如下: + +``` +#!/bin/bash +DIGEST_FILE=$1 # IMA摘要列表文件路径 +SIG_FILE=$2 # IMA摘要列表签名信息保存路径 +OUT=$3 #完成签名信息添加后的摘要列表文件输出路径 + +cat $DIGEST_FILE $SIG_FILE > $OUT +echo -n -e "\x00\x00\x02\x00\x00\x00\x00\x00" >> $OUT +echo -n -e $(printf "%08x" "$(ls -l $SIG_FILE | awk '{print $5}')") | xxd -r -ps >> $OUT +echo -n "~Module signature appended~" >> $OUT +echo -n -e "\x0a" >> $OUT +``` + +**3) 复用RPM签名(openEuler 22.03 LTS版本默认)** + +openEuler 22.03 LTS版本支持复用RPM签名机制实现IMA摘要列表文件的签名。旨在解决版本无专用IMA签名密钥的问题。用户无需感知该签名流程,当RPM包中含有IMA摘要列表文件,而不包含IMA摘要列表的签名文件时,会自动使用该签名机制。其核心原理是通过RPM包的头信息实现对IMA摘要列表的验证。 + +对于openEuler发布的RPM包,每一个包文件可以包含两部分: + +- **RPM头信息:** 存放RPM包属性字段,如包名、文件摘要值列表等,通过RPM头签名保证其完整性; +- **RPM文件:** 实际安装至系统中的文件,也包括构建阶段生成的IMA摘要列表文件。 + + + +在RPM包安装阶段,如果RPM进程检测到包中的摘要列表文件不包含签名,则在`/etc/ima`目录下创建一个RPM摘要列表文件,将RPM头信息写入文件内容,将RPM头签名写入文件的`security.ima`扩展属性中。后续可通过RPM摘要列表间接实现IMA摘要列表的验证和导入。 + +##### IMA摘要列表导入 + +在开启IMA度量模式下,导入IMA摘要列表文件无需经过签名验证,可直接将路径写入内核接口,实现摘要列表的导入/删除: + +``` +# 导入IMA摘要列表文件 +echo > /sys/kernel/security/ima/digest_list_data +# 删除IMA摘要列表文件 +echo > /sys/kernel/security/ima/digest_list_data_del +``` + +在开启IMA评估模式下,导入摘要列表必须通过签名验证。根据签名方式不同,可分为两种导入方式。 + +**直接导入方式** + +对于已包含签名信息的IMA摘要列表文件(IMA扩展属性签名或IMA摘要列表追加签名),可直接将路径写入内核接口,实现摘要列表的导入/删除。该过程会自动触发评估,基于`security.ima`扩展属性完成对摘要列表文件内容的签名验证: + +``` +# 导入IMA摘要列表文件 +echo > /sys/kernel/security/ima/digest_list_data +# 删除IMA摘要列表文件 +echo > /sys/kernel/security/ima/digest_list_data_del +``` + +**调用`upload_digest_lists`导入方式** + +对于复用RPM签名的IMA摘要列表文件,需要调用`upload_digest_lists`命令实现导入。具体命令如下(注意指定的路径为对应的RPM摘要列表): + +``` +# 导入IMA摘要列表文件 +upload_digest_lists add +# 删除IMA摘要列表文件 +upload_digest_lists del +``` + +该流程相对复杂,需要满足以下前置条件: + +1) 系统已导入openEuler发布的`digest_list_tools`软件包中的摘要列表(包含IMA摘要列表和IMA PARSER摘要列表); + +2) 已配置对应用程序执行的IMA评估策略(BPRM_CHECK策略)。 + +#### 操作指导 + +##### RPM构建自动生成摘要列表 + +openEuler RPM工具链支持`%__brp_digest_list`宏定义,配置格式如下: + +``` +%__brp_digest_list /usr/lib/rpm/brp-digest-list %{buildroot} +``` + +当配置了该宏定义后,当用户调用`rpmbuild`命令进行软件包构建时,在RPM打包阶段会调用`/usr/lib/rpm/brp-digest-list`脚本进行摘要列表的生成和签名等流程。openEuler默认针对可执行程序、动态库、内核模块等关键文件生成摘要列表。用户也可以通过修改脚本,自行配置生成摘要列表的范围和指定签名密钥。如下示例使用用户自定义的签名密钥`/path/to/ima.key`进行摘要列表签名。 + +``` +...... (line 66) +DIGEST_LIST_TLV_PATH="$DIGEST_LIST_DIR.tlv/0-metadata_list-compact_tlv-$(basename $BIN_PKG_FILES)" +[ -f $DIGEST_LIST_TLV_PATH ] || exit 0 + +chmod 644 $DIGEST_LIST_TLV_PATH +echo $DIGEST_LIST_TLV_PATH + +evmctl ima_sign -f --key /path/to/ima.key -a sha256 $DIGEST_LIST_PATH &> /dev/null +chmod 400 $DIGEST_LIST_PATH.sig +mkdir -p $DIGEST_LIST_DIR.sig +mv $DIGEST_LIST_PATH.sig $DIGEST_LIST_DIR.sig +echo $DIGEST_LIST_DIR.sig/0-metadata_list-compact-$(basename $BIN_PKG_FILES).sig +``` + +##### IMA摘要列表度量 + +用户可通过如下功能开启IMA摘要列表度量: + +**步骤1:** 用户需要配置启动参数度量策略,开启IMA度量功能,具体步骤同**原生IMA度量**,不同的是需要单独配置度量所使用的TPM PCR寄存器,启动参数示例如下: + +``` +ima_policy=exec_tcb ima_digest_list_pcr=11 +``` + +**步骤2:** 用户导入IMA摘要列表,以`bash`软件包的摘要列表为例: + +``` +echo /etc/ima/digest_lists/0-metadata_list-compact-bash-5.1.8-6.oe2203sp1.x86_64 > /sys/kernel/security/ima/digest_list_data +``` + +可查询到IMA摘要列表的度量日志: + +``` +cat /sys/kernel/security/ima/ascii_runtime_measurements +``` + +导入IMA摘要列表后,如果后续度量的文件摘要值包含在IMA摘要列表中,则不会额外记录度量日志。 + +##### IMA摘要列表评估 + +###### 默认策略启动场景 + +用户可在启动参数中配置`ima_policy`参数指定IMA默认策略,则在内核启动阶段,IMA初始化完成后立即启用默认策略进行评估。用户可通过如下功能开启IMA摘要列表评估: + +**步骤1:** 执行`dracut`命令将摘要列表文件写入initrd: + +``` +dracut -f -e xattr +``` + +**步骤2:** 配置启动参数和IMA策略,典型的配置如下: + +``` +# 基于摘要列表的IMA评估log/enforce模式,只保护文件内容,配置默认策略为appraise_exec_tcb +ima_appraise=log ima_appraise_digest_list=digest-nometadata ima_policy="appraise_exec_tcb" initramtmpfs module.sig_enforce +ima_appraise=enforce ima_appraise_digest_list=digest-nometadata ima_policy="appraise_exec_tcb" initramtmpfs module.sig_enforce +# 基于摘要列表的IMA评估log/enforce模式,保护文件内容和扩展属性,配置默认策略为appraise_exec_tcb+appraise_exec_immutable +ima_appraise=log-evm ima_appraise_digest_list=digest ima_policy="appraise_exec_tcb|appraise_exec_immutable" initramtmpfs evm=x509 evm=complete module.sig_enforce +ima_appraise=enforce-evm ima_appraise_digest_list=digest ima_policy="appraise_exec_tcb|appraise_exec_immutable" initramtmpfs evm=x509 evm=complete module.sig_enforce +``` + +重启系统即可开启IMA摘要列表评估功能,启动过程中自动完成IMA策略生效和IMA摘要列表文件导入。 + +###### 无默认策略启动场景 + +用户可在启动参数中不配置`ima_policy`参数,代表系统启动阶段无默认策略,IMA评估机制等待用户导入策略后生效启用。 + +**步骤1:** 配置启动参数,典型的配置如下: + +``` +# 基于摘要列表的IMA评估log/enforce模式,只保护文件内容,无默认策略 +ima_appraise=log ima_appraise_digest_list=digest-nometadata initramtmpfs +ima_appraise=enforce ima_appraise_digest_list=digest-nometadata initramtmpfs +# 基于摘要列表的IMA评估log/enforce模式,保护文件内容和扩展属性,无默认策略 +ima_appraise=log-evm ima_appraise_digest_list=digest initramtmpfs evm=x509 evm=complete +ima_appraise=enforce-evm ima_appraise_digest_list=digest initramtmpfs evm=x509 evm=complete +``` + +重启系统,此时由于系统无策略,IMA评估并不生效。 + +**步骤2:** 导入IMA策略,将策略文件的全路径写入内核接口: + +``` +echo /path/to/policy > /sys/kernel/security/ima/policy +``` + +> ![](./public_sys-resources/icon-note.gif) **说明:** +> +> 策略中需要包含一些固定规则,用户可参考如下策略模板: +> +> openEuler 22.03 LTS版本的策略模板如下(复用RPM签名场景): +> +``` +# 不评估securityfs文件系统的访问行为 +dont_appraise fsmagic=0x73636673 +# 其他用户自定义的dont_appraise规则 +...... +# 评估导入的IMA摘要列表文件 +appraise func=DIGEST_LIST_CHECK appraise_type=imasig +# 评估/usr/libexec/rpm_parser进程打开的所有文件 +appraise parser appraise_type=imasig +# 评估执行的应用程序(触发对/usr/libexec/rpm_parser执行的评估,也可以新增其他限制条件,如SELinux标签等) +appraise func=BPRM_CHECK appraise_type=imasig +# 其他用户自定义的appraise规则 +...... +``` +> +> openEuler 24.03 LTS版本的策略模板如下(IMA扩展属性签名或追加签名场景): +> +``` +# 用户自定义的dont_appraise规则 +...... +# 评估导入的IMA摘要列表文件 +appraise func=DIGEST_LIST_CHECK appraise_type=imasig|modsig +# 其他用户自定义的appraise规则 +...... +``` + +**步骤3:** 导入IMA摘要列表文件,对于不同签名方式的摘要列表,需要使用不同的导入方式。 + +openEuler 22.03 LTS的摘要列表导入方式如下(复用RPM签名的IMA摘要列表): + +``` +# 导入digest_list_tools软件包的摘要列表 +echo /etc/ima/digest_lists/0-metadata_list-compact-digest-list-tools-0.3.95-13.x86_64 > /sys/kernel/security/ima/digest_list_data +echo /etc/ima/digest_lists/0-parser_list-compact-libexec > /sys/kernel/security/ima/digest_list_data +# 导入其他的RPM摘要列表 +upload_digest_lists add /etc/ima/digest_lists +# 检查导入的摘要列表条数 +cat /sys/kernel/security/ima/digests_count +``` + +openEuler 24.03 LTS的摘要列表导入方式如下(追加签名的IMA摘要列表): + +``` +find /etc/ima/digest_lists -name "0-metadata_list-compact-*" -exec echo {} > /sys/kernel/security/ima/digest_list_data \; +``` + +##### 软件升级场景 + +开启IMA摘要列表功能后,对于覆盖在IMA保护范围内的文件,在升级更新场景需要同步更新摘要列表。对于openEuler发布的RPM包,在包安装、升级、卸载的同时,将自动完成RPM包中的摘要列表的添加、更新和删除,不需要用户手动操作。对于用户维护的非RPM格式的软件包,则需要手动完成摘要列表的导入。 + +##### 用户证书导入 + +用户可以通过导入自定义证书,从而针对非openEuler发布的软件进行度量或评估。openEuler IMA评估模式支持从如下两种密钥环中获取证书进行签名校验: + +- builtin_trusted_keys密钥环:内核编译时预置的根证书; +- ima密钥环:通过initrd中的/etc/keys/x509_ima.der导入,需要为builtin_trusted_keys密钥环中任意一本证书的子证书。 + +**将根证书导入builtin_trusted_keys密钥环的步骤如下:** + +**步骤1:** 生成根证书,以openssl命令为例: + +``` +echo 'subjectKeyIdentifier=hash' > root.cfg +openssl genrsa -out root.key 4096 +openssl req -new -sha256 -key root.key -out root.csr -subj "/C=AA/ST=BB/O=CC/OU=DD/CN=openeuler test ca" +openssl x509 -req -days 3650 -extfile root.cfg -signkey root.key -in root.csr -out root.crt +openssl x509 -in root.crt -out root.der -outform DER +``` + +**步骤2:** 获取openEuler kernel源码,以最新的OLK-5.10分支为例: + +``` +git clone https://gitee.com/openeuler/kernel.git -b OLK-5.10 +``` + +**步骤3:** 进入源码目录,并将根证书拷贝至目录下: + +``` +cd kernel +cp /path/to/root.der . +``` + +修改config文件的CONFIG_SYSTEM_TRUSTED_KEYS选项: + +``` +CONFIG_SYSTEM_TRUSTED_KEYS="./root.crt" +``` + +**步骤4:** 编译安装内核(步骤略,注意需要为内核模块生成摘要列表)。 + +**步骤5:** 重启后检查证书导入成功: + +``` +keyctl show %:.builtin_trusted_keys +``` + +**将子证书导入ima密钥环的步骤如下,注意需要提前将根证书导入builtin_trusted_keys密钥环:** + +**步骤1:** 基于根证书生成子证书,以openssl命令为例: + +``` +echo 'subjectKeyIdentifier=hash' > ima.cfg +echo 'authorityKeyIdentifier=keyid,issuer' >> ima.cfg +echo 'keyUsage=digitalSignature' >> ima.cfg +openssl genrsa -out ima.key 4096 +openssl req -new -sha256 -key ima.key -out ima.csr -subj "/C=AA/ST=BB/O=CC/OU=DD/CN=openeuler test ima" +openssl x509 -req -sha256 -CAcreateserial -CA root.crt -CAkey root.key -extfile ima.cfg -in ima.csr -out ima.crt +openssl x509 -outform DER -in ima.crt -out x509_ima.der +``` + +**步骤2:** 将IMA证书拷贝到/etc/keys目录下: + +``` +mkdir -p /etc/keys/ +cp x509_ima.der /etc/keys/ +``` + +**步骤3:** 打包initrd,将IMA证书和摘要列表置入initrd镜像中: + +``` +echo 'install_items+=" /etc/keys/x509_ima.der "' >> /etc/dracut.conf +dracut -f -e xattr +``` + +**步骤4:** 重启后检查证书导入成功: + +``` +keyctl show %:.ima +``` + +#### 典型使用场景 + +根据运行模式的不同,IMA摘要列表可应用于可信度量场景和用户态安全启动场景。 + +##### 可信度量场景 + +可信度量场景主要基于IMA摘要列表度量模式,由内核+硬件可信根(如TPM)共同完成对关键文件的度量,再结合远程证明工具链完成对当前系统的文件可信状态的证明: + +![](./figures/ima_trusted_measurement.png) + +**运行阶段** + +- 软件包部署时同步导入摘要列表,IMA对摘要列表进行度量并记录度量日志(同步扩展TPM); + +- 应用程序执行时触发IMA度量,若文件摘要值匹配白名单则忽略,否则记录度量日志(同步扩展TPM) 。 + +**证明阶段(业界通用流程)** + +- 远程证明服务器下发证明请求,客户端回传IMA度量日志以及经过签名的TPM PCR值; + +- 远程证明服务器依次校验PCR(校验签名)、度量日志(PCR回放)、文件度量信息(比对本地基准值)的正确性,上报结果至安全中心; + +- 安全管理中心采取对应操作,如事件通知、节点隔离等。 + +##### 用户态安全启动场景 + +用户态安全启动场景主要基于IMA摘要列表评估模式,与安全启动类似,旨在对执行的应用程序或访问的关键文件执行完整性校验,如果校验失败,则拒绝访问: + +![](./figures/ima_secure_boot.png) + +**运行阶段** + +- 应用部署时导入摘要列表,内核验签通过后,加载摘要值到内核哈希表中作为白名单; +- 应用程序执行时触发IMA校验,计算文件hash值,若与基线值一致,则允许访问,否则记录日志或拒绝访问 。 + +## 附录 + +### 内核编译选项说明 + +原生IMA/EVM提供的编译选项及说明如下: + +| 编译选项 | 功能 | +| :------------------------------- | :------------------------ | +| CONFIG_INTEGRITY | IMA/EVM 总编译开关 | +| CONFIG_INTEGRITY_SIGNATURE | 使能IMA签名校验 | +| CONFIG_INTEGRITY_ASYMMETRIC_KEYS | 使能IMA非对称签名校验 | +| CONFIG_INTEGRITY_TRUSTED_KEYRING | 使能 IMA/EVM 密钥环 | +| CONFIG_INTEGRITY_AUDIT | 编译 IMA audit 审计模块 | +| CONFIG_IMA | IMA 总编译开关 | +| CONFIG_IMA_WRITE_POLICY | 允许在运行阶段更新IMA策略 | +| CONFIG_IMA_MEASURE_PCR_IDX | 允许指定IMA度量 PCR 序号 | +| CONFIG_IMA_LSM_RULES | 允许配置 LSM 规则 | +| CONFIG_IMA_APPRAISE | IMA 评估总编译开关 | +| IMA_APPRAISE_BOOTPARAM | 启用IMA评估启动参数 | +| CONFIG_EVM | EVM 总编译开关 | + +openEuler IMA摘要列表特性提供的编译选项及说明如下(openEuler内核编译默认开启): + +| 编译选项 | 功能 | +| :----------------- | :---------------------- | +| CONFIG_DIGEST_LIST | 开启IMA摘要列表特性开关 | + +### IMA摘要列表根证书说明 + +openEuler 22.03版本使用RPM密钥对IMA摘要列表进行签名,为保证IMA功能开箱可用,openEuler内核编译时默认将RPM根证书(PGP证书)导入内核。当前包含旧版本使用的OBS证书和openEuler 22.03 LTS SP1版本切换的openEuler证书: + +```shell +# cat /proc/keys | grep PGP +1909b4ad I------ 1 perm 1f030000 0 0 asymmetri private OBS b25e7f66: PGP.rsa b25e7f66 [] +2f10cd36 I------ 1 perm 1f030000 0 0 asymmetri openeuler fb37bc6f: PGP.rsa fb37bc6f [] +``` + +由于当前内核不支持导入PGP子公钥,而切换后的openEuler证书采用子密钥签名,因此openEuler内核编译前对证书进行了预处理,抽取子公钥并导入内核,具体处理流程可见内核软件包代码仓内的process_pgp_certs.sh脚本文件: + +openEuler 24.03及之后的版本支持IMA专用证书,详见[证书签名](../CertSignature/签名证书介绍.md)文档相关章节。 + +如果用户不使用IMA摘要列表功能或使用其他密钥实现签名/验签,则可将相关代码移除,自行实现内核根证书配置。 + +### FAQ + +#### FAQ1:开启IMA评估enforce模式并配置默认策略后,系统启动失败 + +IMA默认策略可能包含对应用程序执行、内核模块加载等关键文件访问流程的校验,如果关键文件访问失败,可能导致系统无法启动。通常原因有: + +1) IMA校验证书未导入内核,导致摘要列表无法被正确校验; +2) 摘要列表文件未正确签名,导致摘要列表校验失败; +3) 摘要列表文件未导入initrd中,导致启动过程无法导入摘要列表; +4) 摘要列表文件和应用程序不匹配,导致应用程序匹配已导入的摘要列表失败。 + +用户需要通过log模式进入系统进行问题定位和修复。重启系统,进入grub界面修改启动参数,采用log模式启动: + +``` +ima_appraise=log +``` + +系统启动后,可参考如下流程进行问题排查: + +**步骤1:** 检查keyring中的IMA证书: + +``` +keyctl show %:.builtin_trusted_keys +``` + +对于openEuler LTS版本,至少应存在以下几本内核证书(其他未列出版本可根据发布时间前推参考): + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
版本证书
openEuler 22.03 LTSprivate OBS b25e7f66
openEuler 22.03 LTS SP1/2/3private OBS b25e7f66
openeuler <openeuler@compass-ci.com> b675600b
openEuler 22.03 LTS SP4private OBS b25e7f66
openeuler <openeuler@compass-ci.com> b675600b
openeuler <openeuler@compass-ci.com> fb37bc6f
openEuler 24.03openEuler kernel ICA 1: 90bb67eb4b57eb62bf6f867e4f56bd4e19e7d041
+ +如果用户导入了其他内核根证书,也同样需要通过`keyctl`命令查询确认证书是否被成功导入。openEuler默认不使用IMA密钥环,如果用户存在使用的情况,则需要通过如下命令查询IMA密钥环中是否存在用户证书: + +``` +keyctl show %:.ima +``` + +如果排查结果为证书未正确导入,则用户需要根据*用户证书导入场景*章节进行流程排查。 + +**步骤2:** 检查摘要列表携带签名信息: + +用户可通过如下命令查询当前系统中的摘要列表文件: + +``` +ls /etc/ima/digest_lists | grep '_list-compact-' +``` + +对于每个摘要列表文件,需要检查存在**以下三种之一**的签名信息: + +1) 检查该摘要列表文件存在对应的**RPM摘要列表文件**,且**RPM摘要列表文件**的ima扩展属性中包含签名值。以bash软件包的摘要列表为例,摘要列表文件路径为: + +``` +/etc/ima/digest_lists/0-metadata_list-compact-bash-5.1.8-6.oe2203sp1.x86_64 +``` + +RPM摘要列表路径为: + +``` +/etc/ima/digest_lists/0-metadata_list-rpm-bash-5.1.8-6.oe2203sp1.x86_64 +``` + +检查RPM摘要列表签名,即文件的`security.ima`扩展属性不为空: + +``` +getfattr -n security.ima /etc/ima/digest_lists/0-metadata_list-rpm-bash-5.1.8-6.oe2203sp1.x86_64 +``` + +2) 检查摘要列表文件的`security.ima`扩展属性不为空: + +``` +getfattr -n security.ima /etc/ima/digest_lists/0-metadata_list-compact-bash-5.1.8-6.oe2203sp1.x86_64 +``` + +3) 检查摘要列表文件的末尾包含了签名信息,可通过检查文件内容末尾是否包含`~Module signature appended~`魔鬼字符串进行判断(仅openEuler 24.03 LTS及之后版本支持的签名方式): + +``` +tail -c 28 /etc/ima/digest_lists/0-metadata_list-compact-kernel-6.6.0-28.0.0.34.oe2403.x86_64 +``` + +如果排查结果为摘要列表未包含签名信息,则用户需要根据*摘要列表签名机制说明*章节进行流程排查。 + +**步骤3:** 检查摘要列表的签名信息正确: + +在确保摘要列表已携带签名信息的情况下,用户还需要确保摘要列表采用正确的私钥签名,即签名私钥和内核中的证书匹配。除用户自行进行私钥检查外,还可通过dmesg日志或audit日志(默认路径为`/var/log/audit/audit.log`)判断是否有签名校验失败的情况发生。典型的日志输出如下: + +``` +type=INTEGRITY_DATA msg=audit(1722578008.756:154): pid=3358 uid=0 auid=0 ses=1 subj=unconfined_u:unconfined_r:haikang_t:s0-s0:c0.c1023 op=appraise_data cause=invalid-signature comm="bash" name="/root/0-metadata_list-compact-bash-5.1.8-6.oe2203sp1.x86_64" dev="dm-0" ino=785161 res=0 errno=0UID="root" AUID="root" +``` + +如果检查结果为签名信息错误,则用户需要根据*摘要列表签名机制说明*章节进行流程排查。 + +**步骤4:** 检查initrd中是否导入摘要列表文件: + +用户需要通过如下命令查询当前initrd中是否存在摘要列表文件: + +``` +lsinitrd | grep 'etc/ima/digest_lists' +``` + +如果未查询到摘要列表文件,则用户需要重新制作initrd,并再次检查摘要列表导入成功: + +``` +dracut -f -e xattr +``` + +**步骤5:** 检查IMA摘要列表和应用程序是否匹配: + +参考FAQ2章节。 + +#### FAQ2:开启IMA评估enforce模式后,部分文件执行失败 + +开启IMA评估enforce模式后,对于配置IMA策略的文件访问,如果文件的内容或扩展属性设置有误(和导入的摘要列表不匹配),则可能会导致文件访问被拒绝。通常原因有: + +1) 摘要列表未成功导入(可参考FAQ1); + +2) 文件内容或属性被篡改。 + + 对于出现文件执行失败的场景,首先需要确定摘要列表文件已经成功导入内核,用户可以检查摘要列表数量判断导入情况: + +``` +cat /sys/kernel/security/ima/digests_count +``` + +然后用户可通过audit日志(默认路径为`/var/log/audit/audit.log`)判断具体哪个文件校验失败以及原因。典型的日志输出如下: + +``` +type=INTEGRITY_DATA msg=audit(1722811960.997:2967): pid=7613 uid=0 auid=0 ses=1 subj=unconfined_u:unconfined_r:haikang_t:s0-s0:c0.c1023 op=appraise_data cause=IMA-signature-required comm="bash" name="/root/test" dev="dm-0" ino=814424 res=0 errno=0UID="root" AUID="root" +``` + +在确定校验失败的文件后,可对比TLV摘要列表确定文件被篡改的原因。对于未开启扩展属性校验的场景,仅对比文件SHA256哈希值和TLV摘要列表中的`IMA digest`项即可,对于开启扩展属性校验的场景,则还需要对比文件当前的属性和TLV摘要列表中显示扩展属性的区别。 + +在确定问题原因后,可通过还原文件的内容及属性,或对当前文件再次生成摘要列表,签名并导入内核的方式解决问题。 + +#### FAQ3:开启IMA评估模式后,跨openEuler 22.03 LTS SP版本安装软件包时出现报错信息 + +开启IMA评估模式后,当安装不同版本的openEuler 22.03 LTS的软件包时,会自动触发IMA摘要列表的导入。其中包含对摘要列表的签名验证流程,即使用内核中的证书验证摘要列表的签名。由于openEuler在演进过程中,签名证书发生变化,因此部分跨版本安装场景存在后向兼容问题(无前向兼容问题,即新版本的内核可正常校验旧版本的IMA摘要列表文件)。 + +建议用户确认当前内核中包含以下几本签名证书: + +``` +# keyctl show %:.builtin_trusted_keys +Keyring + 566488577 ---lswrv 0 0 keyring: .builtin_trusted_keys + 383580336 ---lswrv 0 0 \_ asymmetric: openeuler b675600b + 453794670 ---lswrv 0 0 \_ asymmetric: private OBS b25e7f66 + 938520011 ---lswrv 0 0 \_ asymmetric: openeuler fb37bc6f +``` + +如缺少证书,建议将内核升级至最新版本。 + +``` +yum update kernel +``` + +openEuler 24.03 LTS及之后版本已具备IMA专用证书,且支持证书链校验,证书生命周期可覆盖整个LTS版本。 + +#### FAQ4:开启IMA摘要列表评估模式后,IMA摘要列表文件签名正确,但是导入失败 + +IMA摘要列表导入存在检查机制,如果某次导入过程中,摘要列表的签名校验失败,则会关闭摘要列表导入功能,从而导致后续即使正确签名的摘要列表文件也无法被导入。用户可检查dmesg日志中是否存在如下打印确认是否为该原因导致: + +``` +# dmesg +ima: 0-metadata_list-compact-bash-5.1.8-6.oe2203sp1.x86_64 not appraised, disabling digest lists lookup for appraisal +``` + +如上述日志,则说明在开启IMA摘要列表评估模式的情况下,已经导入了一个签名错误的摘要列表文件,从而导致功能关闭。此时用户需要重启系统,并修复错误的摘要列表签名信息。 + +#### FAQ5:openEuler 24.03 LTS及之后版本导入用户自定义的IMA证书失败 + +Linux 6.6内核新增了对导入证书的字段校验限制,对于导入IMA密钥环的证书,需要满足如下约束(遵循X.509标准格式): + +- 为数字签名证书,即设置`keyUsage=digitalSignature`字段; +- 非CA证书,即不可设置`basicConstraints=CA:TRUE`字段; +- 非中间证书,即不可设置`keyUsage=keyCertSign`字段。 + +#### FAQ6:开启IMA评估模式后kdump服务启动失败 + +开启IMA评估enforce模式后,如果IMA策略中配置了如下KEXEC_KERNEL_CHECK规则,可能导致kdump服务启动失败。 + +```shell +appraise func=KEXEC_KERNEL_CHECK appraise_type=imasig +``` + +原因是在该场景下,所有通过KEXEC加载的文件都需要经过完整性校验,因此内核限制kdump加载内核映像文件时必须使用kexec_file_load系统调用。可通过修改/etc/sysconfig/kdump配置文件的KDUMP_FILE_LOAD开启kexec_file_load系统调用。 + +```shell +KDUMP_FILE_LOAD="on" +``` + +同时,kexec_file_load系统调用自身也会执行文件的签名校验,因此要求被加载的内核映像文件必须包含正确的安全启动签名,而且当前内核中必须包含对应的验签证书。 diff --git "a/docs/zh/docs/Administration/\345\212\250\346\200\201\345\256\214\346\225\264\346\200\247\345\272\246\351\207\217\357\274\210DIM\357\274\211.md" "b/docs/zh/docs/Administration/\345\212\250\346\200\201\345\256\214\346\225\264\346\200\247\345\272\246\351\207\217\357\274\210DIM\357\274\211.md" new file mode 100644 index 0000000000000000000000000000000000000000..69fe61f71cbe09520a193652e91c91c28d4aeb01 --- /dev/null +++ "b/docs/zh/docs/Administration/\345\212\250\346\200\201\345\256\214\346\225\264\346\200\247\345\272\246\351\207\217\357\274\210DIM\357\274\211.md" @@ -0,0 +1,732 @@ +# 动态完整性度量(DIM) + +本章节为DIM(Dynamic Integrity Measurement)动态完整性度量的特性介绍以及使用说明。 + +## 背景 + +随着信息产业的不断发展,信息系统所面临的安全风险也日益增长。信息系统中可能运行大量软件,部分软件不可避免地存在漏洞,这些漏洞一旦被攻击者利用,可能会对系统业务造成严重的影响,如数据泄露、服务不可用等。 + +绝大部分的软件攻击,都会伴随着完整性破坏,如恶意进程运行、配置文件篡改、后门植入等。因此业界提出了完整性保护技术,指的是从系统启动开始,对关键数据进行度量和校验,从而保证系统运行达到预期效果。当前业界已广泛使用的完整性保护技术(如安全启动、文件完整性度量等)都无法对进程运行时的内存数据进行保护。如果攻击者利用一些手段修改了进程的代码指令,可能导致进程被劫持或被植入后门,具有攻击性强,隐蔽性高的特点。对于这种攻击手段,业界提出了动态完整性度量技术,即对进程的运行时内存中的关键数据进行度量和保护。 + +## 术语说明 + +静态基线:针对度量目标的二进制文件进行解析所生成的度量基准数据; + +动态基线:针对度量目标执行首次度量的结果; + +度量策略:指定度量目标的配置信息; + +度量日志:存储度量结果的列表,包含度量对象、度量结果等信息。 + +## 特性简介 + +DIM特性通过在程序运行时对内存中的关键数据(如代码段、数据段)进行度量,并将度量结果和基准值进行对比,确定内存数据是否被篡改,从而检测攻击行为,并采取应对措施。 + +### 功能范围 + +- 当前DIM特性支持在ARM64/X86架构系统中运行; +- 当前DIM特性支持对以下关键内存数据执行度量: + - 用户态进程的代码段:对应ELF文件中属性为PT_LOAD、权限为RX的段,对应进程加载后权限为RX的vma区域; + - 内核模块代码段:起始地址为内核模块对应struct module结构体中的core_layout.base,长度为core_layout.text_size; + - 内核代码段:对应\_stext符号至\_etext,跳过可能由于内核static key机制发生变化的地址。 +- 当前DIM特性支持对接以下硬件平台: + - 支持将度量结果扩展至TPM 2.0芯片的PCR寄存器,以实现远程证明服务对接。 + +### 技术限制 + +- 对于用户态进程,仅支持度量文件映射代码段,不支持度量匿名代码段; +- 不支持度量内核热补丁; +- 仅支持主动触发机制,如果两次触发过程中发生了篡改-恢复的行为,会导致无法识别攻击; +- 对于主动修改代码段的场景(如代码段重定位、自修改或热补丁),会被识别为攻击; +- 对于内核、内核模块的度量,以触发动态基线时的度量结果作为度量基准值,静态基线值仅作为一个固定标识; +- 度量目标必须在触发动态基线的时刻就已在内存中加载(如进程运行或内核模块加载),否则后续无法度量; +- 在需要使用TPM芯片的PCR寄存器验证度量日志的场景下,DIM模块不允许卸载,否则会导致度量日志清空,而无法和PCR寄存器匹配; +>![](./public_sys-resources/icon-note.gif) **说明:** +> +>特性启用后,会对系统性能存在一定影响,主要包括以下方面: +> - DIM特性自身加载以及基线数据、度量日志管理会对系统内存造成消耗,具体影响与保护策略配置相关; +> - DIM特性执行度量期间需要进行哈希运算,造成CPU消耗,具体影响与需要度量的数据大小有关; +> - DIM特性执行度量期间需要对部分资源执行上锁或获取信号量操作,可能导致其他并发进程等待。 + +### 规格约束 + +| 规格项 | 值 | +| ------------------------------------------------------------ | ---- | +| 文件大小上限(策略文件、静态基线文件、签名文件、证书文件) | 10MB | +| 同一个度量目标在一次动态基线后多次度量期间最多记录的篡改度量日志条数 | 10条 | +| /etc/dim/policy中度量策略最大可记录数|10000条| + +### 架构说明 + +DIM包含两个软件包dim_tools和dim,分别提供如下组件: + +| 软件包 | 组件 | 说明 | +| --------- | ---------------- | ------------------------------------------------------------ | +| dim_tools | dim_gen_baseline | 用户态组件,静态基线生成工具,用于生成动态度量所需要的基线数据,该基线数据在DIM特性运行时会被导入并作为度量基准值 | +| dim | dim_core | 内核模块,执行核心的动态度量逻辑,包括策略解析、静态基线解析、动态基线建立、度量执行、度量日志记录、TPM芯片扩展操作等,实现对内存关键数据的度量功能 | +| dim | dim_monitor | 内核模块,执行对dim_core的代码段和关键数据的度量保护,一定程度防止由于dim_core遭受攻击导致的DIM功能失效。 | + +整体架构如下图所示: + +![](./figures/dim_architecture.jpg) + +### 关键流程说明 + +dim_core和dim_monitor模块均提供了对内存数据的度量功能,包含两个核心流程: + +- 动态基线流程:dim_core模块读取并解析策略和静态基线文件,然后对目标进程执行代码段度量,度量结果在内存中以动态基线形式存放,最后将动态基线数据和静态基线数据进行对比,并将对比结果记录度量日志;dim_monitor模块对dim_core模块的代码段和关键数据进行度量,作为动态基线并记录度量日志; +- 动态度量流程:dim_core和dim_monitor模块对目标执行度量,并将度量结果与动态基线值进行对比,如果对比不一致,则将结果记录度量日志。 + +### 接口说明 + +#### 文件路径说明 + +| 路径 | 说明 | +| ------------------------------- | ------------------------------------------------------------ | +| /etc/dim/policy | 度量策略文件 | +| /etc/dim/policy.sig | 度量策略签名文件,用于存放策略文件的签名信息,在签名校验功能开启的情况下使用 | +| /etc/dim/digest_list/*.hash | 静态基线文件,用于存放度量的基准值信息 | +| /etc/dim/digest_list/*.hash.sig | 静态基线签名文件,用于存放静态基线文件的签名信息,在签名校验功能开启的情况下使用 | +| /etc/keys/x509_dim.der | 证书文件,用于校验策略文件和静态基线文件的签名信息,在签名校验功能开启的情况下使用 | +| /sys/kernel/security/dim | DIM文件系统目录,DIM内核模块加载后生成,目录下提供对DIM功能进行操作的内核接口 | + +#### 文件格式说明 + +1. 度量策略文件格式说明 + + 文本文件,以UNIX换行符进行分隔,每行代表一条度量策略,当前支持以下几种配置格式: + + 1. 用户态进程代码段度量配置: + + ``` + measure obj=BPRM_TEXT path=<度量目标进程可执行文件或动态库对应二进制文件的绝对路径> + ``` + + 2. 内核模块代码段度量配置: + + ``` + measure obj=MODULE_TEXT name=<内核模块名> + ``` + + 3. 内核度量配置: + + ``` + measure obj=KERNEL_TEXT + ``` + +**参考示例:** + +``` +# cat /etc/dim/policy +measure obj=BPRM_TEXT path=/usr/bin/bash +measure obj=BPRM_TEXT path=/usr/lib64/libc.so.6 +measure obj=MODULE_TEXT name=ext4 +measure obj=KERNEL_TEXT +``` + +2. 静态基线文件格式说明 + + 文本文件,以UNIX换行符进行分隔,每行代表一条静态基线,当前支持以下几种配置格式: + + 1. 用户态进程基线: + + ``` + dim USER sha256:6ae347be2d1ba03bf71d33c888a5c1b95262597fbc8d00ae484040408a605d2b <度量目标进程可执行文件或动态库对应二进制文件的绝对路径> + ``` + + 2. 内核模块基线: + + ``` + dim KERNEL sha256:a18bb578ff0b6043ec5c2b9b4f1c5fa6a70d05f8310a663ba40bb6e898007ac5 <内核release号>/<内核模块名> + ``` + + 3. 内核基线: + + ``` + dim KERNEL sha256:2ce2bc5d65e112ba691c6ab46d622fac1b7dbe45b77106631120dcc5441a3b9a <内核release号> + ``` + +**参考示例:** + +``` +dim USER sha256:6ae347be2d1ba03bf71d33c888a5c1b95262597fbc8d00ae484040408a605d2b /usr/bin/bash +dim USER sha256:bc937f83dee4018f56cc823f5dafd0dfedc7b9872aa4568dc6fbe404594dc4d0 /usr/lib64/libc.so.6 +dim KERNEL sha256:a18bb578ff0b6043ec5c2b9b4f1c5fa6a70d05f8310a663ba40bb6e898007ac5 6.4.0-1.0.1.4.oe2309.x86_64/dim_monitor +dim KERNEL sha256:2ce2bc5d65e112ba691c6ab46d622fac1b7dbe45b77106631120dcc5441a3b9a 6.4.0-1.0.1.4.oe2309.x86_64 +``` + +3. 度量日志格式说明 + + 文本内容,以UNIX换行符进行分隔,每行代表一条度量日志,格式为: + +``` + <度量日志哈希值> <度量算法>:<度量哈希值> <度量对象> <度量日志类型> +``` + +**参考示例:** + + 1. 对bash进程代码段执行度量,度量结果与静态基线一致: + + ``` + 12 0f384a6d24e121daf06532f808df624d5ffc061e20166976e89a7bb24158eb87 sha256:db032449f9e20ba37e0ec4a506d664f24f496bce95f2ed972419397951a3792e /usr/bin.bash [static baseline] + ``` + + 2. 对bash进程代码段执行度量,度量结果与静态基线不一致: + + ``` + 12 0f384a6d24e121daf06532f808df624d5ffc061e20166976e89a7bb24158eb87 sha256:db032449f9e20ba37e0ec4a506d664f24f496bce95f2ed972419397951a3792e /usr/bin.bash [tampered] + ``` + + 3. 对ext4内核模块代码段执行度量,未找到静态基线: + + ``` + 12 0f384a6d24e121daf06532f808df624d5ffc061e20166976e89a7bb24158eb87 sha256:db032449f9e20ba37e0ec4a506d664f24f496bce95f2ed972419397951a3792e ext4 [no static baseline] + ``` + + 4. dim_monitor对dim_core执行度量,记录基线时的度量结果: + + ``` + 12 660d594ba050c3ec9a7cdc8cf226c5213c1e6eec50ba3ff51ff76e4273b3335a sha256:bdab94a05cc9f3ad36d29ebbd14aba8f6fd87c22ae580670d18154b684de366c dim_core.text [dynamic baseline] + 12 28a3cefc364c46caffca71e7c88d42cf3735516dec32796e4883edcf1241a7ea sha256:0dfd9656d6ecdadc8ec054a66e9ff0c746d946d67d932cd1cdb69780ccad6fb2 dim_core.data [dynamic baseline] + ``` + +4. 证书/签名文件格式说明 + +为通用格式,详见[开启签名校验](#开启签名校验)章节。 + +#### 内核模块参数说明 + +1. dim_core模块参数 + +| 参数名 | 参数内容 | 取值范围 | 默认值 | +| -------------------- | ------------------------------------------------------------ | ------------------------ | ------ | +| measure_log_capacity | 度量日志最大条数,当dim_core记录的度量日志数量达到参数设置时,停止记录度量日志 | 100-UINT_MAX(64位系统) | 100000 | +| measure_schedule | 度量完一个进程/模块后调度的时间,单位毫秒,设置为0代表不调度 | 0-1000 | 0 | +| measure_interval | 自动度量周期,单位分钟,设置为0代表不设置自动度量 | 0-525600 | 0 | +| measure_hash | 度量哈希算法 | sha256, sm3 | sha256 | +| measure_pcr | 将度量结果扩展至TPM芯片的PCR寄存器,设置为0代表不扩展(注意需要与芯片实际的PCR编号保持一致) | 0-128 | 0 | +| signature | 是否启用策略文件和签名机制,设置为0代表不启用,设置为1代表启用 | 0, 1 | 0 | + +**使用示例**: + +``` +insmod /path/to/dim_core.ko measure_log_capacity=10000 measure_schedule=10 measure_pcr=12 +modprobe dim_core measure_log_capacity=10000 measure_schedule=10 measure_pcr=12 +``` + +2. dim_monitor模块参数 + +| 参数名 | 参数内容 | 取值范围 | 默认值 | +| -------------------- | ------------------------------------------------------------ | ------------------------ | ------ | +| measure_log_capacity | 度量日志最大条数,当dim_monitor记录的度量日志数量达到参数设置时,停止记录度量日志 | 100-UINT_MAX(64位系统) | 100000 | +| measure_hash | 度量哈希算法 | sha256, sm3 | sha256 | +| measure_pcr | 将度量结果扩展至TPM芯片的PCR寄存器,设置为0代表不扩展 | 0-128 | 0 | + +**使用示例**: + +``` +insmod /path/to/dim_monitor.ko measure_log_capacity=10000 measure_hash=sm3 +modprobe dim_monitor measure_log_capacity=10000 measure_hash=sm3 +``` + +#### 内核接口说明 + +1. dim_core模块接口 + +| 接口名 | 属性 | 功能 | 示例 | +| -------------------------- | ---- | ------------------------------------------------------------ | --------------------------------------------------------- | +| measure | 只写 | 写入字符串1触发动态度量,成功返回0,失败返回错误码 | echo 1 > /sys/kernel/security/dim/measure | +| baseline_init | 只写 | 写入字符串1触发动态基线,成功返回0,失败返回错误码 | echo 1 > /sys/kernel/security/dim/baseline_init | +| ascii_runtime_measurements | 只读 | 读取接口查询度量日志 | cat /sys/kernel/security/dim/ascii_runtime_measurements | +| runtime_status | 只读 | 读取接口返回状态类型信息,失败返回错误码 | cat /sys/kernel/security/dim/runtime_status | +| interval | 读写 | 写入数字字符串设置自动度量周期(范围同measure_interval参数);读取接口查询当前自动度量周期,失败返回错误码 | echo 10 > /sys/kernel/security/dim/interval
cat /sys/kernel/security/dim/interval | + +**dim_core状态类型信息说明:** + +状态信息以如下字段取值: + +- DIM_NO_BASELINE:表示dim_core已加载,但未进行任何操作; +- DIM_BASELINE_RUNNING:表示正在进行动态基线建立; +- DIM_MEASURE_RUNNING:表示正在进行动态度量度量; +- DIM_PROTECTED:表示已完成动态基线建立,处于受保护状态; +- DIM_ERROR:执行动态基线建立或动态度量时发生错误,需要用户解决错误后重新触发动态基线建立或动态度量。 + +2. dim_monitor模块接口 + +| 接口名 | 属性 | 说明 | 示例 | +| ---------------------------------- | ---- | ---------------------------------------------- | ------------------------------------------------------------ | +| monitor_run | 只写 | 写入字符串1触发度量,成功返回0,失败返回错误码 | echo 1 > /sys/kernel/security/dim/monitor_run | +| monitor_baseline | 只写 | 写入字符串1触发基线,成功返回0,失败返回错误码 | echo 1 > /sys/kernel/security/dim/monitor_baseline | +| monitor_ascii_runtime_measurements | 只读 | 读取接口查询度量日志 | cat /sys/kernel/security/dim/monitor_ascii_runtime_measurements | +| monitor_status | 只读 | 读取接口返回状态类型信息,失败返回错误码 | cat /sys/kernel/security/dim/monitor_status | + +**dim_monitor状态类型信息说明:** + +- ready:表示dim_monitior已加载,但未进行任何操作; +- running:表示正在进行动态基线建立或动态度量; +- error:执行动态基线建立或动态度量时发生错误,需要用户解决错误后重新触发动态基线建立或动态度量; +- protected:表示已完成动态基线建立,处于受保护状态。 + +#### 用户态工具接口说明 + +dim_gen_baseline命令行接口,详见: 。 + +## 如何使用 + +### 安装/卸载 + +**前置条件**: + +- OS版本:支持openEuler 23.09及以上版本; +- 内核版本:支持openEuler kernel 5.10/6.4版本。 + +安装dim_tools和dim软件包,以openEuler 23.09版本为例: + +``` +# yum install -y dim_tools dim +``` + +软件包安装完成后,DIM内核组件不会默认加载,可通过如下命令进行加载和卸载: + +``` +# modprobe dim_core 或 insmod /path/to/dim_core.ko +# modprobe dim_monitor 或 insmod /path/to/dim_monitor.ko +# rmmod dim_monitor +# rmmod dim_core +``` + +加载成功后,可以通过如下命令查询: + +``` +# lsmod | grep dim_core +dim_core 77824 1 dim_monitor +# lsmod | grep dim_monitor +dim_monitor 36864 0 +``` + +卸载前需要先卸载ko,再卸载rpm包 + +``` +# rmmod dim_monitor +# rmmod dim_core +# rpm -e dim +``` + +>![](./public_sys-resources/icon-note.gif) **说明:** +> +> dim_monitor必须后于dim_core加载,先于dim_core卸载; +> 也可使用源码编译安装,详见 。 + +### 度量用户态进程代码段 + +**前置条件**: + +- dim_core模块加载成功; + +- 用户需要准备一个常驻的度量目标用户态程序,本小节以程序路径/opt/dim/demo/dim_test_demo为例: + + ``` + # /opt/dim/demo/dim_test_demo & + ``` + +**步骤1**:为度量目标进程对应的二进制文件生成静态基线 + +``` +# mkdir -p /etc/dim/digest_list +# dim_gen_baseline /opt/dim/demo/dim_test_demo -o /etc/dim/digest_list/test.hash +``` + +**步骤2**:配置度量策略 + +``` +# echo "measure obj=BPRM_TEXT path=/opt/dim/demo/dim_test_demo" > /etc/dim/policy +``` + +**步骤3**:触发动态基线建立 + +``` +# echo 1 > /sys/kernel/security/dim/baseline_init +``` + +**步骤4**:查询度量日志 + +``` +# cat /sys/kernel/security/dim/ascii_runtime_measurements +0 e9a79e25f091e03a8b3972b1a0e4ae2ccaed1f5652857fe3b4dc947801a6913e sha256:02e28dff9997e1d81fb806ee5b784fd853eac8812059c4dba7c119c5e5076989 /opt/dim/demo/dim_test_demo [static baseline] +``` + +如上度量日志说明目标进程被成功度量,且度量结果与静态基线一致。 + +**步骤5**:触发动态度量 + +``` +# echo 1 > /sys/kernel/security/dim/measure +``` + +度量完成后可通过**步骤4**查询度量日志,如果度量结果和动态基线阶段的度量结果一致,则度量日志不会更新,否则会新增异常度量日志。如果攻击者尝试篡改目标程序(如采用修改代码重新编译的方式,过程略)并重新启动目标程序: + +``` +# pkill dim_test_demo +# /opt/dim/demo/dim_test_demo & +``` + +再次触发度量并查询度量日志,可以发现有标识为“tampered”的度量日志: + +``` +# echo 1 > /sys/kernel/security/dim/measure +# cat /sys/kernel/security/dim/ascii_runtime_measurements +0 e9a79e25f091e03a8b3972b1a0e4ae2ccaed1f5652857fe3b4dc947801a6913e sha256:02e28dff9997e1d81fb806ee5b784fd853eac8812059c4dba7c119c5e5076989 /opt/dim/demo/dim_test_demo [static baseline] +0 08a2f6f2922ad3d1cf376ae05cf0cc507c2f5a1c605adf445506bc84826531d6 sha256:855ec9a890ff22034f7e13b78c2089e28e8d217491665b39203b50ab47b111c8 /opt/dim/demo/dim_test_demo [tampered] +``` + +### 度量内核模块代码段 + +**前置条件**: + +- dim_core模块加载成功; + +- 用户需要准备一个度量目标内核模块,本小节假设内核模块路径为/opt/dim/demo/dim_test_module.ko,模块名为dim_test_module: + + ``` + # insmod /opt/dim/demo/dim_test_module.ko + ``` + +>![](./public_sys-resources/icon-note.gif) **说明:** +> +>需要保证内核模块的内核编译环境版本号和当前系统内核版本号一致,可以使用如下方法确认: +> +>``` +># modinfo dim_monitor.ko | grep vermagic | grep "$(uname -r)" +>vermagic: 6.4.0-1.0.1.4.oe2309.x86_64 SMP preempt mod_unload modversions +>``` + +即内核模块vermagic信息的第一个字段需要和当前内核版本号完全一致。 + +**步骤1**:为度量目标内核模块生成静态基线 + +``` +# mkdir -p /etc/dim/digest_list +# dim_gen_baseline /opt/dim/demo/dim_test_module.ko -o /etc/dim/digest_list/test.hash +``` + +**步骤2**:配置度量策略 + +``` +# echo "measure obj=MODULE_TEXT name=dim_test_module" > /etc/dim/policy +``` + +**步骤3**:触发动态基线建立 + +``` +# echo 1 > /sys/kernel/security/dim/baseline_init +``` + +**步骤4**:查询度量日志 + +``` +# cat /sys/kernel/security/dim/ascii_runtime_measurements +0 9603a9d5f87851c8eb7d2619f7abbe28cb8a91f9c83f5ea59f036794e23d1558 sha256:9da4bccc7ae1b709deab8f583b244822d52f3552c93f70534932ae21fac931c6 dim_test_module [static baseline] +``` + +如上度量日志说明dim_test_module模块被成功度量,并以当前的度量结果作为后续度量的基准值(此时度量日志中的哈希值不代表实际度量值)。 + +**步骤5**:触发动态度量 + +``` +echo 1 > /sys/kernel/security/dim/measure +``` + +度量完成后可通过**步骤4**查询度量日志,如果度量结果和动态基线阶段的度量结果一致,则度量日志不会更新,否则会新增异常度量日志。如果攻击者尝试篡改内核模块(如采用修改代码重新编译的方式,过程略)并重新加载: + +``` +rmmod dim_test_module +insmod /opt/dim/demo/dim_test_module.ko +``` + +再次触发度量并查询度量日志,可以发现有标识为“tampered”的度量日志: + +``` +# cat /sys/kernel/security/dim/ascii_runtime_measurements +0 9603a9d5f87851c8eb7d2619f7abbe28cb8a91f9c83f5ea59f036794e23d1558 sha256:9da4bccc7ae1b709deab8f583b244822d52f3552c93f70534932ae21fac931c6 dim_test_module [static baseline] +0 6205915fe63a7042788c919d4f0ff04cc5170647d7053a1fe67f6c0943cd1f40 sha256:4cb77370787323140cb572a789703be1a4168359716a01bf745aa05de68a14e3 dim_test_module [tampered] +``` + +### 度量内核代码段 + +**前置条件**: + +- dim_core模块加载成功。 + +**步骤1**:为内核生成静态基线 + +``` +# mkdir -p /etc/dim/digest_list +# dim_gen_baseline -k "$(uname -r)" -o /etc/dim/digest_list/test.hash /boot/vmlinuz-6.4.0-1.0.1.4.oe2309.x86_64 +``` + +>![](./public_sys-resources/icon-note.gif) **说明:** +> +>/boot/vmlinuz-6.4.0-1.0.1.4.oe2309.x86_64文件名不固定。 + +**步骤2**:配置DIM策略 + +``` +# echo "measure obj=KERNEL_TEXT" > /etc/dim/policy +``` + +**步骤3**:触发动态基线建立 + +``` +# echo 1 > /sys/kernel/security/dim/baseline_init +``` + +**步骤4**:查询度量日志 + +``` +# cat /sys/kernel/security/dim/ascii_runtime_measurements +0 ef82c39d767dece1f5c52b31d1e8c7d55541bae68a97542dda61b0c0c01af4d2 sha256:5f1586e95b102cd9b9f7df3585fe13a1306cbd464f2ebe47a51ad34128f5d0af 6.4.0-1.0.1.4.oe2309.x86_64 [static baseline] +``` + +如上度量日志说明内核被成功度量,并以当前的基线结果作为后续度量的基准值(此时度量日志中的哈希值不代表实际度量值)。 + +**步骤5**:触发动态度量 + +``` +# echo 1 > /sys/kernel/security/dim/measure +``` + +度量完成后可通过**步骤4**查询度量日志,如果度量结果和动态基线阶段的度量结果一致,则度量日志不会更新,否则会新增异常度量日志。 + +### 度量dim_core模块 + +**前置条件**: + +- dim_core和dim_monitor模块加载成功; +- 度量策略配置完成。 + +**步骤1**:触发dim_core动态基线 + +``` +# echo 1 > /sys/kernel/security/dim/baseline_init +``` + +**步骤2**:触发dim_monitor动态基线 + +``` +# echo 1 > /sys/kernel/security/dim/monitor_baseline +``` + +**步骤3**:查询dim_monitor度量日志 + +``` +# cat /sys/kernel/security/dim/monitor_ascii_runtime_measurements +0 c1b0d9909ddb00633fc6bbe7e457b46b57e165166b8422e81014bdd3e6862899 sha256:35494ed41109ebc9bf9bf7b1c190b7e890e2f7ce62ca1920397cd2c02a057796 dim_core.text [dynamic baseline] +0 9be7121cd2c215d454db2a8aead36b03d2ed94fad0fbaacfbca83d57a410674f sha256:f35d20aae19ada5e633d2fde6e93133c3b6ae9f494ef354ebe5b162398e4d7fa dim_core.data [dynamic baseline] +``` + +如上度量日志说明dim_core模块被成功度量,并以当前的基线结果作为后续度量的基准值。 +>![](./public_sys-resources/icon-note.gif) **说明:** +> +>若跳过动态基线创建,直接进行度量,日志中会显示tampered。 + +**步骤4**:触发dim_monitor动态度量 + +``` +# echo 1 > /sys/kernel/security/dim/monitor_run +``` + +如果度量结果和动态基线阶段的度量结果一致,则度量日志不会更新,否则会新增异常度量日志。尝试修改策略后重新执触发dim_core动态基线,此时由于度量目标发生变化,dim_core管理的基线数据也会发生变更,从而dim_monitor的度量结果也会发生变化: + +``` +# echo "measure obj=BPRM_TEXT path=/usr/bin/bash" > /etc/dim/policy +# echo 1 > /sys/kernel/security/dim/baseline_init +``` + +再次触发dim_monitor度量并查询度量日志,可以发现有标识为“tampered”的度量日志: + +``` +# echo 1 > /sys/kernel/security/dim/monitor_run +# cat /sys/kernel/security/dim/monitor_ascii_runtime_measurements +0 c1b0d9909ddb00633fc6bbe7e457b46b57e165166b8422e81014bdd3e6862899 sha256:35494ed41109ebc9bf9bf7b1c190b7e890e2f7ce62ca1920397cd2c02a057796 dim_core.text [dynamic baseline] +0 9be7121cd2c215d454db2a8aead36b03d2ed94fad0fbaacfbca83d57a410674f sha256:f35d20aae19ada5e633d2fde6e93133c3b6ae9f494ef354ebe5b162398e4d7fa dim_core.data [dynamic baseline] +0 6a60d78230954aba2e6ea6a6b20a7b803d7adb405acbb49b297c003366cfec0d sha256:449ba11b0bfc6146d4479edea2b691aa37c0c025a733e167fd97e77bbb4b9dab dim_core.data [tampered] +``` + +### 扩展TPM PCR寄存器 + +**前置条件**: + +- 系统已安装TPM 2.0芯片,执行如下命令返回不为空: + + ``` + # ls /dev/tpm* + /dev/tpm0 /dev/tpmrm0 + ``` + +- 系统已安装tpm2-tools软件包,执行如下命令返回不为空: + + ``` + # rpm -qa tpm2-tools + ``` + +- 度量策略和静态基线配置完成。 + +**步骤1**:加载dim_core和dim_monitor模块,并配置扩展度量结果的PCR寄存器编号,这里为dim_core度量结果指定PCR 12,为dim_monitor指定PCR 13 + +``` +# modprobe dim_core measure_pcr=12 +# modprobe dim_monitor measure_pcr=13 +``` + +**步骤2**:触发dim_core和dim_monitor基线 + +``` +# echo 1 > /sys/kernel/security/dim/baseline_init +# echo 1 > /sys/kernel/security/dim/monitor_baseline +``` + +**步骤3**:查看度量日志,每条日志都显示了对应的TPM PCR寄存器编号 + +``` +# cat /sys/kernel/security/dim/ascii_runtime_measurements +12 2649c414d1f9fcac1c8d0df8ae7b1c18b5ea10a162b957839bdb8f8415ec6146 sha256:83110ce600e744982d3676202576d8b94cea016a088f99617767ddbd66da1164 /usr/lib/systemd/systemd [static baseline] +# cat /sys/kernel/security/dim/monitor_ascii_runtime_measurements +13 72ee3061d5a80eb8547cd80c73a80c3a8dc3b3e9f7e5baa10f709350b3058063 sha256:5562ed25fcdf557efe8077e231399bcfbcf0160d726201ac8edf7a2ca7c55ad0 dim_core.text [dynamic baseline] +13 8ba44d557a9855c03bc243a8ba2d553347a52c1a322ea9cf8d3d1e0c8f0e2656 sha256:5279eadc235d80bf66ba652b5d0a2c7afd253ebaf1d03e6e24b87b7f7e94fa02 dim_core.data [dynamic baseline] +``` + +**步骤4**:检查TPM芯片的PCR寄存器,对应的寄存器均已被写入了扩展值 + +``` +# tpm2_pcrread sha256 | grep "12:" + 12: 0xF358AC6F815BB29D53356DA2B4578B4EE26EB9274E553689094208E444D5D9A2 +# tpm2_pcrread sha256 | grep "13:" + 13: 0xBFB9FF69493DEF9C50E52E38B332BDA8DE9C53E90FB96D14CD299E756205F8EA +``` + +### 开启签名校验 + +**前置条件**: + +- 用户准备公钥证书和签名私钥,签名算法需要为RSA,哈希算法为sha256,证书格式需要为DER。也可以采用如下方式生成: + + ``` + # openssl genrsa -out dim.key 4096 + # openssl req -new -sha256 -key dim.key -out dim.csr -subj "/C=AA/ST=BB/O=CC/OU=DD/CN=DIM Test" + # openssl x509 -req -days 3650 -signkey dim.key -in dim.csr -out dim.crt + # openssl x509 -in dim.crt -out dim.der -outform DER + ``` + +- 度量策略配置完成。 + +**步骤1**:将DER格式的证书放置在/etc/keys/x509_dim.der路径 + +``` +# mkdir -p /etc/keys +# cp dim.der /etc/keys/x509_dim.der +``` + +**步骤2**:对策略文件和静态基线文件进行签名,签名文件必须为原文件名直接添加.sig后缀 + +``` +# openssl dgst -sha256 -out /etc/dim/policy.sig -sign dim.key /etc/dim/policy +# openssl dgst -sha256 -out /etc/dim/digest_list/test.hash.sig -sign dim.key /etc/dim/digest_list/test.hash +``` + +**步骤3**:加载dim_core模块,开启签名校验功能 + +``` +modprobe dim_core signature=1 +``` + +此时,策略文件和静态基线文件均需要通过签名校验后才能加载。 +修改策略文件触发基线,会导致基线失败: + +``` +# echo "" >> /etc/dim/policy +# echo 1 > /sys/kernel/security/dim/baseline_init +-bash: echo: write error: Key was rejected by service +``` + +>![](./public_sys-resources/icon-note.gif) **说明:** +> +>如果某个静态基线文件签名校验失败,dim_core会跳过该文件的解析,而不会导致基线失败。 + +### 配置度量算法 + +**前置条件**: + +- 度量策略配置完成。 + +**步骤1**:加载dim_core和dim_monitor模块,并配置度量算法,这里以sm3算法为例 + +``` +# modprobe dim_core measure_hash=sm3 +# modprobe dim_monitor measure_hash=sm3 +``` + +**步骤2**:配置策略并为度量目标程序生成sm3算法的静态基线 + +``` +# echo "measure obj=BPRM_TEXT path=/opt/dim/demo/dim_test_demo" > /etc/dim/policy +# dim_gen_baseline -a sm3 /opt/dim/demo/dim_test_demo -o /etc/dim/digest_list/test.hash +``` + +**步骤3**:触发基线 + +``` +# echo 1 > /sys/kernel/security/dim/baseline_init +# echo 1 > /sys/kernel/security/dim/monitor_baseline +``` + +**步骤4**:查看度量日志,每条日志都显示了对应的哈希算法 + +``` +# cat /sys/kernel/security/dim/ascii_runtime_measurements +0 585a64feea8dd1ec415d4e67c33633b97abb9f88e6732c8a039064351d24eed6 sm3:ca84504c02bef360ec77f3280552c006ce387ebb09b49b316d1a0b7f43039142 /opt/dim/demo/dim_test_demo [static baseline] +# cat /sys/kernel/security/dim/monitor_ascii_runtime_measurements +0 e6a40553499d4cbf0501f32cabcad8d872416ca12855a389215b2509af76e60b sm3:47a1dae98182e9d7fa489671f20c3542e0e154d3ce941440cdd4a1e4eee8f39f dim_core.text [dynamic baseline] +0 2c862bb477b342e9ac7d4dd03b6e6705c19e0835efc15da38aafba110b41b3d1 sm3:a4d31d5f4d5f08458717b520941c2aefa0b72dc8640a33ee30c26a9dab74eae9 dim_core.data [dynamic baseline] +``` + +### 配置自动周期度量 + +**前置条件**: + +- 度量策略配置完成; + +**方式1**:加载dim_core模块,配置定时度量间隔,此处配置为1分钟 + +``` +modprobe dim_core measure_interval=1 +``` + +在模块加载完成后,自动触发动态基线流程,后续每隔1分钟触发一次动态度量。 + +>![](./public_sys-resources/icon-note.gif) **说明:** +> +>此时不能配置dim_core度量自身代码段的度量策略,否则会产生误报。 +>同时需要提前配置/etc/dim/policy,否则指定measure_interval=1加载模块会失败 + +**方式2**:加载dim_core模块后,也可通过内核模块接口配置定时度量间隔,此处配置为1分钟 + +``` +modprobe dim_core +echo 1 > /sys/kernel/security/dim/interval +``` + +此时不会立刻触发度量,1分钟后会触发动态基线或动态度量,后续每隔1分钟触发一次动态度量。 + +### 配置度量调度时间 + +**前置条件**: + +- 度量策略配置完成; + +加载dim_core模块,配置定时度量调度时间,此处配置为10毫秒: + +``` +modprobe dim_core measure_schedule=10 +``` + +触发动态基线或动态度量时,dim_core每度量一个进程,就会调度让出CPU 10毫秒时间。 \ No newline at end of file diff --git "a/docs/zh/docs/Administration/\345\217\257\344\277\241\345\271\263\345\217\260\346\216\247\345\210\266\346\250\241\345\235\227\357\274\210TPCM\357\274\211.md" "b/docs/zh/docs/Administration/\345\217\257\344\277\241\345\271\263\345\217\260\346\216\247\345\210\266\346\250\241\345\235\227\357\274\210TPCM\357\274\211.md" new file mode 100644 index 0000000000000000000000000000000000000000..2cbe00d46cf37245dd331e4bc19c1a2518fc37bb --- /dev/null +++ "b/docs/zh/docs/Administration/\345\217\257\344\277\241\345\271\263\345\217\260\346\216\247\345\210\266\346\250\241\345\235\227\357\274\210TPCM\357\274\211.md" @@ -0,0 +1,39 @@ +# 可信平台控制模块(TPCM) + +## 背景 + +可信计算在近40年的研究过程中,经历了不断的发展和完善,已经成为信息安全的一个重要分支。中国的可信计算技术近年发展迅猛,在可信计算2.0的基础上解决了可信体系与现有体系的融合问题、可信管理问题以及可信开发的简化问题,形成了基于主动免疫体系的可信计算技术--可信计算3.0。相对于可信计算2.0被动调用的外挂式体系结构,可信计算3.0提出了以自主密码为基础、控制芯片为支柱、双融主板为平台、可信软件为核心、可信连接为纽带、策略管控成体系、安全可信保应用的全新的可信体系框架,在网络层面解决可信问题。 + +可信平台控制模块(Trusted Platform Control Module,TPCM)是一种可集成在可信计算平台中,用于建立和保障信任源点的基础核心模块。它作为中国可信计算3.0中的创新点之一和主动免疫机制的核心,实现了对整个平台的主动可控。 + +TPCM可信计算3.0架构为双体系架构,分为防护部件和计算部件,以可信密码模块为基础,通过可信平台控制模块对防护部件和计算部件及组件的固件进行可信度量,可信软件基(Trusted Software Base,TSB)对系统软件及应用软件进行可信度量,同时TPCM管理平台实现对可信度量的验证及可信策略同步和管理。 + + + +## 功能描述 + +如下图所示,整体系统方案由防护部件、计算部件和可信管理中心三部分组成。 + +![](./figures/TPCM.png) + +- 可信管理中心:对可信计算节点的防护策略和基准值进行制定、下发、维护、存储等操作的集中管理平台,可信管理中心由第三方厂商提供。 +- 防护部件:独立于计算部件执行,为可信计算平台提供具有主动度量和主动控制特征的可信计算防护功能,实现运算的同时进行安全防护。防护部件包括可信平台控制模块、可信软件基,以及可信密码模块(Trusted Cryptography Module,TCM)。TPCM是可信计算节点中实现可信防护功能的关键部件,可以采用多种技术途径实现,如板卡、芯片、IP核等,其内部包含中央处理器、存储器等硬件,固件,以及操作系统与可信功能组件等软件,支撑其作为一个独立于计算部件的防护部件组件,并行于计算部件按内置防护策略工作,对计算部件的硬件、固件及软件等需防护的资源进行可信监控,是可信计算节点中的可信根。 + +- 计算部件:主要包括硬件、操作系统和应用层软件。其中操作系统分为引导阶段和运行阶段,在引导阶段openEuler的shim和grub2支持可信度量能力,可实现对shim、grub2以及操作系统内核、initramfs等启动文件的可信度量防护;在运行阶段,openEuler操作系统支持部署可信验证要素代理(由第三方厂商可信华泰提供),它负责将数据发送给TPCM模块,用以实现运行阶段的可信度量防护。 + +其中,TPCM作为可信计算节点中实现可信防护功能的关键部件,需要与TSB、TCM、可信管理中心和可信计算节点的计算部件交互,交互方式如下: + +1. TPCM的硬件、固件与软件为TSB提供运行环境,设置的可信功能组件为TSB按策略库解释要求实现度量、控制、支撑与决策等功能提供支持。 +2. TPCM通过访问TCM获取可信密码功能,完成对防护对象可信验证、度量和保密存储等计算任务,并提供TCM服务部件以支持对TCM的访问。 +3. TPCM通过管理接口连接可信管理中心,实现防护策略管理、可信报告处理等功能。 +4. TPCM通过内置的控制器和I/O端口,经由总线与计算部件的控制器交互,实现对计算部件的主动监控。 +5. 计算部件操作系统中内置的防护代理获取预设的防护对象有关代码和数据提供给TPCM,TPCM将监控信息转发给TSB,由TSB依据策略库进行分析处理。 + +## 约束限制 + +适配服务器:TaiShan 200(型号2280)VF
+适配BMC插卡型号:BC83SMMC + +## 应用场景 + +通过TPCM特性构成一个完整的信任链,保障系统启动以后进入一个可信的计算环境。 \ No newline at end of file diff --git "a/docs/zh/docs/Administration/\345\217\257\344\277\241\350\256\241\347\256\227.md" "b/docs/zh/docs/Administration/\345\217\257\344\277\241\350\256\241\347\256\227.md" index efa7a3e9b4116a7ae673abfb29351cd180dd476c..12f92cfa3939d6d1118f7a2a8e851fabd4fa47c5 100644 --- "a/docs/zh/docs/Administration/\345\217\257\344\277\241\350\256\241\347\256\227.md" +++ "b/docs/zh/docs/Administration/\345\217\257\344\277\241\350\256\241\347\256\227.md" @@ -2,8 +2,6 @@ ## 可信计算基础 -### 可信计算 - 不同国际组织对可信(Trusted)做了不同的定义。 1. 可信计算组织(TCG)的定义: @@ -23,1811 +21,3 @@ 一个可信计算系统由信任根、可信硬件平台、可信操作系统和可信应用组成,它的基本思想是首先创建一个安全信任根(TCB),然后建立从硬件平台、操作系统到应用的信任链,在这条信任链上从根开始,前一级认证后一级,实现信任的逐级扩展,从而实现一个安全可信的计算环境。 ![](./figures/trusted_chain.png) - -相比于传统安全机制的“头痛医头,脚痛医脚”,发现一个病毒消灭一个病毒,可信计算采用的是白名单机制,即只允许经过认证的内核、内核模块、应用程序等在系统上运行,如果发现程序已发生更改(或本来就是一个未知的程序),就拒绝其执行。 - -## 内核完整性度量(IMA) - -### 概述 - -#### IMA - -IMA,全称 Integrity Measurement Architecture(完整性度量架构),是内核中的一个子系统,能够基于自定义策略对通过 execve()、mmap() 和 open() 系统调用访问的文件进行度量,度量结果可被用于**本地/远程证明**,或者和已有的参考值比较以**控制对文件的访问**。 - -根据 IMA wiki 的定义,内核完整性子系统的功能可以被分为三部分: - -- 度量(measure):检测对文件的意外或恶意修改,无论远程还是本地。 -- 评估(appraise):度量文件并与一个存储在扩展属性中的参考值作比较,控制本地文件完整性。 -- 审计(audit):将度量结果写到系统日志中,用于审计。 - -可以看到,相比于 IMA 度量作为一个“只记录不干涉”的观察员,IMA 评估更像是一位严格的保安人员,它的职责是拒绝对所有“人证不一”的程序的访问。 - -#### EVM - -EVM,全称 Extended Verification Module(扩展验证模块),它的作用就是将系统当中某个文件的安全扩展属性,包括 security.ima 、security.selinux 等合起来计算一个哈希值,然后使用 TPM 中存的密钥或其他可信环境中的密钥对其进行签名,签名之后的值存在 security.evm 中,这个签名后的值是不能被篡改的,如果被篡改,再次访问的时候就会验签失败。 - -总而言之,EVM 的作用就是通过对安全扩展属性计算摘要和签名并将其存储在 security.evm 中,提供对安全扩展属性的离线保护。 - -#### IMA Digest Lists - -IMA Digest Lists(IMA 摘要列表扩展)是 openEuler 对内核原生完整性保护机制的增强,它取代了原生 IMA 机制为文件完整性提供保护。 - -“摘要列表”(digest lists)是一种特殊格式的二进制数据文件,它与 rpm 包一一对应,记录了 rpm 包中受保护文件(即可执行文件和动态库文件)的哈希值。 - -当正确配置启动参数后,内核将维护一个哈希表(对用户空间不可见),并通过 securityfs 对外提供更新哈希表的接口(digest_list_data 和 digest_list_data_del)。摘要列表在构建阶段经过私钥签名,通过接口上传到内核时,需经过内核中的公钥验证。 - -![](./figures/ima_digest_list_update.png) - -在开启 IMA 评估的情况下,每当访问一个可执行文件或动态库文件,就会调用内核中的钩子,计算文件内容和扩展属性的哈希值,并在内核哈希表中进行搜索,如果匹配就允许文件的执行,否则就拒绝访问。 - -![1599719649188](./figures/ima_verification.png) - -相比内核社区原生 IMA 机制,openEuler 内核提供的 IMA 摘要列表扩展从安全性、性能、易用性三个方面进行了改良,助力完整性保护机制在生产环境下落地: - -- **具备完整的信任链,安全性好** - - 原生 IMA 机制要求在现网环境下预先生成并标记文件扩展属性,访问文件时将文件扩展属性作为参考值,信任链不完整。 - - IMA 摘要列表扩展将文件参考摘要值保存在内核空间中,构建阶段通过摘要列表的形式携带在发布的 rpm 包中,安装 rpm 包的同时导入摘要列表并执行验签,确保了参考值来自于软件发行商,实现了完整的信任链。 - -- **惊艳的性能** - - 由于 TPM 芯片是一种低速芯片,因此 PCR 扩展操作成为了 IMA 度量场景的性能瓶颈。摘要列表扩展在确保安全性的前提下,减少了不必要的 PCR 扩展操作,相比原生 IMA 性能提升高达 65%。 - - IMA 评估场景下,摘要列表扩展将签名验证统一移动到启动阶段进行,避免每次访问文件时都执行验签,相比原生 IMA 评估场景提升运行阶段文件访问的性能约 20%。 - -- **快速部署,平滑升级** - - 原生 IMA 机制在初次部署或每次更新软件包时,都需要切换到 fix 模式手动标记文件扩展属性后再重启进入 enforce 模式,才能正常访问安装的程序。 - - 摘要列表扩展可实现安装完成后开箱即用,且允许直接在 enforce 模式下安装或升级 rpm 包,无需重启和手动标记即可使用,实现了用户感知最小化,适合现网环境下的快速部署和平滑升级。 - -需要注意的是,IMA 摘要列表扩展将原生 IMA 的验签过程提前到启动阶段进行,也引入了一个假设,即内核空间的内存无法被篡改,这就使得 IMA 也依赖于其他安全机制(内核模块安全启动和内存动态度量)以保护内核内存的完整性。 - -但无论社区原生 IMA 机制还是 IMA 摘要列表扩展,都只是可信计算信任链中的一环,无法孤立地保证系统的安全性,安全自始至终都是一个构建纵深防御的系统工程。 - -### 约束限制 - -1. 当前 IMA 评估模式仅支持保护系统中的不可变文件(包括可执行文件和动态库文件)。 -2. IMA 提供的是应用层的完整性度量,它的安全性依赖于之前环节的可信。 -3. 当前阶段 IMA 不支持第三方应用摘要列表的导入。 -4. 启动日志中可能存在 `Unable to open file: /etc/keys/x509_ima.der` 字样,该报错来自于开源社区,不影响 IMA 摘要列表特性的使用。 -5. ARM 版本中 IMA 开启日志模式可能存在一些 audit 报错信息,这是由于 modprobe 在摘要列表未导入时加载内核模块所致,不影响正常功能。 - -### 使用场景 - -#### IMA measurement - -IMA 度量的目的是检测对系统文件的意外或恶意修改,度量结果可被用于本地证明或远程证明。 - -如果系统中存在 TPM 芯片,度量结果将被扩展到 TPM 芯片的指定 PCR 寄存器中,由于 PCR 扩展的单向性以及 TPM 芯片的硬件安全性,用户无法修改已被扩展的度量结果,这就确保了度量结果的真实性。 - -IMA 度量的文件范围和触发条件可以由用户通过 IMA 策略自行配置。 - -默认情况下 IMA 不启用,但系统会前往 `/etc/ima/` 路径下寻找 ima-policy 策略文件,如果找到,就会按照策略在启动时度量系统中的文件。如果不想手动编写策略文件,也可以在启动参数中配置 `ima_policy=tcb` 使用默认策略(更多策略参数请参考附录“IMA启动参数”章节)。 - -系统当前加载的 IMA 策略可以在 `/sys/kernel/security/ima/policy` 文件中查看,IMA 度量日志则位于`/sys/kernel/security/ima/ascii_runtime_measurements` 文件中,如下所示: - -```shell -# head /sys/kernel/security/ima/ascii_runtime_measurements -10 ddee6004dc3bd4ee300406cd93181c5a2187b59b ima-ng sha1:9797edf8d0eed36b1cf92547816051c8af4e45ee boot_aggregate -10 180ecafba6fadbece09b057bcd0d55d39f1a8a52 ima-ng sha1:db82919bf7d1849ae9aba01e28e9be012823cf3a /init -10 ac792e08a7cf8de7656003125c7276968d84ea65 ima-ng sha1:f778e2082b08d21bbc59898f4775a75e8f2af4db /bin/bash -10 0a0d9258c151356204aea2498bbca4be34d6bb05 ima-ng sha1:b0ab2e7ebd22c4d17d975de0d881f52dc14359a7 /lib64/ld-2.27.so -10 0d6b1d90350778d58f1302d00e59493e11bc0011 ima-ng sha1:ce8204c948b9fe3ae67b94625ad620420c1dc838 /etc/ld.so.cache -10 d69ac2c1d60d28b2da07c7f0cbd49e31e9cca277 ima-ng sha1:8526466068709356630490ff5196c95a186092b8 /lib64/libreadline.so.7.0 -10 ef3212c12d1fbb94de9534b0bbd9f0c8ea50a77b ima-ng sha1:f80ba92b8a6e390a80a7a3deef8eae921fc8ca4e /lib64/libc-2.27.so -10 f805861177a99c61eabebe21003b3c831ccf288b ima-ng sha1:261a3cd5863de3f2421662ba5b455df09d941168 /lib64/libncurses.so.6.1 -10 52f680881893b28e6f0ce2b132d723a885333500 ima-ng sha1:b953a3fa385e64dfe9927de94c33318d3de56260 /lib64/libnss_files-2.27.so -10 4da8ce3c51a7814d4e38be55a2a990a5ceec8b27 ima-ng sha1:99a9c095c7928ecca8c3a4bc44b06246fc5f49de /etc/passwd -``` - -每一条记录从左到右分别是: - -1. PCR:用于扩展度量结果的 PCR 寄存器,默认是 10,只在系统装了 TPM 芯片的情况下有意义。 -2. 模板哈希值:最终被用于扩展的哈希值,组合了文件内容哈希和文件路径的长度和值。 -3. 模板:扩展度量值的模板,如 ima-ng。 -4. 文件内容哈希值:被度量的文件内容的哈希值。 -5. 文件路径:被度量的文件路径。 - -#### IMA appraisal - -IMA 评估的目的是通过与标准参考值的比较,控制对本地文件的访问。 - -IMA 首先使用安全扩展属性 security.ima 和 security.evm 存储文件完整性度量的参考值: - -- security.ima:存储文件内容的哈希值; -- security.evm:存储文件扩展属性的哈希值签名。 - -访问受保护文件时,将会触发内核中的钩子,依次验证文件扩展属性和内容的完整性: - -1. 使用内核 keyring 中的公钥对文件 security.evm 扩展属性中的签名值验签,与当前文件扩展属性的哈希值比较,如果匹配就证明文件的扩展属性是完整的(包括 security.ima)。 -2. 在文件扩展属性完整的前提下,将文件 security.ima 扩展属性的内容与当前文件内容的摘要值比较,如果匹配就允许对文件的访问。 - -同样,IMA 评估的文件范围和触发条件也可以由用户通过 IMA 策略自行配置。 - -#### IMA Digest Lists - -IMA 摘要列表扩展当前提供对以下三种启动参数组合的支持: - -- IMA measurement 度量模式: - - ```shell - ima_policy=exec_tcb ima_digest_list_pcr=11 - ``` - -- IMA appraisal 日志模式 + IMA measurement 度量模式: - - ```shell - ima_template=ima-sig ima_policy="exec_tcb|appraise_exec_tcb|appraise_exec_immutable" initramtmpfs ima_hash=sha256 ima_appraise=log evm=allow_metadata_writes evm=x509 ima_digest_list_pcr=11 ima_appraise_digest_list=digest - ``` - -- IMA appraisal 强制模式 + IMA measurement 度量模式: - - ```shell - ima_template=ima-sig ima_policy="exec_tcb|appraise_exec_tcb|appraise_exec_immutable" initramtmpfs ima_hash=sha256 ima_appraise=enforce-evm evm=allow_metadata_writes evm=x509 ima_digest_list_pcr=11 ima_appraise_digest_list=digest - ``` - -### 操作指导 - -#### 原生 IMA 场景初次部署 - -第一次启动时,需要在启动参数中配置: - -```shell -ima_appraise=fix ima_policy=appraise_tcb -``` - -`fix` 模式会允许系统在没有参考值的情况下启动,`appraise_tcb` 对应了一种 IMA 策略,具体可参考附录中的“IMA 启动参数”章节。 - -接下来,你需要访问所有需要被校验的文件,从而为它们添加 IMA 扩展属性: - -```shell -# time find / -fstype ext4 -type f -uid 0 -exec dd if='{}' of=/dev/null count=0 status=none \; -``` - -该过程会花费一定时间,请耐心等待。命令执行完成后,你可以从受保护文件的扩展属性中看到参考值已被标记: - -```shell -# getfattr -m - -d /sbin/init -# file: sbin/init -security.ima=0sAXr7Qmun5mkGDS286oZxCpdGEuKT -security.selinux="system_u:object_r:init_exec_t" -``` - -最后,配置以下启动参数并重新启动系统: - -```shell -ima_appraise=enforce ima_policy=appraise_tcb -``` - -#### 摘要列表场景初次部署 - -1. 配置内核参数进入 log 模式。 - - 编辑 `/boot/efi/EFI/openEuler/grub.cfg` 文件,加入以下参数: - - ```shell - ima_template=ima-sig ima_policy="exec_tcb|appraise_exec_tcb|appraise_exec_immutable" initramtmpfs ima_hash=sha256 ima_appraise=log evm=allow_metadata_writes evm=x509 ima_digest_list_pcr=11 ima_appraise_digest_list=digest - ``` - - 使用 `reboot` 重启系统进入 log 模式,该模式下已开启完整性校验,但不会因校验失败而无法启动。 - -2. 安装依赖包。 - - 使用 yum 安装 digest-list-tools 和 ima-evm-utils,确认不低于以下版本: - - ```shell - # yum install digest-list-tools ima-evm-utils - # rpm -qa | grep digest-list-tools - digest-list-tools-0.3.93-1.oe1.x86_64 - # rpm -qa | grep ima-evm-utils - ima-evm-utils-1.2.1-9.oe1.x86_64 - ``` - -3. 执行 `dracut` 重新生成 initrd: - - ```shell - # dracut -f -e xattr - ``` - - 编辑 `/boot/efi/EFI/openEuler/grub.cfg` 文件,将 ima_appraise=log 改为 ima_appraise=enforce-evm: - - ```shell - ima_template=ima-sig ima_policy="exec_tcb|appraise_exec_tcb|appraise_exec_immutable" initramtmpfs ima_hash=sha256 ima_appraise=enforce-evm evm=allow_metadata_writes evm=x509 ima_digest_list_pcr=11 ima_appraise_digest_list=digest - ``` - - 使用 reboot 重启即可完成初次部署。 - -#### 在 OBS 上进行摘要列表构建 - -OBS 全称 Open Build Service,是一种编译系统,最早在 openSUSE 用于软件包的构建,能够支持多架构的分布式编译。 - -进行摘要列表构建之前,首先确保您的工程包含以下 rpm 包,且来自 openEuler: - -- digest-list-tools -- pesign-obs-integration -- selinux-policy -- rpm -- openEuler-rpm-config - -在交付件工程中增加 Project Config: - -```shell -Preinstall: pesign-obs-integration digest-list-tools selinux-policy-targeted -Macros: -%__brp_digest_list /usr/lib/rpm/openEuler/brp-digest-list %{buildroot} -:Macros -``` - -- 在 Preinstall 中新增 digest-list-tools 用于生成摘要列表,pesign-obs-integration 用于生成摘要列表的签名,新增 selinux-policy-targeted 用于确保生成摘要列表时构建环境内 SELinux 标签正确。 -- 在 Macros 中定义宏 %__brp_digest_list,rpm 将在构建阶段通过这个宏执行命令为编译完成的二进制文件生成摘要列表。这个宏可以作为一个开关控制工程中的摘要列表是否生成。 - -配置完成后,OBS 会自动执行全量构建,正常情况下构建完成后,软件包中会新增以下两个文件: - -- /etc/ima/digest_lists/0-metadata_list-compact-[包名]-[版本号] -- /etc/ima/digest_lists.tlv/0-metadata_list-compact_tlv-[包名]-[版本号] - -#### 在 Koji 上进行摘要列表构建 - -Koji 是 Fedora 社区的编译系统,openEuler 社区将在后续支持,敬请期待。 - -### FAQ - -1. 为什么进入 enforce 模式后系统无法启动或启动后命令无法执行/服务不正常? - - enforce 模式下 IMA 会对文件访问做控制,如果访问文件的内容或扩展属性不完整,就会被拒绝访问,当影响启动的关键命令无法执行时,就会造成系统无法启动。 - - 请确认是否存在以下问题: - - - **摘要列表是否被加入到 initrd 中?** - - 初次部署时是否执行了 dracut 命令将摘要列表加入内核?如果摘要列表没有加入 initrd,启动阶段就无法导入摘要列表,从而导致启动失败。 - - - **是否使用官方提供的 rpm 包?** - - 如果使用的是非 openEuler 官方提供的 rpm 包,rpm 包可能没有携带摘要列表,或者对摘要列表签名的私钥与内核中的验签公钥不匹配,从而导致摘要列表没有被导入内核。 - - 如果原因还不明确,可以进入 log 模式启动,从错误日志中寻找原因: - - ```shell - # dmesg | grep appraise - ``` - -2. 为什么 enforce 模式下没有对系统文件做访问控制? - - 系统没有按照预期对文件执行访问控制,首先查看启动参数中的 IMA 策略是否已被正确配置: - - ```shell - # cat /proc/cmdline - ...ima_policy=exec_tcb|appraise_exec_tcb|appraise_exec_immutable... - ``` - - 其次查看当前内核中 IMA 策略是否已生效: - - ```shell - # cat /sys/kernel/security/ima/policy - ``` - - 如果 policy 文件是空的,证明策略没有设置成功,系统也就不会进行访问控制。 - -3. 初次部署完成后,安装/升级/卸载软件包后还需要手动执行 dracut 生成 initrd 吗? - - 不需要。rpm 包提供的 digest_list.so 插件能够在 rpm 包粒度提供摘要列表的自动更新,可以实现用户对摘要列表的无感知。 - -### 附录 - -#### IMA securityfs 接口说明 - -原生 IMA 提供的 securityfs 接口如下: - -> 注:以下接口路径都位于 `/sys/kernel/security/` 目录下。 - -| 路径 | 权限 | 说明 | -| ------------------------------ | ---- | ---------------------------------------- | -| ima/policy | 600 | IMA 策略接口 | -| ima/ascii_runtime_measurement | 440 | ascii 码形式表示的 IMA 度量结果 | -| ima/binary_runtime_measurement | 440 | 二进制形式表示的 IMA 度量结果 | -| ima/runtime_measurement_count | 440 | 度量结果数量统计 | -| ima/violations | 440 | IMA 度量结果冲突数 | -| evm | 660 | EVM 模式,即校验文件扩展属性完整性的方式 | - -其中,`/sys/kernel/security/evm` 的取值有以下三种: - -- 0:EVM 未初始化; -- 1:使用 HMAC(对称加密)方式校验扩展属性完整性; -- 2:使用公钥验签(非对称加密)方式校验扩展属性完整性; -- 6:关闭扩展属性完整性校验(openEuler 使用此方式)。 - -IMA 摘要列表扩展额外提供的 securityfs 接口如下: - -| 路径 | 权限 | 说明 | -| ------------------------ | ---- | --------------------------------------- | -| ima/digests_count | 440 | 显示系统哈希表中的总摘要数量(IMA+EVM) | -| ima/digest_list_data | 200 | 摘要列表新增接口 | -| ima/digest_list_data_del | 200 | 摘要列表删除接口 | - -#### IMA 策略语法 - -每条 IMA 策略语句都必须以 action 关键字代表的**动作**开头,后接**筛选条件**: - -- action:表示该条策略具体的动作,一条策略只能选一个 action。 - - > 注:实际书写时**可忽略 action 字样**,直接书写 dont_measure,不需要写成 action=dont_measure。 - -- func:表示被度量或鉴定的文件类型,常和 mask 匹配使用,一条策略只能选一个 func。 - - - FILE_CHECK 只能同 MAY_EXEC、MAY_WRITE、MAY_READ 匹配使用。 - - MODULE_CHECK、MMAP_CHECK、BPRM_CHECK 只能同 MAY_EXEC 匹配使用。 - - 匹配关系以外的组合不会产生效果。 - -- mask:表示文件在做什么操作时将被度量或鉴定,一条策略只能选一个 mask。 - -- fsmagic:表示文件系统类型的十六进制魔数,定义在 `/usr/include/linux/magic.h` 文件中。 - - > 注:默认情况下度量所有文件系统,除非使用 dont_measure/dont_appraise 标记不度量某文件系统。 - -- fsuuid:表示系统设备 uuid 的 16 位的十六进制字符串。 - -- objtype:表示文件类型,一条策略只能选一个文件类型。 - - > 注:objtype 相比 func 而言,划分的粒度更细,比如 obj_type=nova_log_t 表示 nova log 类型的文件。 - -- uid:表示哪个用户(用用户 id 表示)对文件进行操作,一条策略只能选一个 uid。 - -- fowner:表示文件的属主(用用户 id 表示)是谁,一条策略只能选一个 fowner。 - -关键字的具体取值及说明如下: - -| 关键字 | 值 | 说明 | -| ------ | ---------- | --------- | -| action | measure | 开启 IMA 度量 | -| | dont_measure | 禁用 IMA 度量 | -| | appraise | 开启 IMA 评估 | -| | dont_appraise | 禁用 IMA 评估 | -| | audit | 开启审计 | -| func | FILE_CHECK | 将要被打开的文件 | -| | MODULE_CHECK | 将要被装载的内核模块文件 | -| | MMAP_CHECK | 将要被映射到进程内存空间的动态库文件 | -| | BRPM_CHECK | 将要被执行的文件(不含通过 `/bin/hash` 等程序打开的脚本文件) | -| | POLICY_CHECK | 将要被作为补充 IMA 策略装载的文件 | -| | FIRMWARE_CHECK | 将要被加载到内存中的固件 | -| | DIGEST_LIST_CHECK | 将要被加载到内核中的摘要列表文件 | -| | KEXEC_KERNEL_CHECK | 将要切换的 kexec 内核 | -| mask | MAY_EXEC | 执行文件 | -| | MAY_WRITE | 写文件。不建议使用,受限于 echo、vim 等开源机制(修改本质是新建临时文件再重命名),并不是每次修改都会触发 MAY_WRITE 的 IMA 度量。 | -| | MAY_READ | 读文件 | -| | MAY_APPEND | 扩展文件属性 | -| fsmagic | fsmagic=xxx | 表示文件系统类型的十六进制魔数 | -| fsuuid | fsuuid=xxx | 表示系统设备 uuid 的 16 位的十六进制字符串 | -| fowner | fowner=xxx | 文件属主的用户 id | -| uid | uid=xxx | 操作文件的用户 id | -| obj_type | obj_type=xxx_t | 表示文件的类型(基于 SELinux 标签) | -| pcr | pcr=\ | 选择 TPM 中用于扩展度量值的 PCR(默认为 10) | -| appraise_type | imasig | 基于签名进行 IMA 评估 | -| | meta_immutable | 基于签名进行文件扩展属性的评估(支持摘要列表) | - -> 注:PATH_CHECK 等同于 FILE_CHECK,FILE_MMAP 等同于 MMAP_CHECK,不在本表提及。 - -#### IMA 原生启动参数 - -原生 IMA 的内核启动参数列表如下: - -| 参数名称 | 取值 | 功能 | -| ---------------- | ------------ | ------------------- | -| ima_appraise | off | 关闭 IMA 评估模式,在访问文件时不进行完整性校验,也不为文件生成新的参考值。 | -| | enforce | 开启 IMA 评估强制模式,在访问文件时进行完整性校验,即计算文件摘要值并与参考值比对,如果比对失败就拒绝对文件的访问。IMA 会为新文件生成新的参考值。 | -| | fix | 开启 IMA 修复模式,在该模式下允许更新受保护文件的参考值。 | -| | log | 开启 IMA 评估日志模式,在访问文件时进行完整性校验,但即使校验失败也允许执行命令,只进行日志记录。 | -| ima_policy | tcb | 度量所有文件执行、动态库映射、内核模块导入以及设备驱动加载,此外,root 用户读文件的行为也会被度量。 | -| | appraise_tcb | 对所有 root 属主的文件进行评估。 | -| | secure_boot | 对所有内核模块导入、硬件驱动加载、kexec 内核切换以及 IMA 策略进行评估,前提是这些文件都具有 IMA 签名, | -| ima_tcb | 无 | 等价于 ima_policy=tcb | -| ima_appraise_tcb | 无 | 等价于 ima_policy=appraise_tcb | -| ima_hash | sha1/md5/... | IMA 摘要算法,默认为 sha1 | -| ima_template | ima | IMA 度量扩展模板 | -| | ima-ng | IMA 度量扩展模板 | -| | ima-sig | IMA 度量扩展模板 | -| integrity_audit | 0 | 基础完整性审计信息(默认) | -| | 1 | 额外完整性审计信息 | - -> 注:ima_policy 参数可以同时指定多个值,例如 ima_policy=tcb|appraise_tcb,启动后系统的 IMA 策略就是这两种参数对应的策略的总和。 - -启动参数 `ima_policy=tcb` 对应的 IMA 策略为: - -```shell -# PROC_SUPER_MAGIC = 0x9fa0 -dont_measure fsmagic=0x9fa0 -# SYSFS_MAGIC = 0x62656572 -dont_measure fsmagic=0x62656572 -# DEBUGFS_MAGIC = 0x64626720 -dont_measure fsmagic=0x64626720 -# TMPFS_MAGIC = 0x01021994 -dont_measure fsmagic=0x1021994 -# DEVPTS_SUPER_MAGIC=0x1cd1 -dont_measure fsmagic=0x1cd1 -# BINFMTFS_MAGIC=0x42494e4d -dont_measure fsmagic=0x42494e4d -# SECURITYFS_MAGIC=0x73636673 -dont_measure fsmagic=0x73636673 -# SELINUX_MAGIC=0xf97cff8c -dont_measure fsmagic=0xf97cff8c -# SMACK_MAGIC=0x43415d53 -dont_measure fsmagic=0x43415d53 -# CGROUP_SUPER_MAGIC=0x27e0eb -dont_measure fsmagic=0x27e0eb -# CGROUP2_SUPER_MAGIC=0x63677270 -dont_measure fsmagic=0x63677270 -# NSFS_MAGIC=0x6e736673 -dont_measure fsmagic=0x6e736673 -measure func=MMAP_CHECK mask=MAY_EXEC -measure func=BPRM_CHECK mask=MAY_EXEC -measure func=FILE_CHECK mask=MAY_READ uid=0 -measure func=MODULE_CHECK -measure func=FIRMWARE_CHECK -``` - -启动参数 `ima_policy=tcb_appraise` 对应的 IMA 策略为: - -```shell -# PROC_SUPER_MAGIC = 0x9fa0 -dont_appraise fsmagic=0x9fa0 -# SYSFS_MAGIC = 0x62656572 -dont_appraise fsmagic=0x62656572 -# DEBUGFS_MAGIC = 0x64626720 -dont_appraise fsmagic=0x64626720 -# TMPFS_MAGIC = 0x01021994 -dont_appraise fsmagic=0x1021994 -# RAMFS_MAGIC -dont_appraise fsmagic=0x858458f6 -# DEVPTS_SUPER_MAGIC=0x1cd1 -dont_appraise fsmagic=0x1cd1 -# BINFMTFS_MAGIC=0x42494e4d -dont_appraise fsmagic=0x42494e4d -# SECURITYFS_MAGIC=0x73636673 -dont_appraise fsmagic=0x73636673 -# SELINUX_MAGIC=0xf97cff8c -dont_appraise fsmagic=0xf97cff8c -# SMACK_MAGIC=0x43415d53 -dont_appraise fsmagic=0x43415d53 -# NSFS_MAGIC=0x6e736673 -dont_appraise fsmagic=0x6e736673 -# CGROUP_SUPER_MAGIC=0x27e0eb -dont_appraise fsmagic=0x27e0eb -# CGROUP2_SUPER_MAGIC=0x63677270 -dont_appraise fsmagic=0x63677270 -appraise fowner=0 -``` - -启动参数 `ima_policy=secure_boot` 对应的 IMA 策略为: - -```shell -appraise func=MODULE_CHECK appraise_type=imasig -appraise func=FIRMWARE_CHECK appraise_type=imasig -appraise func=KEXEC_KERNEL_CHECK appraise_type=imasig -appraise func=POLICY_CHECK appraise_type=imasig -``` - -#### IMA 摘要列表启动参数 - -IMA 摘要列表特性额外引入的内核启动参数如下: - -| 参数名称 | 取值 | 功能 | -| -------------- | -------------- | -------------------------- | -| integrity | 0 | IMA 特性总开关关闭(默认) | -| | 1 | IMA 特性总开关打开 | -| ima_appraise | off | 关闭 IMA 评估模式 | -| | enforce-evm | IMA 评估强制模式,在访问文件时进行完整性校验并进行访问控制 | -| ima_appraise_digest_list | digest | 当 EVM 被禁用时,使用摘要列表进行 IMA appraise,摘要列表同时保护文件内容和扩展属性 | -| | digest-nometadata | 在EVM摘要值不存在的情况下,仅基于IMA摘要值进行完整性校验(不保护文件扩展属性) | -| evm | fix | 允许任何对扩展属性的修改(即使修改会导致扩展属性完整性校验失败) | -| | ignore | 只有在扩展属性不存在或不正确的情况下才允许修改 | -| ima_policy | exec_tcb | IMA 度量策略,详见下文策略说明。 | -| | appraise_exec_tcb | IMA 评估策略,详见下文策略说明。 | -| | appraise_exec_immutable | IMA 评估策略,详见下文策略说明。 | -| ima_digest_list_pcr | 11 | 使用 PCR 11 替代 PCR 10,仅使用摘要列表进行度量 | -| | +11 | 依然保留 PCR 10 的度量,在有TPM芯片时也往TPM芯片写度量结果 | -| initramtmpfs | 无 | 添加对 tmpfs 的支持 | - -启动参数 `ima_policy=exec_tcb` 对应的 IMA 策略为: - -```shell -dont_measure fsmagic=0x9fa0 -dont_measure fsmagic=0x62656572 -dont_measure fsmagic=0x64626720 -dont_measure fsmagic=0x1cd1 -dont_measure fsmagic=0x42494e4d -dont_measure fsmagic=0x73636673 -dont_measure fsmagic=0xf97cff8c -dont_measure fsmagic=0x43415d53 -dont_measure fsmagic=0x27e0eb -dont_measure fsmagic=0x63677270 -dont_measure fsmagic=0x6e736673 -measure func=MMAP_CHECK mask=MAY_EXEC -measure func=BPRM_CHECK mask=MAY_EXEC -measure func=MODULE_CHECK -measure func=FIRMWARE_CHECK -measure func=POLICY_CHECK -measure func=DIGEST_LIST_CHECK -measure parser -``` - -启动参数 `ima_policy=appraise_exec_tcb` 对应的 IMA 策略为: - -```shell -appraise func=MODULE_CHECK appraise_type=imasig -appraise func=FIRMWARE_CHECK appraise_type=imasig -appraise func=KEXEC_KERNEL_CHECK appraise_type=imasig -appraise func=POLICY_CHECK appraise_type=imasig -appraise func=DIGEST_LIST_CHECK appraise_type=imasig -dont_appraise fsmagic=0x9fa0 -dont_appraise fsmagic=0x62656572 -dont_appraise fsmagic=0x64626720 -dont_appraise fsmagic=0x858458f6 -dont_appraise fsmagic=0x1cd1 -dont_appraise fsmagic=0x42494e4d -dont_appraise fsmagic=0x73636673 -dont_appraise fsmagic=0xf97cff8c -dont_appraise fsmagic=0x43415d53 -dont_appraise fsmagic=0x6e736673 -dont_appraise fsmagic=0x27e0eb -dont_appraise fsmagic=0x63677270 -``` - -启动参数 `ima_policy=appraise_exec_immutable` 对应的 IMA 策略为: - -```shell -appraise func=BPRM_CHECK appraise_type=imasig appraise_type=meta_immutable -appraise func=MMAP_CHECK -appraise parser appraise_type=imasig -``` - -#### IMA 内核编译选项详解 - -原生 IMA 提供的编译选项如下: - -| 编译选项 | 功能 | -| -------------------------------- | --------------------------- | -| CONFIG_INTEGRITY | IMA/EVM 总编译开关 | -| CONFIG_INTEGRITY_SIGNATURE | 使能 IMA 签名校验 | -| CONFIG_INTEGRITY_ASYMMETRIC_KEYS | 使能 IMA 非对称签名校验 | -| CONFIG_INTEGRITY_TRUSTED_KEYRING | 使能 IMA/EVM 密钥环 | -| CONFIG_INTEGRITY_AUDIT | 编译 IMA audit 审计模块 | -| CONFIG_IMA | IMA 总编译开关 | -| CONFIG_IMA_WRITE_POLICY | 允许在运行阶段更新 IMA 策略 | -| CONFIG_IMA_MEASURE_PCR_IDX | 允许指定 IMA 度量 PCR 序号 | -| CONFIG_IMA_LSM_RULES | 允许配置 LSM 规则 | -| CONFIG_IMA_APPRAISE | IMA 评估总编译开关 | -| IMA_APPRAISE_BOOTPARAM | 启用 IMA 评估启动参数 | -| CONFIG_EVM | EVM 总编译开关 | - -IMA 摘要列表扩展额外提供的编译选项如下: - -| 编译选项 | 功能 | -| ------------------ | ------------------------- | -| CONFIG_DIGEST_LIST | 开启 IMA 摘要列表特性开关 | - -#### IMA 性能参考数据 - -下图对比了不开启 IMA、开启原生 IMA、开启 IMA 摘要列表特性时的性能: - -![img](./figures/ima_performance.gif) - -#### IMA 对kdump服务的影响 - -开启IMA enforce模式,在策略中配置kexec系统调用校验时,可能导致kdump启动失败。 - -```shell -appraise func=KEXEC_KERNEL_CHECK appraise_type=imasig -``` - -kdump启动失败原因:由于开启IMA后,需要对文件执行完整性校验,因此限制kdump加载内核映像文件时需使用kexec_file_load系统调用。可通过修改/etc/sysconfig/kdump配置文件的KDUMP_FILE_LOAD开启kexec_file_load系统调用。 - -```shell -KDUMP_FILE_LOAD="on" -``` - -同时,kexec_file_load系统调用自身也会执行文件的签名校验,因此要求被加载的内核映像文件必须包含正确的安全启动签名,而且当前内核中必须包含对应的验签证书。 - -#### IMA 根证书配置 - -当前openEuler使用RPM密钥对IMA摘要列表进行签名,为保证IMA功能开箱可用,openEuler内核编译时默认将RPM根证书(PGP证书)导入内核。当前共包含两本PGP证书,分别为旧版本使用的OBS证书和openEuler 22.03 LTS SP1版本切换的openEuler证书: - -```shell -# cat /proc/keys | grep PGP -1909b4ad I------ 1 perm 1f030000 0 0 asymmetri private OBS b25e7f66: PGP.rsa b25e7f66 [] -2f10cd36 I------ 1 perm 1f030000 0 0 asymmetri openeuler fb37bc6f: PGP.rsa fb37bc6f [] -``` - -由于当前内核不支持导入PGP子公钥,而切换后的openEuler证书采用子密钥签名,因此openEuler内核编译前对证书进行了预处理,抽取子公钥并导入内核,具体处理流程可见内核软件包代码仓内的process_pgp_certs.sh脚本文件: - -如果用户不使用IMA摘要列表功能或使用其他密钥实现签名/验签,则可将相关代码移除,自行实现内核根证书配置。 - -## DIM动态完整性度量 - -本章节为DIM(Dynamic Integrity Measurement)动态完整性度量的特性介绍以及使用说明。 - -### 背景 - -随着信息产业的不断发展,信息系统所面临的安全风险也日益增长。信息系统中可能运行大量软件,部分软件不可避免地存在漏洞,这些漏洞一旦被攻击者利用,可能会对系统业务造成严重的影响,如数据泄露、服务不可用等。 - -绝大部分的软件攻击,都会伴随着完整性破坏,如恶意进程运行、配置文件篡改、后门植入等。因此业界提出了完整性保护技术,指的是从系统启动开始,对关键数据进行度量和校验,从而保证系统运行达到预期效果。当前业界已广泛使用的完整性保护技术(如安全启动、文件完整性度量等)都无法对进程运行时的内存数据进行保护。如果攻击者利用一些手段修改了进程的代码指令,可能导致进程被劫持或被植入后门,具有攻击性强,隐蔽性高的特点。对于这种攻击手段,业界提出了动态完整性度量技术,即对进程的运行时内存中的关键数据进行度量和保护。 - -### 术语说明 - -静态基线:针对度量目标的二进制文件进行解析所生成的度量基准数据; - -动态基线:针对度量目标执行首次度量的结果; - -度量策略:指定度量目标的配置信息; - -度量日志:存储度量结果的列表,包含度量对象、度量结果等信息。 - -### 特性简介 - -DIM特性通过在程序运行时对内存中的关键数据(如代码段、数据段)进行度量,并将度量结果和基准值进行对比,确定内存数据是否被篡改,从而检测攻击行为,并采取应对措施。 - -#### 功能范围 - -- 当前DIM特性支持在ARM64/X86架构系统中运行; -- 当前DIM特性支持对以下关键内存数据执行度量: - - 用户态进程的代码段:对应ELF文件中属性为PT_LOAD、权限为RX的段,对应进程加载后权限为RX的vma区域; - - 内核模块代码段:起始地址为内核模块对应struct module结构体中的core_layout.base,长度为core_layout.text_size; - - 内核代码段:对应\_stext符号至\_etext,跳过可能由于内核static key机制发生变化的地址。 -- 当前DIM特性支持对接以下硬件平台: - - 支持将度量结果扩展至TPM 2.0芯片的PCR寄存器,以实现远程证明服务对接。 - -#### 技术限制 - -- 对于用户态进程,仅支持度量文件映射代码段,不支持度量匿名代码段; -- 不支持度量内核热补丁; -- 仅支持主动触发机制,如果两次触发过程中发生了篡改-恢复的行为,会导致无法识别攻击; -- 对于主动修改代码段的场景(如代码段重定位、自修改或热补丁),会被识别为攻击; -- 对于内核、内核模块的度量,以触发动态基线时的度量结果作为度量基准值,静态基线值仅作为一个固定标识; -- 度量目标必须在触发动态基线的时刻就已在内存中加载(如进程运行或内核模块加载),否则后续无法度量; -- 在需要使用TPM芯片的PCR寄存器验证度量日志的场景下,DIM模块不允许卸载,否则会导致度量日志清空,而无法和PCR寄存器匹配; ->![](./public_sys-resources/icon-note.gif) **说明:** -> ->特性启用后,会对系统性能存在一定影响,主要包括以下方面: -> - DIM特性自身加载以及基线数据、度量日志管理会对系统内存造成消耗,具体影响与保护策略配置相关; -> - DIM特性执行度量期间需要进行哈希运算,造成CPU消耗,具体影响与需要度量的数据大小有关; -> - DIM特性执行度量期间需要对部分资源执行上锁或获取信号量操作,可能导致其他并发进程等待。 - -#### 规格约束 - -| 规格项 | 值 | -| ------------------------------------------------------------ | ---- | -| 文件大小上限(策略文件、静态基线文件、签名文件、证书文件) | 10MB | -| 同一个度量目标在一次动态基线后多次度量期间最多记录的篡改度量日志条数 | 10条 | -| /etc/dim/policy中度量策略最大可记录数|10000条| - -#### 架构说明 - -DIM包含两个软件包dim_tools和dim,分别提供如下组件: - -| 软件包 | 组件 | 说明 | -| --------- | ---------------- | ------------------------------------------------------------ | -| dim_tools | dim_gen_baseline | 用户态组件,静态基线生成工具,用于生成动态度量所需要的基线数据,该基线数据在DIM特性运行时会被导入并作为度量基准值 | -| dim | dim_core | 内核模块,执行核心的动态度量逻辑,包括策略解析、静态基线解析、动态基线建立、度量执行、度量日志记录、TPM芯片扩展操作等,实现对内存关键数据的度量功能 | -| dim | dim_monitor | 内核模块,执行对dim_core的代码段和关键数据的度量保护,一定程度防止由于dim_core遭受攻击导致的DIM功能失效。 | - -整体架构如下图所示: - -![](./figures/dim_architecture.jpg) - -#### 关键流程说明 - -dim_core和dim_monitor模块均提供了对内存数据的度量功能,包含两个核心流程: - -- 动态基线流程:dim_core模块读取并解析策略和静态基线文件,然后对目标进程执行代码段度量,度量结果在内存中以动态基线形式存放,最后将动态基线数据和静态基线数据进行对比,并将对比结果记录度量日志;dim_monitor模块对dim_core模块的代码段和关键数据进行度量,作为动态基线并记录度量日志; -- 动态度量流程:dim_core和dim_monitor模块对目标执行度量,并将度量结果与动态基线值进行对比,如果对比不一致,则将结果记录度量日志。 - -#### 接口说明 - -##### 文件路径说明 - -| 路径 | 说明 | -| ------------------------------- | ------------------------------------------------------------ | -| /etc/dim/policy | 度量策略文件 | -| /etc/dim/policy.sig | 度量策略签名文件,用于存放策略文件的签名信息,在签名校验功能开启的情况下使用 | -| /etc/dim/digest_list/*.hash | 静态基线文件,用于存放度量的基准值信息 | -| /etc/dim/digest_list/*.hash.sig | 静态基线签名文件,用于存放静态基线文件的签名信息,在签名校验功能开启的情况下使用 | -| /etc/keys/x509_dim.der | 证书文件,用于校验策略文件和静态基线文件的签名信息,在签名校验功能开启的情况下使用 | -| /sys/kernel/security/dim | DIM文件系统目录,DIM内核模块加载后生成,目录下提供对DIM功能进行操作的内核接口 | - -##### 文件格式说明 - -1. 度量策略文件格式说明 - - 文本文件,以UNIX换行符进行分隔,每行代表一条度量策略,当前支持以下几种配置格式: - - 1. 用户态进程代码段度量配置: - - ``` - measure obj=BPRM_TEXT path=<度量目标进程可执行文件或动态库对应二进制文件的绝对路径> - ``` - - 2. 内核模块代码段度量配置: - - ``` - measure obj=MODULE_TEXT name=<内核模块名> - ``` - - 3. 内核度量配置: - - ``` - measure obj=KERNEL_TEXT - ``` - -**参考示例:** - -``` -# cat /etc/dim/policy -measure obj=BPRM_TEXT path=/usr/bin/bash -measure obj=BPRM_TEXT path=/usr/lib64/libc.so.6 -measure obj=MODULE_TEXT name=ext4 -measure obj=KERNEL_TEXT -``` - -2. 静态基线文件格式说明 - - 文本文件,以UNIX换行符进行分隔,每行代表一条静态基线,当前支持以下几种配置格式: - - 1. 用户态进程基线: - - ``` - dim USER sha256:6ae347be2d1ba03bf71d33c888a5c1b95262597fbc8d00ae484040408a605d2b <度量目标进程可执行文件或动态库对应二进制文件的绝对路径> - ``` - - 2. 内核模块基线: - - ``` - dim KERNEL sha256:a18bb578ff0b6043ec5c2b9b4f1c5fa6a70d05f8310a663ba40bb6e898007ac5 <内核release号>/<内核模块名> - ``` - - 3. 内核基线: - - ``` - dim KERNEL sha256:2ce2bc5d65e112ba691c6ab46d622fac1b7dbe45b77106631120dcc5441a3b9a <内核release号> - ``` - -**参考示例:** - -``` -dim USER sha256:6ae347be2d1ba03bf71d33c888a5c1b95262597fbc8d00ae484040408a605d2b /usr/bin/bash -dim USER sha256:bc937f83dee4018f56cc823f5dafd0dfedc7b9872aa4568dc6fbe404594dc4d0 /usr/lib64/libc.so.6 -dim KERNEL sha256:a18bb578ff0b6043ec5c2b9b4f1c5fa6a70d05f8310a663ba40bb6e898007ac5 6.4.0-1.0.1.4.oe2309.x86_64/dim_monitor -dim KERNEL sha256:2ce2bc5d65e112ba691c6ab46d622fac1b7dbe45b77106631120dcc5441a3b9a 6.4.0-1.0.1.4.oe2309.x86_64 -``` - -3. 度量日志格式说明 - - 文本内容,以UNIX换行符进行分隔,每行代表一条度量日志,格式为: - -``` - <度量日志哈希值> <度量算法>:<度量哈希值> <度量对象> <度量日志类型> -``` - -**参考示例:** - - 1. 对bash进程代码段执行度量,度量结果与静态基线一致: - - ``` - 12 0f384a6d24e121daf06532f808df624d5ffc061e20166976e89a7bb24158eb87 sha256:db032449f9e20ba37e0ec4a506d664f24f496bce95f2ed972419397951a3792e /usr/bin.bash [static baseline] - ``` - - 2. 对bash进程代码段执行度量,度量结果与静态基线不一致: - - ``` - 12 0f384a6d24e121daf06532f808df624d5ffc061e20166976e89a7bb24158eb87 sha256:db032449f9e20ba37e0ec4a506d664f24f496bce95f2ed972419397951a3792e /usr/bin.bash [tampered] - ``` - - 3. 对ext4内核模块代码段执行度量,未找到静态基线: - - ``` - 12 0f384a6d24e121daf06532f808df624d5ffc061e20166976e89a7bb24158eb87 sha256:db032449f9e20ba37e0ec4a506d664f24f496bce95f2ed972419397951a3792e ext4 [no static baseline] - ``` - - 4. dim_monitor对dim_core执行度量,记录基线时的度量结果: - - ``` - 12 660d594ba050c3ec9a7cdc8cf226c5213c1e6eec50ba3ff51ff76e4273b3335a sha256:bdab94a05cc9f3ad36d29ebbd14aba8f6fd87c22ae580670d18154b684de366c dim_core.text [dynamic baseline] - 12 28a3cefc364c46caffca71e7c88d42cf3735516dec32796e4883edcf1241a7ea sha256:0dfd9656d6ecdadc8ec054a66e9ff0c746d946d67d932cd1cdb69780ccad6fb2 dim_core.data [dynamic baseline] - ``` - -4. 证书/签名文件格式说明 - -为通用格式,详见[开启签名校验](#开启签名校验)章节。 - -##### 内核模块参数说明 - -1. dim_core模块参数 - -| 参数名 | 参数内容 | 取值范围 | 默认值 | -| -------------------- | ------------------------------------------------------------ | ------------------------ | ------ | -| measure_log_capacity | 度量日志最大条数,当dim_core记录的度量日志数量达到参数设置时,停止记录度量日志 | 100-UINT_MAX(64位系统) | 100000 | -| measure_schedule | 度量完一个进程/模块后调度的时间,单位毫秒,设置为0代表不调度 | 0-1000 | 0 | -| measure_interval | 自动度量周期,单位分钟,设置为0代表不设置自动度量 | 0-525600 | 0 | -| measure_hash | 度量哈希算法 | sha256, sm3 | sha256 | -| measure_pcr | 将度量结果扩展至TPM芯片的PCR寄存器,设置为0代表不扩展(注意需要与芯片实际的PCR编号保持一致) | 0-128 | 0 | -| signature | 是否启用策略文件和签名机制,设置为0代表不启用,设置为1代表启用 | 0, 1 | 0 | - -**使用示例**: - -``` -insmod /path/to/dim_core.ko measure_log_capacity=10000 measure_schedule=10 measure_pcr=12 -modprobe dim_core measure_log_capacity=10000 measure_schedule=10 measure_pcr=12 -``` - -2. dim_monitor模块参数 - -| 参数名 | 参数内容 | 取值范围 | 默认值 | -| -------------------- | ------------------------------------------------------------ | ------------------------ | ------ | -| measure_log_capacity | 度量日志最大条数,当dim_monitor记录的度量日志数量达到参数设置时,停止记录度量日志 | 100-UINT_MAX(64位系统) | 100000 | -| measure_hash | 度量哈希算法 | sha256, sm3 | sha256 | -| measure_pcr | 将度量结果扩展至TPM芯片的PCR寄存器,设置为0代表不扩展 | 0-128 | 0 | - -**使用示例**: - -``` -insmod /path/to/dim_monitor.ko measure_log_capacity=10000 measure_hash=sm3 -modprobe dim_monitor measure_log_capacity=10000 measure_hash=sm3 -``` - -##### 内核接口说明 - -1. dim_core模块接口 - -| 接口名 | 属性 | 功能 | 示例 | -| -------------------------- | ---- | ------------------------------------------------------------ | --------------------------------------------------------- | -| measure | 只写 | 写入字符串1触发动态度量,成功返回0,失败返回错误码 | echo 1 > /sys/kernel/security/dim/measure | -| baseline_init | 只写 | 写入字符串1触发动态基线,成功返回0,失败返回错误码 | echo 1 > /sys/kernel/security/dim/baseline_init | -| ascii_runtime_measurements | 只读 | 读取接口查询度量日志 | cat /sys/kernel/security/dim/ascii_runtime_measurements | -| runtime_status | 只读 | 读取接口返回状态类型信息,失败返回错误码 | cat /sys/kernel/security/dim/runtime_status | -| interval | 读写 | 写入数字字符串设置自动度量周期(范围同measure_interval参数);读取接口查询当前自动度量周期,失败返回错误码 | echo 10 > /sys/kernel/security/dim/interval
cat /sys/kernel/security/dim/interval | - -**dim_core状态类型信息说明:** - -状态信息以如下字段取值: - -- DIM_NO_BASELINE:表示dim_core已加载,但未进行任何操作; -- DIM_BASELINE_RUNNING:表示正在进行动态基线建立; -- DIM_MEASURE_RUNNING:表示正在进行动态度量度量; -- DIM_PROTECTED:表示已完成动态基线建立,处于受保护状态; -- DIM_ERROR:执行动态基线建立或动态度量时发生错误,需要用户解决错误后重新触发动态基线建立或动态度量。 - -2. dim_monitor模块接口 - -| 接口名 | 属性 | 说明 | 示例 | -| ---------------------------------- | ---- | ---------------------------------------------- | ------------------------------------------------------------ | -| monitor_run | 只写 | 写入字符串1触发度量,成功返回0,失败返回错误码 | echo 1 > /sys/kernel/security/dim/monitor_run | -| monitor_baseline | 只写 | 写入字符串1触发基线,成功返回0,失败返回错误码 | echo 1 > /sys/kernel/security/dim/monitor_baseline | -| monitor_ascii_runtime_measurements | 只读 | 读取接口查询度量日志 | cat /sys/kernel/security/dim/monitor_ascii_runtime_measurements | -| monitor_status | 只读 | 读取接口返回状态类型信息,失败返回错误码 | cat /sys/kernel/security/dim/monitor_status | - -**dim_monitor状态类型信息说明:** - -- ready:表示dim_monitior已加载,但未进行任何操作; -- running:表示正在进行动态基线建立或动态度量; -- error:执行动态基线建立或动态度量时发生错误,需要用户解决错误后重新触发动态基线建立或动态度量; -- protected:表示已完成动态基线建立,处于受保护状态。 - -##### 用户态工具接口说明 - -dim_gen_baseline命令行接口,详见: 。 - -### 如何使用 - -#### 安装/卸载 - -**前置条件**: - -- OS版本:支持openEuler 23.09及以上版本; -- 内核版本:支持openEuler kernel 5.10/6.4版本。 - -安装dim_tools和dim软件包,以openEuler 23.09版本为例: - -``` -# yum install -y dim_tools dim -``` - -软件包安装完成后,DIM内核组件不会默认加载,可通过如下命令进行加载和卸载: - -``` -# modprobe dim_core 或 insmod /path/to/dim_core.ko -# modprobe dim_monitor 或 insmod /path/to/dim_monitor.ko -# rmmod dim_monitor -# rmmod dim_core -``` - -加载成功后,可以通过如下命令查询: - -``` -# lsmod | grep dim_core -dim_core 77824 1 dim_monitor -# lsmod | grep dim_monitor -dim_monitor 36864 0 -``` - -卸载前需要先卸载ko,再卸载rpm包 - -``` -# rmmod dim_monitor -# rmmod dim_core -# rpm -e dim -``` - ->![](./public_sys-resources/icon-note.gif) **说明:** -> -> dim_monitor必须后于dim_core加载,先于dim_core卸载; -> 也可使用源码编译安装,详见 。 - -#### 度量用户态进程代码段 - -**前置条件**: - -- dim_core模块加载成功; - -- 用户需要准备一个常驻的度量目标用户态程序,本小节以程序路径/opt/dim/demo/dim_test_demo为例: - - ``` - # /opt/dim/demo/dim_test_demo & - ``` - -**步骤1**:为度量目标进程对应的二进制文件生成静态基线 - -``` -# mkdir -p /etc/dim/digest_list -# dim_gen_baseline /opt/dim/demo/dim_test_demo -o /etc/dim/digest_list/test.hash -``` - -**步骤2**:配置度量策略 - -``` -# echo "measure obj=BPRM_TEXT path=/opt/dim/demo/dim_test_demo" > /etc/dim/policy -``` - -**步骤3**:触发动态基线建立 - -``` -# echo 1 > /sys/kernel/security/dim/baseline_init -``` - -**步骤4**:查询度量日志 - -``` -# cat /sys/kernel/security/dim/ascii_runtime_measurements -0 e9a79e25f091e03a8b3972b1a0e4ae2ccaed1f5652857fe3b4dc947801a6913e sha256:02e28dff9997e1d81fb806ee5b784fd853eac8812059c4dba7c119c5e5076989 /opt/dim/demo/dim_test_demo [static baseline] -``` - -如上度量日志说明目标进程被成功度量,且度量结果与静态基线一致。 - -**步骤5**:触发动态度量 - -``` -# echo 1 > /sys/kernel/security/dim/measure -``` - -度量完成后可通过**步骤4**查询度量日志,如果度量结果和动态基线阶段的度量结果一致,则度量日志不会更新,否则会新增异常度量日志。如果攻击者尝试篡改目标程序(如采用修改代码重新编译的方式,过程略)并重新启动目标程序: - -``` -# pkill dim_test_demo -# /opt/dim/demo/dim_test_demo & -``` - -再次触发度量并查询度量日志,可以发现有标识为“tampered”的度量日志: - -``` -# echo 1 > /sys/kernel/security/dim/measure -# cat /sys/kernel/security/dim/ascii_runtime_measurements -0 e9a79e25f091e03a8b3972b1a0e4ae2ccaed1f5652857fe3b4dc947801a6913e sha256:02e28dff9997e1d81fb806ee5b784fd853eac8812059c4dba7c119c5e5076989 /opt/dim/demo/dim_test_demo [static baseline] -0 08a2f6f2922ad3d1cf376ae05cf0cc507c2f5a1c605adf445506bc84826531d6 sha256:855ec9a890ff22034f7e13b78c2089e28e8d217491665b39203b50ab47b111c8 /opt/dim/demo/dim_test_demo [tampered] -``` - -#### 度量内核模块代码段 - -**前置条件**: - -- dim_core模块加载成功; - -- 用户需要准备一个度量目标内核模块,本小节假设内核模块路径为/opt/dim/demo/dim_test_module.ko,模块名为dim_test_module: - - ``` - # insmod /opt/dim/demo/dim_test_module.ko - ``` - ->![](./public_sys-resources/icon-note.gif) **说明:** -> ->需要保证内核模块的内核编译环境版本号和当前系统内核版本号一致,可以使用如下方法确认: -> ->``` -># modinfo dim_monitor.ko | grep vermagic | grep "$(uname -r)" ->vermagic: 6.4.0-1.0.1.4.oe2309.x86_64 SMP preempt mod_unload modversions ->``` - -即内核模块vermagic信息的第一个字段需要和当前内核版本号完全一致。 - -**步骤1**:为度量目标内核模块生成静态基线 - -``` -# mkdir -p /etc/dim/digest_list -# dim_gen_baseline /opt/dim/demo/dim_test_module.ko -o /etc/dim/digest_list/test.hash -``` - -**步骤2**:配置度量策略 - -``` -# echo "measure obj=MODULE_TEXT name=dim_test_module" > /etc/dim/policy -``` - -**步骤3**:触发动态基线建立 - -``` -# echo 1 > /sys/kernel/security/dim/baseline_init -``` - -**步骤4**:查询度量日志 - -``` -# cat /sys/kernel/security/dim/ascii_runtime_measurements -0 9603a9d5f87851c8eb7d2619f7abbe28cb8a91f9c83f5ea59f036794e23d1558 sha256:9da4bccc7ae1b709deab8f583b244822d52f3552c93f70534932ae21fac931c6 dim_test_module [static baseline] -``` - -如上度量日志说明dim_test_module模块被成功度量,并以当前的度量结果作为后续度量的基准值(此时度量日志中的哈希值不代表实际度量值)。 - -**步骤5**:触发动态度量 - -``` -echo 1 > /sys/kernel/security/dim/measure -``` - -度量完成后可通过**步骤4**查询度量日志,如果度量结果和动态基线阶段的度量结果一致,则度量日志不会更新,否则会新增异常度量日志。如果攻击者尝试篡改内核模块(如采用修改代码重新编译的方式,过程略)并重新加载: - -``` -rmmod dim_test_module -insmod /opt/dim/demo/dim_test_module.ko -``` - -再次触发度量并查询度量日志,可以发现有标识为“tampered”的度量日志: - -``` -# cat /sys/kernel/security/dim/ascii_runtime_measurements -0 9603a9d5f87851c8eb7d2619f7abbe28cb8a91f9c83f5ea59f036794e23d1558 sha256:9da4bccc7ae1b709deab8f583b244822d52f3552c93f70534932ae21fac931c6 dim_test_module [static baseline] -0 6205915fe63a7042788c919d4f0ff04cc5170647d7053a1fe67f6c0943cd1f40 sha256:4cb77370787323140cb572a789703be1a4168359716a01bf745aa05de68a14e3 dim_test_module [tampered] -``` - -#### 度量内核代码段 - -**前置条件**: - -- dim_core模块加载成功。 - -**步骤1**:为内核生成静态基线 - -``` -# mkdir -p /etc/dim/digest_list -# dim_gen_baseline -k "$(uname -r)" -o /etc/dim/digest_list/test.hash /boot/vmlinuz-6.4.0-1.0.1.4.oe2309.x86_64 -``` - ->![](./public_sys-resources/icon-note.gif) **说明:** -> ->/boot/vmlinuz-6.4.0-1.0.1.4.oe2309.x86_64文件名不固定。 - -**步骤2**:配置DIM策略 - -``` -# echo "measure obj=KERNEL_TEXT" > /etc/dim/policy -``` - -**步骤3**:触发动态基线建立 - -``` -# echo 1 > /sys/kernel/security/dim/baseline_init -``` - -**步骤4**:查询度量日志 - -``` -# cat /sys/kernel/security/dim/ascii_runtime_measurements -0 ef82c39d767dece1f5c52b31d1e8c7d55541bae68a97542dda61b0c0c01af4d2 sha256:5f1586e95b102cd9b9f7df3585fe13a1306cbd464f2ebe47a51ad34128f5d0af 6.4.0-1.0.1.4.oe2309.x86_64 [static baseline] -``` - -如上度量日志说明内核被成功度量,并以当前的基线结果作为后续度量的基准值(此时度量日志中的哈希值不代表实际度量值)。 - -**步骤5**:触发动态度量 - -``` -# echo 1 > /sys/kernel/security/dim/measure -``` - -度量完成后可通过**步骤4**查询度量日志,如果度量结果和动态基线阶段的度量结果一致,则度量日志不会更新,否则会新增异常度量日志。 - -#### 度量dim_core模块 - -**前置条件**: - -- dim_core和dim_monitor模块加载成功; -- 度量策略配置完成。 - -**步骤1**:触发dim_core动态基线 - -``` -# echo 1 > /sys/kernel/security/dim/baseline_init -``` - -**步骤2**:触发dim_monitor动态基线 - -``` -# echo 1 > /sys/kernel/security/dim/monitor_baseline -``` - -**步骤3**:查询dim_monitor度量日志 - -``` -# cat /sys/kernel/security/dim/monitor_ascii_runtime_measurements -0 c1b0d9909ddb00633fc6bbe7e457b46b57e165166b8422e81014bdd3e6862899 sha256:35494ed41109ebc9bf9bf7b1c190b7e890e2f7ce62ca1920397cd2c02a057796 dim_core.text [dynamic baseline] -0 9be7121cd2c215d454db2a8aead36b03d2ed94fad0fbaacfbca83d57a410674f sha256:f35d20aae19ada5e633d2fde6e93133c3b6ae9f494ef354ebe5b162398e4d7fa dim_core.data [dynamic baseline] -``` - -如上度量日志说明dim_core模块被成功度量,并以当前的基线结果作为后续度量的基准值。 ->![](./public_sys-resources/icon-note.gif) **说明:** -> ->若跳过动态基线创建,直接进行度量,日志中会显示tampered。 - -**步骤4**:触发dim_monitor动态度量 - -``` -# echo 1 > /sys/kernel/security/dim/monitor_run -``` - -如果度量结果和动态基线阶段的度量结果一致,则度量日志不会更新,否则会新增异常度量日志。尝试修改策略后重新执触发dim_core动态基线,此时由于度量目标发生变化,dim_core管理的基线数据也会发生变更,从而dim_monitor的度量结果也会发生变化: - -``` -# echo "measure obj=BPRM_TEXT path=/usr/bin/bash" > /etc/dim/policy -# echo 1 > /sys/kernel/security/dim/baseline_init -``` - -再次触发dim_monitor度量并查询度量日志,可以发现有标识为“tampered”的度量日志: - -``` -# echo 1 > /sys/kernel/security/dim/monitor_run -# cat /sys/kernel/security/dim/monitor_ascii_runtime_measurements -0 c1b0d9909ddb00633fc6bbe7e457b46b57e165166b8422e81014bdd3e6862899 sha256:35494ed41109ebc9bf9bf7b1c190b7e890e2f7ce62ca1920397cd2c02a057796 dim_core.text [dynamic baseline] -0 9be7121cd2c215d454db2a8aead36b03d2ed94fad0fbaacfbca83d57a410674f sha256:f35d20aae19ada5e633d2fde6e93133c3b6ae9f494ef354ebe5b162398e4d7fa dim_core.data [dynamic baseline] -0 6a60d78230954aba2e6ea6a6b20a7b803d7adb405acbb49b297c003366cfec0d sha256:449ba11b0bfc6146d4479edea2b691aa37c0c025a733e167fd97e77bbb4b9dab dim_core.data [tampered] -``` - -#### 扩展TPM PCR寄存器 - -**前置条件**: - -- 系统已安装TPM 2.0芯片,执行如下命令返回不为空: - - ``` - # ls /dev/tpm* - /dev/tpm0 /dev/tpmrm0 - ``` - -- 系统已安装tpm2-tools软件包,执行如下命令返回不为空: - - ``` - # rpm -qa tpm2-tools - ``` - -- 度量策略和静态基线配置完成。 - -**步骤1**:加载dim_core和dim_monitor模块,并配置扩展度量结果的PCR寄存器编号,这里为dim_core度量结果指定PCR 12,为dim_monitor指定PCR 13 - -``` -# modprobe dim_core measure_pcr=12 -# modprobe dim_monitor measure_pcr=13 -``` - -**步骤2**:触发dim_core和dim_monitor基线 - -``` -# echo 1 > /sys/kernel/security/dim/baseline_init -# echo 1 > /sys/kernel/security/dim/monitor_baseline -``` - -**步骤3**:查看度量日志,每条日志都显示了对应的TPM PCR寄存器编号 - -``` -# cat /sys/kernel/security/dim/ascii_runtime_measurements -12 2649c414d1f9fcac1c8d0df8ae7b1c18b5ea10a162b957839bdb8f8415ec6146 sha256:83110ce600e744982d3676202576d8b94cea016a088f99617767ddbd66da1164 /usr/lib/systemd/systemd [static baseline] -# cat /sys/kernel/security/dim/monitor_ascii_runtime_measurements -13 72ee3061d5a80eb8547cd80c73a80c3a8dc3b3e9f7e5baa10f709350b3058063 sha256:5562ed25fcdf557efe8077e231399bcfbcf0160d726201ac8edf7a2ca7c55ad0 dim_core.text [dynamic baseline] -13 8ba44d557a9855c03bc243a8ba2d553347a52c1a322ea9cf8d3d1e0c8f0e2656 sha256:5279eadc235d80bf66ba652b5d0a2c7afd253ebaf1d03e6e24b87b7f7e94fa02 dim_core.data [dynamic baseline] -``` - -**步骤4**:检查TPM芯片的PCR寄存器,对应的寄存器均已被写入了扩展值 - -``` -# tpm2_pcrread sha256 | grep "12:" - 12: 0xF358AC6F815BB29D53356DA2B4578B4EE26EB9274E553689094208E444D5D9A2 -# tpm2_pcrread sha256 | grep "13:" - 13: 0xBFB9FF69493DEF9C50E52E38B332BDA8DE9C53E90FB96D14CD299E756205F8EA -``` - -#### 开启签名校验 - -**前置条件**: - -- 用户准备公钥证书和签名私钥,签名算法需要为RSA,哈希算法为sha256,证书格式需要为DER。也可以采用如下方式生成: - - ``` - # openssl genrsa -out dim.key 4096 - # openssl req -new -sha256 -key dim.key -out dim.csr -subj "/C=AA/ST=BB/O=CC/OU=DD/CN=DIM Test" - # openssl x509 -req -days 3650 -signkey dim.key -in dim.csr -out dim.crt - # openssl x509 -in dim.crt -out dim.der -outform DER - ``` - -- 度量策略配置完成。 - -**步骤1**:将DER格式的证书放置在/etc/keys/x509_dim.der路径 - -``` -# mkdir -p /etc/keys -# cp dim.der /etc/keys/x509_dim.der -``` - -**步骤2**:对策略文件和静态基线文件进行签名,签名文件必须为原文件名直接添加.sig后缀 - -``` -# openssl dgst -sha256 -out /etc/dim/policy.sig -sign dim.key /etc/dim/policy -# openssl dgst -sha256 -out /etc/dim/digest_list/test.hash.sig -sign dim.key /etc/dim/digest_list/test.hash -``` - -**步骤3**:加载dim_core模块,开启签名校验功能 - -``` -modprobe dim_core signature=1 -``` - -此时,策略文件和静态基线文件均需要通过签名校验后才能加载。 -修改策略文件触发基线,会导致基线失败: - -``` -# echo "" >> /etc/dim/policy -# echo 1 > /sys/kernel/security/dim/baseline_init --bash: echo: write error: Key was rejected by service -``` - ->![](./public_sys-resources/icon-note.gif) **说明:** -> ->如果某个静态基线文件签名校验失败,dim_core会跳过该文件的解析,而不会导致基线失败。 - -#### 配置度量算法 - -**前置条件**: - -- 度量策略配置完成。 - -**步骤1**:加载dim_core和dim_monitor模块,并配置度量算法,这里以sm3算法为例 - -``` -# modprobe dim_core measure_hash=sm3 -# modprobe dim_monitor measure_hash=sm3 -``` - -**步骤2**:配置策略并为度量目标程序生成sm3算法的静态基线 - -``` -# echo "measure obj=BPRM_TEXT path=/opt/dim/demo/dim_test_demo" > /etc/dim/policy -# dim_gen_baseline -a sm3 /opt/dim/demo/dim_test_demo -o /etc/dim/digest_list/test.hash -``` - -**步骤3**:触发基线 - -``` -# echo 1 > /sys/kernel/security/dim/baseline_init -# echo 1 > /sys/kernel/security/dim/monitor_baseline -``` - -**步骤4**:查看度量日志,每条日志都显示了对应的哈希算法 - -``` -# cat /sys/kernel/security/dim/ascii_runtime_measurements -0 585a64feea8dd1ec415d4e67c33633b97abb9f88e6732c8a039064351d24eed6 sm3:ca84504c02bef360ec77f3280552c006ce387ebb09b49b316d1a0b7f43039142 /opt/dim/demo/dim_test_demo [static baseline] -# cat /sys/kernel/security/dim/monitor_ascii_runtime_measurements -0 e6a40553499d4cbf0501f32cabcad8d872416ca12855a389215b2509af76e60b sm3:47a1dae98182e9d7fa489671f20c3542e0e154d3ce941440cdd4a1e4eee8f39f dim_core.text [dynamic baseline] -0 2c862bb477b342e9ac7d4dd03b6e6705c19e0835efc15da38aafba110b41b3d1 sm3:a4d31d5f4d5f08458717b520941c2aefa0b72dc8640a33ee30c26a9dab74eae9 dim_core.data [dynamic baseline] -``` - -#### 配置自动周期度量 - -**前置条件**: - -- 度量策略配置完成; - -**方式1**:加载dim_core模块,配置定时度量间隔,此处配置为1分钟 - -``` -modprobe dim_core measure_interval=1 -``` - -在模块加载完成后,自动触发动态基线流程,后续每隔1分钟触发一次动态度量。 - ->![](./public_sys-resources/icon-note.gif) **说明:** -> ->此时不能配置dim_core度量自身代码段的度量策略,否则会产生误报。 ->同时需要提前配置/etc/dim/policy,否则指定measure_interval=1加载模块会失败 - -**方式2**:加载dim_core模块后,也可通过内核模块接口配置定时度量间隔,此处配置为1分钟 - -``` -modprobe dim_core -echo 1 > /sys/kernel/security/dim/interval -``` - -此时不会立刻触发度量,1分钟后会触发动态基线或动态度量,后续每隔1分钟触发一次动态度量。 - -#### 配置度量调度时间 - -**前置条件**: - -- 度量策略配置完成; - -加载dim_core模块,配置定时度量调度时间,此处配置为10毫秒: - -``` -modprobe dim_core measure_schedule=10 -``` - -触发动态基线或动态度量时,dim_core每度量一个进程,就会调度让出CPU 10毫秒时间。 - - -## 远程证明(鲲鹏安全库) - -### 介绍 - -本项目开发运行在鲲鹏处理器上的基础安全软件组件,前期主要聚焦在远程证明等可信计算相关领域,使能社区安全开发者。 - -### 软件架构 - -在未使能TEE的平台上,本项目可提供平台远程证明特性,其软件架构如下图所示: - -![img](./figures/RA-arch-1.png) - -在已使能TEE的平台上,本项目可提供TEE远程证明特性,其软件架构如下图所示: - -![img](./figures/RA-arch-2.png) - -### 安装配置 - -1. 使用yum安装程序的rpm包,命令如下: - - ```shell - # yum install kunpengsecl-ras kunpengsecl-rac kunpengsecl-rahub kunpengsecl-qcaserver kunpengsecl-attester kunpengsecl-tas kunpengsecl-devel - ``` - -2. 准备数据库环境:进入 `/usr/share/attestation/ras` 目录,执行 `prepare-database-env.sh` 脚本进行自动化的数据库环境配置。 - -3. 程序运行时依赖的配置文件有三个路径,分别为:当前路径 `./config.yaml` ,家路径 `${HOME}/.config/attestation/ras(rac)(rahub)(qcaserver)(attester)(tas)/config.yaml` ,以及系统路径 `/etc/attestation/ras(rac)(rahub)(qcaserver)(attester)(tas)/config.yaml` 。 - -4. (可选)如果需要创建家目录配置文件,可在安装好rpm包后,执行位于 `/usr/share/attestation/ras(rac)(rahub)(qcaserver)(attester)(tas)` 下的脚本 `prepare-ras(rac)(hub)(qca)(attester)(tas)conf-env.sh` 从而完成家目录配置文件的部署。 - -### 相关参数 - -#### RAS启动参数 - -命令行输入 `ras` 即可启动RAS程序。请注意,在当前目录下需要提供**ECDSA**公钥并命名为 `ecdsakey.pub` 。相关参数如下: - -```shell - -H --https http/https模式开关,默认为https(true),false=http - -h --hport https模式下RAS监听的restful api端口 - -p, --port string RAS监听的client api端口 - -r, --rest string http模式下RAS监听的restful api端口 - -T, --token 生成一个测试用的验证码并退出 - -v, --verbose 打印更详细的RAS运行时日志信息 - -V, --version 打印RAS版本并退出 -``` - -#### RAC启动参数 - -命令行输入 `sudo raagent` 即可启动RAC程序,请注意,物理TPM模块的开启需要sudo权限。相关参数如下: - -```shell - -s, --server string 指定待连接的RAS服务端口 - -t, --test 以测试模式启动 - -v, --verbose 打印更详细的RAC运行时日志信息 - -V, --version 打印RAC版本并退出 - -i, --imalog 指定ima文件路径 - -b, --bioslog 指定bios文件路径 - -T, --tatest 以TA测试模式启动 -``` - -**注意:** ->1.若要使用TEE远程证明特性,需要以非TA测试模式启动RAC,并将待证明TA的uuid、是否使用TCB、mem_hash和img_hash按序放入RAC执行路径下的**talist**文件内。同时预装由TEE团队提供的**libqca.so**库和**libteec.so**库。**talist**文件格式如下: -> ->```text ->e08f7eca-e875-440e-9ab0-5f381136c600 false ccd5160c6461e19214c0d8787281a1e3c4048850352abe45ce86e12dd3df9fde 46d5019b0a7ffbb87ad71ea629ebd6f568140c95d7b452011acfa2f9daf61c7a ->``` -> ->2.若不使用TEE远程证明特性,则需要将 `${DESTDIR}/usr/share/attestation/qcaserver` 目录下的libqca.so库和libteec.so库复制到 `/usr/lib` 或 `/usr/lib64` 目录,并以TA测试模式启动RAC。 - -#### QCA启动参数 - -命令行输入 `${DESTDIR}/usr/bin/qcaserver` 即可启动QCA程序,请注意,这里必须要使用qcaserver的完整路径以正常启动QTA,同时需要使QTA中的CA路径参数与该路径保持相同。相关参数如下: - -```shell - -C, --scenario int 设置程序的应用场景,默认为no_as场景(0),1=as_no_daa场景,2=as_with_daa场景 - -S, --server string 指定开放的服务器地址/端口 -``` - -#### ATTESTER启动参数 - -命令行输入 `attester` 即可启动ATTESTER程序。相关参数如下: - -```shell - -B, --basevalue string 设置基准值文件读取路径 - -M, --mspolicy int 设置度量策略,默认为-1,需要手动指定。1=仅比对img-hash值,2=仅比对hash值,3=同时比对img-hash和hash两个值 - -S, --server string 指定待连接的服务器地址 - -U, --uuid int 指定待验证的可信应用 - -V, --version 打印程序版本并退出 - -T, --test 读取固定的nonce值以匹配目前硬编码的可信报告 -``` - -#### TAS启动参数 - -命令行输入 `tas` 即可启动TAS程序。相关参数如下: - -```shell - -T, --token 生成一个测试用的验证码并退出 -``` - -**注意:** ->1.若要启用TAS服务,需要先为TAS配置好私钥。可以按如下命令修改家目录下的配置文件: -> ->```shell -># cd ${HOME}/.config/attestation/tas -># vim config.yaml -> # 如下DAA_GRP_KEY_SK_X和DAA_GRP_KEY_SK_Y的值仅用于测试,正常使用前请务必更新其内容以保证安全。 ->tasconfig: -> port: 127.0.0.1:40008 -> rest: 127.0.0.1:40009 -> akskeycertfile: ./ascert.crt -> aksprivkeyfile: ./aspriv.key -> huaweiitcafile: ./Huawei IT Product CA.pem -> DAA_GRP_KEY_SK_X: 65a9bf91ac8832379ff04dd2c6def16d48a56be244f6e19274e97881a776543c65a9bf91ac8832379ff04dd2c6def16d48a56be244f6e19274e97881a776543c -> DAA_GRP_KEY_SK_Y: 126f74258bb0ceca2ae7522c51825f980549ec1ef24f81d189d17e38f1773b56126f74258bb0ceca2ae7522c51825f980549ec1ef24f81d189d17e38f1773b56 ->``` -> ->之后再输入`tas`启动TAS程序。 -> ->2.在有TAS环境中,为提高QCA配置证书的效率,并非每一次启动都需要访问TAS以生成相应证书,而是通过证书的本地化存储,即读取QCA侧 `config.yaml` 中配置的证书路径,通过 `func hasAKCert(s int) bool` 函数检查是否已有TAS签发的证书保存于本地,若成功读取证书,则无需访问TAS,若读取证书失败,则需要访问TAS,并将TAS返回的证书保存于本地。 - -### 接口定义 - -#### RAS接口 - -为了便于管理员对目标服务器、RAS以及目标服务器上部署的TEE中的用户 TA 进行管理,本程序设计了以下接口可供调用: - -| 接口 | 方法 | -| --------------------------------- | --------------------------- | -| / | GET | -| /{id} | GET、POST、DELETE | -| /{from}/{to} | GET | -| /{id}/reports | GET | -| /{id}/reports/{reportid} | GET、DELETE | -| /{id}/basevalues | GET | -| /{id}/newbasevalue | POST | -| /{id}/basevalues/{basevalueid} | GET、POST、DELETE | -| /{id}/ta/{tauuid}/status | GET | -| /{id}/ta/{tauuid}/tabasevalues | GET | -| /{id}/ta/{tauuid}/tabasevalues/{tabasevalueid} | GET、POST、DELETE | -| /{id}/ta/{tauuid}/newtabasevalue | POST | -| /{id}/ta/{tauuid}/tareports | GET | -| /{id}/ta/{tauuid}/tareports/{tareportid} | GET、POST、DELETE | -| /{id}/basevalues/{basevalueid} | GET、DELETE | -| /version | GET | -| /config | GET、POST | -| /{id}/container/status | GET | -| /{id}/device/status | GET | - -上述接口的具体用法分别介绍如下。 - -若需要查询所有服务器的信息,可以使用`"/"`接口。 - -```shell -# curl -X GET -H "Content-Type: application/json" http://localhost:40002/ -``` - -*** -若需要查询目标服务器的详细信息,可以使用`"/{id}"`接口的`GET`方法,其中{id}是RAS为目标服务器分配的唯一标识号。 - -```shell -# curl -X GET -H "Content-Type: application/json" http://localhost:40002/1 -``` - -*** -若需要修改目标服务器的信息,可以使用`"/{id}"`接口的`POST`方法,其中$AUTHTOKEN是事先使用`ras -T`自动生成的身份验证码。 - -```go -type clientInfo struct { - Registered *bool `json:"registered"` // 目标服务器注册状态 - IsAutoUpdate *bool `json:"isautoupdate"`// 目标服务器基准值更新策略 -} -``` - -```shell -# curl -X POST -H "Authorization: $AUTHTOKEN" -H "Content-Type: application/json" http://localhost:40002/1 -d '{"registered":false, "isautoupdate":false}' -``` - -*** -若需要删除目标服务器,可以使用`"/{id}"`接口的`DELETE`方法。 -**注意:** ->使用该方法并非删除目标服务器的所有信息,而是把目标服务器的注册状态置为`false`! - -```shell -# curl -X DELETE -H "Authorization: $AUTHTOKEN" -H "Content-Type: application/json" http://localhost:40002/1 -``` - -*** -若需要查询指定范围内的所有服务器信息,可以使用`"/{from}/{to}"`接口的`GET`方法。 - -```shell -# curl -X GET -H "Content-Type: application/json" http://localhost:40002/1/9 -``` - -*** -若需要查询目标服务器的所有可信报告,可以使用`"/{id}/reports"`接口的`GET`方法。 - -```shell -# curl -X GET -H "Content-Type: application/json" http://localhost:40002/1/reports -``` - -*** -若需要查询目标服务器指定可信报告的详细信息,可以使用`"/{id}/reports/{reportid}"`接口的`GET`方法,其中{reportid}是RAS为目标服务器指定可信报告分配的唯一标识号。 - -```shell -# curl -X GET -H "Content-Type: application/json" http://localhost:40002/1/reports/1 -``` - -*** -若需要删除目标服务器指定可信报告,可以使用`"/{id}/reports/{reportid}"`接口的`DELETE`方法。 -**注意:** ->使用该方法将删除指定可信报告的所有信息,将无法再通过接口对该报告进行查询! - -```shell -# curl -X DELETE -H "Authorization: $AUTHTOKEN" -H "Content-Type: application/json" http://localhost:40002/1/reports/1 -``` - -*** -若需要查询目标服务器的所有基准值,可以使用`"/{id}/basevalues"`接口的`GET`方法。 - -```shell -# curl -X GET -H "Content-Type: application/json" http://localhost:40002/1/basevalues -``` - -*** -若需要给目标服务器新增一条基准值信息,可以使用`"/{id}/newbasevalue"`接口的`POST`方法。 - -```go -type baseValueJson struct { - BaseType string `json:"basetype"` // 基准值类型 - Uuid string `json:"uuid"` // 容器或设备的标识号 - Name string `json:"name"` // 基准值名称 - Enabled bool `json:"enabled"` // 基准值是否可用 - Pcr string `json:"pcr"` // PCR值 - Bios string `json:"bios"` // BIOS值 - Ima string `json:"ima"` // IMA值 - IsNewGroup bool `json:"isnewgroup"` // 是否为一组新的基准值 -} -``` - -```shell -# curl -X POST -H "Authorization: $AUTHTOKEN" -H "Content-Type: application/json" http://localhost:40002/1/newbasevalue -d '{"name":"test", "basetype":"host", "enabled":true, "pcr":"testpcr", "bios":"testbios", "ima":"testima", "isnewgroup":true}' -``` - -*** -若需要查询目标服务器指定基准值的详细信息,可以使用`"/{id}/basevalues/{basevalueid}"`接口的`GET`方法,其中{basevalueid}是RAS为目标服务器指定基准值分配的唯一标识号。 - -```shell -# curl -X GET -H "Content-Type: application/json" http://localhost:40002/1/basevalues/1 -``` - -*** -若需要修改目标服务器指定基准值的可用状态,可以使用`"/{id}/basevalues/{basevalueid}"`接口的`POST`方法。 - -```shell -# curl -X POST -H "Content-type: application/json" -H "Authorization: $AUTHTOKEN" http://localhost:40002/1/basevalues/1 -d '{"enabled":true}' -``` - -*** -若需要删除目标服务器指定基准值,可以使用`"/{id}/basevalues/{basevalueid}"`接口的`DELETE`方法。 -**注意:** ->使用该方法将删除指定基准值的所有信息,将无法再通过接口对该基准值进行查询! - -```shell -# curl -X DELETE -H "Authorization: $AUTHTOKEN" -H "Content-Type: application/json" http://localhost:40002/1/basevalues/1 -``` - -*** -若需要查询目标服务器上特定用户 TA 的可信状态,可以使用`"/{id}/ta/{tauuid}/status"`接口的GET方法。其中{id}是RAS为目标服务器分配的唯一标识号,{tauuid}是特定用户 TA 的身份标识号。 - -```shell -# curl -X GET -H "Content-type: application/json" -H "Authorization: $AUTHTOKEN" http://localhost:40002/1/ta/test/status -``` - -*** -若需要查询目标服务器上特定用户 TA 的所有基准值信息,可以使用`"/{id}/ta/{tauuid}/tabasevalues"`接口的GET方法。 - -```shell -# curl -X GET -H "Content-type: application/json" http://localhost:40002/1/ta/test/tabasevalues -``` - -*** -若需要查询目标服务器上特定用户 TA 的指定基准值的详细信息,可以使用`"/{id}/ta/{tauuid}/tabasevalues/{tabasevalueid}"`接口的GET方法。其中{tabasevalueid}是RAS为目标服务器上特定用户 TA 的指定基准值分配的唯一标识号。 - -```shell -# curl -X GET -H "Content-type: application/json" http://localhost:40002/1/ta/test/tabasevalues/1 -``` - -*** -若需要修改目标服务器上特定用户 TA 的指定基准值的可用状态,可以使用`"/{id}/ta/{tauuid}/tabasevalues/{tabasevalueid}"`接口的`POST`方法。 - -```shell -# curl -X POST -H "Content-type: application/json" -H "Authorization: $AUTHTOKEN" http://localhost:40002/1/ta/test/tabasevalues/1 --data '{"enabled":true}' -``` - -*** -若需要删除目标服务器上特定用户 TA 的指定基准值,可以使用`"/{id}/ta/{tauuid}/tabasevalues/{tabasevalueid}"`接口的`DELETE`方法。 -**注意:** ->使用该方法将删除指定基准值的所有信息,将无法再通过接口对该基准值进行查询! - -```shell -# curl -X DELETE -H "Content-type: application/json" -H "Authorization: $AUTHTOKEN" -k http://localhost:40002/1/ta/test/tabasevalues/1 -``` - -*** -若需要给目标服务器上特定用户 TA 新增一条基准值信息,可以使用`"/{id}/ta/{tauuid}/newtabasevalue"`接口的`POST`方法。 - -```go -type tabaseValueJson struct { - Uuid string `json:"uuid"` // 用户 TA 的标识号 - Name string `json:"name"` // 基准值名称 - Enabled bool `json:"enabled"` // 基准值是否可用 - Valueinfo string `json:"valueinfo"` // 镜像哈希值和内存哈希值 -} -``` - -```shell -# curl -X POST -H "Content-Type: application/json" -H "Authorization: $AUTHTOKEN" -k http://localhost:40002/1/ta/test/newtabasevalue -d '{"uuid":"test", "name":"testname", "enabled":true, "valueinfo":"test info"}' -``` - -*** -若需要查询目标服务器上特定用户 TA 的所有可信报告,可以使用`"/{id}/ta/{tauuid}/tareports"`接口的`GET`方法。 - -```shell -# curl -X GET -H "Content-type: application/json" http://localhost:40002/1/ta/test/tareports -``` - -*** -若需要查询目标服务器上特定用户 TA 的指定可信报告的详细信息,可以使用`"/{id}/ta/{tauuid}/tareports/{tareportid}"`接口的`GET`方法,其中{tareportid}是RAS为目标服务器上特定用户 TA 的指定可信报告分配的唯一标识号。 - -```shell -# curl -X GET -H "Content-type: application/json" http://localhost:40002/1/ta/test/tareports/2 -``` - -*** -若需要删除目标服务器上特定用户 TA 的指定可信报告,可以使用`"/{id}/ta/{tauuid}/tareports/{tareportid}"`接口的`DELETE`方法。 -**注意:** ->使用该方法将删除指定可信报告的所有信息,将无法再通过接口对该报告进行查询! - -```shell -# curl -X DELETE -H "Content-type: application/json" http://localhost:40002/1/ta/test/tareports/2 -``` - -*** -若需要获取本程序的版本信息,可以使用`"/version"`接口的`GET`方法。 - -```shell -# curl -X GET -H "Content-Type: application/json" http://localhost:40002/version -``` - -*** -若需要查询目标服务器/RAS/数据库的配置信息,可以使用`"/config"`接口的`GET`方法。 - -```shell -# curl -X GET -H "Content-Type: application/json" http://localhost:40002/config -``` - -*** -若需要修改目标服务器/RAS/数据库的配置信息,可以使用`"/config"`接口的`POST`方法。 - -```go -type cfgRecord struct { - // 目标服务器配置 - HBDuration string `json:"hbduration" form:"hbduration"` - TrustDuration string `json:"trustduration" form:"trustduration"` - DigestAlgorithm string `json:"digestalgorithm" form:"digestalgorithm"` - // RAS配置 - MgrStrategy string `json:"mgrstrategy" form:"mgrstrategy"` - ExtractRules string `json:"extractrules" form:"extractrules"` - IsAllupdate *bool `json:"isallupdate" form:"isallupdate"` - LogTestMode *bool `json:"logtestmode" form:"logtestmode"` -} -``` - -```shell -# curl -X POST -H "Authorization: $AUTHTOKEN" -H "Content-Type: application/json" http://localhost:40002/config -d '{"hbduration":"5s","trustduration":"20s","DigestAlgorithm":"sha256"}' -``` - -#### TAS接口 - -为了便于管理员对TAS服务的远程控制,本程序设计了以下接口可供调用: - -| 接口 | 方法 | -| --------------------| ------------------| -| /config | GET、POST | - -若需要查询TAS的配置信息,可使用`"/config"`接口的`GET`方法: - -```shell -# curl -X GET -H "Content-Type: application/json" http://localhost:40009/config -``` - -*** -若需要修改TAS的配置信息,可使用`"/config"`接口的`POST`方法: - -```shell -curl -X POST -H "Content-Type: application/json" -H "Authorization: $AUTHTOKEN" http://localhost:40009/config -d '{"basevalue":"testvalue"}' -``` - -**注意:** ->TAS的配置信息读取与修改目前仅支持基准值 - -### FAQ - -1. RAS安装后,为什么无法启动? - - >因为在当前RAS的设计逻辑中,程序启动后需要从当前目录查找一份名为 `ecdsakey.pub` 的文件进行读取并作为之后访问该程序的身份验证码,若当前目录没有该文件,则RAS启动会报错。 - >>解决方法一:运行 `ras -T` 生成测试用token后会生成 `ecdsakey.pub` 。 - >>解决方法二:自行部署oauth2认证服务后,将对应JWT token生成方对应的验证公钥保存为 `ecdsakey.pub` 。 - -2. 为什么RAS启动后,通过restapi无法访问? - - >因为RAS默认以https模式启动,您需要向RAS提供合法的证书才能正常访问,而http模式下启动的RAS则不需要提供证书。 - - - -## 可信平台控制模块(TPCM) - -### 背景 - -可信计算在近40年的研究过程中,经历了不断的发展和完善,已经成为信息安全的一个重要分支。中国的可信计算技术近年发展迅猛,在可信计算2.0的基础上解决了可信体系与现有体系的融合问题、可信管理问题以及可信开发的简化问题,形成了基于主动免疫体系的可信计算技术--可信计算3.0。相对于可信计算2.0被动调用的外挂式体系结构,可信计算3.0提出了以自主密码为基础、控制芯片为支柱、双融主板为平台、可信软件为核心、可信连接为纽带、策略管控成体系、安全可信保应用的全新的可信体系框架,在网络层面解决可信问题。 - -可信平台控制模块(Trusted Platform Control Module,TPCM)是一种可集成在可信计算平台中,用于建立和保障信任源点的基础核心模块。它作为中国可信计算3.0中的创新点之一和主动免疫机制的核心,实现了对整个平台的主动可控。 - -TPCM可信计算3.0架构为双体系架构,分为防护部件和计算部件,以可信密码模块为基础,通过可信平台控制模块对防护部件和计算部件及组件的固件进行可信度量,可信软件基(Trusted Software Base,TSB)对系统软件及应用软件进行可信度量,同时TPCM管理平台实现对可信度量的验证及可信策略同步和管理。 - - - -### 功能描述 - -如下图所示,整体系统方案由防护部件、计算部件和可信管理中心三部分组成。 - -![](./figures/TPCM.png) - -- 可信管理中心:对可信计算节点的防护策略和基准值进行制定、下发、维护、存储等操作的集中管理平台,可信管理中心由第三方厂商提供。 -- 防护部件:独立于计算部件执行,为可信计算平台提供具有主动度量和主动控制特征的可信计算防护功能,实现运算的同时进行安全防护。防护部件包括可信平台控制模块、可信软件基,以及可信密码模块(Trusted Cryptography Module,TCM)。TPCM是可信计算节点中实现可信防护功能的关键部件,可以采用多种技术途径实现,如板卡、芯片、IP核等,其内部包含中央处理器、存储器等硬件,固件,以及操作系统与可信功能组件等软件,支撑其作为一个独立于计算部件的防护部件组件,并行于计算部件按内置防护策略工作,对计算部件的硬件、固件及软件等需防护的资源进行可信监控,是可信计算节点中的可信根。 - -- 计算部件:主要包括硬件、操作系统和应用层软件。其中操作系统分为引导阶段和运行阶段,在引导阶段openEuler的shim和grub2支持可信度量能力,可实现对shim、grub2以及操作系统内核、initramfs等启动文件的可信度量防护;在运行阶段,openEuler操作系统支持部署可信验证要素代理(由第三方厂商可信华泰提供),它负责将数据发送给TPCM模块,用以实现运行阶段的可信度量防护。 - -其中,TPCM作为可信计算节点中实现可信防护功能的关键部件,需要与TSB、TCM、可信管理中心和可信计算节点的计算部件交互,交互方式如下: - -1. TPCM的硬件、固件与软件为TSB提供运行环境,设置的可信功能组件为TSB按策略库解释要求实现度量、控制、支撑与决策等功能提供支持。 -2. TPCM通过访问TCM获取可信密码功能,完成对防护对象可信验证、度量和保密存储等计算任务,并提供TCM服务部件以支持对TCM的访问。 -3. TPCM通过管理接口连接可信管理中心,实现防护策略管理、可信报告处理等功能。 -4. TPCM通过内置的控制器和I/O端口,经由总线与计算部件的控制器交互,实现对计算部件的主动监控。 -5. 计算部件操作系统中内置的防护代理获取预设的防护对象有关代码和数据提供给TPCM,TPCM将监控信息转发给TSB,由TSB依据策略库进行分析处理。 - -### 约束限制 - -适配服务器:TaiShan 200(型号2280)VF
-适配BMC插卡型号:BC83SMMC - -### 应用场景 - -通过TPCM特性构成一个完整的信任链,保障系统启动以后进入一个可信的计算环境。 \ No newline at end of file diff --git "a/docs/zh/docs/Administration/\345\237\272\347\241\200\351\205\215\347\275\256.md" "b/docs/zh/docs/Administration/\345\237\272\347\241\200\351\205\215\347\275\256.md" index 3c2f5d4ecd63b68da8a43158d175fdc48d4078a6..ffa32894a7e65904dd813a7fe92837a12d4c799d 100644 --- "a/docs/zh/docs/Administration/\345\237\272\347\241\200\351\205\215\347\275\256.md" +++ "b/docs/zh/docs/Administration/\345\237\272\347\241\200\351\205\215\347\275\256.md" @@ -12,15 +12,33 @@ - [设置键盘布局](#设置键盘布局) - [设置日期和时间](#设置日期和时间) - [使用timedatectl命令设置](#使用timedatectl命令设置) + - [显示日期和时间](#显示日期和时间) + - [通过远程服务器进行时间同步](#通过远程服务器进行时间同步) + - [修改日期](#修改日期) + - [修改时间](#修改时间) + - [修改时区](#修改时区) - [使用date命令设置](#使用date命令设置) + - [显示当前的日期和时间](#显示当前的日期和时间) + - [修改时间](#修改时间-1) + - [修改日期](#修改日期-1) - [使用hwclock命令设置](#使用hwclock命令设置) + - [硬件时钟和系统时钟](#硬件时钟和系统时钟) + - [显示日期和时间](#显示日期和时间-1) + - [设置日期和时间](#设置日期和时间-1) - [设置kdump](#设置kdump) - [设置kdump预留内存](#设置kdump预留内存) + - [预留内存参数格式](#预留内存参数格式) - [预留内存推荐值](#预留内存推荐值) - [禁用网络相关驱动](#禁用网络相关驱动) - [设置磁盘调度算法](#设置磁盘调度算法) - [临时修改调度策略](#临时修改调度策略) - [永久设置调度策略](#永久设置调度策略) + - [设置NMI watchdog](#设置nmi-watchdog) + - [概述](#概述) + - [注意事项](#注意事项) + - [操作步骤](#操作步骤) + - [关闭NMI watchdog](#关闭nmi-watchdog) + - [修改NMI watchdog阈值](#修改nmi-watchdog阈值) @@ -482,3 +500,64 @@ kdump配置文件(/etc/kdump.conf)中,dracut参数可以设置裁剪的驱 ```text linux /vmlinuz-4.19.90-2003.4.0.0036.oe1.x86_64 root=/dev/mapper/openeuler-root ro resume=/dev/mapper/openeuler-swap rd.lvm.lv=openeuler/root rd.lvm.lv=openeuler/swap quiet crashkernel=512M elevator=mq-deadline ``` + +## 设置NMI watchdog + +本节介绍openEuler在arm64架构上NMI watchdog方案的差异以及配置。 + +### 概述 + +NMI watchdog(Hard lockup detector)是一种用来检测系统是否出现Hard lockup(硬死锁)的机制。一般的watchdog依赖时钟中断进行挂死检测,当系统在原子上下文(中断,或者中断关闭的上下文中,etc)中出现挂死时,时钟中断处理,检测失效。NMI watchdog一般通过PMC(或者PMU)的NMI中断进行检测,NMI中断可以在原子上下文中产生并处理,因此可以用来检测原子上下文中挂死的场景。 + +NMI watchdog主线已经支持,当硬件满足以下条件时可以使能NMI watchdog: + +1. 支持NMI中断 +2. 支持PMC(PMU) + +在arm64上,openEuler基于arm64的SDEI功能实现了SDEI watchdog作为NMI watchdog。因此openEuler在arm64上存在2种NMI watchdog方案: + +1. SDEI watchdog(默认方式) +2. 基于PMC(PMU)中断的NMI watchdog + +### 注意事项 + +对于arm64机器,需要注意以下事项: + +- 默认情况下使用SDEI watchdog。当SDEI watchdog使能失败时,不会切换到NMI watchdog +- 需要使用NMI watchdog时,需要显式的在启动参数中禁用SDEI watchdog:disable_sdei_nmi_watchdog +- 当需要使用NMI watchdog时,需要保证硬件支持NMI中断: + - 当硬件支持NMI中断时,不需要额外处理 + - 当硬件不支持NMI中断,但是支持伪NMI中断时,需要显式的在启动参数中使能伪NMI中断:irqchip.gicv3_pseudo_nmi=1 + +以上事项不影响非arm64平台。 + +### 操作步骤 + +针对arm64架构配置NMI watchdog的操作步骤如下: + +1. 在OS的引导配置文件grub.cfg中添加如下参数:irqchip.gicv3_pseudo_nmi=1(仅通过Pseudo-NMI实现NMI watchdog时添加) disable_sdei_nmi_watchdog +2. 检查NMI watchdog是否加载成功,如果加载成功,内核dmesg日志打印类似如下内容 + + ``` + [ 11.361889][ T129] NMI watchdog: Enabled. Permanently consumes one hw-PMU counter. + ``` + +### 关闭NMI watchdog + +将NMI watchdog临时关闭,此修改重启后会失效;默认nmi_watchdog=1。 + +```shell +# echo 0 > /proc/sys/kernel/nmi_watchdog +``` + +在OS启动时,可以通过配置内核参数nmi_watchdog=0关闭NMI watchdog。 + +### 修改NMI watchdog阈值 + +修改NMI watchdog阈值,此修改重启后会失效;默认watchdog_thresh=10。 + +```shell +# echo 10 > /proc/sys/kernel/watchdog_thresh +``` + +在OS启动时,可以通过配置内核参数watchdog_thresh=[0-60]修改阈值。 diff --git a/docs/zh/docs/Administration/FAQ-54.md "b/docs/zh/docs/Administration/\345\270\270\350\247\201\351\227\256\351\242\230\344\270\216\350\247\243\345\206\263\346\226\271\346\263\225.md" similarity index 86% rename from docs/zh/docs/Administration/FAQ-54.md rename to "docs/zh/docs/Administration/\345\270\270\350\247\201\351\227\256\351\242\230\344\270\216\350\247\243\345\206\263\346\226\271\346\263\225.md" index 7ba3fd243d35e33216c36fd4a7ece5765f76634f..f3f17f289cbf600d2d64ea9a208ffe872ab7ff16 100644 --- a/docs/zh/docs/Administration/FAQ-54.md +++ "b/docs/zh/docs/Administration/\345\270\270\350\247\201\351\227\256\351\242\230\344\270\216\350\247\243\345\206\263\346\226\271\346\263\225.md" @@ -1,23 +1,6 @@ -# FAQ +# 常见问题与解决方法 - - -- [FAQ](#faq) - - [使用systemctl和top命令查询libvirtd服务占用内存不同](#使用systemctl和top命令查询libvirtd服务占用内存不同) - - [设置RAID0卷,参数stripsize设置为4时出错](#设置raid0卷参数stripsize设置为4时出错) - - [使用rpmbuild编译mariadb失败](#使用rpmbuild编译mariadb失败) - - [使用默认配置启动SNTP服务失败](#使用默认配置启动sntp服务失败) - - [安装时出现软件包冲突、文件冲突或缺少软件包导致安装失败](#安装时出现软件包冲突文件冲突或缺少软件包导致安装失败) - - [libiscsi降级失败](#libiscsi降级失败) - - [xfsprogs降级失败](#xfsprogs降级失败) - - [elfutils降级失败](#elfutils降级失败) - - [cpython/Lib发现CVE-2019-9674:Zip炸弹漏洞](#cpythonlib发现cve-2019-9674zip炸弹漏洞) - - [不合理使用glibc正则表达式引起ReDoS攻击](#不合理使用glibc正则表达式引起redos攻击) - - [带参数f执行modprobe或insmod报错](#带参数f执行modprobe或insmod报错) - - - -## 使用systemctl和top命令查询libvirtd服务占用内存不同 +## 问题1:使用systemctl和top命令查询libvirtd服务占用内存不同 ### 问题描述 @@ -42,7 +25,7 @@ CGroup里的memory.usage\_in\_bytes = cache + RSS + swap。 由上可知,syestemd相关命令和top命令的内存占用率含义不同,所以查询结果不同。 -## 设置RAID0卷,参数stripsize设置为4时出错 +## 问题2:设置RAID0卷,参数stripsize设置为4时出错 ### 问题现象 @@ -56,7 +39,7 @@ CGroup里的memory.usage\_in\_bytes = cache + RSS + swap。 不需要修改配置文件,openEuler执行lvcreate命令时,条带化规格支持的stripesize最小值为64KB,将参数stripesize设置为64。 -## 使用rpmbuild编译mariadb失败 +## 问题3:使用rpmbuild编译mariadb失败 ### 问题描述 @@ -90,7 +73,7 @@ mariadb数据库不允许使用root权限的帐号进行测试用例执行,所 该修改关闭了编译阶段执行测试用例的功能,但不会影响编译和编译后的RPM包内容。 -## 使用默认配置启动SNTP服务失败 +## 问题4:使用默认配置启动SNTP服务失败 ### 问题现象 @@ -104,7 +87,7 @@ mariadb数据库不允许使用root权限的帐号进行测试用例执行,所 修改/etc/sysconfig/sntp文件 ,在文件中添加中国NTP快速授时服务器域名:0.generic.pool.ntp.org。 -## 安装时出现软件包冲突、文件冲突或缺少软件包导致安装失败 +## 问题5:安装时出现软件包冲突、文件冲突或缺少软件包导致安装失败 ### 问题现象 @@ -198,7 +181,7 @@ Error: file /usr/bin/build conflicts between attempted installs of python3-edk2-devel-202002-3.oe1.noarch and build-20191114-324.4.oe1.noarch ``` -## libiscsi降级失败 +## 问题6:libiscsi降级失败 ### 问题现象 @@ -227,7 +210,7 @@ libiscsi-1.19.3 之前的版本把 iscsi-xxx 等二进制文件打包进了主 yum remove libiscsi-utils ``` -## xfsprogs降级失败 +## 问题7:xfsprogs降级失败 ### 问题现象 @@ -256,33 +239,7 @@ Problem: problem with installed package xfsprogs-xfs_scrub-5.6.0-2.oe1.x86_64 # yum remove xfsprogs-xfs_scrub ``` -## elfutils降级失败 - -### 问题现象 - -elfutils降级缺少依赖,导致无法降级。 - -![](figures/1665628542704.png) - -### 原因分析 - -22.03-LTS、22.03-LTS-Next分支:elfutils-0.185-12 - -master分支:elfutils-0.187-7 - -20.03-LTS-SP1分支:elfutils-0.180-9 - -如上版本,elfutils主包提供的eu-objdump、eu-readelf、eu-nm命令拆分到elfutils-extra子包中。当系统已安装elfutils-extra,且elfutils进行降级时,由于低版本(如上分支版本)无法提供对应的elfutils-extra包,因此elfutils-extra子包不会降级(elfutils-extra依赖于降级前的elfutils包),导致依赖问题无法解决,最终elfutils降级失败。 - -### 解决方案 - -执行以下命令,先卸载elfutils-extra包,再进行降级操作。 - -```shell -# yum remove -y elfutils-extra -``` - -## cpython/Lib发现CVE-2019-9674:Zip炸弹漏洞 +## 问题8:cpython/Lib发现CVE-2019-9674:Zip炸弹漏洞 ### 问题现象 @@ -296,7 +253,7 @@ Python 3.7.2 及以下版本中的 Lib/zipfile.py 允许远程攻击者通过 zi 在 zipfile 文档中添加告警信息: -## 不合理使用glibc正则表达式引起ReDoS攻击 +## 问题9:不合理使用glibc正则表达式引起ReDoS攻击 ### 问题现象 @@ -331,7 +288,7 @@ Segmentation fault (core dumped) 3. 用户程序在检测到进程异常之后,通过重启进程等手段恢复业务,提升程序的可靠性。 -## 安装卸载httpd-devel和apr-util-devel软件包,其中的依赖包gdbm-devel安装、卸载有报错 +## 问题10:安装卸载httpd-devel和apr-util-devel软件包,其中的依赖包gdbm-devel安装、卸载有报错 ### 问题现象 @@ -353,7 +310,7 @@ Segmentation fault (core dumped) 1. 单包升级gdbm,安装使用gdbm-1.18.1-2版本相关软件包后,告警信息消失; 2. 在单包升级gdbm后,再进行安装依赖的gdbm-devel软件包安装,让其依赖高版本gdbm软件包,告警信息消失。 -## 系统reboot后,执行yum/dnf等命令报错,提示rpmdb error +## 问题11:系统reboot后,执行yum/dnf等命令报错,提示rpmdb error ### 问题现象 @@ -373,7 +330,7 @@ Segmentation fault (core dumped) 步骤2 执行`rm -rf /var/lib/rpm/__db.00*`删除所有db.00的文件。 步骤3 执行`rpmdb --rebuilddb`命令,重建rpm db后即可。 -## 执行 rpmrebuild -d /home/test filesystem对filesystem包rebuild时,rebuild失败 +## 问题12:执行 rpmrebuild -d /home/test filesystem对filesystem包rebuild时,rebuild失败 ### 问题现象 @@ -391,7 +348,7 @@ Segmentation fault (core dumped) 暂时不使用rpmrebuild命令对filesystem进行rebuild。 -## 带参数f执行modprobe或insmod报错 +## 问题13:带参数f执行modprobe或insmod报错 ### 问题现象 diff --git "a/docs/zh/docs/Administration/\346\220\255\345\273\272repo\346\234\215\345\212\241\345\231\250.md" "b/docs/zh/docs/Administration/\346\220\255\345\273\272repo\346\234\215\345\212\241\345\231\250.md" index 29a1d8a96d3a5d1d44c005c049e7e2f37fa481f2..97fa78068762e23bdcc42d3f2dc39516b0acf8d2 100644 --- "a/docs/zh/docs/Administration/\346\220\255\345\273\272repo\346\234\215\345\212\241\345\231\250.md" +++ "b/docs/zh/docs/Administration/\346\220\255\345\273\272repo\346\234\215\345\212\241\345\231\250.md" @@ -1,7 +1,7 @@ # 搭建repo服务器 >![](./public_sys-resources/icon-note.gif) **说明:** ->openEuler提供了多种repo源供用户在线使用,各repo源含义可参考[系统安装](./../Releasenotes/系统安装.md)。若用户无法在线获取openEuler repo源,则可使用openEuler提供的ISO发布包创建为本地openEuler repo源。本章节中以openEuler-21.09-aarch64-dvd.iso发布包为例,请根据实际需要的ISO发布包进行修改。 +>openEuler提供了多种repo源供用户在线使用,各repo源含义可参考[系统安装](./../Releasenotes/系统安装.md)。若用户无法在线获取openEuler repo源,则可使用openEuler提供的ISO发布包创建为本地openEuler repo源。本章节中以openEuler-{version}-aarch64-dvd.iso发布包为例,请根据实际需要的ISO发布包进行修改。 @@ -25,17 +25,17 @@ ## 概述 -将openEuler提供的ISO发布包openEuler-21.09-aarch64-dvd.iso创建为repo源,如下以使用nginx进行repo源部署,提供http服务为例进行说明。 +将openEuler提供的ISO发布包openEuler-{version}-aarch64-dvd.iso创建为repo源,如下以使用nginx进行repo源部署,提供http服务为例进行说明。 ## 创建/更新本地repo源 -使用mount挂载,将openEuler的ISO发布包openEuler-21.09-aarch64-dvd.iso创建为repo源,并能够对repo源进行更新。 +使用mount挂载,将openEuler的ISO发布包openEuler-{version}-aarch64-dvd.iso创建为repo源,并能够对repo源进行更新。 ### 获取ISO发布包 请从如下网址获取openEuler的ISO发布包: -[https://repo.openeuler.org/openEuler-21.09/ISO/](https://repo.openeuler.org/openEuler-21.09/ISO/) +[https://repo.openeuler.org/openEuler-{version}/ISO/](https://repo.openeuler.org/openEuler-{version}/ISO/) ### 挂载ISO创建repo源 @@ -44,7 +44,7 @@ 示例如下: ```shell -# mount /home/openEuler/openEuler-21.09-aarch64-dvd.iso /mnt/ +# mount /home/openEuler/openEuler-{version}-aarch64-dvd.iso /mnt/ ``` 挂载好的mnt目录如下: @@ -68,7 +68,7 @@ 可以拷贝ISO发布包中相关文件至本地目录以创建本地repo源,示例如下: ```shell -# mount /home/openEuler/openEuler-21.09-aarch64-dvd.iso /mnt/ +# mount /home/openEuler/openEuler-{version}-aarch64-dvd.iso /mnt/ # mkdir -p /home/openEuler/srv/repo/ # cp -r /mnt/Packages /home/openEuler/srv/repo/ # cp -r /mnt/repodata /home/openEuler/srv/repo/ @@ -235,14 +235,14 @@ Packages为rpm包所在的目录,repodata为repo源元数据所在的目录, - 在root权限下拷贝镜像中相关文件至/usr/share/nginx/repo下,并修改目录权限。 ```shell - # mount /home/openEuler/openEuler-21.09-aarch64-dvd.iso /mnt/ + # mount /home/openEuler/openEuler-{version}-aarch64-dvd.iso /mnt/ # cp -r /mnt/Packages /usr/share/nginx/repo # cp -r /mnt/repodata /usr/share/nginx/repo # cp -r /mnt/RPM-GPG-KEY-openEuler /usr/share/nginx/repo # chmod -R 755 /usr/share/nginx/repo ``` - openEuler-21.09-aarch64-dvd.iso存放在/home/openEuler目录下。 + openEuler-{version}-aarch64-dvd.iso存放在/home/openEuler目录下。 - 使用root在/usr/share/nginx/repo下创建repo源的软链接。 @@ -304,10 +304,10 @@ repo可配置为yum源,yum(全称为 Yellow dog Updater, Modified)是一 ```text [base] name=base - baseurl=http://repo.openeuler.org/openEuler-21.09/OS/aarch64/ + baseurl=http://repo.openeuler.org/openEuler-{version}/OS/aarch64/ enabled=1 gpgcheck=1 - gpgkey=http://repo.openeuler.org/openEuler-21.09/OS/aarch64/RPM-GPG-KEY-openEuler + gpgkey=http://repo.openeuler.org/openEuler-{version}/OS/aarch64/RPM-GPG-KEY-openEuler ``` ### repo优先级 diff --git "a/docs/zh/docs/Administration/\346\220\255\345\273\272\346\225\260\346\215\256\345\272\223\346\234\215\345\212\241\345\231\250.md" "b/docs/zh/docs/Administration/\346\220\255\345\273\272\346\225\260\346\215\256\345\272\223\346\234\215\345\212\241\345\231\250.md" index c305f5cd18ea9ebdc52e27ee4b259796139f8401..541c6bc31dfbdad0f50410d89a6d7dbe935523a0 100644 --- "a/docs/zh/docs/Administration/\346\220\255\345\273\272\346\225\260\346\215\256\345\272\223\346\234\215\345\212\241\345\231\250.md" +++ "b/docs/zh/docs/Administration/\346\220\255\345\273\272\346\225\260\346\215\256\345\272\223\346\234\215\345\212\241\345\231\250.md" @@ -118,13 +118,13 @@ PostgreSQL的架构如[图1](#fig26022387391)所示,主要进程说明如[表1 1. 在root权限下停止防火墙。 ```shell - # systemctl stop firewalld + systemctl stop firewalld ``` 2. 在root权限下关闭防火墙。 ```shell - # systemctl disable firewalld + systemctl disable firewalld ``` >![](./public_sys-resources/icon-note.gif) **说明:** @@ -135,7 +135,7 @@ PostgreSQL的架构如[图1](#fig26022387391)所示,主要进程说明如[表1 在root权限下修改配置文件。 ```shell -# sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config +sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config ``` #### 创建组和用户 @@ -146,14 +146,14 @@ PostgreSQL的架构如[图1](#fig26022387391)所示,主要进程说明如[表1 1. 在root权限下创建PostgreSQL用户(组)。 ```shell - # groupadd postgres - # useradd -g postgres postgres + groupadd postgres + useradd -g postgres postgres ``` 2. 在root权限下设置postgres用户密码(重复输入密码)。 ```shell - # passwd postgres + passwd postgres ``` #### 搭建数据盘 @@ -166,19 +166,19 @@ PostgreSQL的架构如[图1](#fig26022387391)所示,主要进程说明如[表1 1. 在root权限下创建文件系统(以xfs为例,根据实际需求创建文件系统),若磁盘之前已做过文件系统,执行此命令会出现报错,可使用-f参数强制创建文件系统。 ```shell - # mkfs.xfs /dev/nvme0n1 + mkfs.xfs /dev/nvme0n1 ``` 2. 在root权限下创建数据目录。 ```shell - # mkdir /data + mkdir /data ``` 3. 在root权限下挂载磁盘。 ```shell - # mount -o noatime,nobarrier /dev/nvme0n1 /data + mount -o noatime,nobarrier /dev/nvme0n1 /data ``` #### 数据目录授权 @@ -186,7 +186,7 @@ PostgreSQL的架构如[图1](#fig26022387391)所示,主要进程说明如[表1 1. 在root权限下修改目录权限。 ```shell - # chown -R postgres:postgres /data/ + chown -R postgres:postgres /data/ ``` ### 安装、运行和卸载 @@ -197,25 +197,25 @@ PostgreSQL的架构如[图1](#fig26022387391)所示,主要进程说明如[表1 2. 清除缓存。 ```shell - # dnf clean all + dnf clean all ``` 3. 创建缓存。 ```shell - # dnf makecache + dnf makecache ``` 4. 在root权限下安装PostgreSQL服务器。 ```shell - # dnf install postgresql-server + dnf install postgresql-server ``` 5. 查看安装后的rpm包。 ```shell - # rpm -qa | grep postgresql + rpm -qa | grep postgresql ``` #### 运行 @@ -228,13 +228,13 @@ PostgreSQL的架构如[图1](#fig26022387391)所示,主要进程说明如[表1 1. 切换到已创建的PostgreSQL用户。 ```shell - # su - postgres + su - postgres ``` 2. 初始化数据库,其中命令中的/usr/bin是命令initdb所在的目录。 ```shell - # usr/bin/initdb -D /data/ + usr/bin/initdb -D /data/ ``` ##### 启动数据库 @@ -242,13 +242,13 @@ PostgreSQL的架构如[图1](#fig26022387391)所示,主要进程说明如[表1 1. 启动PostgreSQL数据库。 ```shell - # /usr/bin/pg_ctl -D /data/ -l /data/logfile start + /usr/bin/pg_ctl -D /data/ -l /data/logfile start ``` 2. 确认PostgreSQL数据库进程是否正常启动。 ```shell - # ps -ef | grep postgres + ps -ef | grep postgres ``` 命令执行后,打印信息如下图所示,PostgreSQL相关进程已经正常启动了。 @@ -260,7 +260,7 @@ PostgreSQL的架构如[图1](#fig26022387391)所示,主要进程说明如[表1 1. 登录数据库。 ```shell - # /usr/bin/psql -U postgres + /usr/bin/psql -U postgres ``` ![](./figures/login.png) @@ -291,7 +291,7 @@ PostgreSQL的架构如[图1](#fig26022387391)所示,主要进程说明如[表1 1. 停止PostgreSQL数据库。 ```shell - # /usr/bin/pg_ctl -D /data/ -l /data/logfile stop + /usr/bin/pg_ctl -D /data/ -l /data/logfile stop ``` #### 卸载 @@ -299,13 +299,13 @@ PostgreSQL的架构如[图1](#fig26022387391)所示,主要进程说明如[表1 1. 在postgres用户下停止数据库。 ```shell - # /usr/bin/pg_ctl -D /data/ -l /data/logfile stop + /usr/bin/pg_ctl -D /data/ -l /data/logfile stop ``` 2. 在root用户下执行**dnf remove postgresql-server**卸载PostgreSQL数据库。 ```shell - # dnf remove postgresql-server + dnf remove postgresql-server ``` ### 管理数据库角色 @@ -355,7 +355,7 @@ postgres=# CREATE ROLE roleexample2 WITH LOGIN PASSWORD '123456'; 创建角色名为roleexample3的角色。 ```shell -[postgres@localhost ~]# createuser roleexample3 +[postgres@localhost ~]# createuser roleexample3 ``` #### 查看角色 @@ -455,7 +455,7 @@ postgres=# DROP ROLE userexample1; 删除userexample2角色。 ```shell -[postgres@localhost ~]# dropuser userexample2 +[postgres@localhost ~]# dropuser userexample2 ``` #### 角色授权 @@ -773,8 +773,8 @@ psql命令不会自动创建databasename数据库,所以在执行psql恢复数 将db1.sql脚本文件导入到主机为192.168.202.144,端口为3306,postgres用户下newdb数据库中。 ```shell -[postgres@localhost ~]# createdb newdb -[postgres@localhost ~]# psql -h 192.168.202.144 -p 3306 -U postgres -W -d newdb < db1.sql +[postgres@localhost ~]# createdb newdb +[postgres@localhost ~]# psql -h 192.168.202.144 -p 3306 -U postgres -W -d newdb < db1.sql ``` ## Mariadb服务器 @@ -819,13 +819,13 @@ MariaDB的架构如[图2](#fig13492418164520)所示。 1. 在root权限下停止防火墙。 ```shell - # systemctl stop firewalld + systemctl stop firewalld ``` 2. 在root权限下关闭防火墙。 ```shell - # systemctl disable firewalld + systemctl disable firewalld ``` >![](./public_sys-resources/icon-note.gif) **说明:** @@ -836,7 +836,7 @@ MariaDB的架构如[图2](#fig13492418164520)所示。 1. 在root权限下修改配置文件。 ```shell - # sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/sysconfig/selinux + sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/sysconfig/selinux ``` #### 创建组和用户 @@ -847,17 +847,17 @@ MariaDB的架构如[图2](#fig13492418164520)所示。 1. 在root权限下创建MySQL用户(组)。 ```shell - # groupadd mysql + groupadd mysql ``` ```shell - # useradd -g mysql mysql + useradd -g mysql mysql ``` 2. 在root权限下设置MySQL用户密码。 ```shell - # passwd mysql + passwd mysql ``` 重复输入密码(根据实际需求设置密码)。 @@ -873,7 +873,7 @@ MariaDB的架构如[图2](#fig13492418164520)所示。 1. 创建分区(以/dev/sdb为例,根据实际情况创建) ```shell - # fdisk /dev/sdb + fdisk /dev/sdb ``` 2. 输入n,按回车确认。 @@ -885,17 +885,17 @@ MariaDB的架构如[图2](#fig13492418164520)所示。 8. 创建文件系统(以xfs为例,根据实际需求创建文件系统) ```shell - # mkfs.xfs /dev/sdb1 + mkfs.xfs /dev/sdb1 ``` 9. 挂载分区到“/data”以供操作系统使用。 ```shell - # mkdir /data + mkdir /data ``` ```shell - # mount /dev/sdb1 /data + mount /dev/sdb1 /data ``` 10. 执行命令“vi /etc/fstab", 编辑“/etc/fstab”使重启后自动挂载数据盘。如下图中,添加最后一行内容。 @@ -915,32 +915,32 @@ MariaDB的架构如[图2](#fig13492418164520)所示。 1. 创建物理卷(sdb为硬盘名称,具体名字以实际为准)。 ```shell - # pvcreate /dev/sdb + pvcreate /dev/sdb ``` 2. 创建物理卷组(其中datavg为创建的卷组名称,具体名字以实际规划为准)。 ```shell - # vgcreate datavg /dev/sdb + vgcreate datavg /dev/sdb ``` 3. 创建逻辑卷(其中600G为规划的逻辑卷大小,具体大小以实际情况为准;datalv为创建的逻辑卷的名字,具体名称以实际规划为准。)。 ```shell - # lvcreate -L 600G -n datalv datavg + lvcreate -L 600G -n datalv datavg ``` 4. 创建文件系统。 ```shell - # mkfs.xfs /dev/datavg/datalv + mkfs.xfs /dev/datavg/datalv ``` 5. 创建数据目录并挂载。 ```shell - # mkdir /data - # mount /dev/datavg/datalv /data + mkdir /data + mount /dev/datavg/datalv /data ``` 6. 执行命令**vi /etc/fstab**,编辑“/etc/fstab”使重启后自动挂载数据盘。如下图中,添加最后一行内容。 @@ -954,10 +954,10 @@ MariaDB的架构如[图2](#fig13492418164520)所示。 1. 在已创建的数据目录 **/data** 基础上,使用root权限继续创建进程所需的相关目录并授权MySQL用户(组)。 ```shell - # mkdir -p /data/mariadb - # cd /data/mariadb - # mkdir data tmp run log - # chown -R mysql:mysql /data + mkdir -p /data/mariadb + cd /data/mariadb + mkdir data tmp run log + chown -R mysql:mysql /data ``` ### 安装、运行和卸载 @@ -968,25 +968,25 @@ MariaDB的架构如[图2](#fig13492418164520)所示。 2. 清除缓存。 ```shell - # dnf clean all + dnf clean all ``` 3. 创建缓存。 ```shell - # dnf makecache + dnf makecache ``` 4. 在root权限下安装mariadb服务器。 ```shell - # dnf install mariadb-server + dnf install mariadb-server ``` 5. 查看安装后的rpm包。 ```shell - # rpm -qa | grep mariadb + rpm -qa | grep mariadb ``` #### 运行 @@ -994,13 +994,13 @@ MariaDB的架构如[图2](#fig13492418164520)所示。 1. 在root权限下开启mariadb服务器。 ```shell - # systemctl start mariadb + systemctl start mariadb ``` 2. 在root权限下初始化数据库。 ```shell - # /usr/bin/mysql_secure_installation + /usr/bin/mysql_secure_installation ``` 命令执行过程中需要输入数据库的root设置的密码,若没有密码则直接按“Enter”。然后根据提示及实际情况进行设置。 @@ -1008,7 +1008,7 @@ MariaDB的架构如[图2](#fig13492418164520)所示。 3. 登录数据库。 ```shell - # mysql -u root -p + mysql -u root -p ``` 命令执行后提示输入密码。密码为[2](#li197143190587)中设置的密码。 @@ -1021,14 +1021,14 @@ MariaDB的架构如[图2](#fig13492418164520)所示。 1. 在root权限下关闭数据库进程。 ```shell - # ps -ef | grep mysql - # kill -9 进程ID + ps -ef | grep mysql + kill -9 进程ID ``` 2. 在root权限下执行**dnf remove mariadb-server**命令卸载mariadb。 ```shell - # dnf remove mariadb-server + dnf remove mariadb-server ``` ### 管理数据库用户 @@ -1363,31 +1363,31 @@ mysqldump [options] -all-databases > outputfile 备份主机为192.168.202.144,端口为3306,root用户下的所有数据库到alldb.sql中。 ```shell -# mysqldump -h 192.168.202.144 -P 3306 -uroot -p123456 --all-databases > alldb.sql +mysqldump -h 192.168.202.144 -P 3306 -uroot -p123456 --all-databases > alldb.sql ``` 备份主机为192.168.202.144,端口为3306,root用户下的db1数据库到db1.sql中。 ```shell -# mysqldump -h 192.168.202.144 -P 3306 -uroot -p123456 --databases db1 > db1.sql +mysqldump -h 192.168.202.144 -P 3306 -uroot -p123456 --databases db1 > db1.sql ``` 备份主机为192.168.202.144,端口为3306,root用户下的db1数据库的tb1表到db1tb1.sql中。 ```shell -# mysqldump -h 192.168.202.144 -P 3306 -uroot -p123456 db1 tb1 > db1tb1.sql +mysqldump -h 192.168.202.144 -P 3306 -uroot -p123456 db1 tb1 > db1tb1.sql ``` 只备份主机为192.168.202.144,端口为3306,root用户下的db1数据库的表结构到db1.sql中。 ```shell -# mysqldump -h 192.168.202.144 -P 3306 -uroot -p123456 -d db1 > db1.sql +mysqldump -h 192.168.202.144 -P 3306 -uroot -p123456 -d db1 > db1.sql ``` 只备份主机为192.168.202.144,端口为3306,root用户下的db1数据库的数据到db1.sql中。 ```shell -# mysqldump -h 192.168.202.144 -P 3306 -uroot -p123456 -t db1 > db1.sql +mysqldump -h 192.168.202.144 -P 3306 -uroot -p123456 -t db1 > db1.sql ``` #### 恢复数据库 @@ -1440,13 +1440,13 @@ MySQL所使用的SQL语言是用于访问数据库的最常用标准化语言。 1. 在root权限下停止防火墙。 ```shell - # systemctl stop firewalld + systemctl stop firewalld ``` 2. 在root权限下关闭防火墙。 ```shell - # systemctl disable firewalld + systemctl disable firewalld ``` >![](./public_sys-resources/icon-note.gif) **说明:** @@ -1457,7 +1457,7 @@ MySQL所使用的SQL语言是用于访问数据库的最常用标准化语言。 1. 在root权限下修改配置文件。 ```shell - # sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/sysconfig/selinux + sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/sysconfig/selinux ``` #### 创建组和用户 @@ -1468,17 +1468,17 @@ MySQL所使用的SQL语言是用于访问数据库的最常用标准化语言。 1. 在root权限下创建MySQL用户(组)。 ```shell - # groupadd mysql + groupadd mysql ``` ```shell - # useradd -g mysql mysql + useradd -g mysql mysql ``` 2. 在root权限下设置MySQL用户密码。 ```shell - # passwd mysql + passwd mysql ``` 重复输入密码(根据实际需求设置密码)。 @@ -1487,14 +1487,14 @@ MySQL所使用的SQL语言是用于访问数据库的最常用标准化语言。 >![](./public_sys-resources/icon-note.gif) **说明:** >- 进行性能测试时,数据目录使用单独硬盘,需要对硬盘进行格式化并挂载,参考方法一或者方法二。 ->- 非性能测试时,在root权限下执行`mkdir /data`创建数据目录即可。然后跳过本小节: +>- 非性能测试时,在root权限下执行`mkdir /data`创建数据目录即可。然后跳过本小节: ##### 方法一:在root权限下使用fdisk进行磁盘管理 1. 创建分区(以/dev/sdb为例,根据实际情况创建) ```shell - # fdisk /dev/sdb + fdisk /dev/sdb ``` 2. 输入n,按回车确认。 @@ -1506,17 +1506,17 @@ MySQL所使用的SQL语言是用于访问数据库的最常用标准化语言。 8. 创建文件系统(以xfs为例,根据实际需求创建文件系统) ```shell - # mkfs.xfs /dev/sdb1 + mkfs.xfs /dev/sdb1 ``` 9. 挂载分区到“/data”以供操作系统使用。 ```shell - # mkdir /data + mkdir /data ``` ```shell - # mount /dev/sdb1 /data + mount /dev/sdb1 /data ``` 10. 执行命令“vi /etc/fstab", 编辑“/etc/fstab”使重启后自动挂载数据盘。如下图中,添加最后一行内容。 @@ -1529,41 +1529,41 @@ MySQL所使用的SQL语言是用于访问数据库的最常用标准化语言。 >![](./public_sys-resources/icon-note.gif) **说明:** >此步骤需要安装镜像中的lvm2相关包,步骤如下: ->1. 配置本地yum源,详细信息请参考[搭建repo服务器](./搭建repo服务器.html)。如果已经执行,则可跳过此步。 ->2. 执行`yum install lvm2`安装lvm2。 +>1. 配置本地yum源,详细信息请参考[搭建repo服务器](./搭建repo服务器.html)。如果已经执行,则可跳过此步。 +>2. 执行`yum install lvm2`安装lvm2。 1. 创建物理卷(sdb为硬盘名称,具体名字以实际为准)。 ```shell - #pvcreate /dev/sdb + pvcreate /dev/sdb ``` 2. 创建物理卷组(其中datavg为创建的卷组名称,具体名字以实际规划为准)。 ```shell - #vgcreate datavg /dev/sdb + vgcreate datavg /dev/sdb ``` 3. 创建逻辑卷(其中600G为规划的逻辑卷大小,具体大小以实际情况为准;datalv为创建的逻辑卷的名字,具体名称以实际规划为准。)。 ```shell - #lvcreate -L 600G -n datalv datavg + lvcreate -L 600G -n datalv datavg ``` 4. 创建文件系统。 ```shell - #mkfs.xfs /dev/datavg/datalv + mkfs.xfs /dev/datavg/datalv ``` 5. 创建数据目录并挂载。 ```shell - #mkdir /data + mkdir /data ``` ```shell - #mount /dev/datavg/datalv /data + mount /dev/datavg/datalv /data ``` 6. 执行命令**vi /etc/fstab**,编辑“/etc/fstab”使重启后自动挂载数据盘。如下图中,添加最后一行内容。 @@ -1577,10 +1577,10 @@ MySQL所使用的SQL语言是用于访问数据库的最常用标准化语言。 1. 在已创建的数据目录 **/data** 基础上,使用root权限继续创建进程所需的相关目录并授权MySQL用户(组)。 ```shell - # mkdir -p /data/mysql - # cd /data/mysql - # mkdir data tmp run log - # chown -R mysql:mysql /data + mkdir -p /data/mysql + cd /data/mysql + mkdir data tmp run log + chown -R mysql:mysql /data ``` ### 安装、运行和卸载 @@ -1591,25 +1591,25 @@ MySQL所使用的SQL语言是用于访问数据库的最常用标准化语言。 2. 清除缓存。 ```shell - # dnf clean all + dnf clean all ``` 3. 创建缓存。 ```shell - # dnf makecache + dnf makecache ``` 4. 在root权限下安装MySQL服务器。 ```shell - # dnf install mysql-server + dnf install mysql-server ``` 5. 查看安装后的rpm包。 ```shell - # rpm -qa | grep mysql-server + rpm -qa | grep mysql-server ``` #### 运行 @@ -1618,7 +1618,7 @@ MySQL所使用的SQL语言是用于访问数据库的最常用标准化语言。 1. 在root权限下创建my.cnf文件,其中文件路径(包括软件安装路径basedir、数据路径datadir等)根据实际情况修改。 ```shell - # vi /etc/my.cnf + vi /etc/my.cnf ``` 编辑my.cnf内容如下: @@ -1646,7 +1646,7 @@ MySQL所使用的SQL语言是用于访问数据库的最常用标准化语言。 2. 确保my.cnf配置文件修改正确。 ```shell - # cat /etc/my.cnf + cat /etc/my.cnf ``` ![](./figures/zh-cn_image_0231563132.png) @@ -1657,14 +1657,14 @@ MySQL所使用的SQL语言是用于访问数据库的最常用标准化语言。 3. 在root权限下修改/etc/my.cnf文件的组和用户为mysql:mysql ```shell - # chown mysql:mysql /etc/my.cnf + chown mysql:mysql /etc/my.cnf ``` 2. 配置环境变量。 1. 安装完成后,在root权限下将MySQL二进制文件路径到PATH。 ```shell - # echo export PATH=$PATH:/usr/local/mysql/bin >> /etc/profile + echo export PATH=$PATH:/usr/local/mysql/bin >> /etc/profile ``` >![](./public_sys-resources/icon-caution.gif) **注意:** @@ -1673,13 +1673,13 @@ MySQL所使用的SQL语言是用于访问数据库的最常用标准化语言。 2. 在root权限下使环境变量配置生效。 ```shell - # source /etc/profile + source /etc/profile ``` 3. 在root权限下初始化数据库。 >![](./public_sys-resources/icon-note.gif) **说明:** - >本步骤倒数第2行中有初始密码,请注意保存,登录数据库时需要使用。 + >本步骤倒数第2行中有初始密码,请注意保存,登录数据库时需要使用。 ```shell # mysqld --defaults-file=/etc/my.cnf --initialize @@ -1698,22 +1698,22 @@ MySQL所使用的SQL语言是用于访问数据库的最常用标准化语言。 1. 在root权限下修改文件权限。 ```shell - # chmod 777 /usr/local/mysql/support-files/mysql.server - # chown mysql:mysql /var/log/mysql/* + chmod 777 /usr/local/mysql/support-files/mysql.server + chown mysql:mysql /var/log/mysql/* ``` 2. 在root权限下启动MySQL。 ```shell - # cp /usr/local/mysql/support-files/mysql.server /etc/init.d/mysql - # chkconfig mysql on + cp /usr/local/mysql/support-files/mysql.server /etc/init.d/mysql + chkconfig mysql on ``` 以mysql用户启动数据库。 ```shell - # su - mysql - # service mysql start + su - mysql + service mysql start ``` 5. 登录数据库。 @@ -1765,14 +1765,14 @@ MySQL所使用的SQL语言是用于访问数据库的最常用标准化语言。 1. 在root权限下关闭数据库进程。 ```shell - # ps -ef | grep mysql - # kill -9 进程ID + ps -ef | grep mysql + kill -9 进程ID ``` 2. 在root权限下执行**dnf remove mysql**命令卸载MySQL。 ```shell - # dnf remove mysql + dnf remove mysql ``` ### 管理数据库用户 @@ -2104,31 +2104,31 @@ mysqldump [options] -all-databases > outputfile 备份主机为192.168.202.144,端口为3306,root用户下的所有数据库到alldb.sql中。 ```shell -# mysqldump -h 192.168.202.144 -P 3306 -uroot -p123456 --all-databases > alldb.sql +mysqldump -h 192.168.202.144 -P 3306 -uroot -p123456 --all-databases > alldb.sql ``` 备份主机为192.168.202.144,端口为3306,root用户下的db1数据库到db1.sql中。 ```shell -# mysqldump -h 192.168.202.144 -P 3306 -uroot -p123456 --databases db1 > db1.sql +mysqldump -h 192.168.202.144 -P 3306 -uroot -p123456 --databases db1 > db1.sql ``` 备份主机为192.168.202.144,端口为3306,root用户下的db1数据库的tb1表到db1tb1.sql中。 ```shell -# mysqldump -h 192.168.202.144 -P 3306 -uroot -p123456 db1 tb1 > db1tb1.sql +mysqldump -h 192.168.202.144 -P 3306 -uroot -p123456 db1 tb1 > db1tb1.sql ``` 只备份主机为192.168.202.144,端口为3306,root用户下的db1数据库的表结构到db1.sql中。 ```shell -# mysqldump -h 192.168.202.144 -P 3306 -uroot -p123456 -d db1 > db1.sql +mysqldump -h 192.168.202.144 -P 3306 -uroot -p123456 -d db1 > db1.sql ``` 只备份主机为192.168.202.144,端口为3306,root用户下的db1数据库的数据到db1.sql中。 ```shell -# mysqldump -h 192.168.202.144 -P 3306 -uroot -p123456 -t db1 > db1.sql +mysqldump -h 192.168.202.144 -P 3306 -uroot -p123456 -t db1 > db1.sql ``` #### 恢复数据库 diff --git "a/docs/zh/docs/Administration/\346\237\245\347\234\213\347\263\273\347\273\237\344\277\241\346\201\257.md" "b/docs/zh/docs/Administration/\346\237\245\347\234\213\347\263\273\347\273\237\344\277\241\346\201\257.md" index e35099b2b02326ad921af2cc586443c25a23deab..b729b1a0886343c74038952c6e102c9320463ba5 100644 --- "a/docs/zh/docs/Administration/\346\237\245\347\234\213\347\263\273\347\273\237\344\277\241\346\201\257.md" +++ "b/docs/zh/docs/Administration/\346\237\245\347\234\213\347\263\273\347\273\237\344\277\241\346\201\257.md" @@ -6,16 +6,10 @@ cat /etc/os-release ``` - 例如,命令和输出如下: + 例如,命令如下: ```shell $ cat /etc/os-release - NAME="openEuler" - VERSION="23.09" - ID="openEuler" - VERSION_ID="23.09" - PRETTY_NAME="openEuler 23.09" - ANSI_COLOR="0;31" ``` - 查看系统相关的资源信息。 diff --git "a/docs/zh/docs/Administration/\350\247\243\351\207\212\345\231\250\347\261\273\345\272\224\347\224\250\347\250\213\345\272\217\345\256\214\346\225\264\346\200\247\344\277\235\346\212\244\347\224\250\346\210\267\346\226\207\346\241\243.md" "b/docs/zh/docs/Administration/\350\247\243\351\207\212\345\231\250\347\261\273\345\272\224\347\224\250\347\250\213\345\272\217\345\256\214\346\225\264\346\200\247\344\277\235\346\212\244\347\224\250\346\210\267\346\226\207\346\241\243.md" new file mode 100644 index 0000000000000000000000000000000000000000..5f28d77b7718447b139e4204d0f06d3f7e1bb006 --- /dev/null +++ "b/docs/zh/docs/Administration/\350\247\243\351\207\212\345\231\250\347\261\273\345\272\224\347\224\250\347\250\213\345\272\217\345\256\214\346\225\264\346\200\247\344\277\235\346\212\244\347\224\250\346\210\267\346\226\207\346\241\243.md" @@ -0,0 +1,263 @@ +# 解释器类应用程序安全防护 + +## 背景介绍 + +业界主要使用Linux IMA机制对系统运行的程序发起完整性检查,一方面可以检测文件是否被篡改,另一方面通过IMA的白名单机制可以保证只有经过认证(如签名或HMAC)的文件才可以被运行。 + +Linux IMA机制支持对通过read()、exec()、mmap()等系统调用访问的文件进行完整性度量/评估。尽管从功能上而言,IMA能够配置对解释器运行脚本所采用的read()系统调用的完整性保护功能,但是在实际场景中,IMA摘要列表主要配置为针对exec()、mmap()进行拦截,而无法有效拦截未授权的恶意脚本执行。原因IMA无法将脚本文件和其他可变的数据文件进行区分。一旦针对read()系统调用配置完整性保护,则会将其他可变的配置文件、临时文件、数据文件等纳入保护范围,而这些文件无法预先生成基线或认证凭据,从而导致完整性检查失败。因此实际场景无法配置基于read()的度量/评估策略,无法针对性地实现拦截和保护: + +| **执行方式** | 实际应用场景下能否通过IMA保护 | +| ------------ | ----------------------------- | +| ./test.sh | 是 | +| bash test.sh | 否 | + +## 特性介绍 + +本特性旨在通过内核系统调用保证通过执行方式运行脚本文件(如./test.sh)和通过解释器运行脚本文件(bash ./test.sh)具备相同的权限检查流程,具体说明如下: + +### execveat()支持AT_CHECK检查参数 + +execveat()是于Linux 3.19/glibc 2.34版本开始支持的系统调用函数,该函数允许传入一个已打开的文件描述符并执行该文件。本特性针对execveat()系统调用扩展AT_CHECK检查机制,实现对某个文件是否可执行进行检查操作,而不真正地执行该文件。 + +当调用者通过execveat()系统调用函数,传入目标文件描述符并指定AT_CHECK参数时,在内核的系统调用执行逻辑中,首先针对传入的文件描述符进行权限检查,该流程与普通的文件执行流程一致,包含文件的DAC权限位、LSM访问控制规则、IMA等检查,如果检查不通过则退出并返回错误码-EACCSS。直到所有权限检查流程完成后,execveat()系统调用判断参数是否包含AT_CHECK,如果包含,则不执行后续的执行流程,退出并返回0,表示该文件的可执行权限检查通过。 +![](./figures/Process_Of_EXECVAT_ATCHECK.png) + +### 解释器支持调用execveat对程序进行权限检查 + +在支持基于execveat()的AT_CHECK检查机制的基础上,当解释器打开待运行的脚本后,可主动调用execveat()系统调用函数发起对文件进行可执行权限检查,只有当权限检查成功后,才可继续运行脚本文件。 +![](./figures/AT_CHECK_Process.png) + +## 接口介绍 + +### 系统调用接口说明 + +execveat()系统调用的函数类型为: + +``` +int execveat(int dirfd, const char *pathname, + char *const _Nullable argv[], + char *const _Nullable envp[], + int flags); +``` + +本特性涉及新增flags参数AT_CHECK,说明如下: + +| **参数** | **取值** | **说明** | +| -------- | -------- | -------------------------------------------------- | +| AT_CHECK | 0x10000 | 对目标文件进行可执行权限检查,而不真正地执行该文件 | + +### 内核启动参数说明 + +本特性支持如下内核启动参数: + +| **参数** | **取值** | **说明** | +| ------------------- | -------- | ------------------------------------------------------------ | +| exec_check.bash= | 0/1 | 默认为0,设置为1时,bash解释器进程运行脚本时前调用execveat进行脚本文件的可执行权限检查 | +| exec_check.java= | 0/1 | 默认为0,设置为1时,jdk运行脚本时class和jar文件时,需要调用execveat进行脚本文件的执行权限检查 | +| exec_check.= | 0/1 | 后续可扩展其他解释器 | + +**注意:上述启动参数实际由各个解释器进程进行读取解析,内核并不实际使用这些参数。** + +## 特性范围 + +本特性于openEuler 24.03 LTS SP1(6.6内核)版本支持,需要内核版本。特性支持的解释器类型如下: + +| **解释器** | **目标文件** | **说明** | +| ---------- | ----------------- | ---------------------------------------------------- | +| bash | shell脚本文件 | bash进程对打开的shell文件进行可执行权限检查 | +| jdk | class文件/jar文件 | java虚拟机对加载的class文件和jar包进行可执行权限检查 | + +社区开发人员或用户可基于该机制,自行扩展其他解释器或类似机制的支持。 +## 使用说明 + +### AT_CHECK参数使用示例 + +#### 前置条件 + +内核版本大于6.6.0-54.0.0.58,glibc版本大于等于2.38-41。 + +``` +glibc-2.38-41.oe2403sp1.x86_64 +kernel-6.6.0-54.0.0.58.oe2403sp1.x86_64 +``` + +#### 操作指导 + +可编写如下测试程序(test.c)进行参数功能测试: + +``` +#define _GNU_SOURCE + +#include +#include +#include +#include + +#define AT_CHECK 0x10000 + +int main(void) +{ + int fd; + int access_ret; + + fd = open("./", O_RDONLY); + access_ret = execveat(fd, "test.sh", NULL, NULL, AT_CHECK); + perror("execveat"); + printf("access_ret = %d\n", access_ret); + close(fd); + return 0; +} +``` + +**步骤1:** 编译测试代码: + +``` +gcc test.c -o test +``` + +**步骤2:** 创建测试脚本test.sh: + +``` +echo "sleep 10" > test.sh +``` + +**步骤3:** 如果测试脚本具备合法的可执行权限,则execveat返回0: + +``` +# chmod +x test.sh +# ./test +execveat: Success +access_ret = 0 +``` + +**步骤4:** 如果测试脚本不具备合法的权限,则execveat返回-1,错误码为Permission denied: + +``` +# chmod -x test.sh +# ./test +execveat: Permission denied +access_ret = -1 +``` + +### bash解释器支持脚本可执行权限检查 + +#### 前置条件 + +内核版本大于6.6.0-54,glibc版本大于等于2.38-41,bash版本大于等于5.2.15-13 + +```bash +bash-5.2.15-13.oe2403sp1.x86_64 +glibc-2.38-41.oe2403sp1.x86_64 +kernel-6.6.0-54.0.0.58.oe2403sp1.x86_64 +``` + +#### 操作指导 + +**步骤1:** 设置系统中所有脚本文件的权限为可执行 + +```bash +find / -name "*.sh" --exec chmod +x {} \; +``` + +**步骤2:** 设置启动参数并重启系统,添加的启动参数为: + +``` +exec_check.bash=1 +``` + +**步骤3:** 验证只有具备可执行权限的脚本才可被bash解释器运行: + +```bash +# echo "echo hello world" > test.sh +# bash test.sh +bash: line 0: [1402] denied sourcing non-executable test.sh +# chmod +x test.sh +# bash test.sh +hello world +``` + +### jdk支持脚本可执行权限检查 + +#### 前置条件 + +获取支持该特性的jdk代码: + +``` +https://gitee.com/openeuler/bishengjdk-8/tree/IMA_Glibc2_34 +``` + +按照如下流程编译: + +``` +https://gitee.com/openeuler/bishengjdk-8/wikis/%E4%B8%AD%E6%96%87%E6%96%87%E6%A1%A3/%E6%AF%95%E6%98%87JDK%208%20%E6%BA%90%E7%A0%81%E6%9E%84%E5%BB%BA%E8%AF%B4%E6%98%8E +``` + +#### 操作指导 + +**步骤1:** 确保系统中所有.class文件和.jar文件的可执行权限 + +``` +find / -name "*.class" chmod +x {} \; +find / -name "*.jar" chmod +x {} \; +``` + +**步骤2:** 设置启动参数并重启系统,添加的启动参数为: + +``` +exec_check.java=1 +``` + +**步骤3:** 验证只有具备可执行权限的class文件或jar文件才可被jvm运行: + +可编写如下测试程序(HelloWorld.java)进行参数功能测试: + +``` +public class HelloWorld { + public static void main(String[] args) { + System.out.println("Hello, World!"); + } +} +``` + +```bash +# javac HelloWorld.java +Access denied to /home/bishengjdk/bishengjdk-8/install/jvm/openjdk-1.8.0_432-internal/lib/tools.jar +# chmod +x /home/bishengjdk/bishengjdk-8/install/jvm/openjdk-1.8.0_432-internal/lib/tools.jar +# javac HelloWorld.java +# java HelloWorld +Access denied to HelloWorld.class + +# chmod +x HelloWorld.class +# java HelloWorld +Hello, World! +``` + +### 结合IMA摘要列表实现解释器类应用完整性保护 + +#### 前置条件 + +开启IMA摘要列表功能,详见[**内核完整性度量(IMA)** ](内核完整性度量(IMA).md)文档章节。 + +#### 操作指导 + +**步骤1:** 为目标应用程序生成IMA摘要列表(过程略,摘要列表生成方式详见[**内核完整性度量(IMA)** ](内核完整性度量(IMA).md)文档章节)。 + +**步骤2:** 开启IMA摘要列表功能(过程略,摘要列表生成方式详见[**内核完整性度量(IMA)** ](内核完整性度量(IMA).md)文档章节),以开启摘要列表+shell脚本校验为例,配置的内核启动参数如下: + +```bash +ima_appraise=enforce ima_appraise_digest_list=digest-nometadata ima_policy="appraise_exec_tcb" initramtmpfs module.sig_enforce exec_check.bash=1 +``` + +**步骤3:** 验证IMA对bash脚本完整性保护 + +```bash +# echo "echo hello world" > test.sh +# chmod +x test.sh +# bash test.sh +bash: line 0: [2520] denied sourcing non-executable test.sh + +# 生成摘要列表后签名并导入(略) +# echo /etc/ima/digest_lists/0-metadata_list-compact-test.sh > /sys/kernel/security/ima/digest_list_data +# bash test.sh +hello world +``` diff --git "a/docs/zh/docs/Administration/\350\277\234\347\250\213\350\257\201\346\230\216\357\274\210\351\262\262\351\271\217\345\256\211\345\205\250\345\272\223\357\274\211.md" "b/docs/zh/docs/Administration/\350\277\234\347\250\213\350\257\201\346\230\216\357\274\210\351\262\262\351\271\217\345\256\211\345\205\250\345\272\223\357\274\211.md" new file mode 100644 index 0000000000000000000000000000000000000000..e0b223804caf60e189f51b1cf2722c510f8bce28 --- /dev/null +++ "b/docs/zh/docs/Administration/\350\277\234\347\250\213\350\257\201\346\230\216\357\274\210\351\262\262\351\271\217\345\256\211\345\205\250\345\272\223\357\274\211.md" @@ -0,0 +1,412 @@ +# 远程证明(鲲鹏安全库) + +## 介绍 + +本项目开发运行在鲲鹏处理器上的基础安全软件组件,前期主要聚焦在远程证明等可信计算相关领域,使能社区安全开发者。 + +## 软件架构 + +在未使能TEE的平台上,本项目可提供平台远程证明特性,其软件架构如下图所示: + +![img](./figures/RA-arch-1.png) + +在已使能TEE的平台上,本项目可提供TEE远程证明特性,其软件架构如下图所示: + +![img](./figures/RA-arch-2.png) + +## 安装配置 + +1. 使用yum安装程序的rpm包,命令如下: + + ```shell + # yum install kunpengsecl-ras kunpengsecl-rac kunpengsecl-rahub kunpengsecl-qcaserver kunpengsecl-attester kunpengsecl-tas kunpengsecl-devel + ``` + +2. 准备数据库环境:进入 `/usr/share/attestation/ras` 目录,执行 `prepare-database-env.sh` 脚本进行自动化的数据库环境配置。 + +3. 程序运行时依赖的配置文件有三个路径,分别为:当前路径 `./config.yaml` ,家路径 `${HOME}/.config/attestation/ras(rac)(rahub)(qcaserver)(attester)(tas)/config.yaml` ,以及系统路径 `/etc/attestation/ras(rac)(rahub)(qcaserver)(attester)(tas)/config.yaml` 。 + +4. (可选)如果需要创建家目录配置文件,可在安装好rpm包后,执行位于 `/usr/share/attestation/ras(rac)(rahub)(qcaserver)(attester)(tas)` 下的脚本 `prepare-ras(rac)(hub)(qca)(attester)(tas)conf-env.sh` 从而完成家目录配置文件的部署。 + +## 相关参数 + +### RAS启动参数 + +命令行输入 `ras` 即可启动RAS程序。请注意,在当前目录下需要提供**ECDSA**公钥并命名为 `ecdsakey.pub` 。相关参数如下: + +```shell + -H --https http/https模式开关,默认为https(true),false=http + -h --hport https模式下RAS监听的restful api端口 + -p, --port string RAS监听的client api端口 + -r, --rest string http模式下RAS监听的restful api端口 + -T, --token 生成一个测试用的验证码并退出 + -v, --verbose 打印更详细的RAS运行时日志信息 + -V, --version 打印RAS版本并退出 +``` + +### RAC启动参数 + +命令行输入 `sudo raagent` 即可启动RAC程序,请注意,物理TPM模块的开启需要sudo权限。相关参数如下: + +```shell + -s, --server string 指定待连接的RAS服务端口 + -t, --test 以测试模式启动 + -v, --verbose 打印更详细的RAC运行时日志信息 + -V, --version 打印RAC版本并退出 + -i, --imalog 指定ima文件路径 + -b, --bioslog 指定bios文件路径 + -T, --tatest 以TA测试模式启动 +``` + +**注意:** +>1.若要使用TEE远程证明特性,需要以非TA测试模式启动RAC,并将待证明TA的uuid、是否使用TCB、mem_hash和img_hash按序放入RAC执行路径下的**talist**文件内。同时预装由TEE团队提供的**libqca.so**库和**libteec.so**库。**talist**文件格式如下: +> +>```text +>e08f7eca-e875-440e-9ab0-5f381136c600 false ccd5160c6461e19214c0d8787281a1e3c4048850352abe45ce86e12dd3df9fde 46d5019b0a7ffbb87ad71ea629ebd6f568140c95d7b452011acfa2f9daf61c7a +>``` +> +>2.若不使用TEE远程证明特性,则需要将 `${DESTDIR}/usr/share/attestation/qcaserver` 目录下的libqca.so库和libteec.so库复制到 `/usr/lib` 或 `/usr/lib64` 目录,并以TA测试模式启动RAC。 + +### QCA启动参数 + +命令行输入 `${DESTDIR}/usr/bin/qcaserver` 即可启动QCA程序,请注意,这里必须要使用qcaserver的完整路径以正常启动QTA,同时需要使QTA中的CA路径参数与该路径保持相同。相关参数如下: + +```shell + -C, --scenario int 设置程序的应用场景,默认为no_as场景(0),1=as_no_daa场景,2=as_with_daa场景 + -S, --server string 指定开放的服务器地址/端口 +``` + +### ATTESTER启动参数 + +命令行输入 `attester` 即可启动ATTESTER程序。相关参数如下: + +```shell + -B, --basevalue string 设置基准值文件读取路径 + -M, --mspolicy int 设置度量策略,默认为-1,需要手动指定。1=仅比对img-hash值,2=仅比对hash值,3=同时比对img-hash和hash两个值 + -S, --server string 指定待连接的服务器地址 + -U, --uuid int 指定待验证的可信应用 + -V, --version 打印程序版本并退出 + -T, --test 读取固定的nonce值以匹配目前硬编码的可信报告 +``` + +### TAS启动参数 + +命令行输入 `tas` 即可启动TAS程序。相关参数如下: + +```shell + -T, --token 生成一个测试用的验证码并退出 +``` + +**注意:** +>1.若要启用TAS服务,需要先为TAS配置好私钥。可以按如下命令修改家目录下的配置文件: +> +>```shell +># cd ${HOME}/.config/attestation/tas +># vim config.yaml +> # 如下DAA_GRP_KEY_SK_X和DAA_GRP_KEY_SK_Y的值仅用于测试,正常使用前请务必更新其内容以保证安全。 +>tasconfig: +> port: 127.0.0.1:40008 +> rest: 127.0.0.1:40009 +> akskeycertfile: ./ascert.crt +> aksprivkeyfile: ./aspriv.key +> huaweiitcafile: ./Huawei IT Product CA.pem +> DAA_GRP_KEY_SK_X: 65a9bf91ac8832379ff04dd2c6def16d48a56be244f6e19274e97881a776543c65a9bf91ac8832379ff04dd2c6def16d48a56be244f6e19274e97881a776543c +> DAA_GRP_KEY_SK_Y: 126f74258bb0ceca2ae7522c51825f980549ec1ef24f81d189d17e38f1773b56126f74258bb0ceca2ae7522c51825f980549ec1ef24f81d189d17e38f1773b56 +>``` +> +>之后再输入`tas`启动TAS程序。 +> +>2.在有TAS环境中,为提高QCA配置证书的效率,并非每一次启动都需要访问TAS以生成相应证书,而是通过证书的本地化存储,即读取QCA侧 `config.yaml` 中配置的证书路径,通过 `func hasAKCert(s int) bool` 函数检查是否已有TAS签发的证书保存于本地,若成功读取证书,则无需访问TAS,若读取证书失败,则需要访问TAS,并将TAS返回的证书保存于本地。 + +## 接口定义 + +### RAS接口 + +为了便于管理员对目标服务器、RAS以及目标服务器上部署的TEE中的用户 TA 进行管理,本程序设计了以下接口可供调用: + +| 接口 | 方法 | +| --------------------------------- | --------------------------- | +| / | GET | +| /{id} | GET、POST、DELETE | +| /{from}/{to} | GET | +| /{id}/reports | GET | +| /{id}/reports/{reportid} | GET、DELETE | +| /{id}/basevalues | GET | +| /{id}/newbasevalue | POST | +| /{id}/basevalues/{basevalueid} | GET、POST、DELETE | +| /{id}/ta/{tauuid}/status | GET | +| /{id}/ta/{tauuid}/tabasevalues | GET | +| /{id}/ta/{tauuid}/tabasevalues/{tabasevalueid} | GET、POST、DELETE | +| /{id}/ta/{tauuid}/newtabasevalue | POST | +| /{id}/ta/{tauuid}/tareports | GET | +| /{id}/ta/{tauuid}/tareports/{tareportid} | GET、POST、DELETE | +| /{id}/basevalues/{basevalueid} | GET、DELETE | +| /version | GET | +| /config | GET、POST | +| /{id}/container/status | GET | +| /{id}/device/status | GET | + +上述接口的具体用法分别介绍如下。 + +若需要查询所有服务器的信息,可以使用`"/"`接口。 + +```shell +# curl -X GET -H "Content-Type: application/json" http://localhost:40002/ +``` + +*** +若需要查询目标服务器的详细信息,可以使用`"/{id}"`接口的`GET`方法,其中{id}是RAS为目标服务器分配的唯一标识号。 + +```shell +# curl -X GET -H "Content-Type: application/json" http://localhost:40002/1 +``` + +*** +若需要修改目标服务器的信息,可以使用`"/{id}"`接口的`POST`方法,其中$AUTHTOKEN是事先使用`ras -T`自动生成的身份验证码。 + +```go +type clientInfo struct { + Registered *bool `json:"registered"` // 目标服务器注册状态 + IsAutoUpdate *bool `json:"isautoupdate"`// 目标服务器基准值更新策略 +} +``` + +```shell +# curl -X POST -H "Authorization: $AUTHTOKEN" -H "Content-Type: application/json" http://localhost:40002/1 -d '{"registered":false, "isautoupdate":false}' +``` + +*** +若需要删除目标服务器,可以使用`"/{id}"`接口的`DELETE`方法。 +**注意:** +>使用该方法并非删除目标服务器的所有信息,而是把目标服务器的注册状态置为`false`! + +```shell +# curl -X DELETE -H "Authorization: $AUTHTOKEN" -H "Content-Type: application/json" http://localhost:40002/1 +``` + +*** +若需要查询指定范围内的所有服务器信息,可以使用`"/{from}/{to}"`接口的`GET`方法。 + +```shell +# curl -X GET -H "Content-Type: application/json" http://localhost:40002/1/9 +``` + +*** +若需要查询目标服务器的所有可信报告,可以使用`"/{id}/reports"`接口的`GET`方法。 + +```shell +# curl -X GET -H "Content-Type: application/json" http://localhost:40002/1/reports +``` + +*** +若需要查询目标服务器指定可信报告的详细信息,可以使用`"/{id}/reports/{reportid}"`接口的`GET`方法,其中{reportid}是RAS为目标服务器指定可信报告分配的唯一标识号。 + +```shell +# curl -X GET -H "Content-Type: application/json" http://localhost:40002/1/reports/1 +``` + +*** +若需要删除目标服务器指定可信报告,可以使用`"/{id}/reports/{reportid}"`接口的`DELETE`方法。 +**注意:** +>使用该方法将删除指定可信报告的所有信息,将无法再通过接口对该报告进行查询! + +```shell +# curl -X DELETE -H "Authorization: $AUTHTOKEN" -H "Content-Type: application/json" http://localhost:40002/1/reports/1 +``` + +*** +若需要查询目标服务器的所有基准值,可以使用`"/{id}/basevalues"`接口的`GET`方法。 + +```shell +# curl -X GET -H "Content-Type: application/json" http://localhost:40002/1/basevalues +``` + +*** +若需要给目标服务器新增一条基准值信息,可以使用`"/{id}/newbasevalue"`接口的`POST`方法。 + +```go +type baseValueJson struct { + BaseType string `json:"basetype"` // 基准值类型 + Uuid string `json:"uuid"` // 容器或设备的标识号 + Name string `json:"name"` // 基准值名称 + Enabled bool `json:"enabled"` // 基准值是否可用 + Pcr string `json:"pcr"` // PCR值 + Bios string `json:"bios"` // BIOS值 + Ima string `json:"ima"` // IMA值 + IsNewGroup bool `json:"isnewgroup"` // 是否为一组新的基准值 +} +``` + +```shell +# curl -X POST -H "Authorization: $AUTHTOKEN" -H "Content-Type: application/json" http://localhost:40002/1/newbasevalue -d '{"name":"test", "basetype":"host", "enabled":true, "pcr":"testpcr", "bios":"testbios", "ima":"testima", "isnewgroup":true}' +``` + +*** +若需要查询目标服务器指定基准值的详细信息,可以使用`"/{id}/basevalues/{basevalueid}"`接口的`GET`方法,其中{basevalueid}是RAS为目标服务器指定基准值分配的唯一标识号。 + +```shell +# curl -X GET -H "Content-Type: application/json" http://localhost:40002/1/basevalues/1 +``` + +*** +若需要修改目标服务器指定基准值的可用状态,可以使用`"/{id}/basevalues/{basevalueid}"`接口的`POST`方法。 + +```shell +# curl -X POST -H "Content-type: application/json" -H "Authorization: $AUTHTOKEN" http://localhost:40002/1/basevalues/1 -d '{"enabled":true}' +``` + +*** +若需要删除目标服务器指定基准值,可以使用`"/{id}/basevalues/{basevalueid}"`接口的`DELETE`方法。 +**注意:** +>使用该方法将删除指定基准值的所有信息,将无法再通过接口对该基准值进行查询! + +```shell +# curl -X DELETE -H "Authorization: $AUTHTOKEN" -H "Content-Type: application/json" http://localhost:40002/1/basevalues/1 +``` + +*** +若需要查询目标服务器上特定用户 TA 的可信状态,可以使用`"/{id}/ta/{tauuid}/status"`接口的GET方法。其中{id}是RAS为目标服务器分配的唯一标识号,{tauuid}是特定用户 TA 的身份标识号。 + +```shell +# curl -X GET -H "Content-type: application/json" -H "Authorization: $AUTHTOKEN" http://localhost:40002/1/ta/test/status +``` + +*** +若需要查询目标服务器上特定用户 TA 的所有基准值信息,可以使用`"/{id}/ta/{tauuid}/tabasevalues"`接口的GET方法。 + +```shell +# curl -X GET -H "Content-type: application/json" http://localhost:40002/1/ta/test/tabasevalues +``` + +*** +若需要查询目标服务器上特定用户 TA 的指定基准值的详细信息,可以使用`"/{id}/ta/{tauuid}/tabasevalues/{tabasevalueid}"`接口的GET方法。其中{tabasevalueid}是RAS为目标服务器上特定用户 TA 的指定基准值分配的唯一标识号。 + +```shell +# curl -X GET -H "Content-type: application/json" http://localhost:40002/1/ta/test/tabasevalues/1 +``` + +*** +若需要修改目标服务器上特定用户 TA 的指定基准值的可用状态,可以使用`"/{id}/ta/{tauuid}/tabasevalues/{tabasevalueid}"`接口的`POST`方法。 + +```shell +# curl -X POST -H "Content-type: application/json" -H "Authorization: $AUTHTOKEN" http://localhost:40002/1/ta/test/tabasevalues/1 --data '{"enabled":true}' +``` + +*** +若需要删除目标服务器上特定用户 TA 的指定基准值,可以使用`"/{id}/ta/{tauuid}/tabasevalues/{tabasevalueid}"`接口的`DELETE`方法。 +**注意:** +>使用该方法将删除指定基准值的所有信息,将无法再通过接口对该基准值进行查询! + +```shell +# curl -X DELETE -H "Content-type: application/json" -H "Authorization: $AUTHTOKEN" -k http://localhost:40002/1/ta/test/tabasevalues/1 +``` + +*** +若需要给目标服务器上特定用户 TA 新增一条基准值信息,可以使用`"/{id}/ta/{tauuid}/newtabasevalue"`接口的`POST`方法。 + +```go +type tabaseValueJson struct { + Uuid string `json:"uuid"` // 用户 TA 的标识号 + Name string `json:"name"` // 基准值名称 + Enabled bool `json:"enabled"` // 基准值是否可用 + Valueinfo string `json:"valueinfo"` // 镜像哈希值和内存哈希值 +} +``` + +```shell +# curl -X POST -H "Content-Type: application/json" -H "Authorization: $AUTHTOKEN" -k http://localhost:40002/1/ta/test/newtabasevalue -d '{"uuid":"test", "name":"testname", "enabled":true, "valueinfo":"test info"}' +``` + +*** +若需要查询目标服务器上特定用户 TA 的所有可信报告,可以使用`"/{id}/ta/{tauuid}/tareports"`接口的`GET`方法。 + +```shell +# curl -X GET -H "Content-type: application/json" http://localhost:40002/1/ta/test/tareports +``` + +*** +若需要查询目标服务器上特定用户 TA 的指定可信报告的详细信息,可以使用`"/{id}/ta/{tauuid}/tareports/{tareportid}"`接口的`GET`方法,其中{tareportid}是RAS为目标服务器上特定用户 TA 的指定可信报告分配的唯一标识号。 + +```shell +# curl -X GET -H "Content-type: application/json" http://localhost:40002/1/ta/test/tareports/2 +``` + +*** +若需要删除目标服务器上特定用户 TA 的指定可信报告,可以使用`"/{id}/ta/{tauuid}/tareports/{tareportid}"`接口的`DELETE`方法。 +**注意:** +>使用该方法将删除指定可信报告的所有信息,将无法再通过接口对该报告进行查询! + +```shell +# curl -X DELETE -H "Content-type: application/json" http://localhost:40002/1/ta/test/tareports/2 +``` + +*** +若需要获取本程序的版本信息,可以使用`"/version"`接口的`GET`方法。 + +```shell +# curl -X GET -H "Content-Type: application/json" http://localhost:40002/version +``` + +*** +若需要查询目标服务器/RAS/数据库的配置信息,可以使用`"/config"`接口的`GET`方法。 + +```shell +# curl -X GET -H "Content-Type: application/json" http://localhost:40002/config +``` + +*** +若需要修改目标服务器/RAS/数据库的配置信息,可以使用`"/config"`接口的`POST`方法。 + +```go +type cfgRecord struct { + // 目标服务器配置 + HBDuration string `json:"hbduration" form:"hbduration"` + TrustDuration string `json:"trustduration" form:"trustduration"` + DigestAlgorithm string `json:"digestalgorithm" form:"digestalgorithm"` + // RAS配置 + MgrStrategy string `json:"mgrstrategy" form:"mgrstrategy"` + ExtractRules string `json:"extractrules" form:"extractrules"` + IsAllupdate *bool `json:"isallupdate" form:"isallupdate"` + LogTestMode *bool `json:"logtestmode" form:"logtestmode"` +} +``` + +```shell +# curl -X POST -H "Authorization: $AUTHTOKEN" -H "Content-Type: application/json" http://localhost:40002/config -d '{"hbduration":"5s","trustduration":"20s","DigestAlgorithm":"sha256"}' +``` + +### TAS接口 + +为了便于管理员对TAS服务的远程控制,本程序设计了以下接口可供调用: + +| 接口 | 方法 | +| --------------------| ------------------| +| /config | GET、POST | + +若需要查询TAS的配置信息,可使用`"/config"`接口的`GET`方法: + +```shell +# curl -X GET -H "Content-Type: application/json" http://localhost:40009/config +``` + +*** +若需要修改TAS的配置信息,可使用`"/config"`接口的`POST`方法: + +```shell +curl -X POST -H "Content-Type: application/json" -H "Authorization: $AUTHTOKEN" http://localhost:40009/config -d '{"basevalue":"testvalue"}' +``` + +**注意:** +>TAS的配置信息读取与修改目前仅支持基准值 + +## FAQ + +1. RAS安装后,为什么无法启动? + + >因为在当前RAS的设计逻辑中,程序启动后需要从当前目录查找一份名为 `ecdsakey.pub` 的文件进行读取并作为之后访问该程序的身份验证码,若当前目录没有该文件,则RAS启动会报错。 + >>解决方法一:运行 `ras -T` 生成测试用token后会生成 `ecdsakey.pub` 。 + >>解决方法二:自行部署oauth2认证服务后,将对应JWT token生成方对应的验证公钥保存为 `ecdsakey.pub` 。 + +2. 为什么RAS启动后,通过restapi无法访问? + + >因为RAS默认以https模式启动,您需要向RAS提供合法的证书才能正常访问,而http模式下启动的RAS则不需要提供证书。 \ No newline at end of file diff --git a/docs/zh/docs/ApplicationDev/FAQ.md "b/docs/zh/docs/ApplicationDev/\345\270\270\350\247\201\351\227\256\351\242\230\344\270\216\350\247\243\345\206\263\346\226\271\346\263\225.md" similarity index 86% rename from docs/zh/docs/ApplicationDev/FAQ.md rename to "docs/zh/docs/ApplicationDev/\345\270\270\350\247\201\351\227\256\351\242\230\344\270\216\350\247\243\345\206\263\346\226\271\346\263\225.md" index b997844d819dd62509e039d9483c4305053927ee..4214cec89612a68eef0da120006d756a41f7facf 100644 --- a/docs/zh/docs/ApplicationDev/FAQ.md +++ "b/docs/zh/docs/ApplicationDev/\345\270\270\350\247\201\351\227\256\351\242\230\344\270\216\350\247\243\345\206\263\346\226\271\346\263\225.md" @@ -1,6 +1,6 @@ -# FAQ +# 常见问题与解决方法 -## 部分依赖java-devel的应用程序自编译失败 +## 问题1:部分依赖java-devel的应用程序自编译失败 ### 问题描述 diff --git "a/docs/zh/docs/BiSheng-Autotuner/BiSheng-Autotuner\344\275\277\347\224\250\346\214\207\345\215\227.md" "b/docs/zh/docs/BiSheng-Autotuner/BiSheng-Autotuner\344\275\277\347\224\250\346\214\207\345\215\227.md" new file mode 100644 index 0000000000000000000000000000000000000000..bb9b201919fb4c875b33b371367d9b1efdc6c87c --- /dev/null +++ "b/docs/zh/docs/BiSheng-Autotuner/BiSheng-Autotuner\344\275\277\347\224\250\346\214\207\345\215\227.md" @@ -0,0 +1,252 @@ +# BiSheng-Autotuner 使用手册 + +## BiSheng-Autotuner 介绍 + +[BiSheng-Autotuner](https://gitee.com/openeuler/BiSheng-Autotuner) 是一个基于 BiSheng-opentuner 的命令行工具,与支持调优的编译器(如 BiSheng 编译器、LLVM for openEuler、GCC for openEuler)配合使用。它负责生成搜索空间、操作参数并驱动整个调优过程。 + +[BiSheng-opentuner](https://github.com/Huawei-CPLLab/bisheng-opentuner) 是一个开源框架,用于构建特定领域的、多目标程序的自动调优器。 + +>![](./public_sys-resources/icon-note.gif) **说明:** GCC for openEuler的细粒度调优请参考 [AI4C用户使用指南](../AI4C/AI4C用户使用指南.md)。 + +## BiSheng-Autotuner 调优流程 + +调优流程(如图一所示)由两个阶段组成:初始编译阶段 (initial compilation)和调优阶段 (tuning process)。 + +![图1 BiSheng-Autotuner调优流程](figures/image1.png) + +图1 BiSheng-Autotuner调优流程 + +### 初始编译阶段 + +初始编译阶段发生在调优开始之前,BiSheng-Autotuner首先会让编译器对目标程序代码做一次编译,在编译的过程中,编译器会生成一些包含所有可调优结构的YAML文件,告诉我们在这个目标程序中哪些结构可以用来调优,比如文件(module)、函数(function)、循环(loop)。例如,循环展开是编译器中最常见的优化方法之一,它通过多次复制循环体代码,达到增大指令调度的空间,减少循环分支指令的开销等优化效果。若以循环展开次数(unroll factor)为对象进行调优,编译器会在 YAML 文件中生成所有可被循环展开的循环作为可调优结构。 + +### 调优阶段 + +当可调优结构顺利生成之后,调优阶段便会开始: + +1. BiSheng-Autotuner 首先读取生成好的可调优结构的YAML 文件,从而产生对应的搜索空间,也就是生成针对每个可调优代码结构的具体的参数和范围。 + +2. 调优阶段会根据设定的搜索算法尝试一组参数的值,生成一个YAML格式的编译配置文件(compilation config),从而让编译器编译目标程序代码产生二进制文件。 + +3. 最后 Autotuner 将编译好的文件以用户定义的方式运行并取得性能信息作为反馈。 + +4. 经过一定数量的迭代之后,Autotuner 将找出最终的最优配置,生成最优编译配置文件,以 YAML 的形式储存。 + +## BiSheng-Autotuner 使用 + +### 环境要求 + +必选: + +- 操作系统:openEuler 24.03 LTS 系列、openEuler 25.03 及后续的 openEuler 系统版本 + +- 架构:AArch64、X86_64 + +- Python 3.11.x + +- SQLite 3.0 + +可选: + +- LibYAML:推荐安装,可提升 BiSheng-Autotuner 文件解析速度 + +### BiSheng-Autotuner 获取 + +若用户使用最新的 openEuler 系统,可以直接安装 `BiSheng-Autotuner` 和 `clang` 软件包。 + +```shell +yum install -y BiSheng-Autotuner +yum install -y clang +``` + +若需源码构建 `BiSheng-Autotuner` ,可以参考以下步骤。 + +1. 安装 [BiSheng-opentuner](https://gitee.com/openeuler/BiSheng-opentuner/tree/master) + + ```shell + yum install -y BiSheng-opentuner + ``` + +2. 克隆并安装 [BiSheng-Autotuner](https://gitee.com/openeuler/BiSheng-Autotuner/tree/master) + + ```shell + cd BiSheng-Autotuner + ./dev_install.sh + ``` +### BiSheng-Autotuner 运行 + +我们将以 coremark 为示例展示如何运行自动调优, coremark 源码请从 [github 社区](https://github.com/eembc/coremark)获取。更多 llvm-auotune 的详细用法,请参阅[帮助信息](#帮助信息)章节。以下为以 20 次迭代调优 coremark 的脚本示例: + +``` +export AUTOTUNE_DATADIR=/tmp/autotuner_data/ +CompileCommand="clang -O2 -o coremark core_list_join.c core_main.c core_matrix.c core_state.c core_util.c posix/core_portme.c -DPERFORMANCE_RUN=1 -DITERATIONS=300000 -I. -Iposix -g -DFLAGS_STR=\"\"" + +$CompileCommand -fautotune-generate; +llvm-autotune minimize; +for i in $(seq 20) +do + $CompileCommand -fautotune ; + time=`{ /usr/bin/time -p ./coremark 0x0 0x0 0x66 300000; } 2>&1 | grep "real" | awk '{print $2}'`; + echo "iteration: " $i "cost time:" $time; + llvm-autotune feedback $time; +done +llvm-autotune finalize; +``` +以下为分步说明: + +1. 配置环境变量 + + 使用环境变量 `AUTOTUNE_DATADIR` 指定调优相关的数据的存放位置(指定目录需要为空)。 + + ``` + export AUTOTUNE_DATADIR=/tmp/autotuner_data/ + ``` + +2. 初始编译步骤 + + 添加编译器选项 `-fautotune-generate` ,编译生成可调优代码结构。 + + ``` + cd examples/coremark/ + clang -O2 -o coremark core_list_join.c core_main.c core_matrix.c core_state.c core_util.c posix/core_portme.c -DPERFORMANCE_RUN=1 -DITERATIONS=300000 -I. -Iposix -g -DFLAGS_STR=\"\" -fautotune-generate + ``` + + >![](./public_sys-resources/icon-notice.gif)**注意:** + > 建议仅将此选项应用于需要重点调优的热点代码文件。若应用的代码文件过多(超过 500 个文件),则会生成数量庞大的可调优代码结构的文件,进而可能导致步骤3的初始化时间长(可长达数分钟),以及巨大的搜索空间导致的调优效果不显著、收敛时间长等问题。 + +3. 初始化调优 + + 运行 `llvm-autotune` 命令,初始化调优任务。生成最初的编译配置供下一次编译使用。 + + ``` + llvm-autotune minimize + ``` + + `minimize` 表示调优目标,旨在最小化指标(例如程序运行时间)。也可使用 `maximize` ,旨在最大化指标(例如程序吞吐量)。 + +4. 调优编译步骤 + + 添加毕昇编译器选项 `-fautotune` ,读取当前 `AUTOTUNE_DATADIR` 配置并编译。 + + ``` + clang -O2 -o coremark core_list_join.c core_main.c core_matrix.c core_state.c core_util.c posix/core_portme.c -DPERFORMANCE_RUN=1 -DITERATIONS=300000 -I. -Iposix -g -DFLAGS_STR=\"\" -fautotune + ``` + +5. 性能反馈 + + 用户运行程序,并根据自身需求获取性能数字,使用 `llvm-autotune feedback` 反馈。例如,如果我们想以 coremark 运行速度为指标进行调优,可以采用如下方式: + + ``` + time -p ./coremark 0x0 0x0 0x66 300000 2>&1 1>/dev/null | grep real | awk '{print $2}' + ``` + + ![2](figures/image2.png) + + ``` + llvm-autotune feedback 31.09 + ``` + + >![](./public_sys-resources/icon-notice.gif)**注意:** + > 建议在使用 `llvm-autotune feedback` 之前, 先验证步骤 4 编译是否正常,及编译好的程序是否运行正确。若出现编译或者运行异常的情况,请输入相应调优目标的最差值(例如,调优目标为 minimize ,可输入 `llvm-autotune feedback 9999` ;maximize 可输入 0 或者 -9999)。 + > + > 若输入的性能反馈不正确,可能会影响最终调优的结果。 + +6. 调优迭代 + + 根据用户设定的迭代次数,重复4和5进行调优迭代。 + +7. 结束调优 + + 进行多次迭代后,用户可选择终止调优,并保存最优的配置文件。配置文件会被保存在环境变量AUTOTUNE_DATADIR指定的目录下。 + + ``` + llvm-autotune finalize + ``` + +8. 最终编译 + + 使用步骤7得到最优配置文件,进行最后编译。在环境变量未改变的情况下,可直接使用-fautotune选项: + + ``` + clang -O2 -o coremark core_list_join.c core_main.c core_matrix.c core_state.c core_util.c posix/core_portme.c -DPERFORMANCE_RUN=1 -DITERATIONS=300000 -I. -Iposix -g -DFLAGS_STR=\"\" -fautotune + ``` + + 或者使用 `-mllvm -auto-tuning-input=` 直接指向配置文件。 + + ``` + clang -O2 -o coremark core_list_join.c core_main.c core_matrix.c core_state.c core_util.c posix/core_portme.c -DPERFORMANCE_RUN=1 -DITERATIONS=300000 -I. -Iposix -g -DFLAGS_STR=\"\" -mllvm -auto-tuning-input=/tmp/autotuner_data/config.yaml + ``` + +### 帮助信息 + +llvm-autotune 执行格式如下所示: + +``` +llvm-autotune [-h] {minimize,maximize,feedback,dump,finalize} +``` + +可选指令: + +- `minimize`:初始化调优并生成初始的编译器配置文件,旨在最小化指标(例如运行时间)。 + +- `maximize`:初始化调优并生成初始的编译器配置文件,旨在最大程度地提高指标(例如吞吐量)。 + +- `feedback`:反馈性能调优结果并生成新的编译器配置。 + +- `dump`:生成当前的最优配置,而不终止调优(可继续执行 `feedback`)。 + +- `finalize`: 终止调优,并生成最佳的编译器配置(不可再执行 `feedback`)。 + +帮助信息: + +- `--help/-h` + + ``` + usage: llvm-autotune [-h] {minimize,maximize,feedback,dump,finalize} ... + + positional arguments: + {minimize,maximize,feedback,dump,finalize} + minimize Initialize tuning and generate the initial compiler + configuration file, aiming to minimize the metric + (e.g. run time) + maximize Initialize tuning and generate the initial compiler + configuration file, aiming to maximize the metric + (e.g. throughput) + feedback Feed back performance tuning result and generate a new + test configuration + dump Dump the current best configuration without + terminating the tuning run + finalize Finalize tuning and generate the optimal compiler + configuration + + optional arguments: + -h, --help show this help message and exit + ``` + +### 编译器相关选项 + +llvm-autotune 需要与 LLVM 编译器选项 `-fautotune-generate` 和 `-fautotune` 配合使用。 + +- `-fautotune-generate`: + + - 在 `autotune_datadir` 目录下生成可调优的代码结构列表,此默认目录可由环境变量 `AUTOTUNE_DATADIR` 改写。 + + - 作为调优准备工作的第一步,通常需要在 `llvm-autotune minimize/maximize` 命令执行前使用。 + + - 此选项还可以赋值来改变调优的颗粒度(可选值为`Other`、`Function`、`Loop`、`CallSite`, `MachineBasicBlock`、`Switch`、`LLVMParam`、`ProgramParam`,其中 `LLVMParam` 和 `ProgramParam` 对应粗粒度选项调优)。例如 `-fautotune-generate=Loop` 会开启类型仅为循环的可调优代码结构,每个循环在调优过程中会被赋予不同的参数值;而 `Other` 表示全局,生成的可调优代码结构对应每个编译单元(代码文件)。 + + - `-fautotune-generate`默认等效于`-fautotune-generate=Function,Loop,CallSite`。通常建议使用默认值。 + + - 若要启用选项调优(`LLVMParam`和`ProgramParam`),需要为 llvm-autotune 指定拓展搜索空间,默认的搜索空间不包含预设调优选项。 + + ``` + llvm-autotune minimize --search-space /usr/lib64/python/site-packages/autotuner/search_space_config/extended_search_space.yaml + ``` + + `site-packages`目录可以通过 `pip show autotuner` 指令找到。 + +- -fautotune: + + - 使用`autotune_datadir`下的编译器配置进行调优编译(此默认目录可由环境变量`AUTOTUNE_DATADIR`改写); + + - 通常在调优迭代过程中,`llvm-autotune minimize/maximize/feedback` 命令之后使用。 \ No newline at end of file diff --git a/docs/zh/docs/BiSheng-Autotuner/figures/image1.png b/docs/zh/docs/BiSheng-Autotuner/figures/image1.png new file mode 100644 index 0000000000000000000000000000000000000000..10e002c1402574e3e5b1f9d4f050efb4f439c22e Binary files /dev/null and b/docs/zh/docs/BiSheng-Autotuner/figures/image1.png differ diff --git a/docs/zh/docs/BiSheng-Autotuner/figures/image2.png b/docs/zh/docs/BiSheng-Autotuner/figures/image2.png new file mode 100644 index 0000000000000000000000000000000000000000..0953ed58710d150ff4fec8fad4410dc62206ba28 Binary files /dev/null and b/docs/zh/docs/BiSheng-Autotuner/figures/image2.png differ diff --git "a/docs/zh/docs/CPDS/CPDS\344\273\213\347\273\215.md" "b/docs/zh/docs/CPDS/CPDS\344\273\213\347\273\215.md" index 32db5d854b367cd149dcedfbfaa0767146d8d18c..09478e1242efa02be97c3a539a5c5cc7256e14a2 100644 --- "a/docs/zh/docs/CPDS/CPDS\344\273\213\347\273\215.md" +++ "b/docs/zh/docs/CPDS/CPDS\344\273\213\347\273\215.md" @@ -12,7 +12,7 @@ CPDS (Container Problem Detect System) 容器故障检测系统,是由北京 **2. 集群异常检测** -采集各节点原始数据,基于异常规则对采集的原始数据进行异常检测,提取关键信息。同时基于异常规则对采集数据进行异常检测,后将检测结果数据和原始据进行在线上传,并同步进行持久化操作。 +采集各节点原始数据,基于异常规则对采集的原始数据进行异常检测,提取关键信息。同时基于异常规则对采集数据进行异常检测,后将检测结果数据和原始数据进行在线上传,并同步进行持久化操作。 **3. 节点、业务容器故障/亚健康诊断** diff --git "a/docs/zh/docs/CPDS/\344\275\277\347\224\250\346\211\213\345\206\214.md" "b/docs/zh/docs/CPDS/\344\275\277\347\224\250\346\211\213\345\206\214.md" index 87f563c8f6ca7cf031030cdb98baf627371b8211..8e30500797a04e92156d50779e1bf8141d846a66 100644 --- "a/docs/zh/docs/CPDS/\344\275\277\347\224\250\346\211\213\345\206\214.md" +++ "b/docs/zh/docs/CPDS/\344\275\277\347\224\250\346\211\213\345\206\214.md" @@ -61,7 +61,7 @@ CPDS页面布局分为导航栏、导航菜单、操作区。 | ---- | ---- | | 容器健康状态 | 显示集群中运行中的容器个数占全部容器个数的百分比,并显示全部容器、运行中的容器、停止的容器的个数。 | | 集群节点状态 | 显示在线节点占全部节点的百分比,并显示全部节点、在线节点、离线节点的个数。 | -| 集群资源用量 | 显示集群 CUP、内容、磁盘的使用的量、总量和使用百分比。 | +| 集群资源用量 | 显示集群 CPU、内存、磁盘的使用的量、总量和使用百分比。 | | 节点监控状态 | 显示集群节点的 ip 地址、节点状态、节点运行容器数量占比。点击下方的查看更多,会跳转至“监控告警-节点健康”,可以查看更详细的节点信息。 | | 诊断结果 | 显示触发规则的名称、当前状态、规则第一次触发的时间,以及后续触发的最新时间。点击下方的查看更多,会跳转至“健康诊断-诊断结果”,查看更详细的诊断结果。 | diff --git "a/docs/zh/docs/CTinspector/\345\256\211\350\243\205\344\270\216\351\203\250\347\275\262.md" "b/docs/zh/docs/CTinspector/\345\256\211\350\243\205\344\270\216\351\203\250\347\275\262.md" index c41d2f9fc608806f0a37a07954221cca31da0163..aa22af4b62f041e2eb58834a50af5c803d0bf0d5 100644 --- "a/docs/zh/docs/CTinspector/\345\256\211\350\243\205\344\270\216\351\203\250\347\275\262.md" +++ "b/docs/zh/docs/CTinspector/\345\256\211\350\243\205\344\270\216\351\203\250\347\275\262.md" @@ -1,9 +1,5 @@ # 安装与部署 -## 软件要求 - -* 操作系统:openEuler 23.09 - ## 硬件要求 * x86_64架构 diff --git "a/docs/zh/docs/CVE-ease/CVE-ease\344\273\213\347\273\215\345\222\214\345\256\211\350\243\205\350\257\264\346\230\216.md" "b/docs/zh/docs/CVE-ease/CVE-ease\344\273\213\347\273\215\345\222\214\345\256\211\350\243\205\350\257\264\346\230\216.md" index 4397783bdddb3579b86d628b17824cb274d00df9..b1347d0cebc7f1a9d907088fb7a9d9cf386684ff 100644 --- "a/docs/zh/docs/CVE-ease/CVE-ease\344\273\213\347\273\215\345\222\214\345\256\211\350\243\205\350\257\264\346\230\216.md" +++ "b/docs/zh/docs/CVE-ease/CVE-ease\344\273\213\347\273\215\345\222\214\345\256\211\350\243\205\350\257\264\346\230\216.md" @@ -168,7 +168,7 @@ db_user = db_password = db_host = db_port = -product = openEuler-23.09 +product = openEuler-{version} expiration_days = 14 # notifier diff --git a/docs/zh/docs/CertSignature/figures/cert-tree.png b/docs/zh/docs/CertSignature/figures/cert-tree.png new file mode 100644 index 0000000000000000000000000000000000000000..cfbea157cb7b7308d668196ca5b0b0386067fd9e Binary files /dev/null and b/docs/zh/docs/CertSignature/figures/cert-tree.png differ diff --git a/docs/zh/docs/CertSignature/figures/mokutil-db.png b/docs/zh/docs/CertSignature/figures/mokutil-db.png new file mode 100644 index 0000000000000000000000000000000000000000..82dbe6e04cafe3e9ac039ba19acd5996d4cf2259 Binary files /dev/null and b/docs/zh/docs/CertSignature/figures/mokutil-db.png differ diff --git a/docs/zh/docs/CertSignature/figures/mokutil-sb-off.png b/docs/zh/docs/CertSignature/figures/mokutil-sb-off.png new file mode 100644 index 0000000000000000000000000000000000000000..f3018c9fd0236e9c2cf560f0da3827ed2a877f6d Binary files /dev/null and b/docs/zh/docs/CertSignature/figures/mokutil-sb-off.png differ diff --git a/docs/zh/docs/CertSignature/figures/mokutil-sb-on.png b/docs/zh/docs/CertSignature/figures/mokutil-sb-on.png new file mode 100644 index 0000000000000000000000000000000000000000..449b6774dc61a601cf884845fbd0be5d314108e1 Binary files /dev/null and b/docs/zh/docs/CertSignature/figures/mokutil-sb-on.png differ diff --git a/docs/zh/docs/CertSignature/figures/mokutil-sb-unsupport.png b/docs/zh/docs/CertSignature/figures/mokutil-sb-unsupport.png new file mode 100644 index 0000000000000000000000000000000000000000..525c72f78b897ffaba0d356406ab9d9e64024d91 Binary files /dev/null and b/docs/zh/docs/CertSignature/figures/mokutil-sb-unsupport.png differ diff --git "a/docs/zh/docs/CertSignature/\345\256\211\345\205\250\345\220\257\345\212\250.md" "b/docs/zh/docs/CertSignature/\345\256\211\345\205\250\345\220\257\345\212\250.md" new file mode 100644 index 0000000000000000000000000000000000000000..1ee911d4009e22aff405c7b049d765b8c866c883 --- /dev/null +++ "b/docs/zh/docs/CertSignature/\345\256\211\345\205\250\345\220\257\345\212\250.md" @@ -0,0 +1,70 @@ +# 安全启动 + +## 概述 +安全启动(Secure Boot)就是利用公私钥对启动部件进行签名和验证。在启动过程中,前一个部件验证后一个部件的数字签名,如果能验证通过,则运行后一个部件;如果验证不通过,则暂停启动。通过安全启动可以保证系统启动过程中各个部件的完整性,防止没有经过认证的部件被加载运行,从而防止对系统及用户数据产生安全威胁。 + +安全启动涉及的验证组件: BIOS->shim->grub->vmlinuz(依次验签通过并加载),其中vmlinuz是内核镜像。 + +相关的EFI启动组件采用signcode方式进行签名。公钥证书由BIOS集成到签名数据库DB中,启动过程中BIOS对shim进行验证,shim和grub组件从BIOS的签名数据库DB中获取公钥证书并对下一级组件进行验证。 + +## 背景和解决方案 +前期openEuler版本中,安全启动相关组件没有签名,无法直接使用安全启动功能保障系统组件的完整性。 + +从22.03-LTS-SP3版本开始,openEuler使用社区签名平台对OS侧的相关组件进行签名,包括grub和vmlinuz组件,并将社区签名根证书内嵌于shim组件中。 + +从24.03-LTS版本开始,openEuler提供了CFCA签名的安全启动组件。 + +## 使能安全启动 + +### 前置条件 +- 已安装openEuler-22.03-LTS-SP3及以上版本(若使用CFCA安全启动,则需安装openEuler-24.03-LTS及以上版本) +- 配置openEuler-everything源 +- 系统配置为uefi启动方式 + +### 使用步骤 +**步骤1:** 获取安全启动证书 + +如采用openEuler证书,则在如下网址获取:,进入“证书中心”目录下载。网页上根证书识别名称为“openEuler Shim Default CA”,default-x509ca.cert。 + +如采用CFCA官网获取根证书,则在如下网址获取:(当前证书暂未发布,如需使用,可联系openEuler安全委员会openeuler-security@openeuler.org) + +**步骤2:** 将获取的证书放置到/boot/efi/EFI目录: +``` +mv <证书文件> /boot/efi/EFI/ +``` + +**步骤3:** 安装shim-signed子包(若不使用CFCA签名的安全启动组件则跳过): +``` +yum install -y shim-signed +``` + +**步骤4:** 如使用openEuler签名的shim文件,则跳过该步骤;如使用CFCA的shim文件,则需要安装如下步骤备份以及替换: +``` +mv /boot/efi/EFI/openEuler/shimx64.efi /boot/efi/EFI/openEuler/shimx64_bck.efi +mv /boot/efi/EFI/BOOT/BOOTX64.EFI /boot/efi/EFI/BOOT/BOOTX64_bck.EFI +cp /boot/efi/EFI/BOOT/BOOTX64_CFCA.EFI /boot/efi/EFI/BOOT/BOOTX64.EFI +cp /boot/efi/EFI/BOOT/BOOTX64_CFCA.EFI /boot/efi/EFI/openEuler/shimx64.efi +``` + +**步骤5:** 将根证书导入BIOS的db证书库中,并在BIOS中开启安全启动开关,可实现安全启动功能。BIOS证书导入方法及安全启动开启方法可参考具体BIOS厂商提供的资料。 + +**步骤6:** 重启后,查看系统安全启动状态 +``` +mokutil --sb +``` +- SecureBoot disabled:安全启动关闭 + +![](./figures/mokutil-sb-off.png) + +- SecureBoot enabled:安全启动开启 + +![](./figures/mokutil-sb-on.png) + +- not supported:系统不支持安全启动 + +![](./figures/mokutil-sb-unsupport.png) + +## 约束限制 +- **软件限制**:OS系统需要采用UEFI启动 +- **架构限制**:ARM/X86 +- **硬件约束**:需要BIOS支持安全启动相关校验功能 \ No newline at end of file diff --git "a/docs/zh/docs/CertSignature/\346\200\273\344\275\223\346\246\202\350\277\260.md" "b/docs/zh/docs/CertSignature/\346\200\273\344\275\223\346\246\202\350\277\260.md" new file mode 100644 index 0000000000000000000000000000000000000000..80028df2a1176b23b01be0ed3c754846fe32dbea --- /dev/null +++ "b/docs/zh/docs/CertSignature/\346\200\273\344\275\223\346\246\202\350\277\260.md" @@ -0,0 +1,27 @@ +# 认识证书和签名 + +## 概述 +数字签名是保护操作系统完整性的重要技术。通过对系统关键组件添加签名,并在后续的组件启动加载、运行访问等流程中进行签名验证,可以有效检查组件的完整性,避免组件被篡改而导致的安全问题。业界已支持多种系统完整性保护机制,在系统运行的各个阶段对不同类型的组件进行完整性保护,典型的技术机制有: + +- 安全启动; +- 内核模块签名; +- IMA完整性度量架构; +- RPM签名验证。 + +上述完整性保护的安全机制都需要依赖签名(通常需要在组件发布阶段集成),而开源社区普遍存在缺乏签名私钥和证书管理机制,因此社区发布的操作系统发行版通常不提供默认签名,或仅使用构建阶段临时生成的私钥进行签名。往往需要用户或者下游OSV厂商进行二次签名后才可开启这些完整性保护安全机制,增加了安全功能的使用成本并降低了易用性。 + +## 解决方案 +openEuler社区基础设施支持签名服务,通过签名平台统一管理签名私钥和证书,并与EulerMaker构建平台结合,实现在社区发行版的软件包构建过程中对关键文件进行自动签名。当前支持的文件类型有: + +- EFI文件; +- 内核模块文件; +- IMA摘要列表文件; +- RPM软件包。 + +## 约束限制 +openEuler社区的签名服务功能存在如下约束限制: + +- 当前仅支持为openEuler社区官方发布分支进行签名,暂不支持对个人构建分支进行签名; +- 当前仅支持对OS安全启动相关的EFI文件进行签名,包括shim、grub、kernel文件; +- 当前仅支持对kernel软件包提供的内核模块文件进行签名。 + diff --git "a/docs/zh/docs/CertSignature/\347\255\276\345\220\215\350\257\201\344\271\246\344\273\213\347\273\215.md" "b/docs/zh/docs/CertSignature/\347\255\276\345\220\215\350\257\201\344\271\246\344\273\213\347\273\215.md" new file mode 100644 index 0000000000000000000000000000000000000000..2023390d9d33df4bdcc7e5cae5cbe63e6e47c907 --- /dev/null +++ "b/docs/zh/docs/CertSignature/\347\255\276\345\220\215\350\257\201\344\271\246\344\273\213\347\273\215.md" @@ -0,0 +1,47 @@ +# 签名证书介绍 + +openEuler当前支持两种签名机制:openPGP和CMS,分别用于不同的文件类型: + +| 文件类型 | 签名类型 | 签名格式 | +| --------------- | ------------ | -------- | +| EFI文件 | authenticode | CMS | +| 内核模块文件 | modsig | CMS | +| IMA摘要列表文件 | modsig | CMS | +| RPM软件包 | RPM | openPGP | + +## openPGP证书签名 + +openEuler通过openPGP证书实现RPM软件包的签名,签名证书随操作系统镜像发布,用户可通过两种方式获取到当前openEuler版本所使用的证书: + +方法一:通过REPO源下载,以openEuler 24.03 LTS版本为例,可通过如下路径下载: + +``` +https://repo.openeuler.org/openEuler-24.03-LTS/OS/aarch64/RPM-GPG-KEY-openEuler +``` + +方法二:进入系统通过指定路径获取: + +``` +cat /etc/pki/rpm-gpg/RPM-GPG-KEY-openEuler +``` + +## CMS证书签名 + +openEuler签名平台采用三级证书链管理签名的私钥和证书: + +![](./figures/cert-tree.png) + +根据不同等级的证书分别具有不同的有效期,当前规划为: + +| 证书类型 | 有效期 | +| -------- | ------ | +| 根证书 | 30年 | +| 二级证书 | 10年 | +| 三级证书 | 3年 | + +openEuler根证书可通过社区证书中心下载: + +``` +https://www.openeuler.org/zh/security/certificate-center/ +``` + diff --git "a/docs/zh/docs/Container/CRI\346\216\245\345\217\243.md" "b/docs/zh/docs/Container/CRI-v1alpha2\346\216\245\345\217\243.md" similarity index 99% rename from "docs/zh/docs/Container/CRI\346\216\245\345\217\243.md" rename to "docs/zh/docs/Container/CRI-v1alpha2\346\216\245\345\217\243.md" index 0695f784358f8ee6126124e6eb2b0ba6dc03418b..9d1ac2bd92929e7d10ef959c79c1ae0cd8b5fd96 100644 --- "a/docs/zh/docs/Container/CRI\346\216\245\345\217\243.md" +++ "b/docs/zh/docs/Container/CRI-v1alpha2\346\216\245\345\217\243.md" @@ -1,4 +1,4 @@ -# CRI接口 +# CRI V1alpha2 接口 ## 描述 @@ -6,7 +6,7 @@ CRI API 接口是由kubernetes 推出的容器运行时接口,CRI定义了容 因为容器运行时与镜像的生命周期是彼此隔离的,因此需要定义两个服务。该接口使用[Protocol Buffer](https://developers.google.com/protocol-buffers/)定义,基于[gRPC](https://grpc.io/)。 -当前实现CRI版本为v1alpha1版本,官方API描述文件如下: +当前iSulad使用默认CRI版本为v1alpha2版本,官方API描述文件如下: [https://github.com/kubernetes/kubernetes/blob/release-1.14/pkg/kubelet/apis/cri/runtime/v1alpha2/api.proto](https://github.com/kubernetes/kubernetes/blob/release-1.14/pkg/kubelet/apis/cri/runtime/v1alpha2/api.proto), diff --git "a/docs/zh/docs/Container/CRI-v1\346\216\245\345\217\243.md" "b/docs/zh/docs/Container/CRI-v1\346\216\245\345\217\243.md" new file mode 100644 index 0000000000000000000000000000000000000000..55202eb71688de94b726bda2b07be6f86761bca7 --- /dev/null +++ "b/docs/zh/docs/Container/CRI-v1\346\216\245\345\217\243.md" @@ -0,0 +1,312 @@ +# CRI V1接口支持 + +## 概述 + +CRI(Container Runtime Interface, 容器运行时接口)是kublet与容器引擎通信使用的主要协议。 +在K8S 1.25及之前,K8S存在CRI v1alpha2 和 CRI V1两种版本的CRI接口,但从1.26开始,K8S仅提供对于CRI V1的支持。 + +iSulad同时提供对[CRI v1alpha2](./CRI-v1alpha2接口.md)和CRI v1的支持, +对于CRI v1,iSulad支持[CRI v1alpha2](./CRI-v1alpha2接口.md)所述功能, +并提供对CRI V1中所定义新接口和字段的支持。 + +目前iSulad支持的CRI V1版本为1.29,对应官网描述API如下: + +[https://github.com/kubernetes/cri-api/blob/kubernetes-1.29.0/pkg/apis/runtime/v1/api.proto](https://github.com/kubernetes/cri-api/blob/kubernetes-1.29.0/pkg/apis/runtime/v1/api.proto) + +iSulad使用的API描述文件,与官方API略有出入,以本文档描述的接口为准。 + +## 新增字段描述 + +- **CgroupDriver** + + cgroup驱动的enum值列表 + + + + + + + + + + + + + +

参数成员

+

描述

+

SYSTEMD = 0

+

systemd cgroup驱动

+

CGROUPFS = 1

+

cgroupfs驱动

+
+ +- **LinuxRuntimeConfiguration** + + 容器引擎所使用的cgroup驱动 + + + + + + + + + + +

参数成员

+

描述

+

CgroupDriver cgroup_driver

+

容器引擎所使用的cgroup驱动枚举值

+
+ +- **ContainerEventType** + + 容器事件类型枚举值 + + + + + + + + + + + + + + + + + + + +

参数成员

+

描述

+

CONTAINER_CREATED_EVENT = 0

+

容器创建类型

+

CONTAINER_STARTED_EVENT = 1

+

容器启动类型

+

CONTAINER_STOPPED_EVENT = 1

+

容器停止类型

+

CONTAINER_DELETED_EVENT = 1

+

容器删除类型

+
+ +- **SwapUsage** + + 虚拟内存使用情况 + + + + + + + + + + + + + + + + +

参数成员

+

描述

+

int64 timestamp

+

时间戳信息

+

UInt64Value swap_available_bytes

+

可使用虚拟内存字节数

+

UInt64Value swap_usage_bytes

+

已使用虚拟内存字节数

+
+ +## 新增接口描述 + +### RuntimeConfig + +#### 接口原型 + +```text +rpc RuntimeConfig(RuntimeConfigRequest) returns (RuntimeConfigResponse) {} +``` + +#### 接口描述 + +获取cgroup驱动配置 cgroupfs 或 systemd-cgroup + +#### 参数 RuntimeConfigRequest + +无字段 + +#### 返回值 RuntimeConfigResponse + + + + + + + + + +

返回值

+

描述

+

LinuxRuntimeConfiguration linux

+

描述cgroupfs或者systemd-cgroup的CgroupDriver枚举值

+
+ +### GetContainerEvents + +#### 接口原型 + +```text +rpc GetContainerEvents(GetEventsRequest) returns (stream ContainerEventResponse) {} +``` + +#### 接口描述 + +获取Pod生命周期事件流 + +#### 参数 GetEventsRequest + +无字段 + +#### 返回值 ContainerEventResponse + + + + + + + + + + + + + + + + + + + + + +

返回值

+

描述

+

string container_id

+

容器id

+

ContainerEventType container_event_type

+

容器事件类型

+

int64 created_at

+

容器事件产生时间

+

PodSandboxStatus pod_sandbox_status

+

容器所属Pod的status信息

+

repeated ContainerStatus containers_statuses

+

容器所属Pod内所有容器的status信息

+
+ +## 变更描述 + +### CRI V1.29更新变更描述 + +#### [获取cgroup驱动配置](https://github.com/kubernetes/kubernetes/pull/118770) + +`RuntimeConfig` 获取cgroup驱动配置 cgroupfs 或 systemd-cgroup + +#### [GetContainerEvents支持pod生命周期事件](https://github.com/kubernetes/kubernetes/pull/111384) + +`GetContainerEents`,提供对pod生命周期相关事件流 + +`PodSandboxStatus`有相应调整,增加ContainerStatuses提供沙箱内容器status信息 + +#### [ContainerStats虚拟内存信息](https://github.com/kubernetes/kubernetes/pull/118865) + +`ContainerStats`新增虚拟内存使用情况信息: `SwapUsage` + +#### [ContainerStatus reason字段OOMKilled设置](https://github.com/kubernetes/kubernetes/pull/112977) + +ContainerStatus中reason字段在cgroup out-of-memory时应该设置为OOMKilled + +#### [PodSecurityContext.SupplementalGroups描述修改](https://github.com/kubernetes/kubernetes/pull/113047) + +描述修改,优化`PodSecurityContext.SupplementalGroups`的注释,明确容器镜像定义的主UID不在该列表下的行为 + +#### [ExecSync输出限制](https://github.com/kubernetes/kubernetes/pull/110435) + +ExecSync返回值输出小于16MB + +## 使用手册 + +### 配置iSulad支持CRI V1 + +该需求需要iSulad对K8S新版本CRI接口1.29提供支持, + +对于1.25及之前的CRI接口,V1alpha2和V1功能保持一致,1.26及之后新增的特性仅在CRI V1中提供支持。 +此次升级的功能和特性仅在CRI V1中提供支持,因此新增特性均需要按照以下配置使能CRI V1。 + +CRI V1使能: + +iSulad daemon.json中enable-cri-v1设置为true,重启iSulad + +```json +{ + "group": "isula", + "default-runtime": "runc", + ... + "enable-cri-v1": true +} +``` + +若通过源码进行编译安装iSulad需开启ENABLE_CRI_API_V1编译选项 + +```bash +cmake ../ -D ENABLE_CRI_API_V1=ON +``` + +### RuntimeConfig获取cgroup驱动配置 + +#### systemd-cgroup配置 + +iSulad同时提供对systemd和cgroupfs两种cgroup驱动支持, +默认使用cgroupfs作为cgroup驱动,可以通过配置iSulad容器引擎提供对systemd cgroup驱动支持。 +iSulad仅提供底层运行时为runc时systemd-cgroup的支持。通过修改iSulad配置文件daemon.json, +设置systemd-cgroup为true,重启iSulad,则使用systemd cgroup驱动。 + +```json +{ + "group": "isula", + "default-runtime": "runc", + ... + "enable-cri-v1": true, + "systemd-cgroup": true +} +``` + +### GetContainerEvents Pod 生命周期事件生成 + +#### Pod Events配置 + +修改iSulad配置文件daemon.json, +设置enable-pod-events为true,重启iSulad。 + +```json +{ + "group": "isula", + "default-runtime": "runc", + ... + "enable-cri-v1": true, + "enable-pod-events": true +} +``` + +## 使用限制 + +1. 以上新增特性,iSulad仅提供容器运行时设置为runc时的支持。 +2. 由于cgroup oom会同时触发容器cgroup路径删除,若iSulad对oom事件处理发生在 +cgroup路径删除之后,iSulad则无法成功捕捉容器oom事件, +可能导致ContainerStatus中reason字段设置不正确。 +3. iSulad不支持交叉使用不同的cgroup驱动管理容器,启动容器后iSulad的cgroup驱动配置不应该发生变化。 diff --git "a/docs/zh/docs/Container/iSula-shim-v2\345\257\271\346\216\245stratovirt.md" "b/docs/zh/docs/Container/iSula-shim-v2\345\257\271\346\216\245stratovirt.md" index 1be4f4bb5bc65156e7e856f8ec08ebe75c19ff46..c604fb8e6adff38c1fc7c610552f3eaf1b11cf28 100755 --- "a/docs/zh/docs/Container/iSula-shim-v2\345\257\271\346\216\245stratovirt.md" +++ "b/docs/zh/docs/Container/iSula-shim-v2\345\257\271\346\216\245stratovirt.md" @@ -198,14 +198,14 @@ containerd-shim-kata-v2 使用的虚拟化组件为 StratoVirt 时,iSula 对 $ lsmod |grep vhost_vsock ``` - 下载对应版本和架构的 kernel 并放到 /var/lib/kata/ 路径下, 如下载 openEuler 21.03 版本 x86 架构的内核 [openeuler repo](): + 下载对应版本和架构的 kernel 并放到 /var/lib/kata/ 路径下, [openeuler repo](): ```bash $ cd /var/lib/kata - $ wget https://repo.openeuler.org/openEuler-21.03/stratovirt_img/x86_64/vmlinux.bin + $ wget https://repo.openeuler.org/openEuler-{version}/stratovirt_img/x86_64/vmlinux.bin ``` -3. 使用 busybox 镜像运行安全容器并检查使用的 runtime 为 io.containerd.kata.v2 +3. 使用 busybox 镜像运行安全容器并检查使用的 runtime 为 io.containerd.kata.v2 ```bash $ id=`isula run -tid busybox sh` @@ -218,7 +218,3 @@ containerd-shim-kata-v2 使用的虚拟化组件为 StratoVirt 时,iSula 对 ```bash $ ps -ef | grep stratovirt ``` - - - - diff --git "a/docs/zh/docs/Container/iSulad\346\224\257\346\214\201CDI.md" "b/docs/zh/docs/Container/iSulad\346\224\257\346\214\201CDI.md" new file mode 100644 index 0000000000000000000000000000000000000000..f066dd81d84eff824c1930fe1ecd70f96866b0d2 --- /dev/null +++ "b/docs/zh/docs/Container/iSulad\346\224\257\346\214\201CDI.md" @@ -0,0 +1,120 @@ +# iSulad支持CDI + +## 概述 + +CDI(Container Device Interface,容器设备接口)是容器运行时的一种规范,用于支持第三方设备。 + +CDI解决了如下问题: +在Linux上,为了使容器具有设备感知能力,过去只需在该容器中暴露一个设备节点。但是,随着设备和软件变得越来越复杂,供应商希望执行更多的操作,例如: + +- 向容器公开设备可能需要公开多个设备节点、从运行时命名空间挂载文件或隐藏procfs条目。 +- 执行容器和设备之间的兼容性检查(例如:检查容器是否可以在指定设备上运行)。 +- 执行特定于运行时的操作(例如:虚拟机与基于Linux容器的运行时)。 +- 执行特定于设备的操作(例如:清理GPU的内存或重新配置FPGA)。 + +在缺乏第三方设备标准的情况下,供应商通常不得不为不同的运行时编写和维护多个插件,甚至直接在运行时中贡献特定于供应商的代码。此外,运行时不统一地暴露插件系统(甚至根本不暴露插件系统),导致在更高级别的抽象(例如Kubernetes设备插件)中重复功能。 + +CDI解决上述问题的方法: +CDI描述了一种允许第三方供应商与设备交互的机制,从而不需要更改容器运行时。 + +使用的机制是一个JSON文件(类似于容器网络接口(CNI)),它允许供应商描述容器运行时应该对容器的OCI规范执行的操作。 + +iSulad目前已支持[CDI v0.6.0](https://github.com/cncf-tags/container-device-interface/blob/v0.6.0/SPEC.md)规范。 + +## 配置iSulad支持CDI + +需要对daemon.json做如下配置,然后重启iSulad: + +```json +{ + ... + "enable-cri-v1": true, + "cdi-spec-dirs": ["/etc/cdi", "/var/run/cdi"], + "enable-cdi": true +} +``` + +其中"cdi-spec-dirs"用于指定CDI specs所在目录,如果不指定则默认为"/etc/cdi", "/var/run/cdi"。 + +## 使用示例 + +### CDI specification实例 + +具体每个字段含义详见[CDI v0.6.0](https://github.com/cncf-tags/container-device-interface/blob/v0.6.0/SPEC.md) + +```bash +$ mkdir /etc/cdi +$ cat > /etc/cdi/vendor.json < ![](./public_sys-resources/icon-note.gif) **说明:** +> +> 安装完成后,需要手工启动isula-build服务。启动请参见[管理服务](isula-build构建工具.md#管理服务)。 + +# 配置与管理服务 + +## 配置服务 + +在安装完 isula-build 软件包之后,systemd 管理服务会以 isula-build 软件包自带的 isula-build 服务端默认配置启动 isula-build 服务。如果 isula-build 服务端的默认配置文件不能满足用户的需求,可以参考如下介绍进行定制化配置。需要注意的是,修改完默认配置之后,需要重启 isula-build 服务端使新配置生效,具体操作可参考下一章节。 + +目前 isula-build 服务端包含如下配置文件: + +* /etc/isula-build/configuration.toml:isula-builder 总体配置文件,用于设置 isula-builder 日志级别、持久化目录和运行时目录、OCI runtime等。其中各参数含义如下: + +| 配置项 | 是否可选 | 配置项含义 | 配置项取值 | +| --------- | -------- | --------------------------------- | ----------------------------------------------- | +| debug | 可选 | 设置是否打开debug日志 | true:打开debug日志
false:关闭debug日志 | +| loglevel | 可选 | 设置日志级别 | debug
info
warn
error | +| run_root | 必选 | 设置运行时数据根目录 | 运行时数据根目录路径,例如/var/run/isula-build/ | +| data_root | 必选 | 设置本地持久化目录 | 本地持久化目录路径,例如/var/lib/isula-build/ | +| runtime | 可选 | 设置runtime种类,目前仅支持runc | runc | +| group | 可选 | 设置本地套接字isula_build.sock文件属组使得加入该组的非特权用户可以操作isula-build | isula | +| experimental | 可选 | 设置是否开启实验特性 | true:开启实验特性;false:关闭实验特性 | + +* /etc/isula-build/storage.toml: 本地持久化存储的配置文件,包含所使用的存储驱动的配置。 + +| 配置项 | 是否可选 | 配置项含义 | +| ------ | -------- | ------------------------------ | +| driver | 可选 | 存储驱动类型,目前支持overlay2 | + + 更多设置可参考 [containers-storage.conf.5](https://github.com/containers/storage/blob/main/docs/containers-storage.conf.5.md)。 + +* /etc/isula-build/registries.toml : 针对各个镜像仓库的配置文件。 + +| 配置项 | 是否可选 | 配置项含义 | +| ------------------- | -------- | ------------------------------------------------------------ | +| registries.search | 可选 | 镜像仓库搜索域,在此list的镜像仓库可以被感知,不在此列的不被感知。 | +| registries.insecure | 可选 | 可访问的不安全镜像仓库地址,在此列表中的镜像仓库将不会通过鉴权,不推荐使用。 | + + 更多设置可参考 [containers-registries.conf.5](https://github.com/containers/image/blob/main/docs/containers-registries.conf.5.md)。 + +* /etc/isula-build/policy.json:镜像pull/push策略文件。当前不支持对其进行配置。 + +> ![](./public_sys-resources/icon-note.gif) **说明:** +> +> * isula-build 支持最大 1MiB 的上述配置文件。 +> * isula-build 不支持将持久化工作目录 dataroot 配置在内存盘上,比如 tmpfs。 +> * isula-build 目前仅支持使用overlay2为底层 graphdriver。 +> * 在设置--group参数前,需保证本地OS已经创建了对应的用户组,且非特权用户已经加入该组。重启isula-builder之后即可使该非特权用户使用isula-build功能。同时,为了保持权限一致性,isula-build的配置文件目录/etc/isula-build的属组也会被设置为--group指定的组。 + +## 管理服务 + +目前 openEuler 采用 systemd 管理软件服务,isula-build 软件包已经自带了 systemd 的服务文件,用户安装完 isula-build 软件包之后,可以直接通过 systemd 工具对它进行服务启停等操作。用户同样可以手动启动 isula-build 服务端软件。需要注意的是,同一个节点上不可以同时启动多个 isula-build 服务端软件。 + +>![](./public_sys-resources/icon-note.gif) **说明:** +> +> 同一个节点上不可以同时启动多个 isula-build 服务端软件。 + +### 通过 systemd 管理(推荐方式) + +用户可以通过如下 systemd 的标准指令控制 isula-build 服务的启动、停止、重启等动作: + +* 启动 isula-build 服务: + + ```sh + sudo systemctl start isula-build.service + ``` + +* 停止 isula-build 服务: + + ```sh + sudo systemctl stop isula-build.service + ``` + +* 重启 isula-build 服务: + + ```sh + sudo systemctl restart isula-build.service + ``` + +isula-build 软件包安装的 systemd 服务文件保存在 `/usr/lib/systemd/system/isula-build.service`。如果用户需要修改 isula-build 服务的 systemd 配置,可以修改该文件,执行如下命令使配置生效,之后再根据上面提到的 systemd 管理指令重启 isula-build 服务 + +```sh +sudo systemctl daemon-reload +``` + +### 直接运行 isula-build 服务端 + +您也可以通过执行 isula-build 服务端命令( isula-builder)的方式启动服务。其中,服务端启动配置,可通过isula-builder命令支持的 flags 设置。isula-build 服务端目前支持的 flags 如下: + +* -D, --debug: 是否开启调测模式。 +* --log-level: 日志级别,支持 “debug”, “info”, “warn” or “error”,默认为 “info”。 +* --dataroot: 本地持久化路径,默认为”/var/lib/isula-build/“。 +* --runroot: 运行时路径,默认为”/var/run/isula-build/“。 +* --storage-driver:底层存储驱动类型。 +* --storage-opt: 底层存储驱动配置。 +* --group: 设置本地套接字isula_build.sock文件属组使得加入该组的非特权用户可以操作isula-build,默认为“isula”。 +* --experimental: 是否开启实验特性,默认为false。 + +>![](./public_sys-resources/icon-note.gif) **说明:** +> +> 当命令行启动参数中传递了与配置文件相同的配置选项时,优先使用命令行参数启动。 + +启动 isula-build 服务。例如指定本地持久化路径/var/lib/isula-build,且不开启调试的参考命令如下: + +```sh +sudo isula-builder --dataroot "/var/lib/isula-build" --debug=false +``` + +# 使用指南 + +## 前提条件 + +isula-build 构建 Dockerfile 内的 RUN 指令时依赖可执行文件 runc ,需要 isula-build 的运行环境上预装好 runc。安装方式视用户使用场景而定,如果用户不需要使用完整的 docker-engine 工具链,则可以仅安装 docker-runc rpm包: + +```sh +sudo yum install -y docker-runc +``` + +如果用户需要使用完整的 docker-engine 工具链,则可以安装 docker-engine rpm包,默认包含可执行文件 runc : + +```sh +sudo yum install -y docker-engine +``` + +>![](./public_sys-resources/icon-note.gif) **说明:** +> +> 用户需保证OCI runtime(runc)可执行文件的安全性,避免被恶意替换。 + +## 总体说明 + +isula-build 客户端提供了一系列命令用于构建和管理容器镜像,当前 isula-build 包含的命令行指令如下: + +* ctr-img,容器镜像管理。ctr-img又包含如下子命令: + * build,根据给定dockerfile构建出容器镜像。 + * images,列出本地容器镜像。 + * import,导入容器基础镜像。 + * load,导入层叠镜像。 + * rm,删除本地容器镜像。 + * save,导出层叠镜像至本地磁盘。 + * tag,给本地容器镜像打tag。 + * pull,拉取镜像到本地。 + * push,推送本地镜像到远程仓库。 +* info,查看isula-build的运行环境和系统信息。 +* login,登录远端容器镜像仓库。 +* logout,退出远端容器镜像仓库。 +* version,查看isula-build和isula-builder的版本号。 +* manifest(实验特性),管理manifest列表。 + +>![](./public_sys-resources/icon-note.gif) **说明:** +> +> * isula-build completion 和 isula-builder completion 命令用于生成bash命令补全脚本。该命令为命令行框架隐式提供,不会显示在help信息中。 +> * isula-build客户端不包含配置文件,当用户需要使用isula-build实验特性时,需要在客户端通过命令`export ISULABUILD_CLI_EXPERIMENTAL=enabled`配置环境变量ISULABUILD_CLI_EXPERIMENTAL来开启实验特性。 + +以下按照上述维度依次详细介绍这些命令行指令的使用。 + +## ctr-img: 容器镜像管理 + +isula-build 将所有容器镜像管理相关命令划分在子命令 `ctr-img` 下,命令原型为: + +```sh +isula-build ctr-img [command] +``` + +### build: 容器镜像构建 + +ctr-img 的子命令 build 用于构建容器镜像,命令原型为: + +```sh +isula-build ctr-img build [flags] +``` + +其中 build 包含如下 flags: + +* --build-arg:string列表,构建过程中需要用到的变量。 +* --build-static:KeyValue值,构建二进制一致性。目前包含如下Key值: + * build-time:string,使用固定时间戳来构建容器镜像;时间戳格式为“YYYY-MM-DD HH-MM-SS”。 +* -f, --filename:string,Dockerfile的路径,不指定则是使用当前路径的Dockerfile文件。 +* --format: string, 设置构建镜像的镜像格式:oci | docker(需开启实验特性选项)。 +* --iidfile:string,输出 image ID 到本地文件。 +* -o, --output:string,镜像导出的方式和路径。 +* --proxy:布尔值,继承主机侧环境的proxy环境变量(默认为true)。 +* --tag:string,设置构建成功的镜像的tag值。 +* --cap-add:string列表,构建过程中RUN指令所需要的权限。 + +**以下为各个 flags 的详解。** + +**\--build-arg** + +从命令行接受参数作为Dockerfile中的参数,用法: + +```sh +$ echo "This is bar file" > bar.txt +$ cat Dockerfile_arg +FROM busybox +ARG foo +ADD ${foo}.txt . +RUN cat ${foo}.txt +$ sudo isula-build ctr-img build --build-arg foo=bar -f Dockerfile_arg +STEP 1: FROM busybox +Getting image source signatures +Copying blob sha256:8f52abd3da461b2c0c11fda7a1b53413f1a92320eb96525ddf92c0b5cde781ad +Copying config sha256:e4db68de4ff27c2adfea0c54bbb73a61a42f5b667c326de4d7d5b19ab71c6a3b +Writing manifest to image destination +Storing signatures +STEP 2: ARG foo +STEP 3: ADD ${foo}.txt . +STEP 4: RUN cat ${foo}.txt +This is bar file +Getting image source signatures +Copying blob sha256:6194458b07fcf01f1483d96cd6c34302ffff7f382bb151a6d023c4e80ba3050a +Copying blob sha256:6bb56e4a46f563b20542171b998cb4556af4745efc9516820eabee7a08b7b869 +Copying config sha256:39b62a3342eed40b41a1bcd9cd455d77466550dfa0f0109af7a708c3e895f9a2 +Writing manifest to image destination +Storing signatures +Build success with image id: 39b62a3342eed40b41a1bcd9cd455d77466550dfa0f0109af7a708c3e895f9a2 +``` + +**\--build-static** + +指定为静态构建,即使用isula-build构建容器镜像时消除所有时间戳和其他构建因素(例如容器ID、hostname等)的差异。最终构建出满足静态要求的容器镜像。 + +在使用isula-build进行容器镜像构建时,假如给 build 子命令一个固定的时间戳,并在限定如下条件的时候: + +* 构建环境前后保持一致。 +* 构建Dockerfile前后保持一致。 +* 构建产生的中间数据前后保持一致。 +* 构建命令相同。 +* 第三方库版本一致。 + +对于容器镜像构建,isula-build支持相同的Dockerfile。如果构建环境相同,则多次构建生成的镜像内容和镜像ID相同。 + +--build-static接受k=v形式的键值对选项,当前支持的选项有: + +* build-time:字符串类型。构建静态镜像的固定时间戳,格式为“YYYY-MM-DD HH-MM-SS”。时间戳影响diff层创建修改时间的文件属性。 + + 使用示例如下: + + ```sh + sudo isula-build ctr-img build -f Dockerfile --build-static='build-time=2020-05-23 10:55:33' . + ``` + + 以此方式,同一环境多次构建出来的容器镜像和镜像ID均会保持一致。 + +**\--format** +开始实验特性后该选项可用,默认为OCI镜像格式。可以手动指定镜像格式进行构建,例如,下面分别为构建OCI镜像格式以及Docker镜像格式镜像的命令。 + + ```sh + export ISULABUILD_CLI_EXPERIMENTAL=enabled; sudo isula-build ctr-img build -f Dockerfile --format oci . + ``` + + ```sh + export ISULABUILD_CLI_EXPERIMENTAL=enabled; sudo isula-build ctr-img build -f Dockerfile --format docker . + ``` + +**\--iidfile** + +将构建的镜像ID输出到文件,用法: + +```sh +isula-build ctr-img build --iidfile filename +``` + +例如,将容器镜像ID输出到testfile的参考命令如下: + + ```sh +sudo isula-build ctr-img build -f Dockerfile_arg --iidfile testfile + ``` + + 查看testfile中的容器镜像ID: + + ```sh +$ cat testfile +76cbeed38a8e716e22b68988a76410eaf83327963c3b29ff648296d5cd15ce7b + ``` + +**\-o, --output** + +目前 -o, --output 支持如下形式: + +* `isulad:image:tag`:将构建成功的镜像直接推送到 iSulad。比如:`-o isulad:busybox:latest`。同时需要注意如下约束: + + * isula-build 和 iSulad 必须在同一个节点上 + * tag必须配置 + * isula-build client端需要将构建成功的镜像暂存成 `/var/tmp/isula-build-tmp-%v.tar` 再导入至 iSulad,用户需要保证 `/var/tmp/` 目录有足够磁盘空间 + +* `docker-daemon:image:tag`:将构建成功的镜像直接推送到 Docker daemon。比如:`-o docker-daemon:busybox:latest`。同时需要注意如下约束: + * isula-build 和 docker 必须在同一个节点上 + * tag必须配置 + +* `docker://registry.example.com/repository:tag`:将构建成功的镜像以Docker镜像格式直接推送到远端镜像仓库。比如:`-o docker://localhost:5000/library/busybox:latest`。 + +* `docker-archive:/:image:tag`:将构建成功的镜像以Docker镜像格式保存至本地。比如:`-o docker-archive:/root/image.tar:busybox:latest`。 + +打开实验特性之后,可以启用相应OCI镜像的构建: + +* `oci://registry.example.com/repository:tag`:将构建成功的镜像以OCI镜像格式直接推送到远端镜像仓库(远程镜像仓库须支持OCI镜像格式)。比如:`-o oci://localhost:5000/library/busybox:latest`。 + +* `oci-archive:/:image:tag`:将构建成功的镜像以OCI镜像的格式保存至本地。比如:`-o oci-archive:/root/image.tar:busybox:latest`。 + +除去各个flags之外,build子命令的命令行最后还会接收一个argument,该argument类型是string,意义为context,即该Dockerfile构建环境的上下文。该参数缺省值为isula-build被执行的当前路径。该路径会影响 .dockerignore 和 Dockerfile的ADD/COPY指令 所检索的路径。 + +**\--proxy** + +选择构建时RUN指令启动的容器是否从环境上继承proxy相关环境变量“http_proxy”,“https_proxy”,“ftp_proxy”,“no_proxy”,“HTTP_PROXY”,“HTTPS_PROXY”,“FTP_PROXY”,“NO_PROXY”,默认为true。 + +当用户在Dockerfile配置proxy相关ARG或ENV,将覆盖所继承的环境变量。 + +注意:若client与daemon不在同一个终端运行,所能继承的环境变量为daemon所在终端的环境变量。 + +**\--tag** + +设置镜像构建成功之后,该镜像在本地磁盘存储时的tag。 + +**\--cap-add** + +添加构建过程中RUN指令所需权限,用法: + +```sh +isula-build ctr-img build --cap-add ${CAP} +``` + +使用举例: + +```sh +sudo isula-build ctr-img build --cap-add CAP_SYS_ADMIN --cap-add CAP_SYS_PTRACE -f Dockerfile +``` + +> ![](./public_sys-resources/icon-note.gif) **说明:** +> +> * isula-build最大支持并发构建100个容器镜像。 +> * isula-build支持Dockerfile最大为1MiB。 +> * isula-build支持 .dockerignore 最大为 1MiB。 +> * 用户需保证Dockerfile文件的权限为仅当前用户可读写,避免别的用户进行篡改。 +> * 构建时,RUN指令会启动容器在容器内进行构建,目前 isula-build 仅支持使用主机网络。 +> * isula-build 导出的镜像压缩格式,目前仅支持tar格式。 +> * isula-build 在每一个镜像构建stage完成后做一次提交,而不是每执行 Dockerfile的一行就提交一次。 +> * isula-build 暂不支持构建缓存。 +> * isula-build 仅在构建RUN指令时会启动构建容器。 +> * 目前不支持docker镜像格式的history功能。 +> * isula-build 的stage name支持以数字开头。 +> * isula-build 的stage name最长可为64个字符。 +> * isula-build 暂不支持对单次Dockerfile的构建进行资源限制。如有资源限制需求,可通过对 isula-builder 服务端配置资源限额的方式进行限制。 +> * isula-build 目前不支持Dockerfile里的ADD指令提供的数据来源是远端url。 +> * isula-build 使用docker-archive以及oci-archive类型导出的本地tar包未经压缩。如有需求,用户可以手动进行压缩。 + +### image: 查看本地持久化构建镜像 + +可通过images命令查看当前本地持久化存储的镜像: + +```sh +$ sudo isula-build ctr-img images +--------------------------------------- ----------- ----------------- ------------------------ ------------ +REPOSITORY TAG IMAGE ID CREATED SIZE +--------------------------------------- ----------- ----------------- ------------------------ ------------ +localhost:5000/library/alpine latest a24bb4013296 2022-01-17 10:02:19 5.85 MB + 39b62a3342ee 2022-01-17 10:01:12 1.45 MB +--------------------------------------- ----------- ----------------- ------------------------ ------------ +``` + +> ![](./public_sys-resources/icon-note.gif) **说明:** +> +> 通过`isula-build ctr-img images`查看的镜像大小与`docker images`的显示上有一定差异。这是因为统计镜像大小时,isula-build是直接计算每层tar包大小之和,而docker是通过解压tar遍历diff目录计算文件大小之和,因此存在统计上的差异。 + +### import: 导入容器基础镜像 + +可以通过`ctr-img import`指令将rootfs形式的tar文件导入到isula-build中。 + +命令原型如下: + +```sh +isula-build ctr-img import [flags] +``` + +使用举例: + +```sh +$ sudo isula-build ctr-img import busybox.tar mybusybox:latest +Getting image source signatures +Copying blob sha256:7b8667757578df68ec57bfc9fb7754801ec87df7de389a24a26a7bf2ebc04d8d +Copying config sha256:173b3cf612f8e1dc34e78772fcf190559533a3b04743287a32d549e3c7d1c1d1 +Writing manifest to image destination +Storing signatures +Import success with image id: "173b3cf612f8e1dc34e78772fcf190559533a3b04743287a32d549e3c7d1c1d1" +$ sudo isula-build ctr-img images +--------------------------------------- ----------- ----------------- ------------------------ ------------ +REPOSITORY TAG IMAGE ID CREATED SIZE +--------------------------------------- ----------- ----------------- ------------------------ ------------ +mybusybox latest 173b3cf612f8 2022-01-12 16:02:31 1.47 MB +--------------------------------------- ----------- ----------------- ------------------------ ------------ +``` + +>![](./public_sys-resources/icon-note.gif) **说明:** +> +> isula-build 支持导入最大1GiB的容器基础镜像。 + +### load: 导入层叠镜像 + +层叠镜像指的是通过 docker save 或 isula-build ctr-img save 等指令,将一个构建完成的镜像保存至本地之后,镜像压缩包内是一层一层 layer.tar 的镜像包。可以通过 ctr-img load 指令将它导入至 isula-build。 + +命令原型如下: + +```sh +isula-build ctr-img load [flags] +``` + +目前支持的 flags 为: + +* -i, --input:本地tar包的路径 + +使用举例如下: + +```sh +$ sudo isula-build ctr-img load -i ubuntu.tar +Getting image source signatures +Copying blob sha256:cf612f747e0fbcc1674f88712b7bc1cd8b91cf0be8f9e9771235169f139d507c +Copying blob sha256:f934e33a54a60630267df295a5c232ceb15b2938ebb0476364192b1537449093 +Copying blob sha256:943edb549a8300092a714190dfe633341c0ffb483784c4fdfe884b9019f6a0b4 +Copying blob sha256:e7ebc6e16708285bee3917ae12bf8d172ee0d7684a7830751ab9a1c070e7a125 +Copying blob sha256:bf6751561805be7d07d66f6acb2a33e99cf0cc0a20f5fd5d94a3c7f8ae55c2a1 +Copying blob sha256:c1bd37d01c89de343d68867518b1155cb297d8e03942066ecb44ae8f46b608a3 +Copying blob sha256:a84e57b779297b72428fc7308e63d13b4df99140f78565be92fc9dbe03fc6e69 +Copying blob sha256:14dd68f4c7e23d6a2363c2320747ab88986dfd43ba0489d139eeac3ac75323b2 +Copying blob sha256:a2092d776649ea2301f60265f378a02405539a2a68093b2612792cc65d00d161 +Copying blob sha256:879119e879f682c04d0784c9ae7bc6f421e206b95d20b32ce1cb8a49bfdef202 +Copying blob sha256:e615448af51b848ecec00caeaffd1e30e8bf5cffd464747d159f80e346b7a150 +Copying blob sha256:f610bd1e9ac6aa9326d61713d552eeefef47d2bd49fc16140aa9bf3db38c30a4 +Copying blob sha256:bfe0a1336d031bf5ff3ce381e354be7b2bf310574cc0cd1949ad94dda020cd27 +Copying blob sha256:f0f15db85788c1260c6aa8ad225823f45c89700781c4c793361ac5fa58d204c7 +Copying config sha256:c07ddb44daa97e9e8d2d68316b296cc9343ab5f3d2babc5e6e03b80cd580478e +Writing manifest to image destination +Storing signatures +Loaded image as c07ddb44daa97e9e8d2d68316b296cc9343ab5f3d2babc5e6e03b80cd580478e +``` + +>![](./public_sys-resources/icon-note.gif) **说明:** +> +> * isula-build 支持导入最大50G的容器层叠镜像。 +> * isula-build 会自动识别容器层叠镜像的格式并进行导入。 + +### rm: 删除本地持久化镜像 + +可通过rm命令删除当前本地持久化存储的镜像。命令原型为: + +```sh +isula-build ctr-img rm IMAGE [IMAGE...] [FLAGS] +``` + +目前支持的 flags 为: + +* -a, --all:删除所有本地持久化存储的镜像。 +* -p, --prune:删除所有没有tag的本地持久化存储的镜像。 + +使用示例如下: + +```sh +$ sudo isula-build ctr-img rm -p +Deleted: sha256:78731c1dde25361f539555edaf8f0b24132085b7cab6ecb90de63d72fa00c01d +Deleted: sha256:eeba1bfe9fca569a894d525ed291bdaef389d28a88c288914c1a9db7261ad12c +``` + +### save: 导出层叠镜像 + +可通过save命令导出层叠镜像到本地磁盘。命令原型如下: + +```sh +isula-build ctr-img save [REPOSITORY:TAG]|imageID -o xx.tar +``` + +目前支持的 flags 为: + +* -f, --format:导出层叠镜像的镜像格式:oci | docker(需开启实验特性选项) +* -o, --output:本地tar包路径 + +以下示例通过 `image/tag` 的形式将镜像进行导出: + +```sh +$ sudo isula-build ctr-img save busybox:latest -o busybox.tar +Getting image source signatures +Copying blob sha256:50644c29ef5a27c9a40c393a73ece2479de78325cae7d762ef3cdc19bf42dd0a +Copying blob sha256:824082a6864774d5527bda0d3c7ebd5ddc349daadf2aa8f5f305b7a2e439806f +Copying blob sha256:5f70bf18a086007016e948b04aed3b82103a36bea41755b6cddfaf10ace3c6ef +Copying config sha256:21c3e96ac411242a0e876af269c0cbe9d071626bdfb7cc79bfa2ddb9f7a82db6 +Writing manifest to image destination +Storing signatures +Save success with image: busybox:latest +``` + +以下示例通过 `ImageID` 的形式将镜像进行导出: + +```sh +$ sudo isula-build ctr-img save 21c3e96ac411 -o busybox.tar +Getting image source signatures +Copying blob sha256:50644c29ef5a27c9a40c393a73ece2479de78325cae7d762ef3cdc19bf42dd0a +Copying blob sha256:824082a6864774d5527bda0d3c7ebd5ddc349daadf2aa8f5f305b7a2e439806f +Copying blob sha256:5f70bf18a086007016e948b04aed3b82103a36bea41755b6cddfaf10ace3c6ef +Copying config sha256:21c3e96ac411242a0e876af269c0cbe9d071626bdfb7cc79bfa2ddb9f7a82db6 +Writing manifest to image destination +Storing signatures +Save success with image: 21c3e96ac411 +``` + +以下示例导出多个镜像到同一个tar包: + +```sh +$ sudo isula-build ctr-img save busybox:latest nginx:latest -o all.tar +Getting image source signatures +Copying blob sha256:eb78099fbf7fdc70c65f286f4edc6659fcda510b3d1cfe1caa6452cc671427bf +Copying blob sha256:29f11c413898c5aad8ed89ad5446e89e439e8cfa217cbb404ef2dbd6e1e8d6a5 +Copying blob sha256:af5bd3938f60ece203cd76358d8bde91968e56491daf3030f6415f103de26820 +Copying config sha256:b8efb18f159bd948486f18bd8940b56fd2298b438229f5bd2bcf4cedcf037448 +Writing manifest to image destination +Storing signatures +Getting image source signatures +Copying blob sha256:e2d6930974a28887b15367769d9666116027c411b7e6c4025f7c850df1e45038 +Copying config sha256:a33de3c85292c9e65681c2e19b8298d12087749b71a504a23c576090891eedd6 +Writing manifest to image destination +Storing signatures +Save success with image: [busybox:latest nginx:latest] +``` + +>![](./public_sys-resources/icon-note.gif) **说明:** +> +> * save 导出的镜像默认格式为未压缩的tar格式,如有需求,用户可以再save之后手动压缩。 +> * 在使用镜像名导出镜像时,需要给出完整的镜像名格式:REPOSITORY:TAG。 + +### tag: 给本地持久化镜像打标签 + +可使用tag命令给本地持久化的容器镜像打tag。命令原型如下: + +```sh +isula-build ctr-img tag / busybox:latest +``` + +使用举例: + +```sh +$ sudo isula-build ctr-img images +--------------------------------------- ----------- ----------------- -------------------------- ------------ +REPOSITORY TAG IMAGE ID CREATED SIZE +--------------------------------------- ----------- ----------------- -------------------------- ------------ +alpine latest a24bb4013296 2020-05-29 21:19:46 5.85 MB +--------------------------------------- ----------- ----------------- -------------------------- ------------ +$ sudo isula-build ctr-img tag a24bb4013296 alpine:v1 +$ sudo isula-build ctr-img images +--------------------------------------- ----------- ----------------- ------------------------ ------------ +REPOSITORY TAG IMAGE ID CREATED SIZE +--------------------------------------- ----------- ----------------- ------------------------ ------------ +alpine latest a24bb4013296 2020-05-29 21:19:46 5.85 MB +alpine v1 a24bb4013296 2020-05-29 21:19:46 5.85 MB +--------------------------------------- ----------- ----------------- ------------------------ ------------ +``` + +### pull: 拉取镜像到本地 + +可通过pull命令拉取远程镜像仓库中的镜像到本地。命令原型如下: + +```sh +isula-build ctr-img pull REPOSITORY[:TAG] +``` + +使用示例: + +```sh +$ sudo isula-build ctr-img pull example-registry/library/alpine:latest +Getting image source signatures +Copying blob sha256:8f52abd3da461b2c0c11fda7a1b53413f1a92320eb96525ddf92c0b5cde781ad +Copying config sha256:e4db68de4ff27c2adfea0c54bbb73a61a42f5b667c326de4d7d5b19ab71c6a3b +Writing manifest to image destination +Storing signatures +Pull success with image: example-registry/library/alpine:latest +``` + +### push: 将本地镜像推送到远程仓库 + +可通过push命令将本地镜像推送到远程仓库。命令原型如下: + +```sh +isula-build ctr-img push REPOSITORY[:TAG] +``` + +目前支持的 flags 为: + +* -f, --format:推送的镜像格式:oci|docker(需开启实验特性选项) + +使用示例: + +```sh +$ sudo isula-build ctr-img push example-registry/library/mybusybox:latest +Getting image source signatures +Copying blob sha256:d2421964bad195c959ba147ad21626ccddc73a4f2638664ad1c07bd9df48a675 +Copying config sha256:f0b02e9d092d905d0d87a8455a1ae3e9bb47b4aa3dc125125ca5cd10d6441c9f +Writing manifest to image destination +Storing signatures +Push success with image: example-registry/library/mybusybox:latest +``` + +>![](./public_sys-resources/icon-note.gif) **说明:** +> +> 推送镜像时,需要先登录对应的镜像仓库 + +## info: 查看运行环境与系统信息 + +可以通过“isula-build info”指令查看 isula-build 目前的运行环境与系统信息。命令原型如下: + +```sh +isula-build info [flags] +``` + +支持如下Flags: + +* -H, --human-readable 布尔值,以常用内存表示格式打印内存信息,使用1000次幂 +* -V, --verbose 布尔值,显示运行时内存占用信息 + +使用示例: + +```sh +$ sudo isula-build info -HV + General: + MemTotal: 7.63 GB + MemFree: 757 MB + SwapTotal: 8.3 GB + SwapFree: 8.25 GB + OCI Runtime: runc + DataRoot: /var/lib/isula-build/ + RunRoot: /var/run/isula-build/ + Builders: 0 + Goroutines: 12 + Store: + Storage Driver: overlay + Backing Filesystem: extfs + Registry: + Search Registries: + oepkgs.net + Insecure Registries: + localhost:5000 + oepkgs.net + Runtime: + MemSys: 68.4 MB + HeapSys: 63.3 MB + HeapAlloc: 7.41 MB + MemHeapInUse: 8.98 MB + MemHeapIdle: 54.4 MB + MemHeapReleased: 52.1 MB +``` + +## login: 登录远端镜像仓库 + +用户可以运行 login 命令来登录远程镜像仓库。命令原型如下: + +```sh + isula-build login SERVER [FLAGS] +``` + +目前支持的flag有: + +```Conf + Flags: + -p, --password-stdin Read password from stdin + -u, --username string Username to access registry +``` + +通过stdin输入密码。以下示例通过通过管道将creds.txt里的密码传给isula-build的stdin进行输入: + +```sh + $ cat creds.txt | sudo isula-build login -u cooper -p mydockerhub.io + Login Succeeded +``` + +通过交互式输入密码: + +```sh + $ sudo isula-build login mydockerhub.io -u cooper + Password: + Login Succeeded +``` + +## logout: 退出远端镜像仓库 + +用户可以运行 logout 命令来登出远程镜像仓库。命令原型如下: + +```sh +isula-build logout [SERVER] [FLAGS] +``` + +目前支持的flag有: + +```sh + Flags: + -a, --all Logout all registries +``` + +使用示例如下: + +```sh +$ sudo isula-build logout -a + Removed authentications +``` + +## version: 版本查询 + +可通过version命令查看当前版本信息: + +```sh +$ sudo isula-build version +Client: + Version: 0.9.6-4 + Go Version: go1.15.7 + Git Commit: 83274e0 + Built: Wed Jan 12 15:32:55 2022 + OS/Arch: linux/amd64 + +Server: + Version: 0.9.6-4 + Go Version: go1.15.7 + Git Commit: 83274e0 + Built: Wed Jan 12 15:32:55 2022 + OS/Arch: linux/amd64 +``` + +## manifest: manifest列表管理 + +manifest列表包含不同系统架构对应的镜像信息,通过使用manifest列表,用户可以在不同的架构中使用相同的manifest(例如openeuler:latest)获取对应架构的镜像,manifest包含create、annotate、inspect和push子命令。 +> ![](./public_sys-resources/icon-note.gif) **说明:** +> +> manifest为实验特性,使用时需开启客户端和服务端的实验选项,方式详见客户端总体说明和配置服务章节。 + +### create: manifest列表创建 + +manifest的子命令create用于创建manifest列表,命令原型为: + +```sh +isula-build manifest create MANIFEST_LIST MANIFEST [MANIFEST...] +``` + +用户可以指定manifest列表的名称以及需要加入到列表中的远程镜像,若不指定任何远程镜像,则会创建一个空的manifest列表。 + +使用示例如下: + +```sh +sudo isula-build manifest create openeuler localhost:5000/openeuler_x86:latest localhost:5000/openeuler_aarch64:latest +``` + +### annotate: manifest列表更新 + +manifest的子命令annotate用于更新manifest列表,命令原型为: + +```sh +isula-build manifest annotate MANIFEST_LIST MANIFEST [flags] +``` + +用户可以指定需要更新的manifest列表以及其中的镜像,通过flags指定需要更新的选项,此命令也可用于添加新的镜像到列表中。 + +其中annotate包含如下flags: + +* --arch: string,重写镜像适用架构 +* --os: string,重写镜像适用系统 +* --os-features: string列表,指定镜像需要的OS特性,很少使用 +* --variant: string,指定列表中记录镜像的变量 + +使用示例如下: + +```sh +sudo isula-build manifest annotate --os linux --arch arm64 openeuler:latest localhost:5000/openeuler_aarch64:latest +``` + +### inspect: manifest列表查询 + +manifest子命令inspect用于查询manifest列表信息,命令原型为: + +```sh +isula-build manifest inspect MANIFEST_LIST +``` + +使用示例如下: + +```sh +$ sudo isula-build manifest inspect openeuler:latest +{ + "schemaVersion": 2, + "mediaType": "application/vnd.docker.distribution.manifest.list.v2+json", + "manifests": [ + { + "mediaType": "application/vnd.docker.distribution.manifest.v2+json", + "size": 527, + "digest": "sha256:bf510723d2cd2d4e3f5ce7e93bf1e52c8fd76831995ac3bd3f90ecc866643aff", + "platform": { + "architecture": "amd64", + "os": "linux" + } + }, + { + "mediaType": "application/vnd.docker.distribution.manifest.v2+json", + "size": 527, + "digest": "sha256:f814888b4bb6149bd39ba8375a1932fb15071b4dbffc7f76c7b602b06abbb820", + "platform": { + "architecture": "arm64", + "os": "linux" + } + } + ] +} +``` + +### push: 将manifest列表推送到远程仓库 + +manifest子命令push用于将manifest列表推送到远程仓库,命令原型为: + +```sh +isula-build manifest push MANIFEST_LIST DESTINATION +``` + +使用示例如下: + +```sh +sudo isula-build manifest push openeuler:latest localhost:5000/openeuler:latest +``` + +# 直接集成容器引擎 + +isula-build可以与iSulad和docker集成,将构建好的容器镜像导入到容器引擎的本地存储中。 + +## 与iSulad集成 + +支持将构建成功的镜像直接导出到iSulad。 + +命令行举例: + +```sh +sudo isula-build ctr-img build -f Dockerfile -o isulad:busybox:2.0 +``` + +通过在-o参数中指定iSulad,将构建好的容器镜像导出到iSulad,可以通过isula images查询: + +```sh +$ sudo isula images +isula images +REPOSITORY TAG IMAGE ID CREATED SIZE +busybox 2.0 2d414a5cad6d 2020-08-01 06:41:36 5.577 MB +``` + +> ![](./public_sys-resources/icon-note.gif) **说明:** +> +> * 要求isula-build和iSulad在同一节点。 +> * 直接导出镜像到iSulad时,isula-build client端需要将构建成功的镜像暂存成 `/var/lib/isula-build/tmp/[buildid]/isula-build-tmp-%v.tar` 再导入至 iSulad,用户需要保证 /var/lib/isula-build/tmp/ 目录有足够磁盘空间;同时如果在导出过程中 isula-build client进程被KILL或Ctrl+C终止,需要依赖用户手动清理 `/var/lib/isula-build/tmp/[buildid]/isula-build-tmp-%v.tar` 文件。 + +## 与Docker集成 + +支持将构建成功的镜像直接导出到Docker daemon。 + +命令行举例: + +```sh +sudo isula-build ctr-img build -f Dockerfile -o docker-daemon:busybox:2.0 +``` + +通过在-o参数中指定docker-daemon,将构建好的容器镜像导出到docker, 可以通过docker images查询。 + +```sh +$ sudo docker images +REPOSITORY TAG IMAGE ID CREATED SIZE +busybox 2.0 2d414a5cad6d 2 months ago 5.22MB +``` + +> ![](./public_sys-resources/icon-note.gif) **说明:** +> +> 要求isula-build和Docker在同一节点。 + +# 使用注意事项 + +本章节主要介绍在使用isula-build构建镜像时相关的约束和限制,以及与docker build的差异。 + +## 约束和限制 + +1. 当导出镜像到[`iSulad`](https://gitee.com/openeuler/iSulad/blob/master/README.md/)时,镜像必须指明tag。 +2. 因为isula-builder运行`RUN`指令时,需要调用系统中的oci 运行时(如`runc`),用户需要保证该运行时的安全性,不受篡改。 +3. `DataRoot`不能设置在内存盘上(tmpfs)。 +4. `Overlay2`是目前isula-builder唯一支持的存储驱动。 +5. `Docker`镜像是目前唯一支持的镜像格式,未来即将支持`oci`格式镜像。 +6. `Dockerfile`文件权限强烈建议设置为**0600**以防止恶意篡改。 +7. `RUN`命令中目前只支持主机侧网络(host network)。 +8. 当导出镜像到本地tar包时,目前只支持保存为`tar`格式。 +9. 当使用`import`功能导入基础镜像时,最大支持**1G**。 + +## 与“docker build”差异 + +`isula-build`兼容[Docker镜像格式规范](https://docs.docker.com/engine/reference/builder/),但仍然和`docker build`存在一些差异: + +1. 支持镜像压缩,即对每个`stage`进行提交而非每一行。 +2. 目前不支持构建缓存。 +3. 只有`RUN`指令会运行容器进行构建。 +4. 目前不支持查询镜像构建历史。 +5. `Stage`名称可以用数字开头。 +6. `Stage`名称最大长度为64。 +7. `ADD`命令不支持远端URL格式。 +8. 暂不支持对单次构建进行资源限额,可采取对isula-builder配置资源限额的方式进行限制。 +9. 统计镜像大小时,isula-build是直接计算每层tar包大小之和,而docker是通过解压tar遍历diff目录计算文件大小之和,因此通过`isula-build ctr-img images`查看的镜像大小与`docker images`的显示上有一定差异。 +10. 操作时的镜像名称需要明确,格式为IMAGE_NAME:IMAGE_TAG。例如 busybox:latest, 其中latest不可省略。 \ No newline at end of file diff --git "a/docs/zh/docs/Container/isula-build\345\270\270\350\247\201\351\227\256\351\242\230\344\270\216\350\247\243\345\206\263\346\226\271\346\263\225.md" "b/docs/zh/docs/Container/isula-build\345\270\270\350\247\201\351\227\256\351\242\230\344\270\216\350\247\243\345\206\263\346\226\271\346\263\225.md" new file mode 100644 index 0000000000000000000000000000000000000000..587eb62ec76152fe737be756305f05025b71600b --- /dev/null +++ "b/docs/zh/docs/Container/isula-build\345\270\270\350\247\201\351\227\256\351\242\230\344\270\216\350\247\243\345\206\263\346\226\271\346\263\225.md" @@ -0,0 +1,6 @@ +# 常见问题与解决方法 +## **问题1:isula-build拉取镜像报错:pinging container registry xx: get xx: dial tcp host:repo: connect: connection refused** + +原因:拉取的镜像来源于非授信仓库。 + +解决方法:修改isula-build镜像仓库的配置文件/etc/isula-build/registries.toml,将该非授信仓库加入[registries.insecure],重启isula-build。 \ No newline at end of file diff --git "a/docs/zh/docs/Container/isula-build\346\236\204\345\273\272\345\267\245\345\205\267.md" "b/docs/zh/docs/Container/isula-build\346\236\204\345\273\272\345\267\245\345\205\267.md" index 9536a5a0ac8d134563ae62edc5aa02db8235db19..682b7a3e6dc17e6f6818bb2224ff07f1c3c5448b 100644 --- "a/docs/zh/docs/Container/isula-build\346\236\204\345\273\272\345\267\245\345\205\267.md" +++ "b/docs/zh/docs/Container/isula-build\346\236\204\345\273\272\345\267\245\345\205\267.md" @@ -1,7 +1,5 @@ # 容器镜像构建 -## 概述 - isula-build是iSula容器团队推出的容器镜像构建工具,支持通过Dockerfile文件快速构建容器镜像。 isula-build采用服务端/客户端模式,其中,isula-build为客户端,提供了一组命令行工具,用于镜像构建及管理等;isula-builder为服务端,用于处理客户端管理请求,作为守护进程常驻后台。 @@ -12,1029 +10,3 @@ isula-build采用服务端/客户端模式,其中,isula-build为客户端, > > isula-build当前支持OCI镜像格式([OCI Image Format Specification](https://github.com/opencontainers/image-spec/blob/main/spec.md/))以及Docker镜像格式([Image Manifest Version 2, Schema 2](https://docs.docker.com/registry/spec/manifest-v2-2/))。通过命令`export ISULABUILD_CLI_EXPERIMENTAL=enabled`开启实验特性以支持OCI镜像格式。不开启实验特性时,isula-build默认采用Docker镜像格式;当开启实验特性后,将默认采用OCI镜像格式。 -## 安装 - -### 环境准备 - -为了确保isula-build成功安装,需满足以下软件硬件要求。 - -* 支持的机器架构:x86_64 和 AArch64 -* 支持的操作系统:openEuler -* 用户具有root权限。 - -#### 安装isula-build - -使用isula-build构建容器镜像,需要先安装以下软件包。 - -##### (推荐)方法一:使用yum安装 - -1. 配置openEuler yum源。 - -2. 使用root权限,登录目标服务器,安装isula-build。 - - ```sh - sudo yum install -y isula-build - ``` - -##### 方法二:使用rpm包安装 - -1. 从openEuler yum源中获取isula-build对应安装包isula-build-*.rpm。例如isula-build-0.9.6-4.oe1.x86_64.rpm。 - -2. 将获取的rpm软件包上传至目标服务器的任一目录,例如 /home/。 - -3. 使用root权限,登录目标服务器,参考如下命令安装isula-build。 - - ```sh - sudo rpm -ivh /home/isula-build-*.rpm - ``` - -> ![](./public_sys-resources/icon-note.gif) **说明:** -> -> 安装完成后,需要手工启动isula-build服务。启动请参见[管理服务](isula-build构建工具.md#管理服务)。 - -## 配置与管理服务 - -### 配置服务 - -在安装完 isula-build 软件包之后,systemd 管理服务会以 isula-build 软件包自带的 isula-build 服务端默认配置启动 isula-build 服务。如果 isula-build 服务端的默认配置文件不能满足用户的需求,可以参考如下介绍进行定制化配置。需要注意的是,修改完默认配置之后,需要重启 isula-build 服务端使新配置生效,具体操作可参考下一章节。 - -目前 isula-build 服务端包含如下配置文件: - -* /etc/isula-build/configuration.toml:isula-builder 总体配置文件,用于设置 isula-builder 日志级别、持久化目录和运行时目录、OCI runtime等。其中各参数含义如下: - -| 配置项 | 是否可选 | 配置项含义 | 配置项取值 | -| --------- | -------- | --------------------------------- | ----------------------------------------------- | -| debug | 可选 | 设置是否打开debug日志 | true:打开debug日志
false:关闭debug日志 | -| loglevel | 可选 | 设置日志级别 | debug
info
warn
error | -| run_root | 必选 | 设置运行时数据根目录 | 运行时数据根目录路径,例如/var/run/isula-build/ | -| data_root | 必选 | 设置本地持久化目录 | 本地持久化目录路径,例如/var/lib/isula-build/ | -| runtime | 可选 | 设置runtime种类,目前仅支持runc | runc | -| group | 可选 | 设置本地套接字isula_build.sock文件属组使得加入该组的非特权用户可以操作isula-build | isula | -| experimental | 可选 | 设置是否开启实验特性 | true:开启实验特性;false:关闭实验特性 | - -* /etc/isula-build/storage.toml: 本地持久化存储的配置文件,包含所使用的存储驱动的配置。 - -| 配置项 | 是否可选 | 配置项含义 | -| ------ | -------- | ------------------------------ | -| driver | 可选 | 存储驱动类型,目前支持overlay2 | - - 更多设置可参考 [containers-storage.conf.5](https://github.com/containers/storage/blob/main/docs/containers-storage.conf.5.md)。 - -* /etc/isula-build/registries.toml : 针对各个镜像仓库的配置文件。 - -| 配置项 | 是否可选 | 配置项含义 | -| ------------------- | -------- | ------------------------------------------------------------ | -| registries.search | 可选 | 镜像仓库搜索域,在此list的镜像仓库可以被感知,不在此列的不被感知。 | -| registries.insecure | 可选 | 可访问的不安全镜像仓库地址,在此列表中的镜像仓库将不会通过鉴权,不推荐使用。 | - - 更多设置可参考 [containers-registries.conf.5](https://github.com/containers/image/blob/main/docs/containers-registries.conf.5.md)。 - -* /etc/isula-build/policy.json:镜像pull/push策略文件。当前不支持对其进行配置。 - -> ![](./public_sys-resources/icon-note.gif) **说明:** -> -> * isula-build 支持最大 1MiB 的上述配置文件。 -> * isula-build 不支持将持久化工作目录 dataroot 配置在内存盘上,比如 tmpfs。 -> * isula-build 目前仅支持使用overlay2为底层 graphdriver。 -> * 在设置--group参数前,需保证本地OS已经创建了对应的用户组,且非特权用户已经加入该组。重启isula-builder之后即可使该非特权用户使用isula-build功能。同时,为了保持权限一致性,isula-build的配置文件目录/etc/isula-build的属组也会被设置为--group指定的组。 - -### 管理服务 - -目前 openEuler 采用 systemd 管理软件服务,isula-build 软件包已经自带了 systemd 的服务文件,用户安装完 isula-build 软件包之后,可以直接通过 systemd 工具对它进行服务启停等操作。用户同样可以手动启动 isula-build 服务端软件。需要注意的是,同一个节点上不可以同时启动多个 isula-build 服务端软件。 - ->![](./public_sys-resources/icon-note.gif) **说明:** -> -> 同一个节点上不可以同时启动多个 isula-build 服务端软件。 - -#### 通过 systemd 管理(推荐方式) - -用户可以通过如下 systemd 的标准指令控制 isula-build 服务的启动、停止、重启等动作: - -* 启动 isula-build 服务: - - ```sh - sudo systemctl start isula-build.service - ``` - -* 停止 isula-build 服务: - - ```sh - sudo systemctl stop isula-build.service - ``` - -* 重启 isula-build 服务: - - ```sh - sudo systemctl restart isula-build.service - ``` - -isula-build 软件包安装的 systemd 服务文件保存在 `/usr/lib/systemd/system/isula-build.service`。如果用户需要修改 isula-build 服务的 systemd 配置,可以修改该文件,执行如下命令使配置生效,之后再根据上面提到的 systemd 管理指令重启 isula-build 服务 - -```sh -sudo systemctl daemon-reload -``` - -#### 直接运行 isula-build 服务端 - -您也可以通过执行 isula-build 服务端命令( isula-builder)的方式启动服务。其中,服务端启动配置,可通过isula-builder命令支持的 flags 设置。isula-build 服务端目前支持的 flags 如下: - -* -D, --debug: 是否开启调测模式。 -* --log-level: 日志级别,支持 “debug”, “info”, “warn” or “error”,默认为 “info”。 -* --dataroot: 本地持久化路径,默认为”/var/lib/isula-build/“。 -* --runroot: 运行时路径,默认为”/var/run/isula-build/“。 -* --storage-driver:底层存储驱动类型。 -* --storage-opt: 底层存储驱动配置。 -* --group: 设置本地套接字isula_build.sock文件属组使得加入该组的非特权用户可以操作isula-build,默认为“isula”。 -* --experimental: 是否开启实验特性,默认为false。 - ->![](./public_sys-resources/icon-note.gif) **说明:** -> -> 当命令行启动参数中传递了与配置文件相同的配置选项时,优先使用命令行参数启动。 - -启动 isula-build 服务。例如指定本地持久化路径/var/lib/isula-build,且不开启调试的参考命令如下: - -```sh -sudo isula-builder --dataroot "/var/lib/isula-build" --debug=false -``` - -## 使用指南 - -### 前提条件 - -isula-build 构建 Dockerfile 内的 RUN 指令时依赖可执行文件 runc ,需要 isula-build 的运行环境上预装好 runc。安装方式视用户使用场景而定,如果用户不需要使用完整的 docker-engine 工具链,则可以仅安装 docker-runc rpm包: - -```sh -sudo yum install -y docker-runc -``` - -如果用户需要使用完整的 docker-engine 工具链,则可以安装 docker-engine rpm包,默认包含可执行文件 runc : - -```sh -sudo yum install -y docker-engine -``` - ->![](./public_sys-resources/icon-note.gif) **说明:** -> -> 用户需保证OCI runtime(runc)可执行文件的安全性,避免被恶意替换。 - -### 总体说明 - -isula-build 客户端提供了一系列命令用于构建和管理容器镜像,当前 isula-build 包含的命令行指令如下: - -* ctr-img,容器镜像管理。ctr-img又包含如下子命令: - * build,根据给定dockerfile构建出容器镜像。 - * images,列出本地容器镜像。 - * import,导入容器基础镜像。 - * load,导入层叠镜像。 - * rm,删除本地容器镜像。 - * save,导出层叠镜像至本地磁盘。 - * tag,给本地容器镜像打tag。 - * pull,拉取镜像到本地。 - * push,推送本地镜像到远程仓库。 -* info,查看isula-build的运行环境和系统信息。 -* login,登录远端容器镜像仓库。 -* logout,退出远端容器镜像仓库。 -* version,查看isula-build和isula-builder的版本号。 -* manifest(实验特性),管理manifest列表。 - ->![](./public_sys-resources/icon-note.gif) **说明:** -> -> * isula-build completion 和 isula-builder completion 命令用于生成bash命令补全脚本。该命令为命令行框架隐式提供,不会显示在help信息中。 -> * isula-build客户端不包含配置文件,当用户需要使用isula-build实验特性时,需要在客户端通过命令`export ISULABUILD_CLI_EXPERIMENTAL=enabled`配置环境变量ISULABUILD_CLI_EXPERIMENTAL来开启实验特性。 - -以下按照上述维度依次详细介绍这些命令行指令的使用。 - -### ctr-img: 容器镜像管理 - -isula-build 将所有容器镜像管理相关命令划分在子命令 `ctr-img` 下,命令原型为: - -```sh -isula-build ctr-img [command] -``` - -#### build: 容器镜像构建 - -ctr-img 的子命令 build 用于构建容器镜像,命令原型为: - -```sh -isula-build ctr-img build [flags] -``` - -其中 build 包含如下 flags: - -* --build-arg:string列表,构建过程中需要用到的变量。 -* --build-static:KeyValue值,构建二进制一致性。目前包含如下Key值: - * build-time:string,使用固定时间戳来构建容器镜像;时间戳格式为“YYYY-MM-DD HH-MM-SS”。 -* -f, --filename:string,Dockerfile的路径,不指定则是使用当前路径的Dockerfile文件。 -* --format: string, 设置构建镜像的镜像格式:oci | docker(需开启实验特性选项)。 -* --iidfile:string,输出 image ID 到本地文件。 -* -o, --output:string,镜像导出的方式和路径。 -* --proxy:布尔值,继承主机侧环境的proxy环境变量(默认为true)。 -* --tag:string,设置构建成功的镜像的tag值。 -* --cap-add:string列表,构建过程中RUN指令所需要的权限。 - -**以下为各个 flags 的详解。** - -**\--build-arg** - -从命令行接受参数作为Dockerfile中的参数,用法: - -```sh -$ echo "This is bar file" > bar.txt -$ cat Dockerfile_arg -FROM busybox -ARG foo -ADD ${foo}.txt . -RUN cat ${foo}.txt -$ sudo isula-build ctr-img build --build-arg foo=bar -f Dockerfile_arg -STEP 1: FROM busybox -Getting image source signatures -Copying blob sha256:8f52abd3da461b2c0c11fda7a1b53413f1a92320eb96525ddf92c0b5cde781ad -Copying config sha256:e4db68de4ff27c2adfea0c54bbb73a61a42f5b667c326de4d7d5b19ab71c6a3b -Writing manifest to image destination -Storing signatures -STEP 2: ARG foo -STEP 3: ADD ${foo}.txt . -STEP 4: RUN cat ${foo}.txt -This is bar file -Getting image source signatures -Copying blob sha256:6194458b07fcf01f1483d96cd6c34302ffff7f382bb151a6d023c4e80ba3050a -Copying blob sha256:6bb56e4a46f563b20542171b998cb4556af4745efc9516820eabee7a08b7b869 -Copying config sha256:39b62a3342eed40b41a1bcd9cd455d77466550dfa0f0109af7a708c3e895f9a2 -Writing manifest to image destination -Storing signatures -Build success with image id: 39b62a3342eed40b41a1bcd9cd455d77466550dfa0f0109af7a708c3e895f9a2 -``` - -**\--build-static** - -指定为静态构建,即使用isula-build构建容器镜像时消除所有时间戳和其他构建因素(例如容器ID、hostname等)的差异。最终构建出满足静态要求的容器镜像。 - -在使用isula-build进行容器镜像构建时,假如给 build 子命令一个固定的时间戳,并在限定如下条件的时候: - -* 构建环境前后保持一致。 -* 构建Dockerfile前后保持一致。 -* 构建产生的中间数据前后保持一致。 -* 构建命令相同。 -* 第三方库版本一致。 - -对于容器镜像构建,isula-build支持相同的Dockerfile。如果构建环境相同,则多次构建生成的镜像内容和镜像ID相同。 - ---build-static接受k=v形式的键值对选项,当前支持的选项有: - -* build-time:字符串类型。构建静态镜像的固定时间戳,格式为“YYYY-MM-DD HH-MM-SS”。时间戳影响diff层创建修改时间的文件属性。 - - 使用示例如下: - - ```sh - sudo isula-build ctr-img build -f Dockerfile --build-static='build-time=2020-05-23 10:55:33' . - ``` - - 以此方式,同一环境多次构建出来的容器镜像和镜像ID均会保持一致。 - -**\--format** -开始实验特性后该选项可用,默认为OCI镜像格式。可以手动指定镜像格式进行构建,例如,下面分别为构建OCI镜像格式以及Docker镜像格式镜像的命令。 - - ```sh - export ISULABUILD_CLI_EXPERIMENTAL=enabled; sudo isula-build ctr-img build -f Dockerfile --format oci . - ``` - - ```sh - export ISULABUILD_CLI_EXPERIMENTAL=enabled; sudo isula-build ctr-img build -f Dockerfile --format docker . - ``` - -**\--iidfile** - -将构建的镜像ID输出到文件,用法: - -```sh -isula-build ctr-img build --iidfile filename -``` - -例如,将容器镜像ID输出到testfile的参考命令如下: - - ```sh -sudo isula-build ctr-img build -f Dockerfile_arg --iidfile testfile - ``` - - 查看testfile中的容器镜像ID: - - ```sh -$ cat testfile -76cbeed38a8e716e22b68988a76410eaf83327963c3b29ff648296d5cd15ce7b - ``` - -**\-o, --output** - -目前 -o, --output 支持如下形式: - -* `isulad:image:tag`:将构建成功的镜像直接推送到 iSulad。比如:`-o isulad:busybox:latest`。同时需要注意如下约束: - - * isula-build 和 iSulad 必须在同一个节点上 - * tag必须配置 - * isula-build client端需要将构建成功的镜像暂存成 `/var/tmp/isula-build-tmp-%v.tar` 再导入至 iSulad,用户需要保证 `/var/tmp/` 目录有足够磁盘空间 - -* `docker-daemon:image:tag`:将构建成功的镜像直接推送到 Docker daemon。比如:`-o docker-daemon:busybox:latest`。同时需要注意如下约束: - * isula-build 和 docker 必须在同一个节点上 - * tag必须配置 - -* `docker://registry.example.com/repository:tag`:将构建成功的镜像以Docker镜像格式直接推送到远端镜像仓库。比如:`-o docker://localhost:5000/library/busybox:latest`。 - -* `docker-archive:/:image:tag`:将构建成功的镜像以Docker镜像格式保存至本地。比如:`-o docker-archive:/root/image.tar:busybox:latest`。 - -打开实验特性之后,可以启用相应OCI镜像的构建: - -* `oci://registry.example.com/repository:tag`:将构建成功的镜像以OCI镜像格式直接推送到远端镜像仓库(远程镜像仓库须支持OCI镜像格式)。比如:`-o oci://localhost:5000/library/busybox:latest`。 - -* `oci-archive:/:image:tag`:将构建成功的镜像以OCI镜像的格式保存至本地。比如:`-o oci-archive:/root/image.tar:busybox:latest`。 - -除去各个flags之外,build子命令的命令行最后还会接收一个argument,该argument类型是string,意义为context,即该Dockerfile构建环境的上下文。该参数缺省值为isula-build被执行的当前路径。该路径会影响 .dockerignore 和 Dockerfile的ADD/COPY指令 所检索的路径。 - -**\--proxy** - -选择构建时RUN指令启动的容器是否从环境上继承proxy相关环境变量“http_proxy”,“https_proxy”,“ftp_proxy”,“no_proxy”,“HTTP_PROXY”,“HTTPS_PROXY”,“FTP_PROXY”,“NO_PROXY”,默认为true。 - -当用户在Dockerfile配置proxy相关ARG或ENV,将覆盖所继承的环境变量。 - -注意:若client与daemon不在同一个终端运行,所能继承的环境变量为daemon所在终端的环境变量。 - -**\--tag** - -设置镜像构建成功之后,该镜像在本地磁盘存储时的tag。 - -**\--cap-add** - -添加构建过程中RUN指令所需权限,用法: - -```sh -isula-build ctr-img build --cap-add ${CAP} -``` - -使用举例: - -```sh -sudo isula-build ctr-img build --cap-add CAP_SYS_ADMIN --cap-add CAP_SYS_PTRACE -f Dockerfile -``` - -> ![](./public_sys-resources/icon-note.gif) **说明:** -> -> * isula-build最大支持并发构建100个容器镜像。 -> * isula-build支持Dockerfile最大为1MiB。 -> * isula-build支持 .dockerignore 最大为 1MiB。 -> * 用户需保证Dockerfile文件的权限为仅当前用户可读写,避免别的用户进行篡改。 -> * 构建时,RUN指令会启动容器在容器内进行构建,目前 isula-build 仅支持使用主机网络。 -> * isula-build 导出的镜像压缩格式,目前仅支持tar格式。 -> * isula-build 在每一个镜像构建stage完成后做一次提交,而不是每执行 Dockerfile的一行就提交一次。 -> * isula-build 暂不支持构建缓存。 -> * isula-build 仅在构建RUN指令时会启动构建容器。 -> * 目前不支持docker镜像格式的history功能。 -> * isula-build 的stage name支持以数字开头。 -> * isula-build 的stage name最长可为64个字符。 -> * isula-build 暂不支持对单次Dockerfile的构建进行资源限制。如有资源限制需求,可通过对 isula-builder 服务端配置资源限额的方式进行限制。 -> * isula-build 目前不支持Dockerfile里的ADD指令提供的数据来源是远端url。 -> * isula-build 使用docker-archive以及oci-archive类型导出的本地tar包未经压缩。如有需求,用户可以手动进行压缩。 - -#### image: 查看本地持久化构建镜像 - -可通过images命令查看当前本地持久化存储的镜像: - -```sh -$ sudo isula-build ctr-img images ---------------------------------------- ----------- ----------------- ------------------------ ------------ -REPOSITORY TAG IMAGE ID CREATED SIZE ---------------------------------------- ----------- ----------------- ------------------------ ------------ -localhost:5000/library/alpine latest a24bb4013296 2022-01-17 10:02:19 5.85 MB - 39b62a3342ee 2022-01-17 10:01:12 1.45 MB ---------------------------------------- ----------- ----------------- ------------------------ ------------ -``` - -> ![](./public_sys-resources/icon-note.gif) **说明:** -> -> 通过`isula-build ctr-img images`查看的镜像大小与`docker images`的显示上有一定差异。这是因为统计镜像大小时,isula-build是直接计算每层tar包大小之和,而docker是通过解压tar遍历diff目录计算文件大小之和,因此存在统计上的差异。 - -#### import: 导入容器基础镜像 - -可以通过`ctr-img import`指令将rootfs形式的tar文件导入到isula-build中。 - -命令原型如下: - -```sh -isula-build ctr-img import [flags] -``` - -使用举例: - -```sh -$ sudo isula-build ctr-img import busybox.tar mybusybox:latest -Getting image source signatures -Copying blob sha256:7b8667757578df68ec57bfc9fb7754801ec87df7de389a24a26a7bf2ebc04d8d -Copying config sha256:173b3cf612f8e1dc34e78772fcf190559533a3b04743287a32d549e3c7d1c1d1 -Writing manifest to image destination -Storing signatures -Import success with image id: "173b3cf612f8e1dc34e78772fcf190559533a3b04743287a32d549e3c7d1c1d1" -$ sudo isula-build ctr-img images ---------------------------------------- ----------- ----------------- ------------------------ ------------ -REPOSITORY TAG IMAGE ID CREATED SIZE ---------------------------------------- ----------- ----------------- ------------------------ ------------ -mybusybox latest 173b3cf612f8 2022-01-12 16:02:31 1.47 MB ---------------------------------------- ----------- ----------------- ------------------------ ------------ -``` - ->![](./public_sys-resources/icon-note.gif) **说明:** -> -> isula-build 支持导入最大1GiB的容器基础镜像。 - -#### load: 导入层叠镜像 - -层叠镜像指的是通过 docker save 或 isula-build ctr-img save 等指令,将一个构建完成的镜像保存至本地之后,镜像压缩包内是一层一层 layer.tar 的镜像包。可以通过 ctr-img load 指令将它导入至 isula-build。 - -命令原型如下: - -```sh -isula-build ctr-img load [flags] -``` - -目前支持的 flags 为: - -* -i, --input:本地tar包的路径 - -使用举例如下: - -```sh -$ sudo isula-build ctr-img load -i ubuntu.tar -Getting image source signatures -Copying blob sha256:cf612f747e0fbcc1674f88712b7bc1cd8b91cf0be8f9e9771235169f139d507c -Copying blob sha256:f934e33a54a60630267df295a5c232ceb15b2938ebb0476364192b1537449093 -Copying blob sha256:943edb549a8300092a714190dfe633341c0ffb483784c4fdfe884b9019f6a0b4 -Copying blob sha256:e7ebc6e16708285bee3917ae12bf8d172ee0d7684a7830751ab9a1c070e7a125 -Copying blob sha256:bf6751561805be7d07d66f6acb2a33e99cf0cc0a20f5fd5d94a3c7f8ae55c2a1 -Copying blob sha256:c1bd37d01c89de343d68867518b1155cb297d8e03942066ecb44ae8f46b608a3 -Copying blob sha256:a84e57b779297b72428fc7308e63d13b4df99140f78565be92fc9dbe03fc6e69 -Copying blob sha256:14dd68f4c7e23d6a2363c2320747ab88986dfd43ba0489d139eeac3ac75323b2 -Copying blob sha256:a2092d776649ea2301f60265f378a02405539a2a68093b2612792cc65d00d161 -Copying blob sha256:879119e879f682c04d0784c9ae7bc6f421e206b95d20b32ce1cb8a49bfdef202 -Copying blob sha256:e615448af51b848ecec00caeaffd1e30e8bf5cffd464747d159f80e346b7a150 -Copying blob sha256:f610bd1e9ac6aa9326d61713d552eeefef47d2bd49fc16140aa9bf3db38c30a4 -Copying blob sha256:bfe0a1336d031bf5ff3ce381e354be7b2bf310574cc0cd1949ad94dda020cd27 -Copying blob sha256:f0f15db85788c1260c6aa8ad225823f45c89700781c4c793361ac5fa58d204c7 -Copying config sha256:c07ddb44daa97e9e8d2d68316b296cc9343ab5f3d2babc5e6e03b80cd580478e -Writing manifest to image destination -Storing signatures -Loaded image as c07ddb44daa97e9e8d2d68316b296cc9343ab5f3d2babc5e6e03b80cd580478e -``` - ->![](./public_sys-resources/icon-note.gif) **说明:** -> -> * isula-build 支持导入最大50G的容器层叠镜像。 -> * isula-build 会自动识别容器层叠镜像的格式并进行导入。 - -#### rm: 删除本地持久化镜像 - -可通过rm命令删除当前本地持久化存储的镜像。命令原型为: - -```sh -isula-build ctr-img rm IMAGE [IMAGE...] [FLAGS] -``` - -目前支持的 flags 为: - -* -a, --all:删除所有本地持久化存储的镜像。 -* -p, --prune:删除所有没有tag的本地持久化存储的镜像。 - -使用示例如下: - -```sh -$ sudo isula-build ctr-img rm -p -Deleted: sha256:78731c1dde25361f539555edaf8f0b24132085b7cab6ecb90de63d72fa00c01d -Deleted: sha256:eeba1bfe9fca569a894d525ed291bdaef389d28a88c288914c1a9db7261ad12c -``` - -#### save: 导出层叠镜像 - -可通过save命令导出层叠镜像到本地磁盘。命令原型如下: - -```sh -isula-build ctr-img save [REPOSITORY:TAG]|imageID -o xx.tar -``` - -目前支持的 flags 为: - -* -f, --format:导出层叠镜像的镜像格式:oci | docker(需开启实验特性选项) -* -o, --output:本地tar包路径 - -以下示例通过 `image/tag` 的形式将镜像进行导出: - -```sh -$ sudo isula-build ctr-img save busybox:latest -o busybox.tar -Getting image source signatures -Copying blob sha256:50644c29ef5a27c9a40c393a73ece2479de78325cae7d762ef3cdc19bf42dd0a -Copying blob sha256:824082a6864774d5527bda0d3c7ebd5ddc349daadf2aa8f5f305b7a2e439806f -Copying blob sha256:5f70bf18a086007016e948b04aed3b82103a36bea41755b6cddfaf10ace3c6ef -Copying config sha256:21c3e96ac411242a0e876af269c0cbe9d071626bdfb7cc79bfa2ddb9f7a82db6 -Writing manifest to image destination -Storing signatures -Save success with image: busybox:latest -``` - -以下示例通过 `ImageID` 的形式将镜像进行导出: - -```sh -$ sudo isula-build ctr-img save 21c3e96ac411 -o busybox.tar -Getting image source signatures -Copying blob sha256:50644c29ef5a27c9a40c393a73ece2479de78325cae7d762ef3cdc19bf42dd0a -Copying blob sha256:824082a6864774d5527bda0d3c7ebd5ddc349daadf2aa8f5f305b7a2e439806f -Copying blob sha256:5f70bf18a086007016e948b04aed3b82103a36bea41755b6cddfaf10ace3c6ef -Copying config sha256:21c3e96ac411242a0e876af269c0cbe9d071626bdfb7cc79bfa2ddb9f7a82db6 -Writing manifest to image destination -Storing signatures -Save success with image: 21c3e96ac411 -``` - -以下示例导出多个镜像到同一个tar包: - -```sh -$ sudo isula-build ctr-img save busybox:latest nginx:latest -o all.tar -Getting image source signatures -Copying blob sha256:eb78099fbf7fdc70c65f286f4edc6659fcda510b3d1cfe1caa6452cc671427bf -Copying blob sha256:29f11c413898c5aad8ed89ad5446e89e439e8cfa217cbb404ef2dbd6e1e8d6a5 -Copying blob sha256:af5bd3938f60ece203cd76358d8bde91968e56491daf3030f6415f103de26820 -Copying config sha256:b8efb18f159bd948486f18bd8940b56fd2298b438229f5bd2bcf4cedcf037448 -Writing manifest to image destination -Storing signatures -Getting image source signatures -Copying blob sha256:e2d6930974a28887b15367769d9666116027c411b7e6c4025f7c850df1e45038 -Copying config sha256:a33de3c85292c9e65681c2e19b8298d12087749b71a504a23c576090891eedd6 -Writing manifest to image destination -Storing signatures -Save success with image: [busybox:latest nginx:latest] -``` - ->![](./public_sys-resources/icon-note.gif) **说明:** -> -> * save 导出的镜像默认格式为未压缩的tar格式,如有需求,用户可以再save之后手动压缩。 -> * 在使用镜像名导出镜像时,需要给出完整的镜像名格式:REPOSITORY:TAG。 - -#### tag: 给本地持久化镜像打标签 - -可使用tag命令给本地持久化的容器镜像打tag。命令原型如下: - -```sh -isula-build ctr-img tag / busybox:latest -``` - -使用举例: - -```sh -$ sudo isula-build ctr-img images ---------------------------------------- ----------- ----------------- -------------------------- ------------ -REPOSITORY TAG IMAGE ID CREATED SIZE ---------------------------------------- ----------- ----------------- -------------------------- ------------ -alpine latest a24bb4013296 2020-05-29 21:19:46 5.85 MB ---------------------------------------- ----------- ----------------- -------------------------- ------------ -$ sudo isula-build ctr-img tag a24bb4013296 alpine:v1 -$ sudo isula-build ctr-img images ---------------------------------------- ----------- ----------------- ------------------------ ------------ -REPOSITORY TAG IMAGE ID CREATED SIZE ---------------------------------------- ----------- ----------------- ------------------------ ------------ -alpine latest a24bb4013296 2020-05-29 21:19:46 5.85 MB -alpine v1 a24bb4013296 2020-05-29 21:19:46 5.85 MB ---------------------------------------- ----------- ----------------- ------------------------ ------------ -``` - -#### pull: 拉取镜像到本地 - -可通过pull命令拉取远程镜像仓库中的镜像到本地。命令原型如下: - -```sh -isula-build ctr-img pull REPOSITORY[:TAG] -``` - -使用示例: - -```sh -$ sudo isula-build ctr-img pull example-registry/library/alpine:latest -Getting image source signatures -Copying blob sha256:8f52abd3da461b2c0c11fda7a1b53413f1a92320eb96525ddf92c0b5cde781ad -Copying config sha256:e4db68de4ff27c2adfea0c54bbb73a61a42f5b667c326de4d7d5b19ab71c6a3b -Writing manifest to image destination -Storing signatures -Pull success with image: example-registry/library/alpine:latest -``` - -#### push: 将本地镜像推送到远程仓库 - -可通过push命令将本地镜像推送到远程仓库。命令原型如下: - -```sh -isula-build ctr-img push REPOSITORY[:TAG] -``` - -目前支持的 flags 为: - -* -f, --format:推送的镜像格式:oci|docker(需开启实验特性选项) - -使用示例: - -```sh -$ sudo isula-build ctr-img push example-registry/library/mybusybox:latest -Getting image source signatures -Copying blob sha256:d2421964bad195c959ba147ad21626ccddc73a4f2638664ad1c07bd9df48a675 -Copying config sha256:f0b02e9d092d905d0d87a8455a1ae3e9bb47b4aa3dc125125ca5cd10d6441c9f -Writing manifest to image destination -Storing signatures -Push success with image: example-registry/library/mybusybox:latest -``` - ->![](./public_sys-resources/icon-note.gif) **说明:** -> -> 推送镜像时,需要先登录对应的镜像仓库 - -### info: 查看运行环境与系统信息 - -可以通过“isula-build info”指令查看 isula-build 目前的运行环境与系统信息。命令原型如下: - -```sh -isula-build info [flags] -``` - -支持如下Flags: - -* -H, --human-readable 布尔值,以常用内存表示格式打印内存信息,使用1000次幂 -* -V, --verbose 布尔值,显示运行时内存占用信息 - -使用示例: - -```sh -$ sudo isula-build info -HV - General: - MemTotal: 7.63 GB - MemFree: 757 MB - SwapTotal: 8.3 GB - SwapFree: 8.25 GB - OCI Runtime: runc - DataRoot: /var/lib/isula-build/ - RunRoot: /var/run/isula-build/ - Builders: 0 - Goroutines: 12 - Store: - Storage Driver: overlay - Backing Filesystem: extfs - Registry: - Search Registries: - oepkgs.net - Insecure Registries: - localhost:5000 - oepkgs.net - Runtime: - MemSys: 68.4 MB - HeapSys: 63.3 MB - HeapAlloc: 7.41 MB - MemHeapInUse: 8.98 MB - MemHeapIdle: 54.4 MB - MemHeapReleased: 52.1 MB -``` - -### login: 登录远端镜像仓库 - -用户可以运行 login 命令来登录远程镜像仓库。命令原型如下: - -```sh - isula-build login SERVER [FLAGS] -``` - -目前支持的flag有: - -```Conf - Flags: - -p, --password-stdin Read password from stdin - -u, --username string Username to access registry -``` - -通过stdin输入密码。以下示例通过通过管道将creds.txt里的密码传给isula-build的stdin进行输入: - -```sh - $ cat creds.txt | sudo isula-build login -u cooper -p mydockerhub.io - Login Succeeded -``` - -通过交互式输入密码: - -```sh - $ sudo isula-build login mydockerhub.io -u cooper - Password: - Login Succeeded -``` - -### logout: 退出远端镜像仓库 - -用户可以运行 logout 命令来登出远程镜像仓库。命令原型如下: - -```sh -isula-build logout [SERVER] [FLAGS] -``` - -目前支持的flag有: - -```sh - Flags: - -a, --all Logout all registries -``` - -使用示例如下: - -```sh -$ sudo isula-build logout -a - Removed authentications -``` - -### version: 版本查询 - -可通过version命令查看当前版本信息: - -```sh -$ sudo isula-build version -Client: - Version: 0.9.6-4 - Go Version: go1.15.7 - Git Commit: 83274e0 - Built: Wed Jan 12 15:32:55 2022 - OS/Arch: linux/amd64 - -Server: - Version: 0.9.6-4 - Go Version: go1.15.7 - Git Commit: 83274e0 - Built: Wed Jan 12 15:32:55 2022 - OS/Arch: linux/amd64 -``` - -### manifest: manifest列表管理 - -manifest列表包含不同系统架构对应的镜像信息,通过使用manifest列表,用户可以在不同的架构中使用相同的manifest(例如openeuler:latest)获取对应架构的镜像,manifest包含create、annotate、inspect和push子命令。 -> ![](./public_sys-resources/icon-note.gif) **说明:** -> -> manifest为实验特性,使用时需开启客户端和服务端的实验选项,方式详见客户端总体说明和配置服务章节。 - -#### create: manifest列表创建 - -manifest的子命令create用于创建manifest列表,命令原型为: - -```sh -isula-build manifest create MANIFEST_LIST MANIFEST [MANIFEST...] -``` - -用户可以指定manifest列表的名称以及需要加入到列表中的远程镜像,若不指定任何远程镜像,则会创建一个空的manifest列表。 - -使用示例如下: - -```sh -sudo isula-build manifest create openeuler localhost:5000/openeuler_x86:latest localhost:5000/openeuler_aarch64:latest -``` - -#### annotate: manifest列表更新 - -manifest的子命令annotate用于更新manifest列表,命令原型为: - -```sh -isula-build manifest annotate MANIFEST_LIST MANIFEST [flags] -``` - -用户可以指定需要更新的manifest列表以及其中的镜像,通过flags指定需要更新的选项,此命令也可用于添加新的镜像到列表中。 - -其中annotate包含如下flags: - -* --arch: string,重写镜像适用架构 -* --os: string,重写镜像适用系统 -* --os-features: string列表,指定镜像需要的OS特性,很少使用 -* --variant: string,指定列表中记录镜像的变量 - -使用示例如下: - -```sh -sudo isula-build manifest annotate --os linux --arch arm64 openeuler:latest localhost:5000/openeuler_aarch64:latest -``` - -#### inspect: manifest列表查询 - -manifest子命令inspect用于查询manifest列表信息,命令原型为: - -```sh -isula-build manifest inspect MANIFEST_LIST -``` - -使用示例如下: - -```sh -$ sudo isula-build manifest inspect openeuler:latest -{ - "schemaVersion": 2, - "mediaType": "application/vnd.docker.distribution.manifest.list.v2+json", - "manifests": [ - { - "mediaType": "application/vnd.docker.distribution.manifest.v2+json", - "size": 527, - "digest": "sha256:bf510723d2cd2d4e3f5ce7e93bf1e52c8fd76831995ac3bd3f90ecc866643aff", - "platform": { - "architecture": "amd64", - "os": "linux" - } - }, - { - "mediaType": "application/vnd.docker.distribution.manifest.v2+json", - "size": 527, - "digest": "sha256:f814888b4bb6149bd39ba8375a1932fb15071b4dbffc7f76c7b602b06abbb820", - "platform": { - "architecture": "arm64", - "os": "linux" - } - } - ] -} -``` - -#### push: 将manifest列表推送到远程仓库 - -manifest子命令push用于将manifest列表推送到远程仓库,命令原型为: - -```sh -isula-build manifest push MANIFEST_LIST DESTINATION -``` - -使用示例如下: - -```sh -sudo isula-build manifest push openeuler:latest localhost:5000/openeuler:latest -``` - -## 直接集成容器引擎 - -isula-build可以与iSulad和docker集成,将构建好的容器镜像导入到容器引擎的本地存储中。 - -### 与iSulad集成 - -支持将构建成功的镜像直接导出到iSulad。 - -命令行举例: - -```sh -sudo isula-build ctr-img build -f Dockerfile -o isulad:busybox:2.0 -``` - -通过在-o参数中指定iSulad,将构建好的容器镜像导出到iSulad,可以通过isula images查询: - -```sh -$ sudo isula images -isula images -REPOSITORY TAG IMAGE ID CREATED SIZE -busybox 2.0 2d414a5cad6d 2020-08-01 06:41:36 5.577 MB -``` - -> ![](./public_sys-resources/icon-note.gif) **说明:** -> -> * 要求isula-build和iSulad在同一节点。 -> * 直接导出镜像到iSulad时,isula-build client端需要将构建成功的镜像暂存成 `/var/lib/isula-build/tmp/[buildid]/isula-build-tmp-%v.tar` 再导入至 iSulad,用户需要保证 /var/lib/isula-build/tmp/ 目录有足够磁盘空间;同时如果在导出过程中 isula-build client进程被KILL或Ctrl+C终止,需要依赖用户手动清理 `/var/lib/isula-build/tmp/[buildid]/isula-build-tmp-%v.tar` 文件。 - -### 与Docker集成 - -支持将构建成功的镜像直接导出到Docker daemon。 - -命令行举例: - -```sh -sudo isula-build ctr-img build -f Dockerfile -o docker-daemon:busybox:2.0 -``` - -通过在-o参数中指定docker-daemon,将构建好的容器镜像导出到docker, 可以通过docker images查询。 - -```sh -$ sudo docker images -REPOSITORY TAG IMAGE ID CREATED SIZE -busybox 2.0 2d414a5cad6d 2 months ago 5.22MB -``` - -> ![](./public_sys-resources/icon-note.gif) **说明:** -> -> 要求isula-build和Docker在同一节点。 - -## 使用注意事项 - -本章节主要介绍在使用isula-build构建镜像时相关的约束和限制,以及与docker build的差异。 - -### 约束和限制 - -1. 当导出镜像到[`iSulad`](https://gitee.com/openeuler/iSulad/blob/master/README.md/)时,镜像必须指明tag。 -2. 因为isula-builder运行`RUN`指令时,需要调用系统中的oci 运行时(如`runc`),用户需要保证该运行时的安全性,不受篡改。 -3. `DataRoot`不能设置在内存盘上(tmpfs)。 -4. `Overlay2`是目前isula-builder唯一支持的存储驱动。 -5. `Docker`镜像是目前唯一支持的镜像格式,未来即将支持`oci`格式镜像。 -6. `Dockerfile`文件权限强烈建议设置为**0600**以防止恶意篡改。 -7. `RUN`命令中目前只支持主机侧网络(host network)。 -8. 当导出镜像到本地tar包时,目前只支持保存为`tar`格式。 -9. 当使用`import`功能导入基础镜像时,最大支持**1G**。 - -### 与“docker build”差异 - -`isula-build`兼容[Docker镜像格式规范](https://docs.docker.com/engine/reference/builder/),但仍然和`docker build`存在一些差异: - -1. 支持镜像压缩,即对每个`stage`进行提交而非每一行。 -2. 目前不支持构建缓存。 -3. 只有`RUN`指令会运行容器进行构建。 -4. 目前不支持查询镜像构建历史。 -5. `Stage`名称可以用数字开头。 -6. `Stage`名称最大长度为64。 -7. `ADD`命令不支持远端URL格式。 -8. 暂不支持对单次构建进行资源限额,可采取对isula-builder配置资源限额的方式进行限制。 -9. 统计镜像大小时,isula-build是直接计算每层tar包大小之和,而docker是通过解压tar遍历diff目录计算文件大小之和,因此通过`isula-build ctr-img images`查看的镜像大小与`docker images`的显示上有一定差异。 -10. 操作时的镜像名称需要明确,格式为IMAGE_NAME:IMAGE_TAG。例如 busybox:latest, 其中latest不可省略。 - -## 附录 - -### 命令行参数说明 - -**表1** ctr-img build 命令参数列表 - -| **命令** | **参数** | **说明** | -| ------------- | -------------- | ------------------------------------------------------------ | -| ctr-img build | --build-arg | string列表,构建过程中需要用到的变量 | -| | --build-static | KV值,构建二进制一致性。目前包含如下K值:- build-time:string,使用固定时间戳来构建容器镜像;时间戳格式为“YYYY-MM-DD HH-MM-SS” | -| | -f, --filename | string,Dockerfile的路径,不指定则是使用当前路径的Dockerfile文件 | -| | --format | string,设置构建镜像的镜像格式:oci|docker(需开启实验特性选项)| -| | --iidfile | string,输出 image ID 到本地文件 | -| | -o, --output | string,镜像导出的方式和路径 | -| | --proxy | 布尔值,继承主机侧环境的proxy环境变量(默认为true) | -| | --tag | string,给构建的镜像添加tag | -| | --cap-add | string列表,构建过程中RUN指令所需要的权限 | - -**表2** ctr-img load 命令参数列表 - -| **命令** | **参数** | **说明** | -| ------------ | ----------- | --------------------------------- | -| ctr-img load | -i, --input | string,需要导入的本地tar包的路径 | - -**表3** ctr-img push 命令参数列表 - -| **命令** | **参数** | **说明** | -| ------------ | ----------- | --------------------------------- | -| ctr-img push | -f, --format | string,推送的镜像格式:oci|docker(需开启实验特性选项)| - -**表4** ctr-img rm 命令参数列表 - -| **命令** | **参数** | **说明** | -| ---------- | ----------- | --------------------------------------------- | -| ctr-img rm | -a, --all | 布尔值,删除所有本地持久化存储的镜像 | -| | -p, --prune | 布尔值,删除所有没有tag的本地持久化存储的镜像 | - -**表5** ctr-img save 命令参数列表 - -| **命令** | **参数** | **说明** | -| ------------ | ------------ | ---------------------------------- | -| ctr-img save | -o, --output | string,镜像导出后在本地的存储路径 | -| | -f, --format | string,导出层叠镜像的镜像格式:oci|docker(需开启实验特性选项)| - -**表6** login 命令参数列表 - -| **命令** | **参数** | **说明** | -| -------- | -------------------- | ------------------------------------------------------- | -| login | -p, --password-stdin | 布尔值,是否通过stdin读入密码;或采用交互式界面输入密码 | -| | -u, --username | string,登录镜像仓库所使用的用户名 | - -**表7** logout 命令参数列表 - -| **命令** | **参数** | **说明** | -| -------- | --------- | ------------------------------------ | -| logout | -a, --all | 布尔值,是否登出所有已登录的镜像仓库 | - -**表8** manifest annotate命令参数列表 - -| **命令** | **说明** | **参数** | -| ----------------- | ------------- | ------------------------------------------ | -| manifest annotate | --arch | string,重写镜像适用架构 | -| | --os | string,重写镜像适用系统 | -| | --os-features | string列表,指定镜像需要的OS特性,很少使用 | -| | --variant | string,指定列表中记录镜像的变量 | - -### 通信矩阵 - -isula-build两个组件进程之间通过unix socket套接字文件进行通信,无端口通信。 - -### 文件与权限 - -* isula-build 所有的操作均需要使用 root 权限。如需使用非特权用户操作,则需要配置--group参数 - -* isula-build 运行涉及文件权限如下表所示: - -| **文件路径** | **文件/文件夹权限** | **说明** | -| ------------------------------------------- | ------------------- | ------------------------------------------------------------ | -| /usr/bin/isula-build | 550 | 命令行工具二进制文件。 | -| /usr/bin/isula-builder | 550 | 服务端isula-builder进程二进制文件。 | -| /usr/lib/systemd/system/isula-build.service | 640 | systemd配置文件,用于管理isula-build服务。 | -| /etc/isula-build | 650 | isula-builder 配置文件根目录 | -| /etc/isula-build/configuration.toml | 600 | isula-builder 总配置文件,包含设置 isula-builder 日志级别、持久化目录和运行时目录、OCI runtime等。 | -| /etc/isula-build/policy.json | 600 | 签名验证策略文件的语法文件。 | -| /etc/isula-build/registries.toml | 600 | 针对各个镜像仓库的配置文件,含可用的镜像仓库列表、镜像仓库黑名单。 | -| /etc/isula-build/storage.toml | 600 | 本地持久化存储的配置文件,包含所使用的存储驱动的配置。 | -| /etc/isula-build/isula-build.pub | 400 | 非对称加密公钥文件 | -| /var/run/isula_build.sock | 660 | 服务端isula-builder的本地套接字。 | -| /var/lib/isula-build | 700 | 本地持久化目录。 | -| /var/run/isula-build | 700 | 本地运行时目录。 | -| /var/lib/isula-build/tmp/[buildid]/isula-build-tmp-*.tar | 644 | 镜像导出至iSulad时的本地暂存目录。 | diff --git "a/docs/zh/docs/Container/isula-build\351\231\204\345\275\225.md" "b/docs/zh/docs/Container/isula-build\351\231\204\345\275\225.md" new file mode 100644 index 0000000000000000000000000000000000000000..35b0fbcc9e4ad33ac420f0706a2c35acdaefa780 --- /dev/null +++ "b/docs/zh/docs/Container/isula-build\351\231\204\345\275\225.md" @@ -0,0 +1,91 @@ +# 附录 + +## 命令行参数说明 + +**表1** ctr-img build 命令参数列表 + +| **命令** | **参数** | **说明** | +| ------------- | -------------- | ------------------------------------------------------------ | +| ctr-img build | --build-arg | string列表,构建过程中需要用到的变量 | +| | --build-static | KV值,构建二进制一致性。目前包含如下K值:- build-time:string,使用固定时间戳来构建容器镜像;时间戳格式为“YYYY-MM-DD HH-MM-SS” | +| | -f, --filename | string,Dockerfile的路径,不指定则是使用当前路径的Dockerfile文件 | +| | --format | string,设置构建镜像的镜像格式:oci|docker(需开启实验特性选项)| +| | --iidfile | string,输出 image ID 到本地文件 | +| | -o, --output | string,镜像导出的方式和路径 | +| | --proxy | 布尔值,继承主机侧环境的proxy环境变量(默认为true) | +| | --tag | string,给构建的镜像添加tag | +| | --cap-add | string列表,构建过程中RUN指令所需要的权限 | + +**表2** ctr-img load 命令参数列表 + +| **命令** | **参数** | **说明** | +| ------------ | ----------- | --------------------------------- | +| ctr-img load | -i, --input | string,需要导入的本地tar包的路径 | + +**表3** ctr-img push 命令参数列表 + +| **命令** | **参数** | **说明** | +| ------------ | ----------- | --------------------------------- | +| ctr-img push | -f, --format | string,推送的镜像格式:oci|docker(需开启实验特性选项)| + +**表4** ctr-img rm 命令参数列表 + +| **命令** | **参数** | **说明** | +| ---------- | ----------- | --------------------------------------------- | +| ctr-img rm | -a, --all | 布尔值,删除所有本地持久化存储的镜像 | +| | -p, --prune | 布尔值,删除所有没有tag的本地持久化存储的镜像 | + +**表5** ctr-img save 命令参数列表 + +| **命令** | **参数** | **说明** | +| ------------ | ------------ | ---------------------------------- | +| ctr-img save | -o, --output | string,镜像导出后在本地的存储路径 | +| | -f, --format | string,导出层叠镜像的镜像格式:oci|docker(需开启实验特性选项)| + +**表6** login 命令参数列表 + +| **命令** | **参数** | **说明** | +| -------- | -------------------- | ------------------------------------------------------- | +| login | -p, --password-stdin | 布尔值,是否通过stdin读入密码;或采用交互式界面输入密码 | +| | -u, --username | string,登录镜像仓库所使用的用户名 | + +**表7** logout 命令参数列表 + +| **命令** | **参数** | **说明** | +| -------- | --------- | ------------------------------------ | +| logout | -a, --all | 布尔值,是否登出所有已登录的镜像仓库 | + +**表8** manifest annotate命令参数列表 + +| **命令** | **说明** | **参数** | +| ----------------- | ------------- | ------------------------------------------ | +| manifest annotate | --arch | string,重写镜像适用架构 | +| | --os | string,重写镜像适用系统 | +| | --os-features | string列表,指定镜像需要的OS特性,很少使用 | +| | --variant | string,指定列表中记录镜像的变量 | + +## 通信矩阵 + +isula-build两个组件进程之间通过unix socket套接字文件进行通信,无端口通信。 + +## 文件与权限 + +* isula-build 所有的操作均需要使用 root 权限。如需使用非特权用户操作,则需要配置--group参数 + +* isula-build 运行涉及文件权限如下表所示: + +| **文件路径** | **文件/文件夹权限** | **说明** | +| ------------------------------------------- | ------------------- | ------------------------------------------------------------ | +| /usr/bin/isula-build | 550 | 命令行工具二进制文件。 | +| /usr/bin/isula-builder | 550 | 服务端isula-builder进程二进制文件。 | +| /usr/lib/systemd/system/isula-build.service | 640 | systemd配置文件,用于管理isula-build服务。 | +| /etc/isula-build | 650 | isula-builder 配置文件根目录 | +| /etc/isula-build/configuration.toml | 600 | isula-builder 总配置文件,包含设置 isula-builder 日志级别、持久化目录和运行时目录、OCI runtime等。 | +| /etc/isula-build/policy.json | 600 | 签名验证策略文件的语法文件。 | +| /etc/isula-build/registries.toml | 600 | 针对各个镜像仓库的配置文件,含可用的镜像仓库列表、镜像仓库黑名单。 | +| /etc/isula-build/storage.toml | 600 | 本地持久化存储的配置文件,包含所使用的存储驱动的配置。 | +| /etc/isula-build/isula-build.pub | 400 | 非对称加密公钥文件 | +| /var/run/isula_build.sock | 660 | 服务端isula-builder的本地套接字。 | +| /var/lib/isula-build | 700 | 本地持久化目录。 | +| /var/run/isula-build | 700 | 本地运行时目录。 | +| /var/lib/isula-build/tmp/[buildid]/isula-build-tmp-*.tar | 644 | 镜像导出至iSulad时的本地暂存目录。 | diff --git "a/docs/zh/docs/Container/isula\345\270\270\350\247\201\351\227\256\351\242\230\344\270\216\350\247\243\345\206\263\346\226\271\346\263\225.md" "b/docs/zh/docs/Container/isula\345\270\270\350\247\201\351\227\256\351\242\230\344\270\216\350\247\243\345\206\263\346\226\271\346\263\225.md" new file mode 100644 index 0000000000000000000000000000000000000000..b41f205b7a5dc5bb145da44a387ec15742a39480 --- /dev/null +++ "b/docs/zh/docs/Container/isula\345\270\270\350\247\201\351\227\256\351\242\230\344\270\216\350\247\243\345\206\263\346\226\271\346\263\225.md" @@ -0,0 +1,22 @@ +# 常见问题与解决方法 + +## **问题1:修改`iSulad`默认运行时为`lxc`,启动容器报错:Failed to initialize engine or runtime** + +原因:`iSulad`默认运行时为`runc`,设置默认运行时为`lxc`时缺少依赖。 + +解决方法:若需修改`iSulad`默认运行时为`lxc`,需要安装`lcr`、`lxc`软件包依赖,且配置`iSulad`配置文件中`runtime`为`lcr` +或者启动容器时指定`--runtime lcr`。启动容器后不应该随意卸载`lcr`、`lxc`软件包,否则可能会导致删除容器时的资源残留。 + +## **问题2:使用`iSulad` `CRI V1`接口,报错:rpc error: code = Unimplemented desc =** + +原因:`iSulad`同时支持`CRI V1alpha2`和`CRI V1`接口,默认使用`CRI V1alpha2`,若使用`CRI V1`,需要开启相应的配置。 + +解决方法:在`iSulad`配置文件`/etc/isulad/daemon.json`中开启`CRI V1`的配置。 + +```json +{ + "enable-cri-v1": true, +} +``` + +若使用源码编译`iSulad`,还需在编译时增加`cmake`编译选项`-D ENABLE_CRI_API_V1=ON`。 diff --git "a/docs/zh/docs/Container/\345\215\207\347\272\247.md" "b/docs/zh/docs/Container/\345\215\207\347\272\247.md" index c93a36cbeb91dde4ffd6d783f1608b2bec9a6a64..805ef107b5d279d25e66398f2d17779b02a7b71c 100644 --- "a/docs/zh/docs/Container/\345\215\207\347\272\247.md" +++ "b/docs/zh/docs/Container/\345\215\207\347\272\247.md" @@ -10,7 +10,7 @@ >![](./public_sys-resources/icon-note.gif) **说明:** >- 可通过** sudo rpm -qa |grep iSulad** 或 **isula version** 命令确认当前iSulad的版本号。 ->- 相同大版本之间,如果希望手动升级,请下载iSulad及其所有依赖库的RPM包进行升级,参考命令如下: +>- 相同大版本之间,如果希望手动升级,请下载iSulad及其所有依赖的RPM包进行升级,参考命令如下: > ``` > # sudo rpm -Uhv iSulad-xx.xx.xx-YYYYmmdd.HHMMSS.gitxxxxxxxx.aarch64.rpm > ``` @@ -18,4 +18,8 @@ > ``` > # sudo rpm -Uhv --force iSulad-xx.xx.xx-YYYYmmdd.HHMMSS.gitxxxxxxxx.aarch64.rpm > ``` - +>- 如若iSulad依赖的libisula组件发生升级,iSulad应该与对应版本的libisula一起升级,参考命令如下: +> ``` +> # sudo rpm -Uvh libisula-xx.xx.xx-YYYYmmdd.HHMMSS.gitxxxxxxxx.aarch64.rpm iSulad-xx.xx.xx-YYYYmmdd.HHMMSS.gitxxxxxxxx.aarch64.rpm +> ``` +>- iSulad在openeuler 22.03-LTS-SP3之前的版本使用lcr作为默认容器运行时。因此,跨此版本升级时,在升级之前创建的容器仍是使用lcr作为容器运行时,只有在升级之后创建的容器才会采用新版本的默认运行时runc。若在新版本中仍需使用lcr容器运行时,需要修改isulad默认配置文件(默认为/etc/isulad/daemon.json)中的default-runtime为lcr或者在运行容器时指定容器运行时为lcr(--runtime lcr), 在升级时若对应的lcr、lxc版本发生升级,同样应该与iSulad一起升级。 diff --git "a/docs/zh/docs/Container/\345\256\211\350\243\205\344\270\216\351\205\215\347\275\256.md" "b/docs/zh/docs/Container/\345\256\211\350\243\205\344\270\216\351\205\215\347\275\256.md" index 1f12f24530031cee5faddbe16571daf5a60dfec6..8f36bbb0558592c4c87b505d564ede5a3c4157b2 100644 --- "a/docs/zh/docs/Container/\345\256\211\350\243\205\344\270\216\351\205\215\347\275\256.md" +++ "b/docs/zh/docs/Container/\345\256\211\350\243\205\344\270\216\351\205\215\347\275\256.md" @@ -24,7 +24,7 @@ iSulad可以通过yum或rpm命令两种方式安装,由于yum会自动安装 ``` -- 使用rpm安装iSulad,需要下载iSulad及其所有依赖库的RPM包,然后手动安装。安装单个iSulad的RPM包(依赖包安装方式相同),参考命令如下: +- 使用rpm安装iSulad,需要下载iSulad及其所有依赖的RPM包,然后手动安装。安装单个iSulad的RPM包(依赖包安装方式相同),参考命令如下: ``` # sudo rpm -ihv iSulad-xx.xx.xx-xx.xxx.aarch64.rpm diff --git "a/docs/zh/docs/DPU-OS/DPU-OS\350\243\201\345\211\252\346\214\207\345\257\274.md" "b/docs/zh/docs/DPU-OS/DPU-OS\350\243\201\345\211\252\346\214\207\345\257\274.md" index 9a5da0a865ef1e5d75a8dfe9e1c6aab404b8cbce..a1f4790b9aa0ae067675048a57bf69e5f5c51800 100644 --- "a/docs/zh/docs/DPU-OS/DPU-OS\350\243\201\345\211\252\346\214\207\345\257\274.md" +++ "b/docs/zh/docs/DPU-OS/DPU-OS\350\243\201\345\211\252\346\214\207\345\257\274.md" @@ -4,9 +4,9 @@ #### 准备imageTailor和所需的rpm包 -参照[imageTailor使用指导文档](https://docs.openeuler.org/zh/docs/22.03_LTS/docs/TailorCustom/imageTailor%E4%BD%BF%E7%94%A8%E6%8C%87%E5%8D%97.html)安装好`imageTailor`工具并将裁剪所要用到的rpm包准备好。 +参照[imageTailor使用指导文档](https://docs.openeuler.org/zh/docs/24.03_LTS/docs/TailorCustom/imageTailor%E4%BD%BF%E7%94%A8%E6%8C%87%E5%8D%97.html)安装好`imageTailor`工具并将裁剪所要用到的rpm包准备好。 -可以使用openEuler提供安装镜像作为镜像裁剪所需要rpm包源,`openEuler-22.03-LTS-everything-debug-aarch64-dvd.iso`中的rpm比较全但是此镜像很大,可以用镜像`openEuler-22.03-LTS-aarch64-dvd.iso`中的rpm包和一个`install-scripts.noarch`实现。 +可以使用openEuler提供安装镜像作为镜像裁剪所需要rpm包源,`openEuler-{version}-everything-debug-aarch64-dvd.iso`中的rpm比较全但是此镜像很大,可以用镜像`openEuler-{version}-aarch64-dvd.iso`中的rpm包和一个`install-scripts.noarch`实现。 `install-scripts.noarch`包括可以从everything包中获取,或者在系统中通过yum下载: @@ -53,7 +53,7 @@ dpuos 1 rpm-dir euler_base * `kiwi/minios/cfg_dpuos/rpm.conf` -密码生成及修改方法可详见openEuler imageTailor手册[配置初始密码](https://docs.openeuler.org/zh/docs/22.03_LTS/docs/TailorCustom/imageTailor%E4%BD%BF%E7%94%A8%E6%8C%87%E5%8D%97.html#%E9%85%8D%E7%BD%AE%E5%88%9D%E5%A7%8B%E5%AF%86%E7%A0%81)章节。 +密码生成及修改方法可详见openEuler imageTailor手册[配置初始密码](https://docs.openeuler.org/zh/docs/24.03_LTS/docs/TailorCustom/imageTailor%E4%BD%BF%E7%94%A8%E6%8C%87%E5%8D%97.html#%E9%85%8D%E7%BD%AE%E5%88%9D%E5%A7%8B%E5%AF%86%E7%A0%81)章节。 #### 执行裁剪命令 diff --git "a/docs/zh/docs/EulerMaker/EulerMaker\347\224\250\346\210\267\346\214\207\345\215\227.md" "b/docs/zh/docs/EulerMaker/EulerMaker\347\224\250\346\210\267\346\214\207\345\215\227.md" new file mode 100644 index 0000000000000000000000000000000000000000..06f316169160bda833607e73caa23090b2fb11bb --- /dev/null +++ "b/docs/zh/docs/EulerMaker/EulerMaker\347\224\250\346\210\267\346\214\207\345\215\227.md" @@ -0,0 +1,568 @@ +# 简介 + +统一构建平台是一款软件包构建平台,实现从源码到二进制软件包与软件仓库的平台服务。该平台通过统一的软件包配置管理,软件包依赖关系分析,实现基于依赖关系的软件包高效构建。帮助开发者与合作伙伴建设自己的用户个人仓,OS核心仓,构建门禁能力。 + +# 快速入门 + +## 用户注册 + +### 登录注册流程 + +地址:[https://eulermaker.compass-ci.openeuler.openatom.cn](https://eulermaker.compass-ci.openeuler.openatom.cn),访问前需添加白名单。 + +1. 登录 + 首页右上角点击“登录”,跳转至openEuler社区登录页面。 + ![image](images/login.png) + 输入openEuler社区帐号用户名、密码。 + ![image](images/certification.png) + openEuler授权成功后,根据用户是否已经绑定eulermaker帐号,跳转页面。 + - 已绑定:页面返回首页 + ![image](images/home.png) + - 未绑定:页面跳转至帐号注册页面,填写用户信息,完成注册。 + ![image](images/regist.png) + +### 创建工程 + +1. 创建project + 首页点击“all projects”的view按钮,跳转页面到projects总体页面。 + ![image](images/home.png) + 切换至Private Projects页签点击"Add project"按钮,填写project工程基本信息,完成创建。 + ![image](images/create_project.png) + +2. 查看新建的project + 成功新建project工程后,点击private projects中该工程名,页面跳转至project overview页面。 + +### 添加package + + 点击“add package”按钮,输入package信息,添加package。 + ![image](images/add_package.png) + +### 工程配置 + +1. 点击"config"tab栏, 跳转至project config页面。 +2. 此页面可编辑build targets、flags、publish、config等工程配置信息。config支持yaml格式输入,配置使能开关。 + ![image](images/config.png) + +### 继承工程 + +1. 工程继承 + project overview页面点击"inherit Project"按钮,填写继承项目名称,完成继承项目创建。页面跳转至继承工程overview页面,项目继承只继承父项目工程配置信息。 + ![image](images/inherit_project.png) + +2. 包继承 + package overview页面点击"Branch Package"按钮,弹窗中默认显示包继承项目信息,完成继承项目创建。页面跳转至继承工程overview页面,包继承会继承父项目工程配置信息及该软件包。 + ![image](images/branch_package.png) + +### 创建构建任务 + +**单包构建** + +1. 在project overview页面Packages列表中选择需要构建的包,点击包名进入package overview页面。 +2. Package overview页面Git Repo中可修改Git Url及Branch。 +3. 在build targets中完成构建设置,Os_variant设置构建任务执行环境,Arch设置执行机架构,Ground projects地基项目选择,Build Flag设置是否构建,Publish Flag设置是否发布,Operation列可对已添加数据进行修改及删除。 +4. 点击start Build按钮。 + ![image](images/single_build.png) +5. 切换至build tab页中可查看该构建任务执行情况,也可返回该项目build tab中查看构建状态。 + +**全量构建** + +1. 在project overview页面点击Full build按钮,可触发全量构建。 + ![image](images/full_build.png) +2. 切换至project build页面,点击build information列表中选择需要查看的build_id,右侧Details中展示该构建任务中待构建包列表及构建状态,点击上方export按钮可将待构建包列表导出为xlsx文件。 + +**增量构建** + +1. 在project overview页面点击Incremental build按钮,可触发增量构建。 + ![image](images/incremental_build.png) +2. 切换至project build 页面,点击build information列表中选择需要查看的build_id,右侧Details中展示该构建任务中待构建包列表及构建状态,点击上方export按钮可将待构建包列表导出为xlsx文件。 + +### 构建历史查看 + +**工程构建历史查看** + +1. 在project overview页面点击构建历史页面,可查看该工程历史构建任务及构建详情。 + ![image](images/build_history.png) +2. 点击构建历史中任意一条构建ID,页面右侧显示该次构建任务构建包的spec名称,状态,构建详情。 + ![image](images/build_detail.png) +3. 点击构建详情页面DAG关系表按钮,可查看本次构建任务依赖关系及构建相关信息。 + ![image](images/dag_relationships.png) + +### 获取软件包 + +1. 在project overview页面Packages列表中选择需要下载的包,点击包名进入package overview页面。 +2. 切换至Download tab页,该页面展示最新构建出的软件包,点击Download 按钮下载包,点击View Details查看软件包详情。 + ![image](images/download.png) + +## 镜像定制 + +### 添加流水线 + +**创建流水线** + + 1. 首页点击“镜像定制”的查看按钮,跳转页面到流水线列表总体页面。 + + ![image](images/pipeline_list.png) + + 2. 点击“添加流水线”功能按钮,填写流水线工程页面基本信息,流水线类型可选择版本镜像及镜像定制两种类型,点击确认按钮完成创建。 + + ![image](images/pipeline_add.png) + +**查看新建的流水线工程** + 成功新建流水线工程后,点击创建后的流水线名称,页面跳转至流水线详情页面。 + +![image](images/pipeline_start.png) + +### 版本镜像类型 + +#### 版本参数配置 + +1. 点击“版本参数配置”框,右侧跳转至版本参数配置页面,点击“修改”按钮,添加版本参数配置信息,其中表单字段校验规则如下: + ![image](images/pipeline_param.png) + +| 表单字段 | 校验规则 | +| ---------- | ------------------------------------------------------------ | +| repo源地址 | 地址长度小于1000,且配置多个repo源时用空格分割 | +| 镜像类型 | 只针对iso镜像,必填选择框 | +| 产品名称 | 只允许字母数字,且“产品名称-版本号[-Release号]-架构”的字符个数不超过32 | +| 版本号 | 只允许字母数字和“-”和“.”,必须字母或数字开头,且“产品名称-版本号[-Release号]-架构”的字符个数不超过32 | +| Release号 | 可缺省,若填写,则规则与版本号相同 | + +### 镜像定制类型 + +#### 定制业务包 + +##### iso/cpio格式 + +1. 点击创建好的镜像定制类型流水线名称,进入流水线详情。 +2. 点击定制业务包按钮,右边切换至定制业务包界面,界面如下图,点击修改按钮进入修改界面。 + ![image](images/custom_package.png) +3. 填写产品名称,填写安装镜像时的所展示的产品名称。 +4. 配置内核参数,为了系统能够更稳定高效地运行,用户可以根据需要修改内核命令行参数,多个参数使用空格分隔。 +
例如:"net.ifnames=0 biosdevname=0 crashkernel=512M oops=panic softlockup_panic=1 reserve_kbox_mem=16M crash_kexec_post_notifiers panic=3 console=tty0" +5. 添加repo源,输入需要添加rpm包的repo源的正确url地址。 +6. 添加rpm包:配置好正确的repo源后,点击添加按钮,弹出添加rpm包弹窗,勾选需要的rpm包,也可以通过弹窗右上角搜索需要的rpm包,点击确认即可保存。若要删除已添加的rpm包,点击rpm包表格操作列的移除或者勾选需要删除的rpm包,点击批量移除按钮即可。 + ![image](images/add_rpms.png) +7. 添加驱动:填写驱动文件路径及文件名,多个以空格分隔,可选填。 +8. 添加命令:填写系统命令,多个以空格分隔,可选填。 +9. 添加库文件:填写库文件名称,多个以空格分隔,可选填。 +10. 删除其他文件:填写需要删除的文件路径及文件名,多个以空格分隔,可选填。 +11.配置分区:点击加号图标,新增一条配置分区,根据需要填写以下配置。 + ![image](images/config_partition.png) + +- 磁盘索引:磁盘的编号,请按照hdx的格式填写,x指第x块盘。 +- 挂载目录:指定分区挂载的路径。用户既可以配置业务分区,也可以对默认配置中的系统分区进行调整。 +- 分区大小:分区大小的取值有四种,可通过点击单位按钮来更换,最大上限为16T。其中MAX为指定将硬盘上剩余的空间全部用来创建一个分区,只能在最后一个分区配置该值。 +- 分区类型:分区有三种,主分区:primary;扩展分区:extended(该分区只需配置hd磁盘号即可);逻辑分区:logical。 +- 分区文件系统类型:目前支持的文件系统类型有:ext4、vfat。 +- 是否二次格式化:表示二次安装时是否格式化,选择是或否。 + +12.配置网络:点击加号图标,新增一条配置网络,根据需要填写以下配置。
+![image](images/config_net.png) + +- BOOTPROTO:none:引导时不使用协议,不配地址;static:静态分配地址;dhcp:使用DHCP协议动态获取地址。
+- DEVICE:网卡名称,如eth0。 +- IPADDR:IP地址。当BOOTPROTO参数为static时,该参数必配,如:192.168.11.100;其他情况下,该参数不用配置。
+- NETMASK:子网掩码。当BOOTPROTO参数为static时,该参数必配,如:255.255.255.0;其他情况下,该参数不用配置。
+- STARTMODE:启用网卡的方法。manual:用户在终端执行ifup命令启用网卡。auto\hotplug\ifplugd\nfsroot:当OS识别到该网卡时,便启用该网卡。off:任何情况下,网卡都无法被启用。 + +13.添加用户自定义文件:点击加号图标,弹出文件管理器弹窗,选择需要上传的文件,点击打开按钮即可上传,上传成功后状态变为成功,并且需要填写目标存放路径。若要删除文件,点击操作列的删除字段即可。 + + > 注:用户自定义文件上传大小不得超过16M + +14.添加hook脚本:点击加号图标,弹出文件管理器弹窗,选择需要上传的hook脚本文件,文件名称需符合“S+数字(至少两位,个位数以0开头)”开头,数字代表hook脚本的执行顺序(脚本名称示例:S01xxx.sh),点击打开按钮即可上传,上传成功后状态变为成功,并且需要选择hook脚本存放的子目录。若要删除文件,点击操作列的删除字段即可。 + + > 注:hook脚本上传文件大小不得超过16M + + ![image](images/add_file.png) + +15.填好各项配置后,点击右上角保存按钮。 + +##### docker/mini_docker/mini_cpio/iso_normal/qcow2格式 + +1. 点击创建好的镜像定制类型流水线名称,进入流水线详情。 +2. 点击定制业务包按钮,右边切换至定制业务包界面,界面如下图,点击修改按钮进入修改界面。 + ![image](images/custom_package_2.png) +3. 添加repo源,输入需要添加rpm包的repo源的正确url地址。 +4. 添加rpm包:配置好正确的repo源后,点击添加按钮,弹出添加rpm包弹窗,勾选需要的rpm包,也可以通过弹窗右上角搜索需要的rpm包,点击确认即可保存。若要删除已添加的rpm包,点击rpm包表格操作列的移除或者勾选需要删除的rpm包,点击批量移除按钮即可。 + ![image](images/add_rpms_2.png) +5. 填好各项配置后,点击右上角保存按钮。 + +#### 配置系统参数 + +1. 只有`iso/cpio`格式需要配置系统参数。 +2. 进入创建好的流水线后,点击配置系统参数按钮,右边切换至配置系统参数界面,点击修改按钮进入修改界面。 + ![image](images/config_system.png) +3. 配置主机参数:根据需要填写或选择相关参数,其中主机名为字母、数字、“-”的组合,首字母必须是字母或数字。 + ![image](images/host_parameters.png) +4. 配置root初始密码:【密码校验格式】,两次需保持一致,可点击右边可视图标观测输入密码。 +5. 配置grub初始密码:【密码校验格式】,两次需保持一致,可点击右边可视图标观测输入密码。 + ![image](images/config_passwd.png) +6. 填好各项配置后,点击右上角保存按钮。 + +### 制作系统 + + 点击“制作系统”框,右侧自动跳转至制作系统页面,点击下方镜像构建按钮,弹出构建弹窗,点击确认按钮,页面显示相应的构建状态以及相关构建日志。 + ![image](images/release-image_build.png) + +### 镜像下载及构建日志查询 + +1. 方法一:镜像构建完成后,点击流水线中制作系统页面,下方显示构建日志信息,点击镜像下载按钮,将镜像下载至本地。 + ![image](images/image-build-1.png) + +2. 方法二:镜像构建完成后,点击流水线中的构建历史页面,操作栏显示镜像下载、查看日志选项。点击查看日志可进入构建日志页面。点击镜像下载按钮,将镜像下载至本地。 + ![image](images/image-build-2.png) + +### 构建历史 + +1. 在流水线详情页面,点击构建历史,跳转至流水线构建历史页面。 + ![image](images/image-his.png) + +2. 根据该页面查看流水线历史的镜像构建信息以及相关的镜像和构建日志。 + ![image](images/image-his-2.png) + +### 用户管理 + +1. 在流水线详情页面,点击用户管理,跳转至流水线用户管理页面。 + ![image](images/user_manager.png) + +2. 点击“添加用户”按钮,填写相关用户名称,并设置相应的用户权限(Maintainer/Reader)。 + ![image](images/user_add.png) + +### 流水线工程克隆 + +1. 点击需要克隆的流水线,进入到流水线详情页面,点击下方“克隆按钮”,填写或选择克隆流水线分组,填写新流水线名称后,点击确认按钮,完成流水线克隆。 + ![image](images/pipeline_clone.png) + +2. 页面跳转至克隆的流水线详情页面,流水线克隆了相应的配置参数信息。 + +### 删除流水线 + + 点击需要删除的流水线,进入到流水线详情页面,点击下方“删除按钮”,弹出确认窗口,点击确认按钮,删除整条流水线,包括相关构建的镜像。 + ![image](images/pipeline_delete.png) + +# 基于命令行进行构建 + +## 本地安装EulerMaker客户端 + +EulerMaker将[lkp-tests](https://gitee.com/openeuler-customization/lkp-tests.git)作为客户端,通过本地安装lkp-tests,lkp-tests提交任务依赖ruby,建议安装ruby2.5及以上版本。 + +## 下载安装 lkp-tests + +运行以下命令安装/设置lkp-test: + +```shell + git clone https://gitee.com/openeuler-customization/lkp-tests.git + cd lkp-tests + make install + source ~/.${SHELL##*/}rc +``` + +## 配置文件 + +本地配置文件配置用户名, 密码、网关等信息: + +- #{ENV['HOME']}/.config/cli/defaults/*.yaml 优先 + +``` + GATEWAY_IP: xxx + GATEWAY_PORT: xxx + SRV_HTTP_REPOSITORIES_HOST: xxx + SRV_HTTP_REPOSITORIES_PORT: xxx + SRV_HTTP_REPOSITORIES_PROTOCOL: xxx + SRV_HTTP_RESULT_HOST: xxx + SRV_HTTP_RESULT_PORT: xxx + SRV_HTTP_RESULT_PROTOCOL: xxx + DAG_HOST: xxx + DAG_PORT: xxx + ACCOUNT: xx + PASSWORD: xx + OAUTH_TOKEN_URL: https://omapi.osinfra.cn/oneid/oidc/token + OAUTH_REDIRECT_URL: xx + PUBLIC_KEY_URL: xx +``` + +其中网关GATEWAY_IP和GATEWAY_PORT必须配置; gitee帐号GITEE_ID和GITEE_PASSWORD若不配置只能执行游客可执行的命令; SRV_HTTP_RESULT_HOST,SRV_HTTP_RESULT_PORT是存储job日志的微服务,和SRV_HTTP_RESULT_PROTOCOL仅用于ccb log子命令;SRV_HTTP_REPOSITORIES_HOST和SRV_HTTP_REPOSITORIES_PORT是repo源,和SRV_HTTP_REPOSITORIES_PROTOCOL仅用于ccb download子命令,若无下载需求,可以不配置。 + +配置完成后可执行以下命令,查看是否可以正常使用ccb命令。 + +``` +ccb -h/--help +``` + +# CLI 客户端 + +## 命令总览 + +``` +# CRUD +# Elasticsearch 7.x文档基本操作(CRUD) + https://www.cnblogs.com/liugp/p/11848408.html + Elasticsearch CRUD基本操作 + https://www.cnblogs.com/powercto/p/14438907.html + +ccb create k=v|--json JSON|--yaml YAML # Details see "ccb create -h". +ccb update k=v|--json JSON|--yaml YAML + +ccb select k=v|--json JSON-file|--yaml YAML-file [-f/--field key1,key2...] + [-s/--sort key1:asc/desc,key2:asc/desc...] +ccb download os_project= packages= architecture=x86/aarch64 [--sub] [--source] [--debuginfo] +ccb cancel $build_id +ccb log +ccb build-single os_project= packages= k=v|--json JSON-file|--yaml YAML-file + +``` + +## 命令详情 + +### 1. ccb select查询各表信息 + +**查询所有的projects全部信息** + +``` +ccb select projects +``` + +注意: +rpms和rpm_repos,这两个表由于数据量过大,无法通过`ccb select 表名`命令直接查询该表的全部信息, +查询rpms表必须使用-f指定过滤字段或者使用key=value指定明确的过滤条件。 + +``` +ccb select rpms -f repo_id +ccb select rpms repo_id=openEuler-24.03-LTS:baseos-openEuler:24.03-LTS-x86_64-313 +``` + +查询rpm_repos表必须使用key=value指定明确的过滤条件,如果不知道value的值,可以先查询其他表获取,然后再使用key=value查询rpm_repos表。 + +``` +ccb select builds -f repo_id # 先查询builds获得repo_id值 +ccb select rpm_repos repo_id=openEuler-24.03-LTS:baseos-openEuler:24.03-LTS-x86_64-313 # 使用上个命令获得的repo_id值查询rpm_repos表 +``` + + **查询符合要求的projects的全部信息** + +``` +ccb select projects os_project=openEuler:Mainline owner=xxx +``` + + **查询符合要求的projects的部分信息** + +``` +ccb select projects os_project=openEuler:Mainline --field os_project,users +``` + + **查询符合要求的projects的部分信息并排序** + +``` +ccb select projects os_project=openEuler:Mainline --field os_project,users --sort create_time:desc,os_project:asc +``` + +**列出指定project的所有snapshot** + +``` +ccb select snapshots os_project=openEuler:Mainline +``` + +**注:查看其他表类似** + +### 2. ccb create project + +**创建project** + +``` +ccb create projects test-project --json config.json +config.json: +{ + "os_project": "test-project", + "descrption": "this is a private project of zhangshan", + "my_specs": [ + { + "spec_name": "gcc", + "spec_url": "https://gitee.com/src-openEuler/gcc.git", + "spec_branch": "openEuler-24.03-LTS" + }, + { + "spec_name": "python-flask", + "spec_url": "https://gitee.com/src-openEuler/python-flask.git", + "spec_branch": "openEuler-24.03-LTS" + } + ], + "build_targets": [ + { + "os_variant": "openEuler_24.03", + "architecture": "x86_64" + }, + { + "os_variant": "openEuler_24.03", + "architecture": "aarch64" + } + ], + "flags": { + "build": true, + "publish": true + } +} +``` + +### 3. ccb update projects $os_project + +**project添加package** + +``` +ccb update projects $os_project --json update.json +update.json: +{ + "my_specs+": [ + { + "spec_name": , + "spec_url": , + "spec_branch": + }, + ... + ] +} +``` + +**project增删user** + +``` +ccb update projects $os_project --json update.json +update.json: +{ + "users-": ["zhangsan"], + "users+": { + "lisi": "reader", + "wangwu": "maintainer" + } +} +``` + +**锁定某个package** + +``` +ccb update projects test-project package_overrides.$package.lock=true +``` + +**解锁某个package** + +``` +ccb update projects test-project package_overrides.$package.lock=false +``` + +### 4. 单包构建 + +build_targets可以不传,可以有1或多个,如果不传,采用os_project默认配置的build_targets。 + +``` +ccb build-single os_project=test-project packages=gcc --json build_targets.json +build_targets.json: +{ + "build_targets": [ + { + "os_variant": "openEuler:24.03", + "architecture": "x86_64" + }, + { + "os_variant": "openEuler:24.03", + "architecture": "aarch64" + } + ] +} +``` + +### 5. 全量/增量构建 + +指定build_type=full则为全量构建,指定build_type=incremental则为增量构建; +build_targets参数与单包构建中的build_targets一样,可以不传,可以有1或多个,如果不传,采用os_project默认配置的build_targets; +如果指定snapshot_id,则os_project可不传,表示基于某个历史快照创建全量/增量构建。 + +``` +ccb build os_project=test-project build_type=full --json build_targets.json # 全量构建 +ccb build snapshot_id=xxx build_type=incremental --json build_targets.json # 增量构建 +build_targets.json: +{ + "build_targets": [ + { + "os_variant": "openEuler:24.03", + "architecture": "x86_64" + }, + { + "os_variant": "openEuler:24.03", + "architecture": "aarch64" + } + ] +} +``` + +### 6. 下载软件包 + +如果指定snapshot_id,则os_project可不传; +dest表示指定软件包下载后存放的路径,可不传,默认使用当前路径。 + +#### 基本用法 + +``` +ccb download os_project=test-project packages=python-flask architecture=aarch64 dest=/tmp/rpm +ccb download snapshot_id=123456 packages=python-flask architecture=aarch64 dest=/tmp/rpm +``` + +#### -s的用法 + +``` +# 使用-s 表示下载该packages的源码包。示例如下所示: +ccb download os_project=test-project packages=python-flask architecture=aarch64 -s +``` + +#### -d的用法 + +``` +# 使用-d 表示下载该packages的debug(debuginfo和debugsource)包。示例如下所示: +ccb download os_project=test-project packages=python-flask architecture=aarch64 -d +``` + +#### -b的用法 + +``` +# 使用-b all 表示下载该packages的所有子包。示例如下所示: +ccb download os_project=test-project packages=python-flask architecture=aarch64 -b all + +# 使用-b $rpm 表示下载该packages的指定子包$rpm,指定多个子包以逗号分隔。示例如下所示: +ccb download os_project=test-project packages=python-flask architecture=aarch64 -b python2-flask +ccb download os_project=test-project packages=python-flask architecture=aarch64 -b python2-flask,python3-flask +``` + +#### -s -d -b 组合使用 + +``` +# 使用-b all -s -d 表示下载该packages的debug包,源码包和所有子包。示例如下所示: +ccb download os_project=test-project packages=python-flask architecture=aarch64 -b all -s -d + +# 使用-b $rpm -s -d 表示下载该packages的debug包,源码包和指定子包(指定多个子包以逗号分隔)。示例如下所示: +ccb download os_project=test-project packages=python-flask architecture=aarch64 -b python2-flask -s -d +ccb download os_project=test-project packages=python-flask architecture=aarch64 -b python2-flask,python3-flask -s -d +``` + +### 7. cancel 取消构建任务 + +``` +ccb cancel $build_id + +``` + +### 8. 查看job日志 + +``` +ccb log +``` + +# 概念/术语 + +| 术语 | 含义 | +| ------------ | ------------------------------------------------------------ | +| project | 一组包的集合,其中包含build_target配置,软件包的git_url配置 | +| build_target | 指定构建目标,包含目标的os系统及版本,cpu架构 | +| package | 软件包,使用spec_name来标识,一个package会生成一个或多个rpm。 | +| snapshot | project的快照,将记录当前时刻下,各个软件包的commit_id,以及依赖project的snapshot,保障构建一个包所需要的依赖固定,snapshot_id全局唯一 | +| build | 对project的构建任务,基于snapshot来创建,每个build_target会生成一个build,build_id全局唯一 | +| job | 每一次build任务会生成一个或多个job,每个job对应一个软件包的构建 | +| build_single | 单包构建 | +| build_type | 构建类型,可指定为full/incremental | diff --git a/docs/zh/docs/EulerMaker/figures/1686189862936_image.png b/docs/zh/docs/EulerMaker/figures/1686189862936_image.png new file mode 100644 index 0000000000000000000000000000000000000000..25d9365f454d8ac950673c8c89ff5abcf7fb4157 Binary files /dev/null and b/docs/zh/docs/EulerMaker/figures/1686189862936_image.png differ diff --git a/docs/zh/docs/EulerMaker/figures/1686190779219_image.png b/docs/zh/docs/EulerMaker/figures/1686190779219_image.png new file mode 100644 index 0000000000000000000000000000000000000000..c94d01cd9057cdb9e3a51eefb7f389ceab72c3ee Binary files /dev/null and b/docs/zh/docs/EulerMaker/figures/1686190779219_image.png differ diff --git a/docs/zh/docs/EulerMaker/figures/1686190839529_image.png b/docs/zh/docs/EulerMaker/figures/1686190839529_image.png new file mode 100644 index 0000000000000000000000000000000000000000..146eedad4cd02978bfb18ee5f4aa6bb05092cda8 Binary files /dev/null and b/docs/zh/docs/EulerMaker/figures/1686190839529_image.png differ diff --git a/docs/zh/docs/EulerMaker/figures/1686193530087_image.png b/docs/zh/docs/EulerMaker/figures/1686193530087_image.png new file mode 100644 index 0000000000000000000000000000000000000000..e89f4e78266e7dcb4ea320d74f73610438d500b0 Binary files /dev/null and b/docs/zh/docs/EulerMaker/figures/1686193530087_image.png differ diff --git a/docs/zh/docs/EulerMaker/figures/1686193606679_image.png b/docs/zh/docs/EulerMaker/figures/1686193606679_image.png new file mode 100644 index 0000000000000000000000000000000000000000..3070dddbbcd1dca259bef95d62e5ec18dcc44499 Binary files /dev/null and b/docs/zh/docs/EulerMaker/figures/1686193606679_image.png differ diff --git a/docs/zh/docs/EulerMaker/figures/1686193747460_image.png b/docs/zh/docs/EulerMaker/figures/1686193747460_image.png new file mode 100644 index 0000000000000000000000000000000000000000..76c8c5fd75b6ed406737f6c3445559c355fd21c0 Binary files /dev/null and b/docs/zh/docs/EulerMaker/figures/1686193747460_image.png differ diff --git a/docs/zh/docs/EulerMaker/figures/1686194008501_image.png b/docs/zh/docs/EulerMaker/figures/1686194008501_image.png new file mode 100644 index 0000000000000000000000000000000000000000..82134424e83f72f6c3aba04077d34f555149015d Binary files /dev/null and b/docs/zh/docs/EulerMaker/figures/1686194008501_image.png differ diff --git a/docs/zh/docs/EulerMaker/figures/1686194042686_image.png b/docs/zh/docs/EulerMaker/figures/1686194042686_image.png new file mode 100644 index 0000000000000000000000000000000000000000..60f00d2b818b75b0778caacefcf7de0dae1e6663 Binary files /dev/null and b/docs/zh/docs/EulerMaker/figures/1686194042686_image.png differ diff --git a/docs/zh/docs/EulerMaker/figures/image.png b/docs/zh/docs/EulerMaker/figures/image.png new file mode 100644 index 0000000000000000000000000000000000000000..1051e6fc1a7068898108b862aea1835b43799030 Binary files /dev/null and b/docs/zh/docs/EulerMaker/figures/image.png differ diff --git a/docs/zh/docs/Ods-Pipeline/image/.keep b/docs/zh/docs/EulerMaker/images/.keep similarity index 100% rename from docs/zh/docs/Ods-Pipeline/image/.keep rename to docs/zh/docs/EulerMaker/images/.keep diff --git a/docs/zh/docs/EulerMaker/images/add_file.png b/docs/zh/docs/EulerMaker/images/add_file.png new file mode 100644 index 0000000000000000000000000000000000000000..6fc4c5b089237ecbb5dabdae6954093c5234c380 Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/add_file.png differ diff --git a/docs/zh/docs/EulerMaker/images/add_package.png b/docs/zh/docs/EulerMaker/images/add_package.png new file mode 100644 index 0000000000000000000000000000000000000000..1c58f18f4781f6c34c995d56dee131149b835df8 Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/add_package.png differ diff --git a/docs/zh/docs/EulerMaker/images/add_rpms.png b/docs/zh/docs/EulerMaker/images/add_rpms.png new file mode 100644 index 0000000000000000000000000000000000000000..1bb748b49523bfa1b328f4bd9fbf2bf45fcf9bf2 Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/add_rpms.png differ diff --git a/docs/zh/docs/EulerMaker/images/add_rpms_2.png b/docs/zh/docs/EulerMaker/images/add_rpms_2.png new file mode 100644 index 0000000000000000000000000000000000000000..25c845415c8ce1fdedac308b777422405502bffa Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/add_rpms_2.png differ diff --git a/docs/zh/docs/EulerMaker/images/brach_package.png b/docs/zh/docs/EulerMaker/images/brach_package.png new file mode 100644 index 0000000000000000000000000000000000000000..ab72263596c18b2e4a11d69c68cde4f775dcc1c5 Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/brach_package.png differ diff --git a/docs/zh/docs/EulerMaker/images/branch_package.png b/docs/zh/docs/EulerMaker/images/branch_package.png new file mode 100644 index 0000000000000000000000000000000000000000..d9230ffbfc6b9bd006fdf6268537a8b70b43a8f4 Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/branch_package.png differ diff --git a/docs/zh/docs/EulerMaker/images/build_detail.png b/docs/zh/docs/EulerMaker/images/build_detail.png new file mode 100644 index 0000000000000000000000000000000000000000..52a49744799e5387ba0dc76dfce3e0d4ebeb1c2f Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/build_detail.png differ diff --git a/docs/zh/docs/EulerMaker/images/build_history.png b/docs/zh/docs/EulerMaker/images/build_history.png new file mode 100644 index 0000000000000000000000000000000000000000..b413f18a53409a3ba6fb0891e887a9a6a10c001a Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/build_history.png differ diff --git a/docs/zh/docs/EulerMaker/images/certification.png b/docs/zh/docs/EulerMaker/images/certification.png new file mode 100644 index 0000000000000000000000000000000000000000..3bd145b7070b8fd2a1f5c29e762214540f747f8b Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/certification.png differ diff --git a/docs/zh/docs/EulerMaker/images/config.png b/docs/zh/docs/EulerMaker/images/config.png new file mode 100644 index 0000000000000000000000000000000000000000..2042e3bb09a98d34429586322de51e398ed99a20 Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/config.png differ diff --git a/docs/zh/docs/EulerMaker/images/config_net.png b/docs/zh/docs/EulerMaker/images/config_net.png new file mode 100644 index 0000000000000000000000000000000000000000..64f514ded1a9575708c1a379e118b5297d0ee580 Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/config_net.png differ diff --git a/docs/zh/docs/EulerMaker/images/config_partition.png b/docs/zh/docs/EulerMaker/images/config_partition.png new file mode 100644 index 0000000000000000000000000000000000000000..8f63e16cd6c9c07795ad3174a6f4de621ba6bb37 Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/config_partition.png differ diff --git a/docs/zh/docs/EulerMaker/images/config_passwd.png b/docs/zh/docs/EulerMaker/images/config_passwd.png new file mode 100644 index 0000000000000000000000000000000000000000..e9947adc07c9147faae1ec5ad58c327b703ae86c Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/config_passwd.png differ diff --git a/docs/zh/docs/EulerMaker/images/config_system.png b/docs/zh/docs/EulerMaker/images/config_system.png new file mode 100644 index 0000000000000000000000000000000000000000..147fc5ba087113ec34ec4b73bc615b5d5c222d16 Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/config_system.png differ diff --git a/docs/zh/docs/EulerMaker/images/create-project.png b/docs/zh/docs/EulerMaker/images/create-project.png new file mode 100644 index 0000000000000000000000000000000000000000..e4c80324bf81eb3a9c54d300ae2795a693959898 Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/create-project.png differ diff --git a/docs/zh/docs/EulerMaker/images/create_project.png b/docs/zh/docs/EulerMaker/images/create_project.png new file mode 100644 index 0000000000000000000000000000000000000000..22f369f3558657c5287bb62bbe344880ea530034 Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/create_project.png differ diff --git a/docs/zh/docs/EulerMaker/images/custom_package.png b/docs/zh/docs/EulerMaker/images/custom_package.png new file mode 100644 index 0000000000000000000000000000000000000000..7f89731a1734e9426099ad26cd9a6ac85528adf6 Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/custom_package.png differ diff --git a/docs/zh/docs/EulerMaker/images/custom_package_2.png b/docs/zh/docs/EulerMaker/images/custom_package_2.png new file mode 100644 index 0000000000000000000000000000000000000000..612a003bd902b6e50ddc51bf94aebb6e3e08a49e Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/custom_package_2.png differ diff --git a/docs/zh/docs/EulerMaker/images/dag_relation.PNG b/docs/zh/docs/EulerMaker/images/dag_relation.PNG new file mode 100644 index 0000000000000000000000000000000000000000..64bc551096b0978d20dabdaee1a69bb4cfcf14fb Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/dag_relation.PNG differ diff --git a/docs/zh/docs/EulerMaker/images/dag_relationships.png b/docs/zh/docs/EulerMaker/images/dag_relationships.png new file mode 100644 index 0000000000000000000000000000000000000000..acfca3666f7937105337b57e1b4305c7af0681c7 Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/dag_relationships.png differ diff --git a/docs/zh/docs/EulerMaker/images/download.png b/docs/zh/docs/EulerMaker/images/download.png new file mode 100644 index 0000000000000000000000000000000000000000..40e4d418f2f8a57fb730dec03960f995cf255ca4 Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/download.png differ diff --git a/docs/zh/docs/EulerMaker/images/enter_pipeline.png b/docs/zh/docs/EulerMaker/images/enter_pipeline.png new file mode 100644 index 0000000000000000000000000000000000000000..bf41d190deed529ec11d7fce0c929fb914360cc2 Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/enter_pipeline.png differ diff --git a/docs/zh/docs/EulerMaker/images/fork_backlight.png b/docs/zh/docs/EulerMaker/images/fork_backlight.png new file mode 100644 index 0000000000000000000000000000000000000000..0000eff2d35e972bf61fd2af1a7d0ae01f6d0122 Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/fork_backlight.png differ diff --git a/docs/zh/docs/EulerMaker/images/full_build.png b/docs/zh/docs/EulerMaker/images/full_build.png new file mode 100644 index 0000000000000000000000000000000000000000..cdc25c7a4be1b2a6ea2e744086e9c31f4a5d9346 Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/full_build.png differ diff --git a/docs/zh/docs/EulerMaker/images/home.png b/docs/zh/docs/EulerMaker/images/home.png new file mode 100644 index 0000000000000000000000000000000000000000..32d301ac9c3edea498c5445c732cbdf87c1056a8 Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/home.png differ diff --git a/docs/zh/docs/EulerMaker/images/host_parameters.png b/docs/zh/docs/EulerMaker/images/host_parameters.png new file mode 100644 index 0000000000000000000000000000000000000000..dd13cee631c0ef793cc529f702f5de239f1a2b39 Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/host_parameters.png differ diff --git a/docs/zh/docs/EulerMaker/images/image-build-1.png b/docs/zh/docs/EulerMaker/images/image-build-1.png new file mode 100644 index 0000000000000000000000000000000000000000..1cf2b33f2141144ffbd4164cd307f8fa81e67568 Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/image-build-1.png differ diff --git a/docs/zh/docs/EulerMaker/images/image-build-2.png b/docs/zh/docs/EulerMaker/images/image-build-2.png new file mode 100644 index 0000000000000000000000000000000000000000..54a99f7d29965db62522f55370b1fbc3ebcced3b Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/image-build-2.png differ diff --git a/docs/zh/docs/EulerMaker/images/image-build.png b/docs/zh/docs/EulerMaker/images/image-build.png new file mode 100644 index 0000000000000000000000000000000000000000..71238d92d6c08fda87c4c900d68ba4cfd1f81f77 Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/image-build.png differ diff --git a/docs/zh/docs/EulerMaker/images/image-his-2.png b/docs/zh/docs/EulerMaker/images/image-his-2.png new file mode 100644 index 0000000000000000000000000000000000000000..23956c3506f782175e920feb70c2a1d958e5818a Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/image-his-2.png differ diff --git a/docs/zh/docs/EulerMaker/images/image-his.png b/docs/zh/docs/EulerMaker/images/image-his.png new file mode 100644 index 0000000000000000000000000000000000000000..43dcee777c89c499efd250d8adb6c9377e18a3f9 Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/image-his.png differ diff --git a/docs/zh/docs/EulerMaker/images/image_details.png b/docs/zh/docs/EulerMaker/images/image_details.png new file mode 100644 index 0000000000000000000000000000000000000000..9e05e11da1f2265c2f5101b247cfd980f1e250e7 Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/image_details.png differ diff --git a/docs/zh/docs/EulerMaker/images/incremental_build.png b/docs/zh/docs/EulerMaker/images/incremental_build.png new file mode 100644 index 0000000000000000000000000000000000000000..1550b4025eb2bd7b55b9b9391e8f8130cd546a99 Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/incremental_build.png differ diff --git a/docs/zh/docs/EulerMaker/images/inherit_project.png b/docs/zh/docs/EulerMaker/images/inherit_project.png new file mode 100644 index 0000000000000000000000000000000000000000..6faee2fa96a4ec76fb32a5acd43c469c28dbd0d3 Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/inherit_project.png differ diff --git a/docs/zh/docs/EulerMaker/images/jobs.png b/docs/zh/docs/EulerMaker/images/jobs.png new file mode 100644 index 0000000000000000000000000000000000000000..0469194c994e0b022463b43cea7f2c92e7c74cbd Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/jobs.png differ diff --git a/docs/zh/docs/EulerMaker/images/login.png b/docs/zh/docs/EulerMaker/images/login.png new file mode 100644 index 0000000000000000000000000000000000000000..32383e5176d5f691fdbd079df2546385e7ce0aac Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/login.png differ diff --git a/docs/zh/docs/EulerMaker/images/openeuler-community-login.png b/docs/zh/docs/EulerMaker/images/openeuler-community-login.png new file mode 100644 index 0000000000000000000000000000000000000000..9e9bede08d8dba2ce9c0bd32644a42746bff48f4 Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/openeuler-community-login.png differ diff --git a/docs/zh/docs/EulerMaker/images/package_overview.png b/docs/zh/docs/EulerMaker/images/package_overview.png new file mode 100644 index 0000000000000000000000000000000000000000..0242c985fec75ef004d1aaec46c675ac486b3412 Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/package_overview.png differ diff --git a/docs/zh/docs/EulerMaker/images/pipeline_add.png b/docs/zh/docs/EulerMaker/images/pipeline_add.png new file mode 100644 index 0000000000000000000000000000000000000000..2a7464d743a7a243311d7be10d32f0e75532233c Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/pipeline_add.png differ diff --git a/docs/zh/docs/EulerMaker/images/pipeline_clone.png b/docs/zh/docs/EulerMaker/images/pipeline_clone.png new file mode 100644 index 0000000000000000000000000000000000000000..111ba197509d71436d52734da73f4c6882b43df6 Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/pipeline_clone.png differ diff --git a/docs/zh/docs/EulerMaker/images/pipeline_delete.png b/docs/zh/docs/EulerMaker/images/pipeline_delete.png new file mode 100644 index 0000000000000000000000000000000000000000..2a5d91c407d99a98561dbaae9c0f73e6a6700d60 Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/pipeline_delete.png differ diff --git a/docs/zh/docs/EulerMaker/images/pipeline_list.png b/docs/zh/docs/EulerMaker/images/pipeline_list.png new file mode 100644 index 0000000000000000000000000000000000000000..01722cdbc16f3a7de8998978aea227168ebbfe50 Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/pipeline_list.png differ diff --git a/docs/zh/docs/EulerMaker/images/pipeline_param.png b/docs/zh/docs/EulerMaker/images/pipeline_param.png new file mode 100644 index 0000000000000000000000000000000000000000..70852e6d7112e08b54c1741b74c4923e15b6a128 Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/pipeline_param.png differ diff --git a/docs/zh/docs/EulerMaker/images/pipeline_start.png b/docs/zh/docs/EulerMaker/images/pipeline_start.png new file mode 100644 index 0000000000000000000000000000000000000000..1f4d7ba885bb08cde3c523f7e23c0751d42f516f Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/pipeline_start.png differ diff --git a/docs/zh/docs/EulerMaker/images/regist.png b/docs/zh/docs/EulerMaker/images/regist.png new file mode 100644 index 0000000000000000000000000000000000000000..32c00ccebee78b4652ac57b9507d107d31f24f6a Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/regist.png differ diff --git a/docs/zh/docs/EulerMaker/images/release-image_build.png b/docs/zh/docs/EulerMaker/images/release-image_build.png new file mode 100644 index 0000000000000000000000000000000000000000..56d7c636cdce03e226efd57f3c8127a579fd06be Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/release-image_build.png differ diff --git a/docs/zh/docs/EulerMaker/images/run-job.png b/docs/zh/docs/EulerMaker/images/run-job.png new file mode 100644 index 0000000000000000000000000000000000000000..744e674f6eed82525d60fe4c6ce6f383852c7db9 Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/run-job.png differ diff --git a/docs/zh/docs/EulerMaker/images/sign-up-local-account.png b/docs/zh/docs/EulerMaker/images/sign-up-local-account.png new file mode 100644 index 0000000000000000000000000000000000000000..7f7ebb44d2314cab3939c57b1871d4f8b1063acf Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/sign-up-local-account.png differ diff --git a/docs/zh/docs/EulerMaker/images/single_build.png b/docs/zh/docs/EulerMaker/images/single_build.png new file mode 100644 index 0000000000000000000000000000000000000000..76d92fdd9c95afb984720de169712d9a5e402dcb Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/single_build.png differ diff --git a/docs/zh/docs/EulerMaker/images/user_add.png b/docs/zh/docs/EulerMaker/images/user_add.png new file mode 100644 index 0000000000000000000000000000000000000000..644d014f9561a7bccdc7f92cd494e38f42f20435 Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/user_add.png differ diff --git a/docs/zh/docs/EulerMaker/images/user_manager.png b/docs/zh/docs/EulerMaker/images/user_manager.png new file mode 100644 index 0000000000000000000000000000000000000000..fd44cb94dedea095eba93a5110e8ac1a973e2b0d Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/user_manager.png differ diff --git a/docs/zh/docs/EulerMaker/images/web-project.PNG b/docs/zh/docs/EulerMaker/images/web-project.PNG new file mode 100644 index 0000000000000000000000000000000000000000..4f53c375d41eb3a0481dc8ad192cb7c139248311 Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/web-project.PNG differ diff --git a/docs/zh/docs/EulerMaker/images/wgcloud-web.PNG b/docs/zh/docs/EulerMaker/images/wgcloud-web.PNG new file mode 100644 index 0000000000000000000000000000000000000000..3ed2a07058365b5609cf923926171df8a9b11e0f Binary files /dev/null and b/docs/zh/docs/EulerMaker/images/wgcloud-web.PNG differ diff --git a/docs/zh/docs/EulerMaker/merge-configs.md b/docs/zh/docs/EulerMaker/merge-configs.md new file mode 100644 index 0000000000000000000000000000000000000000..2e719ac31e91379818aaccbe8f747144658685f0 --- /dev/null +++ b/docs/zh/docs/EulerMaker/merge-configs.md @@ -0,0 +1,109 @@ +# 概述 + +本特性可以按照用户自定义需求,修改、定制、迭代软件包的构建文件,使构建文件的各版本、包间宏定义差异性管理。 + +# 安装与卸载 + +#### 安装 + + pip install merge_configs-0.0.6-py3-none-any.whl + +#### 卸载 + + pip uninstall merge-configs + +# 使用方法 + +#### 命令行 + + merge-configs --help + -p PACKAGES, --packages PACKAGES: 设置需要merge的软件包,多个软件包用空格隔开。 + -c CONFIG_FILE, --config_file CONFIG_FILE: 设置分层根目录文件config.yaml。 + -o OUTPUT, --output OUTPUT: 设置输出文件的路径。 + -d --debug : 是否需要设置日志模式为debug。 + -l LIST_FEATURES, --list-features LIST_FEATURES: 不为空时显示用户配置信息,设置存在于-p参数值中的软件包,多个软件包用逗号隔开。 + -a TARGET_ARCH, --arch TARGET_ARCH: 设置merge的目标架构,例如:x86_64,aarch64。 +常用命令:merge-configs -p \\\$\{package} -c \\\$\{config_path}/config.yaml -o \\\$\{output_path} -a \\\$\{target_arch_name} -l \\\$\{package} + +常见的yaml结构如下: + +![](./figures/image.png) + +经过转换后: + +![](./figures/1686189862936_image.png) + +#### 软件包定制 + +软件包编译信息经过的分层的架构分开保存,分成yaml主要配置、files.yaml文件配置、编译执行脚本、运行时执行脚本和changelog。其各个文件中定制的内容经由merge-configs工具解析转换,会在编译中生效。 + +##### 参数定制 + +1. 参数名定制: +参数名可以根据定制需求修改,通常只修改source,patch的编号,随意修改可能导致spec语法不支持。 + +2. 参数值定制: +参数值定制的范围比较大,根据需求可随意修改内容,但是尽量不要修改值类型,如string类型改为list类型,可能导致转换错误。 + +修改Patch编号和它的值: + +![](./figures/1686190779219_image.png) + +经过转换后: + +![](./figures/1686190839529_image.png) + +##### 条件定制 + +条件定制的方式是在yaml配置层的key中添加when条件。 + + Source: + 0: http://ftp.gnu.org/gnu/libtool/libtool-%{version}.tar.xz + source when arch in aarch64: + 100: libtool-aarch-%{version}.tar.xz +有三种方式定制: + +1. 架构定制。 + + ```sh + buildRequires: + - "gcc" + buildRequires when arch in x86_64: + - "gcc-c++" + buildRequires when arch not in x86_64: + - "gzip" + ``` + +2. 标签定制: +defineFlags字段将会转换为bcond_with/bcond_without。 + + ```sh + defineFlags: + +auto_compile: "" + patchset when +auto_compile: + 1001: libtool-0.0.1-auto_compile.patch + ``` + +3. 宏定制: +%%{rpmGlobal.}表示在包信息中定义的宏,%%%{rpmGlobal.}表示在rpm系统中定义的宏。 + + ```sh + rpmGloal: + posttest: 0 + source when %%{rpmGlobal.posttest}: + 1: posttest.sh + source when %%%{rpmGlobal._debugsource_packages}: + 2: openEuler_setup.py + ``` + +定制后: + +![](./figures/1686194042686_image.png) + +经过定制再转换后: + +![](./figures/1686194008501_image.png) + +#### 转换 + +目前eulermaker仅支持yaml->spec的转换,仅支持rpmbuild编译rpm包的编译方式。 diff --git "a/docs/zh/docs/FangTian/FangTian\346\224\257\346\214\201Wayland\345\272\224\347\224\250\345\217\212\351\270\277\350\222\231\345\272\224\347\224\250.md" "b/docs/zh/docs/FangTian/FangTian\346\224\257\346\214\201Wayland\345\272\224\347\224\250\345\217\212\351\270\277\350\222\231\345\272\224\347\224\250.md" new file mode 100644 index 0000000000000000000000000000000000000000..9723d13c9d738ebedd84ca3f647ee70540c6ae0b --- /dev/null +++ "b/docs/zh/docs/FangTian/FangTian\346\224\257\346\214\201Wayland\345\272\224\347\224\250\345\217\212\351\270\277\350\222\231\345\272\224\347\224\250.md" @@ -0,0 +1,74 @@ +# Linux Wayland 应用及鸿蒙应用的支持 + +FangTian 视窗引擎融合了多个应用生态,可支持 Linux、鸿蒙应用在 openEuler 同时运行。 + +## Wayland应用的支持 + +### Wayland协议 + +FangTian 为了支持 Linux 原生应用,对 Wayland 应用做了兼容。由于 Wayland 协议庞杂,FangTian 当前主要兼容了 Core/Stable/Unstable 等。 + +### 应用运行 + +1. 在启动[引擎](./FangTian环境配置.md#启动引擎)之后,启动 wayland 适配器的 sa。 + + ```shell + mkdir -p ~/tmp + sa_main /system/profile/ft/ft_wl.xml > ~/tmp/ftwlsa.log 2>&1 & + ``` + +2. 配置 wl 环境。 + + ```shell + export XDG_SESSION_TYPE=wayland + export WAYLAND_DISPLAY="wayland-0" + export QT_QPA_PLATFORMTHEME=ukui + ``` + +3. Linux Wayland 应用的安装下载。 + + ```shell + sudo dnf install kylin-calculator deepin-terminal + ``` + +4. 运行结果如下 。 + +![](./figures/wayland_apps.png) + +## 鸿蒙应用的支持 + +### ArkUI框架 + +FangTian 当前支持 ArkUI 部分控件,如文本、按钮、图片等。开发者可以基于[DevEco Studio](https://developer.harmonyos.com/cn/develop/deveco-studio/)完成鸿蒙应用的开发。 + +### 应用代码 + +- [电子相册](https://gitee.com/openharmony/codelabs/tree/master/ETSUI/ElectronicAlbum) +- [简易计算器](https://gitee.com/openharmony/codelabs/tree/master/ETSUI/SimpleCalculator) + +### 安装运行 + +1. 从 DevEco Studio 复制应用 hap 到 openEuler 目录下,如`~/apps/tmp`。 + +2. 解压该 hap,如`eletronicAlbum.hap`。 + + ```shell + unzip eletronicAlbum.hap + ``` + 解压之后的路径为`~/apps/tmp/eletronicAlbum`。 + +3. 在启动[引擎](./FangTian环境配置.md#启动引擎)之后,运行 hap。 + + ```shell + hap_executor ~/apps/tmp/eletronicAlbum + ``` + +4. 运行结果如下。 + +![](./figures/arkui_ele.png) + +### 限制条件 + +- 当前 ArkUI 控件支持不全,web、视频类等控件不可用,napi 接口需要自行开发、迁移。 + +- ArkUI 在该版本版本上仅支持 x86 架构。 diff --git "a/docs/zh/docs/FangTian/FangTian\347\216\257\345\242\203\351\205\215\347\275\256.md" "b/docs/zh/docs/FangTian/FangTian\347\216\257\345\242\203\351\205\215\347\275\256.md" new file mode 100644 index 0000000000000000000000000000000000000000..492ecd580902a552c00161e44a44ce7732727178 --- /dev/null +++ "b/docs/zh/docs/FangTian/FangTian\347\216\257\345\242\203\351\205\215\347\275\256.md" @@ -0,0 +1,82 @@ +# 安装与部署 + +本章介绍在 openEuler 中安装 FangTian 的方法。 + +## 软硬件要求 + +### 硬件要求 + +当前仅支持 x86和 AArch64 架构。 + +### 环境准备 + +安装 openEuler 系统,安装方法参考《[openEuler 安装指南](./../Installation/installation.md)》。 + +### FangTian 软件包安装 + + x86架构下: + + ```shell + sudo dnf install ft_multimedia ft_mmi ft_flutter ft_engine arkui-linux ft_utils + sudo dnf install ft_multimedia-devel ft_mmi-devel ft_flutter-devel ft_engine-devel + ``` + + AArch64架构下: + + ```shell + sudo dnf install ft_multimedia ft_mmi ft_flutter ft_engine ft_utils + sudo dnf install ft_multimedia-devel ft_mmi-devel ft_flutter-devel ft_engine-devel + ``` + +## 启动引擎 + +- 系统服务samgr的启动 + + 预设置:安装 binder,ashmem 等。 + + ```shell + sudo /usr/share/sa/pre_oneshot_samgr + ``` + + 可以直接拉起samgr。 + + ```shell + mkdir -p ~/tmp + sudo samgr > ~/tmp/samgr.log 2>&1 & + ``` + + 或者,自行配置为 service 再启动服务。 + + ```shell + sudo systemctl restart samgr + ``` + +- 引擎sa的启动 + + ```shell + sa_main /system/profile/ft/ft.xml > ~/tmp/ftsa.log 2>&1 & + ``` + + > 说明 + > + > - sa 代表一种系统能力,一个进程可包含多个 sa。ft.xml 配置了多个 sa,共同组成进程 ft。关于 samgr 及 sa 概念可以参考 OpenHarmony。 + + > - 关于 sa 的配置 xml 及 sa_main samgr 均已在软件包安装时自动部署。 + +## 基于 FangTian 做简单 GUI 应用开发运行 + +C++ GUI 简单应用《[示例参考](https://gitee.com/openeuler/ft_engine/blob/master/samples/)》。 + +运行: + +```shell +desktop & +``` + +结果如下: + +![](./figures/desktop_simple_apps.png) + + > 说明 + > + > 开发者可以查看[FT接口](https://gitee.com/openeuler/ft_engine/wikis/1.0-alpha%E6%8E%A5%E5%8F%A3/1.0-alpha%20Interface%20Overview)进行应用的开发。 diff --git a/docs/zh/docs/FangTian/figures/arkui_ele.png b/docs/zh/docs/FangTian/figures/arkui_ele.png new file mode 100644 index 0000000000000000000000000000000000000000..d2c8010cddaa99a852c072f7852f51e48c9b9675 Binary files /dev/null and b/docs/zh/docs/FangTian/figures/arkui_ele.png differ diff --git a/docs/zh/docs/FangTian/figures/desktop_simple_apps.png b/docs/zh/docs/FangTian/figures/desktop_simple_apps.png new file mode 100644 index 0000000000000000000000000000000000000000..cf625a544183dafb9747ececc544722dd1e42f87 Binary files /dev/null and b/docs/zh/docs/FangTian/figures/desktop_simple_apps.png differ diff --git a/docs/zh/docs/FangTian/figures/wayland_apps.png b/docs/zh/docs/FangTian/figures/wayland_apps.png new file mode 100644 index 0000000000000000000000000000000000000000..bb62dd4f352625b273b22b8681a9de34123eb0ca Binary files /dev/null and b/docs/zh/docs/FangTian/figures/wayland_apps.png differ diff --git a/docs/zh/docs/FangTian/overview.md b/docs/zh/docs/FangTian/overview.md new file mode 100644 index 0000000000000000000000000000000000000000..b303fcdf1e0174822b99c2accb6b4e8d77d987cf --- /dev/null +++ b/docs/zh/docs/FangTian/overview.md @@ -0,0 +1,8 @@ +# FangTian 视窗引擎指南 + +本文档介绍基于 openEuler 系统的 FangTian 视窗引擎的安装及开发使用指南。 + +本文档适用于使用 openEuler 系统并希望了解和使用 FangTian 视窗引擎的社区开发者、开源爱好者以及相关合作伙伴。使用人员需要具备以下经验和技能: + +* 熟悉 Linux 基本操作。 +* 了解 Linux GUI 开发、ArkUI 开发。 diff --git "a/docs/zh/docs/GCC/\345\205\250\345\234\272\346\231\257\351\223\276\346\216\245\346\227\266\344\272\214\350\277\233\345\210\266\345\272\223\345\206\205\350\201\224\344\274\230\345\214\226.md" "b/docs/zh/docs/GCC/\345\205\250\345\234\272\346\231\257\351\223\276\346\216\245\346\227\266\344\272\214\350\277\233\345\210\266\345\272\223\345\206\205\350\201\224\344\274\230\345\214\226.md" new file mode 100644 index 0000000000000000000000000000000000000000..77f2c0322852e272720948b4a3e6fe452de09b2e --- /dev/null +++ "b/docs/zh/docs/GCC/\345\205\250\345\234\272\346\231\257\351\223\276\346\216\245\346\227\266\344\272\214\350\277\233\345\210\266\345\272\223\345\206\205\350\201\224\344\274\230\345\214\226.md" @@ -0,0 +1,59 @@ +# 全场景链接时二进制库内联优化 + +本特性支持全场景链接时的二进制内联,通过多版本二进制的组合输出,及LTO插件形式的多版本符号解析,支持不同版本编译器的LTO的内联优化;并设计跨模块的编译选项分析和匹配,适配不同编译模块的融合,实现全场景的链接时优化。 + +## 选项 -fmulti-version-lib= + +### 说明 + +该选项为链接时选项,与`-flto`结合使用,用于LTO链接时指示传入的库为多版本的LTO格式,需要编译器切换旧版本的LTO形式读取 (目前支持openEuler 2403 SP1、openEuler 2409 编译的LTO格式),用于兼容不同LTO的静态库或目标文件进行融合链接及优化编译。 + +### 使用方式 + +在选项中加入,如传入多个文件名通过逗号隔开: + +~~~bash +-flto -fmulti-version-lib=liba.a,libb.a +~~~ + +举例: + +~~~bash +# gcc for openEuler 24.09 +gcc -O2 -fPIC -flto -c fa.c -o fa.o +gcc-ar rcs liba.a fa.o + +# gcc for openEuler latest +gcc -O2 -fPIC -flto -fmulti-version-lib=liba.a main.c liba.a -o exe +~~~ + + + +## 选项 -finline-force= + +### 说明 + +该选项为链接时选项,与`-flto`结合使用,用于在LTO链接时,指示对传入的目标静态库或目标文件尝试进行内联,该选项将对目标文件中的函数尝试内联增强,增强以下内联扩展: + +- 架构选项的兼容,当调用函数和被调用函数使用的march/mcpu等信息不同时,尽可能将被调用函数的架构选项与调用函数的架构选项切换一致,并进行内联编译。 +- `inline`关键字,类似于被调用函数增加了`inline`关键字,指示在编译过程中能找到函数实体,就尽可能内联。 +- `always_inline`属性,类似于被调用函数增加了`__attribute__((always_inline))`属性。 + +### 使用方法 + +在选项中加入,如传入多个文件名通过逗号隔开: + +~~~bash +-flto -finline-force=liba.a,libb.a +~~~ + +注:`-finline-force`,不加目标文件名形式,仅用于全局内联调试分析,不直接使用。 + +举例: + +~~~bash +gcc -O2 -fPIC -flto -c fa.c -o fa.o +gcc-ar rcs liba.a fa.o +gcc -O2 -fPIC -flto -finline -force=liba.a main.c liba.a -o exe +~~~ + diff --git "a/docs/zh/docs/GMEM/\345\256\211\350\243\205\344\270\216\351\203\250\347\275\262.md" "b/docs/zh/docs/GMEM/\345\256\211\350\243\205\344\270\216\351\203\250\347\275\262.md" index 2a71e421688efdcde259e2235612d48841013712..44505e44e3fc03f86b0b9e52f92baa942924d564 100644 --- "a/docs/zh/docs/GMEM/\345\256\211\350\243\205\344\270\216\351\203\250\347\275\262.md" +++ "b/docs/zh/docs/GMEM/\345\256\211\350\243\205\344\270\216\351\203\250\347\275\262.md" @@ -6,7 +6,7 @@ * 鲲鹏920处理器 * 昇腾910芯片 -* 操作系统:openEuler 23.09 +* 操作系统:openEuler 24.03 ## 环境准备 @@ -24,7 +24,7 @@ | 来源 | 软件包 | | ------------------------------------------------------------ | ------------------------------------------------------------ | - | openEuler 23.09 | kernel-6.4.0-xxx.aarch64.rpm
kernel-devel-6.4.0-xxx.aarch64.rpm
libgmem-xxx.aarch64.rpm
libgmem-devel-xxx.aarch64.rpm | + | openEuler 24.03 | kernel-6.6.0-xxx.aarch64.rpm
kernel-devel-6.6.0-xxx.aarch64.rpm
libgmem-xxx.aarch64.rpm
libgmem-devel-xxx.aarch64.rpm | | 昇腾社区 | # CANN软件包
Ascend-cann-toolkit-xxx-linux.aarch64.rpm
# NPU固件与驱动
Ascend-hdk-910-npu-driver-xxx.aarch64.rpm
Ascend-hdk-910-npu-firmware-xxx.noarch.rpm | | 联系GMEM社区维护人员
[@yang_yanchao](https://gitee.com/yang_yanchao) email:
[@LemmyHuang](https://gitee.com/LemmyHuang) email: | gmem-example-xxx.aarch64.rpm
mindspore-xxx-linux_aarch64.whl | diff --git a/docs/zh/docs/Gazelle/Gazelle.md b/docs/zh/docs/Gazelle/Gazelle.md index a40b562a7a6f64a10e8a519bee4e7ab26e61ec54..7dfbbbf4875468a244369c9b5453fe57c8781a74 100644 --- a/docs/zh/docs/Gazelle/Gazelle.md +++ b/docs/zh/docs/Gazelle/Gazelle.md @@ -12,7 +12,7 @@ Gazelle是一款高性能用户态协议栈。它基于DPDK在用户态直接读 完全兼容POSIX,零修改,适用不同类型的应用。 - 单进程且网卡支持多队列时,只需使用liblstack.so有更短的报文路径。其余场景使用ltran进程分发报文到各个线程。 + 单进程且网卡支持多队列时,只需使用liblstack.so有更短的报文路径。 ## 安装 @@ -36,22 +36,7 @@ dpdk >= 21.11-2 ### 1. 使用root权限安装ko -根据实际情况选择使用ko,提供虚拟网口、绑定网卡到用户态功能。 -若使用虚拟网口功能,则使用rte_kni.ko。 - -```sh -modprobe rte_kni carrier="on" -``` - -配置NetworkManager不托管kni网卡 - -```sh -[root@localhost ~]# cat /etc/NetworkManager/conf.d/99-unmanaged-devices.conf -[keyfile] -unmanaged-devices=interface-name:kni -[root@localhost ~]# systemctl reload NetworkManager -``` - +根据实际情况选择使用ko,提供绑定网卡到用户态功能。 网卡从内核驱动绑为用户态驱动的ko,根据实际情况选择一种。 ```sh @@ -99,21 +84,15 @@ cat查询实际预留页个数,连续内存不足时可能比预期少 ### 4. 挂载大页内存 -创建两个目录,分别给lstack的进程、ltran进程访问大页内存使用。操作步骤如下: +创建目录,给lstack的进程访问大页内存使用。操作步骤如下: ```sh -mkdir -p /mnt/hugepages-ltran mkdir -p /mnt/hugepages-lstack -chmod -R 700 /mnt/hugepages-ltran chmod -R 700 /mnt/hugepages-lstack -mount -t hugetlbfs nodev /mnt/hugepages-ltran -o pagesize=2M mount -t hugetlbfs nodev /mnt/hugepages-lstack -o pagesize=2M ``` ->说明: -/mnt/hugepages-ltran和/mnt/hugepages-lstack必须挂载同样pagesize大页 - ### 5. 应用程序使用Gazelle 有两种使用Gazelle方法,根据需要选择其一 @@ -150,31 +129,36 @@ gcc test.c -o test ${LSTACK_LIBS} |选项|参数格式|说明| |:---|:---|:---| -|dpdk_args|--socket-mem(必需)
--huge-dir(必需)
--proc-type(必需)
--legacy-mem
--map-perfect
-d|dpdk初始化参数,参考dpdk说明
--map-perfect为扩展特性,用于防止dpdk占用多余的地址空间,保证ltran有额外的地址空间分配给lstack。
-d参数加载指定so库文件| +|dpdk_args|--socket-mem(必需)
--huge-dir(必需)
--proc-type(必需)
--legacy-mem
--map-perfect
-d|dpdk初始化参数,参考dpdk说明
--map-perfect为扩展特性,用于防止dpdk占用多余的地址空间,保证有额外的地址空间分配给lstack。
-d参数加载指定so库文件| |listen_shadow| 0/1 | 是否使用影子fd监听。单listen线程,多协议栈线程时是能| -|use_ltran| 0/1 | 是否使用ltran | +|use_ltran| 0/1 | 是否使用ltran ,功能已衰退,不再支持| |num_cpus|"0,2,4 ..."|lstack线程绑定的cpu编号,编号的数量为lstack线程个数(小于等于网卡多队列数量)。可按NUMA选择cpu| -|num_wakeup|"1,3,5 ..."|wakeup线程绑定的cpu编号,编号的数量为wakeup线程个数,与lstack线程的数量保持一致。与numcpus选择对应NUMA的cpu。不配置则为不使用唤醒线程| |low_power_mode|0/1|是否开启低功耗模式,暂不支持| -|kni_switch|0/1|rte_kni开关,默认为0。只有不使用ltran时才能开启| +|kni_switch|0/1|rte_kni开关,默认为0。功能已衰退,不再支持| |unix_prefix|"string"|gazelle进程间通信使用的unix socket文件前缀字符串,默认为空,和需要通信的ltran.conf的unix_prefix或gazellectl的-u参数配置一致。不能含有特殊字符,最大长度为128。| |host_addr|"192.168.xx.xx"|协议栈的IP地址,也是应用程序的IP地址| |mask_addr|"255.255.xx.xx"|掩码地址| |gateway_addr|"192.168.xx.1"|网关地址| -|devices|"aa:bb:cc:dd:ee:ff"|网卡通信的mac地址,需要与ltran.conf的bond_macs配置一致| +|devices|"aa:bb:cc:dd:ee:ff"|网卡通信的mac地址,在bond1模式下作为bond的主网口| |app_bind_numa|0/1|应用的epoll和poll线程是否绑定到协议栈所在的numa,缺省值是1,即绑定| |send_connect_number|4|设置为正整数,表示每次协议栈循环中发包处理的连接个数| |read_connect_number|4|设置为正整数,表示每次协议栈循环中收包处理的连接个数| |rpc_number|4|设置为正整数,表示每次协议栈循环中rpc消息处理的个数| |nic_read_num|128|设置为正整数,表示每次协议栈循环中从网卡读取的数据包的个数| -|mbuf_pool_size|1024000|设置为小于5120000的正整数,表示初始化时申请的mbuf地址池大小,需要根据网卡硬件支持进行合理配置,配置过小会启动失败| +|bond_mode|-1|设置组bond,当前支持两个网口组bond模式,默认值-1关闭bond,当前支持bond1/4/6| +|bond_slave_mac|"aa:bb:cc:dd:ee:ff;AA:BB:CC:DD:EE:FF"|设置组bond网口的mac地址信息,以;分隔| +|bond_miimon|10|设置bond模式的监听间隔,默认值10,取值范围0~1500| +|udp_enable|0/1|是否开启udp功能,默认值1开启| +|nic_vlan_mode|-1|是否开启vlan模式,默认值-1关闭,取值范围-1~4095,0和4095是业界通用预留id无实际效果| +|tcp_conn_count|1500|tcp的最大连接数,该参数乘以mbuf_count_per_conn是初始化时申请的mbuf池大小,配置过小会启动失败,tcp_conn_count * mbuf_count_per_conn * 2048字节不能大于大页大小 | +|mbuf_count_per_conn|170|每个tcp连接需要的mbuf个数,该参数乘以tcp_conn_count是初始化时申请的mbuf地址池大小,配置过小会启动失败,tcp_conn_count * mbuf_count_per_conn * 2048字节不能大于大页大小| lstack.conf示例: ```sh dpdk_args=["--socket-mem", "2048,0,0,0", "--huge-dir", "/mnt/hugepages-lstack", "--proc-type", "primary", "--legacy-mem", "--map-perfect"] -use_ltran=1 +use_ltran=0 kni_switch=0 low_power_mode=0 @@ -192,56 +176,15 @@ read_connect_number=4 rpc_number=4 nic_read_num=128 mbuf_pool_size=1024000 -``` - -- ltran.conf用于指定ltran启动的参数,默认路径为/etc/gazelle/ltran.conf。使用ltran时,lstack.conf内配置use_ltran=1,配置参数如下: - -|选项|参数格式|说明| -|:---|:---|:---| -|forward_kit|"dpdk"|指定网卡收发模块。
保留字段,目前未使用。| -|forward_kit_args|-l
--socket-mem(必需)
--huge-dir(必需)
--proc-TYPE(必需)
--legacy-mem(必需)
--map-perfect(必需)
-d|dpdk初始化参数,参考dpdk说明。
--map-perfect为扩展特性,用于防止dpdk占用多余的地址空间,保证ltran有额外的地址空间分配给lstack。
-d参数加载指定so库文件| -|kni_switch|0/1|rte_kni开关,默认为0| -|unix_prefix|"string"|gazelle进程间通信使用的unix socket文件前缀字符串,默认为空,和需要通信的lstack.conf的unix_prefix或gazellectl的-u参数配置一致| -|dispatch_max_clients|n|ltran支持的最大client数。
lstack的协议栈线程总数不大于32| -|dispatch_subnet|192.168.xx.xx|子网掩码,表示ltran能识别的IP所在子网网段。参数为样例,子网按实际值配置。| -|dispatch_subnet_length|n|子网长度,表示ltran能识别的子网长度,例如length为4时,192.168.1.1-192.168.1.16| -|bond_mode|n|bond模式,目前只支持Active Backup(Mode1),取值为1| -|bond_miimon|n|bond链路监控时间,单位为ms,取值范围为1到2^64 - 1 - (1000 * 1000)| -|bond_ports|"0x01"|使用的dpdk网卡,0x01表示第一块| -|bond_macs|"aa:bb:cc:dd:ee:ff"|绑定的网卡mac地址,需要跟kni的mac地址保持一致| -|bond_mtu|n|最大传输单元,默认是1500,不能超过1500,最小值为68,不能低于68| - -ltran.conf示例: - -```sh -forward_kit_args="-l 0,1 --socket-mem 1024,0,0,0 --huge-dir /mnt/hugepages-ltran --proc-type primary --legacy-mem --map-perfect --syslog daemon" -forward_kit="dpdk" - -kni_switch=0 - -dispatch_max_clients=30 -dispatch_subnet="192.168.1.0" -dispatch_subnet_length=8 - bond_mode=1 -bond_mtu=1500 -bond_miimon=100 -bond_macs="aa:bb:cc:dd:ee:ff" -bond_ports="0x1" - -tcp_conn_scan_interval=10 +bond_slave_mac="aa:bb:cc:dd:ee:ff;AA:BB:CC:DD:EE:FF" +udp_enable=1 +nic_vlan_mode=-1 ``` -### 7. 启动应用程序 - -- 启动ltran进程 - - 单进程且网卡支持多队列,则直接使用网卡多队列分发报文到各线程,不启动ltran进程,lstack.conf的use_ltran配置为0. - 启动ltran时不使用-config-file指定配置文件,则使用默认路径/etc/gazelle/ltran.conf +- ltran模式功能已衰退,多进程使用需求可尝试使用SR-IOV组网硬件虚拟化组网模式: - ```sh - ltran --config-file ./ltran.conf - ``` +### 7. 启动应用程序 - 启动应用程序 @@ -258,40 +201,29 @@ Gazelle wrap应用程序POSIX接口,应用程序无需修改代码。 ### 9. 调测命令 -- 不使用ltran模式时不支持gazellectl ltran xxx命令,以及gazellectl lstack show {ip | pid} -r命令 - ```sh Usage: gazellectl [-h | help] - or: gazellectl ltran {quit | show | set} [LTRAN_OPTIONS] [time] [-u UNIX_PREFIX] or: gazellectl lstack {show | set} {ip | pid} [LSTACK_OPTIONS] [time] [-u UNIX_PREFIX] - quit ltran process exit - - where LTRAN_OPTIONS := - show ltran all statistics - -r, rate show ltran statistics per second - -i, instance show ltran instance register info - -b, burst show ltran NIC packet len per second - -l, latency show ltran latency - set: - loglevel {error | info | debug} set ltran loglevel - where LSTACK_OPTIONS := show lstack all statistics -r, rate show lstack statistics per second -s, snmp show lstack snmp -c, connetct show lstack connect -l, latency show lstack latency + -x, xstats show lstack xstats + -k, nic-features show state of protocol offload and other feature + -a, aggregatin [time] show lstack send/recv aggregation set: loglevel {error | info | debug} set lstack loglevel lowpower {0 | 1} set lowpower enable [time] measure latency time default 1S ``` --u参数指定gazelle进程间通信的unix socket前缀,和需要通信的ltran.conf或lstack.conf的unix_prefix配置一致。 +-u参数指定gazelle进程间通信的unix socket前缀,和需要通信的lstack.conf的unix_prefix配置一致。 **抓包工具** -gazelle使用的网卡由dpdk接管,因此普通的tcpdump无法抓到gazelle的数据包。作为替代,gazelle使用dpdk-tools软件包中提供的gazelle-pdump作为数据包捕获工具,它使用dpdk的多进程模式和lstack/ltran进程共享内存。在ltran模式下,gazelle-pdump只能抓取和网卡直接通信的ltran的数据包,通过tcpdump的数据包过滤,可以过滤特定lstack的数据包。 +gazelle使用的网卡由dpdk接管,因此普通的tcpdump无法抓到gazelle的数据包。作为替代,gazelle使用dpdk-tools软件包中提供的gazelle-pdump作为数据包捕获工具,它使用dpdk的多进程模式和lstack进程共享内存。 [详细使用方法](https://gitee.com/openeuler/gazelle/blob/master/doc/pdump.md) **线程名绑定** @@ -316,11 +248,9 @@ lstack启动时可以通过指定环境变量GAZELLE_THREAD_NAME来指定lstack - 不支持accept阻塞模式或者connect阻塞模式。 - 最多支持1500个TCP连接。 -- 当前仅支持TCP、ICMP、ARP、IPv4 协议。 +- 当前仅支持TCP、ICMP、ARP、IPv4、UDP 协议。 - 在对端ping Gazelle时,要求指定报文长度小于等于14000B。 - 不支持使用透明大页。 -- ltran不支持使用多种类型的网卡混合组bond。 -- ltran的bond1主备模式,只支持链路层故障主备切换(例如网线断开),不支持物理层故障主备切换(例如网卡下电、拔网卡)。 - 虚拟机网卡不支持多队列。 ### 操作约束 @@ -328,24 +258,25 @@ lstack启动时可以通过指定环境变量GAZELLE_THREAD_NAME来指定lstack - 提供的命令行、配置文件默认root权限。非root用户使用,需先提权以及修改文件所有者。 - 将用户态网卡绑回到内核驱动,必须先退出Gazelle。 - 大页内存不支持在挂载点里创建子目录重新挂载。 -- ltran需要最低大页内存为1GB。 - 每个应用实例协议栈线程最低大页内存为800MB 。 - 仅支持64位系统。 - 构建x86版本的Gazelle使用了-march=native选项,基于构建环境的CPU(Intel® Xeon® Gold 5118 CPU @ 2.30GHz指令集进行优化。要求运行环境CPU支持 SSE4.2、AVX、AVX2、AVX-512 指令集。 - 最大IP分片数为10(ping 最大包长14790B),TCP协议不使用IP分片。 - sysctl配置网卡rp_filter参数为1,否则可能不按预期使用Gazelle协议栈,而是依然使用内核协议栈。 -- 不使用ltran模式,KNI网口不可配置只支持本地通讯使用,且需要启动前配置NetworkManager不管理KNI网卡。 -- 虚拟KNI网口的IP及mac地址,需要与lstack.conf配置文件保持一致 。 +- 不支持使用多种类型的网卡混合组bond。 +- bond1主备模式,只支持链路层故障主备切换(例如网线断开),不支持物理层故障主备切换(例如网卡下电、拔网卡)。 +- 发送udp报文包长超过45952(32 * 1436)B时,需要将send_ring_size扩大为至少64个。 ## 注意事项 用户根据使用场景评估使用Gazelle +ltran模式及kni模块由于上游社区及依赖包变更,功能在新版本中不再支持. + 共享内存 - 现状 - 大页内存 mount 至 /mnt/hugepages-lstack 目录,进程初始化时在 /mnt/hugepages-lstack 目录下创建文件,每个文件对应一个大页,并 mmap 这些文件。ltran 在收到 lstask 的注册信息后,根据大页内存配置信息页 mmap 目录下文件,实现大页内存共享。 - ltran 在 /mnt/hugepages-ltran 目录的大页内存同理。 + 大页内存 mount 至 /mnt/hugepages-lstack 目录,进程初始化时在 /mnt/hugepages-lstack 目录下创建文件,每个文件对应一个大页,并 mmap 这些文件。 - 当前消减措施 大页文件权限 600,只有 OWNER 用户才能访问文件,默认 root 用户,支持配置成其他用户; 大页文件有 DPDK 文件锁,不能直接写或者映射。 @@ -356,4 +287,4 @@ lstack启动时可以通过指定环境变量GAZELLE_THREAD_NAME来指定lstack Gazelle没有做流量限制,用户有能力发送最大网卡线速流量的报文到网络,可能导致网络流量拥塞。 **进程仿冒** -合法注册到ltran的两个lstack进程,进程A可仿冒进程B发送仿冒消息给ltran,修改ltran的转发控制信息,造成进程B通讯异常,进程B报文转发给进程A信息泄露等问题。建议lstack进程都为可信任进程。 +建议lstack进程都为可信任进程。 diff --git "a/docs/zh/docs/HCK/HCK\347\211\271\346\200\247\345\217\212\344\275\277\347\224\250\346\211\213\345\206\214.md" "b/docs/zh/docs/HCK/HCK\347\211\271\346\200\247\345\217\212\344\275\277\347\224\250\346\211\213\345\206\214.md" deleted file mode 100644 index 22450003dcb75042fc12e3ac1f7089f1e911d847..0000000000000000000000000000000000000000 --- "a/docs/zh/docs/HCK/HCK\347\211\271\346\200\247\345\217\212\344\275\277\347\224\250\346\211\213\345\206\214.md" +++ /dev/null @@ -1,111 +0,0 @@ -# 特性说明 -HCK(High-performance Computing Kit)是HPC应用的软件底座,是在通用Linux平台上通过定制系统调用,调度优化等手段,为HPC应用提供隔离/低底噪的运行环境,同时兼容Linux生态,达到提高HPC应用性能的目的。在构建内核时需要将选项`CONFIG_PURPOSE_BUILT_KERNEL`开启来使能HCK特性。通过在系统启动阶段隔离出一部分CPU用来运行HPC应用,内核线程和其他用户进程则运行在非隔离核上。 - -# 使用说明 -在内核支持HCK特性前提下,需要在内核启动阶段增加启动参数`pbk_cpus`设置预留的CPU资源,默认不预留。可以通过修改grub的配置文件`grub.cfg`或grub的启动界面设置此参数。 -启动参数的格式为: pbk_cpus=,cpulist支持`","`和`"-"`连接,例如`1,3-4`表示预留CPU1/CPU3和CPU4。HCK特性暂不支持x86架构,需要注意的是,在aarch64架构下不支持预留0核。 -当配置好启动参数后,内核会自动完成CPU资源的隔离预留,系统启动后不需要额外的部署步骤。 - -# 用户态launcher工具使用 - -HCK特性有一个用户态的launcher工具,通过其可以指定程序在特定的隔离核上运行,该工具需要单独安装。 - -## 基本选项 - -参见 `launcher -?`: -``` -Usage: launcher [OPTION...] - -launcher: launch process to pbk domain - -EXAMPLES: - launcher -c 1,2 prog # alloc CPU 1,2 from pbk root domain to run prog - launcher -n 2 prog # alloc 2 CPUs from pbk root domain to run prog - launcher -v prog # prog will only touch pbk - launcher -c 1 prog # alloc CPU 1 and 1M memory - - -c, --cpulist=CPULIST cpulist to run prog - -n, --nr_cpu=NR_CPU nr_cpus to run prog - -v, --pbk_view run prog with pbk view - -?, --help Give this help list - --usage Give a short usage message - -Mandatory or optional arguments to long options are also mandatory or optional -for any corresponding short options. -``` -## 选项详解 - -### launcher -? - -显示该命令的使用说明。 - -### launcher -c - -指定`cpulist`运行程序,例如: - -``` -launcher -c 1,2 ./test -``` - -launcher将会申请CPU1和CPU2用于执行test程序,如果CPU1和CPU2不是隔离核或者已经被其他程序申请,那么返回失败。 - -存在一种例外情况,例如执行以下操作: - -``` -launcher -c 1,2 ./test1 -launcher -c 1,2 ./test2 -``` - -执行test2时,指定了相同的`cpulist`,此时不会返回失败,而是将test2也运行到CPU1和CPU2上。 - -### launcher -n - -指定CPU数量运行程序,例如 - -``` -launcher -n 1 ./test -``` - -launcher将会从隔离核中申请一个CPU来运行test程序,如果剩余的隔离核不足,则返回失败。 - -### launcher -v - -在CPU隔离后,CPU资源划分为Linux资源和隔离核资源。为了适配mpirun等具备资源识别分配功能的程序,在launcher中提供修改资源视角的功能,例如: - -``` -launcher -v cat /proc/cpuinfo -``` - -此时只会显示隔离核而非所有的CPU。 - -此选项还可以结合`-n`和`-c`选项使用,例如 - -``` -launcher -c 1,2 -v cat /proc/cpuinfo -``` - -此时只会显示CPU1和CPU2的信息。 - -受影响的procfs和sysfs的接口如下: - -procfs: - -``` -/proc/cpuinfo -``` - -sysfs: - -``` -/sys/devices/system/cpu/online -/sys/devices/system/cpu/present -/sys/devices/system/cpu/possible -/sys/devices/system/cpu/cpu/online -/sys/devices/system/node/node/cpumap -/sys/devices/system/node/node/cpulist -/sys/fs/cgroup/cpuset/cpuset.cpus -/sys/fs/cgroup/cpuset/cpuset.effective_cpus -``` - -注:除了cgroup相关的两个接口默认仅显示Linux资源外(这是因为隔离核没有加入到cgroup的cpuset中),其他接口默认都会显示Linux资源+隔离核资源,不受隔离影响。 - diff --git a/docs/zh/docs/Installation/RISC-V-LicheePi4A.md b/docs/zh/docs/Installation/RISC-V-LicheePi4A.md new file mode 100644 index 0000000000000000000000000000000000000000..5614f42a70bb1b4427e947b14bf346c014d7db10 --- /dev/null +++ b/docs/zh/docs/Installation/RISC-V-LicheePi4A.md @@ -0,0 +1,109 @@ +# 在 Licheepi4A 上安装 + +## 硬件准备 + +- `Sipeed LicheePi 4A` 设备 1 台(`8 GB` 或 `16 GB` 款均可) + +- 显示器一台 + +- `USB` 键盘及鼠标一套 + +- 串口操作所需设备/组件(可选) + +- `RJ45` 网线 1 条以上及路由器/交换机等设备供有线网络连接使用 + +## 设备固件 + +`LicheePi4A` 不同内存版本需要使用不同的固件。 + +- `u-boot-with-spl-lpi4a.bin` 为 8 GB 版本的 u-boot 文件。 +- `u-boot-with-spl-lpi4a-16g.bin` 为 16 GB 版本的 u-boot 文件。 + +以下的烧录方式以 `16GB + 128GB` 核心板为例,假设你已经下载好了 `base` 镜像和对应的固件文件。 + +## 烧录方法 + +### 烧录工具 + +主要利用 `fastboot` 烧录,可以从 `https://dl.sipeed.com/shareURL/LICHEE/licheepi4a/07_Tools` 下载 `burn_tools.zip`,压缩包内包含了针对 `Windows/macOS/Linux` 三个系统的烧录工具。 + +### 设置硬件进入烧录模式 + +> 请先注意检查底板的拨码开关是否为 EMMC 启动模式,在确认无误之后即可烧录。 + +按住板上的 `BOOT` 按键不放,然后插入 `USB-C` 线缆上电(线缆另一头接 `PC` ),即可进入 `USB` 烧录模式。 + +在 `Windows` 下使用设备管理器查看,会出现 `USB download gadget` 设备。 + +在 `Linux` 下,使用 `lsusb` 查看设备,会显示以下设备: `ID 2345:7654 T-HEAD USB download gadget`。 + +### Windows 下驱动安装 + +> 注意: +> +> 我们提供下载的镜像内并未包含 Windows 侧的驱动程序。你可以从[这里](https://dl.sipeed.com/shareURL/LICHEE/licheepi4a/07_Tools)下载 burn_tools.zip,在压缩包内找到 windows/usb_driver-fullmask 的文件夹,这个文件夹内的文件是 Windows 系统下需要安装的驱动。 + +Windows 下烧录时,需要先进入高级启动模式,禁用数字签名。才能正常安装下面的驱动。禁用数字签名请按照下面的步骤操作。 + +#### Windows 10 + +1. 进入 `Windows 10` 的设置,点击"更新和安全"。 +2. 点击左侧的`恢复`,之后在右边点击高级启动下面的`重新启动`,此时电脑会重新启动。如果当前有未完成的工作,请先保存后再执行。 + +#### Windows 11 + +1. 进入 `Windows 11` 的设置,找到`系统`菜单之后,点击`恢复`。 + +2. 之后在右边点击高级启动下面的"重新启动",此时电脑会重新启动。如果当前有未完成的工作,请先保存后再执行。 + +##### 重启之后 + +1. 点击`疑难解答`,然后点击`高级` -> `启动设置`,随后系统将再次重启。 + +2. 重启之后会进入启动设置,在这里我们需要选择`禁用强制驱动程序签名`。通常这个选项为数字 7,但实际可能有变。在按下对应选项的数字后,系统会再次重新启动。 + +3. 在重启进入系统之后,我们可以开始安装驱动了。打开`设备管理器`,找到`其它设备`内的 `USB download gadget`,双击该设备。 + +4. 点击`常规`选项卡下方的`更新驱动程序`。 + +5. 随后,在`浏览计算机上的驱动程序`页面粘贴你复制的 `usb_driver-fullmask` 目录的路径。 + +6. 点击下一步,此时驱动可以被安装成功。 + +### 烧录镜像 + +进入烧录模式后,我们就可以使用 fastboot 进行烧录操作,在 macOS 或者 Linux 下,若 fastboot 为自行安装的,则你可能需要先赋予 fastboot 可执行权限。 + +#### Windows 系统步骤 + +请先将 `fastboot` 添加至系统环境变量 `PATH` 内,或者将 `fastboot` 放置于同一目录下。不要忘记将镜像也解压。随后打开 `PowerShell`,执行以下命令: + +``` bash +# 将这里的文件替换成跟板子版本对应的 u-boot 文件 +fastboot flash ram u-boot-with-spl-lpi4a-16g.bin +fastboot reboot +# 在执行重启命令之后,等待 5 秒钟后继续执行 +# 将这里的文件替换成跟板子版本对应的 u-boot 文件 +fastboot flash uboot u-boot-with-spl-lpi4a-16g.bin +fastboot flash boot openEuler-24.03-LTS-SP1-riscv64-lpi4a-base-boot.ext4 +fastboot flash root openEuler-24.03-LTS-SP1-riscv64-lpi4a-base-root.ext4 +``` + +#### Linux/macOS 系统步骤 + +你可能需要在 `fastboot` 指令前加入 `sudo` 指令。 + +``` bash +# 将这里的文件替换成跟板子版本对应的 u-boot 文件 +fastboot flash ram u-boot-with-spl-lpi4a-16g.bin +fastboot reboot +# 在执行重启命令之后,等待 5 秒钟后继续执行 +# 将这里的文件替换成跟板子版本对应的 u-boot 文件 +fastboot flash uboot u-boot-with-spl-lpi4a-16g.bin +fastboot flash boot openEuler-24.03-LTS-SP1-riscv64-lpi4a-base-boot.ext4 +fastboot flash root openEuler-24.03-LTS-SP1-riscv64-lpi4a-base-root.ext4 +``` + +## 硬件可用性 + +官方发布版本基于 [`openEuler kernel6.6`](./RISCV-OLK6.6同源版本指南.md) 同源版本构建,并非所有内核模块都完整支持。该版本强调官方生态体验完整一致,如果需要更完善的硬件功能,需要使用第三方发布版本。 diff --git a/docs/zh/docs/Installation/RISC-V-Pioneer1.3.md b/docs/zh/docs/Installation/RISC-V-Pioneer1.3.md new file mode 100644 index 0000000000000000000000000000000000000000..6d4db22e41882784bdc4835c5699533b8257437c --- /dev/null +++ b/docs/zh/docs/Installation/RISC-V-Pioneer1.3.md @@ -0,0 +1,103 @@ +# Pioneer Box 上安装 + +## 硬件准备 + +- `Milk-V Pioneer v1.3` 版本设备 1 台,或主板(及必需外设) 1 套 + +- `m.2 NVMe` 固态硬盘 1 块 + +> 若原先有数据则需格式化清除(个人资料请注意备份)。 +> +> 若有 `PCIe` 转接卡,则可通过转接卡置于设备 `PCIe` 第一槽位(较推荐)。 +> +> 若无 `PCIe` 转接卡,则可置于板载 `NVMe` 接口。 + +- `AMD R5 230` 显卡 1 块 + +> 置于设备 `PCIe` 第二槽位。 + +- `U 盘` 1 个 + +> 大小应为 `16GiB` 以上。 + +- `microSD 卡` 1 张 + +> 大小应为 `4GiB` 以上。 + +- 显示器 1 台(显示接口需与显卡对应) + +- `USB` 键盘及鼠标 1 套 + +- 串口操作所需设备/组件(可选) + +- `RJ45` 网线 1 条以上及路由器/交换机等设备供有线网络连接使用 + +> 推荐使用设备板载 `RJ45` 网口而非厂家附送的 `PCIe` 电口网卡。 +> +> 设备出厂未附送 `WiFi` 网卡,不具备 `WiFi` 或蓝牙连接能力。如有需要请自备对应设备。 + +## 镜像种类 + +### ISO + +> `ISO` 镜像支持 `UEFI` 方式启动,对应下文的 **UEFI 版**固件。 + +从官网下载页面获取 `ISO` 文件(如 `openEuler-24.03-LTS-SP1-riscv64-dvd.iso`),并将其烧录至 **U盘** 中即可。 + +- 推荐使用 `Balena Etcher` 软件进行图形化烧录 [从 `https://github.com/balena-io/etcher/releases/latest` 下载],烧写过程此处略过 +- 命令行环境下也可采用 `dd` 方式进行烧录,参考命令如下: + +``` txt +~$ sudo dd if=openEuler-24.03-LTS-SP1-riscv64-dvd.iso of=/dev/sda bs=512K iflag=fullblock oflag=direct conv=fsync status=progress +``` + +### Image + +> `Image` 镜像支持 `Legacy` 方式启动,对应下文的**非 UEFI 版**固件 + +从官网下载页面获取内含镜像 `Image` 的 `Zip` 压缩包(如 `openEuler-24.03-LTS-SP1-riscv64-sg2042.img.zip`),并将其直接烧录至 **SDCARD** 或 **固态硬盘** 中即可。 + +## 设备固件 + +> 因设备出厂固件目前并未支持 `UEFI`,`ISO` 版本使用者需先手动替换固件为基于 `EDK2` 的 `UEFI` 版固件。 +> +从官网下载页面 **嵌入式分类** 中下载设备固件压缩包:`sg2042_firmware_uefi.zip`,解压后烧录其中 `img` 文件至 **SDCARD** + +```txt +~$ sudo dd if=firmware_single_sg2042-master.img of=/dev/sda bs=512K iflag=fullblock oflag=direct conv=fsync status=progress +261619712 字节 (262 MB, 250 MiB) 已复制,20 s,13.1 MB/s268435456 字节 (268 MB, 256 MiB) 已复制,20.561 s,13.1 MB/s + +输入了 512+0 块记录 +输出了 512+0 块记录 +268435456 字节 (268 MB, 256 MiB) 已复制,20.5611 s,13.1 MB/s +``` + +> 因设备出厂固件版本较老,Image 镜像使用者如果想要使用较新版本的固件,可以更新**非 UEFI 版**固件。 +> +从官网下载页面 **嵌入式分类** 中下载设备固件压缩包:`sg2042_firmware_uboot.zip`,参考 UEFI 版固件的操作,解压后烧录其中 img 文件至 **SDCARD**。 + +烧录完成后,请将 **SDCARD** 插入设备的卡槽中。 + +## 启动前检查 + +`ISO` 版本使用者需检查: + +- 载有 `UEFI` 版固件的 `microSD 卡`是否插入设备卡槽中。 + + > `UEFI` 版固件目前尚无法手动调整或指定启动顺序,敬请谅解。 + +- 如使用设备原厂附送固态硬盘,或硬盘上存在另一可启动的 `RISC-V` 操作系统,则需卸下固态硬盘进行格式化或更换内容为空的另一块固态硬盘,以避免启动顺序上的干涉。 + +`Image` 版本需检查: + +- 如使用设备原厂附送固态硬盘,或硬盘上存在另一可启动的 `RISC-V` 操作系统,则需卸下固态硬盘进行格式化或更换内容为空的另一块固态硬盘,以避免启动顺序上的干涉。 + +## 使用须知 + +`ISO` 版本使用者: + +- 由于当前版本 `UEFI` 固件的局限性,启动时若显卡插入 `PCIe` 槽位,`Grub2` 启动菜单可能需要花费较长时间 (~15s) 才能加载完成且响应较为迟缓。 + +`Image` 版本使用者: + +- 由于当前出厂固件的局限性,设备启动时 `RISC-V` 串口回显并不完整,操作系统未加载完成时串口输出即会关闭。需将显卡插入 `PCIe` 槽位并连接显示器才能观察到完整的启动过程。 diff --git a/docs/zh/docs/Installation/RISC-V-QEMU.md b/docs/zh/docs/Installation/RISC-V-QEMU.md new file mode 100644 index 0000000000000000000000000000000000000000..89a8166a5e2b87bb912a0c1ebbcf6656a20038f2 --- /dev/null +++ b/docs/zh/docs/Installation/RISC-V-QEMU.md @@ -0,0 +1,71 @@ +# 在 QEMU 上安装 + +## 固件 + +### 标准 EFI 固件 + +自下载页面下载如下二进制: + +``` text +RISCV_VIRT_CODE.fd +RISCV_VIRT_VARS.fd +``` + +或者根据[官方文档](https://github.com/tianocore/edk2/tree/master/OvmfPkg/RiscVVirt)本地自行编译最新 EDK2 OVMF 固件。 + +### 具备 Penglai TEE 支持的 EFI 固件 + +自下载页面下载如下二进制: + +``` text +fw_dynamic_oe_2403_penglai.bin +``` + +## QEMU 版本 + +>为了支持 UEFI ,需使用 8.1 版本以上的 QEMU。 +> +>编译时需要安装 libslirp 依赖库(包名根据发行版不同而不同,openEuler 为 libslirp-devel)并添加 --enable-slirp 参数。 + +``` text +~$ qemu-system-riscv64 --version +QEMU emulator version 8.2.2 +Copyright (c) 2003-2023 Fabrice Bellard and the QEMU Project developers +``` + +## qcow2 镜像 + +获取ISO文件(如 `openEuler-24.03-LTS-SP1-riscv64.qcow2.xz`) + +``` bash +~$ ls *.qcow2.xz +openEuler-24.03-LTS-SP1-riscv64.qcow2.xz +``` + +## 获取启动脚本 + +自下载页面获取启动脚本 + +- start_vm.sh: 默认脚本,需要手动安装桌面。 +- start_vm_penglai.sh:蓬莱 TEE 功能支持脚本。 + +脚本参数 + +- ssh_port:本地 SSH 转发端口,默认为 12055。 +- vcpu:QEMU 执行时线程数量,默认为 8 核心,可随需要调整。 +- memory:QEMU 执行时分配内容数量,默认为 8GiB,可随需要调整。 +- fw: 为启动固件 payload。 +- drive:虚拟磁盘路径,可随需要调整。 +- bios(可选): 启动固件,可以用来装载使能了 penglai TEE 的固件。 + +## 创建虚拟硬盘文件 + +创建新的虚拟硬盘文件,如下列例子中虚拟硬盘的大小为 40GiB。 +> 请勿使用先前存有数据的 qcow2 虚拟硬盘文件,以避免启动过程出现预期之外的情况。 +> +> 请确保当前目录中有且仅有一个 qcow2 虚拟硬盘文件,以避免启动脚本识别出错。 +> + +``` bash +~$ qemu-img create -f qcow2 qemu.qcow2 40G +``` diff --git "a/docs/zh/docs/Installation/RISCV-OLK6.6\345\220\214\346\272\220\347\211\210\346\234\254\346\214\207\345\215\227.md" "b/docs/zh/docs/Installation/RISCV-OLK6.6\345\220\214\346\272\220\347\211\210\346\234\254\346\214\207\345\215\227.md" new file mode 100644 index 0000000000000000000000000000000000000000..00f6b368e162d47821c0764990852a2b801357b6 --- /dev/null +++ "b/docs/zh/docs/Installation/RISCV-OLK6.6\345\220\214\346\272\220\347\211\210\346\234\254\346\214\207\345\215\227.md" @@ -0,0 +1,54 @@ +# RISCV-OLK6.6同源版本指南 + +## RISCV-OLK6.6 同源计划 + +目前各个 RISC-V SoC 厂商维护的 kernel 版本并不一致,而 openEuler 系统要求每个版本统一内核。这导致基于各种开发板发布的各种操作系统版本都是内核不一致的第三方版本,增大了维护的难度并且带来了生态的分裂。riscv-kernel 目标是针对 RISC-V 架构在 openEuler 建立统一的 kernel 生态,共享欧拉生态建设与影响。该项目处于开发中,欢迎各方力量积极贡献。 + +目前项目主要依托在 的 OLK-6.6 分支进行开发,并且会进一步回合到 OLK 的源码仓和制品仓上。 + +![riscv-olk6.6](figures/riscv-olk6.6.jpg ) +目前项目已经基本完成 SG2042 的同源工作,并且完成 TH1520 的基础同源工作。 + +## 支持的特性 + +- SG2042 验证平台:MilkV Pioneer 1.3 + +- TH1520 验证平台:LicheePi4A + +### Milk-V Pioneer 特性列表 + +| Features | Status | +| ----------------------- | :----: | +| 64 Core CPU | O | +| PCIe Network Card | O | +| PCIe Graghic Card | O | +| PCIe Slots | O | +| 4x DDR4 128GB RAM | O | +| USB | O | +| Reset | O | +| eMMC | O | +| Micro USB debug console | O | +| micro SD card | O | +| SPI flash | O | +| RVV 0.71 | X | + +### LicheePi 4A 特性列表 + +| Features | Status | +| ---------------- | :----: | +| 4 Core CPU | O | +| RAM | O | +| eMMC | O | +| Ethernet | O | +| WIFI | X | +| GPU IMG BXM-4-64 | X | +| NPU 4TOPS@INT8 | X | +| DSP | X | +| USB | O | +| MicroSD | O | +| GPIO | O | +| PWM-fan | O | +| PVT Sensor | O | +| Reboot | O | +| Poweroff | O | +| cpufreq | O | diff --git a/docs/zh/docs/Installation/figures/riscv-olk6.6.jpg b/docs/zh/docs/Installation/figures/riscv-olk6.6.jpg new file mode 100644 index 0000000000000000000000000000000000000000..8a00c4fd2033954b1d0d99eb376ea5c1436db7fb Binary files /dev/null and b/docs/zh/docs/Installation/figures/riscv-olk6.6.jpg differ diff --git a/docs/zh/docs/Installation/riscv_more.md b/docs/zh/docs/Installation/riscv_more.md deleted file mode 100644 index 6a106f35a83dcdfc9cfe883357b182968292771f..0000000000000000000000000000000000000000 --- a/docs/zh/docs/Installation/riscv_more.md +++ /dev/null @@ -1,4 +0,0 @@ -# 参考资料 - -- visionfive 上使用 openEuler RISC-V -- 在 RISC-V 平台玩转 openEuler diff --git a/docs/zh/docs/Installation/riscv_qemu.md b/docs/zh/docs/Installation/riscv_qemu.md deleted file mode 100644 index 6ed7c17dc6524b737124950bc20da4273632ee4c..0000000000000000000000000000000000000000 --- a/docs/zh/docs/Installation/riscv_qemu.md +++ /dev/null @@ -1,341 +0,0 @@ -# 安装指导 - -本章以 QEMU 安装为例介绍安装openEuler,其他安装方式参考开发板安装页面。 - -## 安装 QEMU - -### 系统环境 - -目前该方案测试过的环境包括 WSL2 (Ubuntu 20.04.4 LTS and Ubuntu 22.04.1 LTS) 和 Ubuntu 22.04.1 live-server LTS。 - -## 安装支持 RISC-V 架构的 QEMU 模拟器 - -安装发行版提供的 `qemu-system-riscv64` 软件包。截止本文档编写时,openEuler 23.09 x86_64 提供 QEMU 6.2.0 (qemu-system-riscv-6.2.0-80.oe2309.x86_64): - -``` bash -dnf install -y qemu-system-riscv -``` - -由于 QEMU 8.0 及更新版本提供了大量针对 RISC-V 的修复和更新,我们推荐使用 QEMU 8.0 或更新版本以获得更佳体验。下面以 QEMU 8.1.2 为例。 - -### 手动编译安装 - -由于自带的包常常过旧,若软件包过旧,使用以下方案编译和安装。 - -``` bash -wget https://download.qemu.org/qemu-8.1.2.tar.xz -tar -xvf qemu-8.1.2.tar.xz -cd qemu-8.1.2 -mkdir res -cd res -sudo apt install libspice-protocol-dev libepoxy-dev libgtk-3-dev libspice-server-dev build-essential autoconf automake autotools-dev pkg-config bc curl gawk git bison flex texinfo gperf libtool patchutils mingw-w64 libmpc-dev libmpfr-dev libgmp-dev libexpat-dev libfdt-dev zlib1g-dev libglib2.0-dev libpixman-1-dev libncurses5-dev libncursesw5-dev meson libvirglrenderer-dev libsdl2-dev -y -../configure --target-list=riscv64-softmmu,riscv64-linux-user --prefix=/usr/local/bin/qemu-riscv64 --enable-slirp -make -j$(nproc) -sudo make install -``` - -上述指令会将 QEMU 安装到 `/usr/local/bin/qemu-riscv64`。将 `/usr/local/bin/qemu-riscv64/bin` 添加至 `$PATH` 即可使用。 - -如需在其他操作系统下,包括 openEuler 下进行编译安装,请参考 [QEMU 官方文档](https://wiki.qemu.org/Hosts/Linux)。 - -openEuler 编译所需依赖包可参考 RHEL / CentOS,如下: - -``` bash -sudo dnf install -y git glib2-devel libfdt-devel pixman-devel zlib-devel bzip2 ninja-build python3 \ - libaio-devel libcap-ng-devel libiscsi-devel capstone-devel \ - gtk3-devel vte291-devel ncurses-devel \ - libseccomp-devel nettle-devel libattr-devel libjpeg-devel \ - brlapi-devel libgcrypt-devel lzo-devel snappy-devel \ - librdmacm-devel libibverbs-devel cyrus-sasl-devel libpng-devel \ - libuuid-devel pulseaudio-libs-devel curl-devel libssh-devel \ - systemtap-sdt-devel libusbx-devel -curl -LO https://download.qemu.org/qemu-8.1.2.tar.xz -tar -xvf qemu-8.1.2.tar.xz -cd qemu-8.1.2 -mkdir res -cd res -../configure --target-list=riscv64-softmmu,riscv64-linux-user --prefix=/usr/local/bin/qemu-riscv64 -make -j$(nproc) -sudo make install -``` - -## 准备 openEuler RISC-V 磁盘映像 - -### 下载磁盘映像 - -需要下载启动固件 (`fw_payload_oe_uboot_2304.bin`),磁盘映像(`openEuler-23.09-RISC-V-qemu-riscv64.qcow2.xz`)和启动脚本(`start_vm.sh`)。 - -### 下载目录 - -目前的构建位于 [openEuler Repo](https://repo.openeuler.org/openEuler-23.09/virtual_machine_img/riscv64/) 中。您也可以访问 [openEuler 官网](https://www.openeuler.org/zh/download/),从其他镜像源获取镜像。 - -### 内容说明 - -- `fw_payload_oe_uboot_2304.bin`: 启动固件 -- `openEuler-23.09-RISC-V-qemu-riscv64.qcow2.xz`: openEuler RISC-V QEMU 虚拟机磁盘映像压缩包 -- `openEuler-23.09-RISC-V-qemu-riscv64.qcow2.xz.sha256sum`: openEuler RISC-V QEMU 虚拟机磁盘映像压缩包的校验。使用 `sha256sum -c openEuler-23.09-RISC-V-qemu-riscv64.qcow2.xz.sha256sum` 校验。 -- `start_vm.sh`: 官方虚拟机启动脚本 - -### [可选] 配置 copy-on-write(COW)磁盘 - -> 写时复制(copy-on-write,缩写COW)技术不会对原始的映像文件做更改,变化的部分写在另外的映像文件中,这种特性在 QEMU 中只有 QCOW 格式支持,多个磁盘映像可以指向同一映像同时测试多个配置, 而不会破坏原映像。 - -#### 创建新映像 - -使用如下的命令创建新的映像,并在下方启动虚拟机时使用新映像。假设原映像为 `openEuler-23.09-RISC-V-qemu-riscv64.qcow2`,新映像为 `test.qcow2`。 - -``` bash -qemu-img create -o backing_file=openEuler-23.09-RISC-V-qemu-riscv64.qcow2,backing_fmt=qcow2 -f qcow2 test.qcow2 -``` - -#### 查看映像信息 - -``` bash -qemu-img info --backing-chain test.qcow2 -``` - -#### 修改基础映像位置 - -使用如下的命令修改基础映像位置。假设新的基础映像为 `another.qcow2`,欲修改映像为 `test.qcow2`。 - -``` bash -qemu-img rebase -b another.qcow2 test.qcow2 -``` - -#### 合并映像 - -将修改后的镜像合并到原来的镜像。假设新映像为 `test.qcow2`。 - -``` bash -qemu-img commit test.qcow2 -``` - -#### 扩容根分区 - -为了扩大根分区以获得更大的可使用空间,按照如下操作进行。 - -扩大磁盘镜像。 - -``` bash -qemu-img resize test.qcow2 +100G -``` - -输出 - -```text -Image resized. -``` - -启动虚拟机,使用下列指令检查磁盘大小。 - -``` bash -lsblk -``` - -列出分区情况。 - -``` bash -fdisk -l -``` - -修改根分区。 - -``` bash -fdisk /dev/vda -Welcome to fdisk (util-linux 2.35.2). -Changes will remain in memory only, until you decide to write them. -Be careful before using the write command. - -Command (m for help): p # 输出分区情况 -Disk /dev/vda: 70 GiB, 75161927680 bytes, 146800640 sectors -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -Disklabel type: dos -Disk identifier: 0x247032e6 - -Device Boot Start End Sectors Size Id Type -/dev/vda1 2048 4194303 4192256 2G e W95 FAT16 (LBA) -/dev/vda2 4194304 83886079 79691776 38G 83 Linux - -Command (m for help): d # 删除原有分区 -Partition number (1,2, default 2): 2 - -Partition 2 has been deleted. - -Command (m for help): n # 新建分区 -Partition type - p primary (1 primary, 0 extended, 3 free) - e extended (container for logical partitions) -Select (default p): p # 选择主分区 -Partition number (2-4, default 2): 2 -First sector (4194304-146800639, default 4194304): # 此处和上文的 /dev/vda2 的起始块应当一致 -Last sector, +/-sectors or +/-size{K,M,G,T,P} (4194304-146800639, default 146800639): #保持默认直接分配到最尾端 - -Created a new partition 2 of type 'Linux' and of size 68 GiB. -Partition #2 contains a ext4 signature.Do you want to remove the signature? [Y]es/[N]o: n - -Command (m for help): p #再次检查 - -Disk /dev/vda: 70 GiB, 75161927680 bytes, 146800640 sectors -Units: sectors of 1 * 512 = 512 bytes -Sector size (logical/physical): 512 bytes / 512 bytes -I/O size (minimum/optimal): 512 bytes / 512 bytes -Disklabel type: dos -Disk identifier: 0x247032e6 - -Device Boot Start End Sectors Size Id Type -/dev/vda1 2048 4194303 4192256 2G e W95 FAT16 (LBA) -/dev/vda2 4194304 146800639 142606336 68G 83 Linux - -Command (m for help): w # 写入到磁盘 -The partition table has been altered. -Syncing disks. -``` - -更新磁盘信息。 - -``` bash -resize2fs /dev/vda2 -``` - -## 启动 openEuler RISC-V 虚拟机 - -### 启动虚拟机 - -- 确认当前目录内包含 `fw_payload_oe_uboot_2304.bin`,磁盘映像压缩包,以及启动脚本。 -- 解压映像压缩包 `xz -dk openEuler-23.09-RISC-V-qemu-riscv64.qcow2.xz` -- 调整启动参数 -- 执行启动脚本 `$ bash start_vm.sh` - -### [可选] 启动参数调整 - -- `vcpu` 为 QEMU 运行线程数,与 CPU 核数没有严格对应。当设定的 `vcpu` 值大于宿主机核心值时,可能导致运行阻塞和速度严重降低。默认为 `4`。 -- `memory` 为虚拟机内存大小,可随需要调整。默认为 `2`。 -- `drive` 为虚拟磁盘路径,如果在上文中配置了 COW 映像,此处填写创建的新映像。 -- `fw` 为 U-Boot 镜像路径。 -- `ssh_port` 为转发的 SSH 端口,默认为 `12055`。设定为空以关闭该功能。 - -## 登录虚拟机 - -脚本提供了 SSH 登录支持。 - -如果这是暴露在外网的虚拟机,请在登录成功之后立即修改 root 用户密码。 - -### SSH 登录 - -Secure Shell(安全外壳协议,简称 SSH)是一种加密的网络传输协议,可在不安全的网络中为网络服务提供安全的传输环境。SSH 通过在网络中创建安全隧道来实现SSH客户端与服务器之间的连接。SSH 最常见的用途是远程登录系统,人们通常利用SSH来传输命令行界面和远程执行命令。SSH 使用频率最高的场合是类 Unix 系统,但是 Windows 操作系统也能有限度地使用 SSH。2015 年,微软宣布将在未来的操作系统中提供原生SSH协议支持,Windows 10 1803 及更新版本中已提供 OpenSSH 客户端。 - -- 用户名: `root` 或 `openeuler` -- 默认密码: `openEuler12#$` -- 登录方式: 参见脚本提示 (或使用您偏好的 ssh 客户端) - -登录成功之后,可以看到如下的信息: - -``` bash -Authorized users only. All activities may be monitored and reported. - -Authorized users only. All activities may be monitored and reported. -Last login: Sun Oct 15 17:19:52 2023 from 10.0.2.2 - -Welcome to 6.4.0-10.1.0.20.oe2309.riscv64 - -System information as of time: Sun Oct 15 19:40:07 CST 2023 - -System load: 0.47 -Processes: 161 -Memory used: .7% -Swap used: 0.0% -Usage On: 11% -IP address: 10.0.2.15 -Users online: 1 - -[root@openeuler ~]# -``` - -### VNC 登录 - -这是一个类似于远程操作真机的方式,但是没有声音,受 QEMU 原生支持。 - -> VNC(Virtual Network Computing),为一种使用 RFB 协议的屏幕画面分享及远程操作软件。此软件借由网络,可发送键盘与鼠标的动作及即时的屏幕画面。 -> -> VNC 与操作系统无关,因此可跨平台使用,例如可用 Windows 连接到某 Linux 的电脑,反之亦同。甚至在没有安装客户端程序的电脑中,只要有支持 Java 的浏览器,也可使用。 - -#### 安装 VNC Viewer - -前往 [此处](https://sourceforge.net/projects/tigervnc/files/stable/) 下载 TigerVNC,或前往 [此处](https://www.realvnc.com/en/connect/download/viewer/) 下载 VNC Viewer。 - -#### 修改启动脚本 - -在启动脚本 `sleep 2` 行之前中添加如下内容: - -``` bash -vnc_port=12056 -echo -e "\033[37mVNC Port: \033[0m \033[34m"$vnc_port"\033[0m" -cmd="${cmd} -vnc :"$((vnc_port-5900)) -``` - -#### 连接到 VNC - -启动 TigerVNC 或 VNC Viewer,粘贴地址按下回车即可。操作界面和真机类似。 - -## 修改默认软件源配置 - -openEuler 23.09 RISC-V 版本的软件源目前仅包含 [OS] 和 [source] 仓库,而默认配置文件中包含了其他 RISC-V 版本并未提供的仓库。 - -用户在使用包管理器安装软件包之前,需要手动编辑软件源配置,移除 [OS] 和 [source] 两节之外的内容。 - -SSH 或 VNC 连接至虚拟机,使用 root 用户登录(若使用非特权用户登录,需要使用 sudo),在终端中进行如下操作: - -### 修改 /etc/yum.repos.d/openEuler.repo - -``` bash -vi /etc/yum.repos.d/openEuler.repo -# 或者 nano /etc/yum.repos.d/openEuler.repo -``` - -删除 [everything], [EPOL], [debuginfo], [update], [update-source] 小节,仅保留 [OS] 和 [source] 两部分。 - -修改完成后,您的 openEuler.repo 配置应该与下述基本一致: - -``` text -#generic-repos is licensed under the Mulan PSL v2. -#You can use this software according to the terms and conditions of the Mulan PSL v2. -#You may obtain a copy of Mulan PSL v2 at: -# http://license.coscl.org.cn/MulanPSL2 -#THIS SOFTWARE IS PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES OF ANY KIND, EITHER EXPRESS OR -#IMPLIED, INCLUDING BUT NOT LIMITED TO NON-INFRINGEMENT, MERCHANTABILITY OR FIT FOR A PARTICULAR -#PURPOSE. -#See the Mulan PSL v2 for more details. - -[OS] -name=OS -baseurl=http://repo.openeuler.org/openEuler-23.09/OS/$basearch/ -metalink=https://mirrors.openeuler.org/metalink?repo=$releasever/OS&arch=$basearch -metadata_expire=1h -enabled=1 -gpgcheck=1 -gpgkey=http://repo.openeuler.org/openEuler-23.09/OS/$basearch/RPM-GPG-KEY-openEuler - -[source] -name=source -baseurl=http://repo.openeuler.org/openEuler-23.09/source/ -metalink=https://mirrors.openeuler.org/metalink?repo=$releasever&arch=source -metadata_expire=1h -enabled=1 -gpgcheck=1 -gpgkey=http://repo.openeuler.org/openEuler-23.09/source/RPM-GPG-KEY-openEuler -``` - -接下来就可以正常使用 `dnf` 包管理器进行软件包的安装了。初次安装的时候需要导入 openEuler 的 GPG key,若出现如下提示时,需输入 y 并确认,还请注意。 - -``` text -retrieving repo key for OS unencrypted from http://repo.openeuler.org/openEuler-23.09/OS/riscv64/RPM-GPG-KEY-openEuler -OS 18 kB/s | 2.1 kB 00:00 -Importing GPG key 0xB25E7F66: - Userid : "private OBS (key without passphrase) " - Fingerprint: 12EA 74AC 9DF4 8D46 C69C A0BE D557 065E B25E 7F66 - From : http://repo.openeuler.org/openEuler-23.09/OS/riscv64/RPM-GPG-KEY-openEuler -Is this ok [y/N]: y -Key imported successfully -``` diff --git "a/docs/zh/docs/Installation/\344\275\277\347\224\250kickstart\350\207\252\345\212\250\345\214\226\345\256\211\350\243\205.md" "b/docs/zh/docs/Installation/\344\275\277\347\224\250kickstart\350\207\252\345\212\250\345\214\226\345\256\211\350\243\205.md" index 67610368588cbe4cfe6023428193303ffc4ff444..42c6d02991b5415d32c6a10dd6c84c54c8d3a35a 100644 --- "a/docs/zh/docs/Installation/\344\275\277\347\224\250kickstart\350\207\252\345\212\250\345\214\226\345\256\211\350\243\205.md" +++ "b/docs/zh/docs/Installation/\344\275\277\347\224\250kickstart\350\207\252\345\212\250\345\214\226\345\256\211\350\243\205.md" @@ -80,7 +80,7 @@ TFTP(Trivial File Transfer Protocol,简单文件传输协议),该协议 - 物理机/虚拟机(虚拟机创建可参考对应厂商的资料)。包括使用kickstart工具进行自动化安装的计算机和被安装的计算机。 - httpd:存放kickstart文件。 -- ISO: openEuler-21.09-aarch64-dvd.iso +- ISO: openEuler-{version}-aarch64-dvd.iso ### 操作步骤 @@ -159,8 +159,8 @@ TFTP(Trivial File Transfer Protocol,简单文件传输协议),该协议 >![](./public_sys-resources/icon-note.gif) **说明:** >密码密文生成方式: -> - >``` + > + >``` ># python3 >Python 3.7.0 (default, Apr 1 2019, 00:00:00) >[GCC 7.3.0] on linux @@ -178,7 +178,7 @@ TFTP(Trivial File Transfer Protocol,简单文件传输协议),该协议 **安装系统** 1. 启动系统进入安装选择界面。 - 1. 在“[启动安装](./安装指导.html#启动安装)”中的“安装引导界面”中选择“Install openEuler 21.09”,并按下“e”键。 + 1. 在“[启动安装](./安装指导.html#启动安装)”中的“安装引导界面”中选择“Install openEuler {version}”,并按下“e”键。 2. 启动参数中追加“inst.ks= ip/ks/openEuler-ks.cfg”。 ![](./figures/startparam.png) @@ -201,7 +201,7 @@ TFTP(Trivial File Transfer Protocol,简单文件传输协议),该协议 - httpd:存放kickstart文件。 - tftp:提供vmlinuz和initrd文件。 - dhcpd/pxe:提供DHCP服务。 -- ISO:openEuler-21.09-aarch64-dvd.iso。 +- ISO:openEuler-{version}-aarch64-dvd.iso。 ### 操作步骤 @@ -252,7 +252,7 @@ TFTP(Trivial File Transfer Protocol,简单文件传输协议),该协议 3. 安装源的制作。 ``` - # mount openEuler-21.09-aarch64-dvd.iso /mnt + # mount openEuler-{version}-aarch64-dvd.iso /mnt # cp -r /mnt/* /var/www/html/openEuler/ ``` @@ -317,7 +317,7 @@ TFTP(Trivial File Transfer Protocol,简单文件传输协议),该协议 ### BEGIN /etc/grub.d/10_linux ### - menuentry 'Install openEuler 21.03 ' --class red --class gnu-linux --class gnu --class os { + menuentry 'Install openEuler {version} ' --class red --class gnu-linux --class gnu --class os { set root=(tftp,192.168.1.1) linux /vmlinuz ro inst.geoloc=0 console=ttyAMA0 console=tty0 rd.iscsi.waitnet=0 inst.ks=http://192.168.122.1/ks/openEuler-ks.cfg initrd /initrd.img diff --git "a/docs/zh/docs/Installation/\345\256\211\350\243\205\345\207\206\345\244\207-1.md" "b/docs/zh/docs/Installation/\345\256\211\350\243\205\345\207\206\345\244\207-1.md" index 53c60be5e4b03895a8e2cfab86a0d506ed317c19..df48d8ce5921bdcb3cb0eaa59073bf692ff7fdcb 100644 --- "a/docs/zh/docs/Installation/\345\256\211\350\243\205\345\207\206\345\244\207-1.md" +++ "b/docs/zh/docs/Installation/\345\256\211\350\243\205\345\207\206\345\244\207-1.md" @@ -7,10 +7,10 @@ 在安装开始前,您需要获取 openEuler 发布的树莓派镜像及其校验文件。 1. 登录[openEuler Repo](https://repo.openeuler.org/)网站。 -2. 在版本列表单击“openEuler 22.03 LTS SP2”,进入openEuler 22.03 LTS SP2下载列表。 +2. 在版本列表单击“openEuler 24.03 LTS”,进入openEuler 24.03 LTS下载列表。 3. 单击“raspi_img”,进入树莓派镜像的下载列表。 -4. 单击“openEuler-22.03-LTS-SP2-raspi-aarch64.img.xz”,将 openEuler 发布的树莓派镜像下载到本地。 -5. 单击“openEuler-22.03-LTS-SP2-raspi-aarch64.img.xz.sha256sum”,将 openEuler 发布的树莓派镜像的校验文件下载到本地。 +4. 单击“openEuler-24.03-LTS-raspi-aarch64.img.xz”,将 openEuler 发布的树莓派镜像下载到本地。 +5. 单击“openEuler-24.03-LTS-raspi-aarch64.img.xz.sha256sum”,将 openEuler 发布的树莓派镜像的校验文件下载到本地。 ## 镜像完整性校验 @@ -24,9 +24,9 @@ 在校验镜像文件的完整性之前,需要准备如下文件: -镜像文件:openEuler-22.03-LTS-SP2-raspi-aarch64.img.xz +镜像文件:openEuler-24.03-LTS-raspi-aarch64.img.xz -校验文件:openEuler-22.03-LTS-SP2-raspi-aarch64.img.xz.sha256sum +校验文件:openEuler-24.03-LTS-raspi-aarch64.img.xz.sha256sum ### 操作指导 @@ -35,13 +35,13 @@ 1. 获取校验文件中的校验值。执行命令如下: ```shell - cat openEuler-22.03-LTS-SP2-raspi-aarch64.img.xz.sha256sum + cat openEuler-24.03-LTS-raspi-aarch64.img.xz.sha256sum ``` 2. 计算文件的 sha256 校验值。执行命令如下: ```shell - sha256sum openEuler-22.03-LTS-SP2-raspi-aarch64.img.xz + sha256sum openEuler-24.03-LTS-raspi-aarch64.img.xz ``` 命令执行完成后,输出校验值。 diff --git "a/docs/zh/docs/Installation/\345\256\211\350\243\205\345\207\206\345\244\207.md" "b/docs/zh/docs/Installation/\345\256\211\350\243\205\345\207\206\345\244\207.md" index e9ab494bba8c776910b2b3abb6e220fb469c9786..ac6d9fc455c1b6bf99399bb2362a0bb959e6853e 100644 --- "a/docs/zh/docs/Installation/\345\256\211\350\243\205\345\207\206\345\244\207.md" +++ "b/docs/zh/docs/Installation/\345\256\211\350\243\205\345\207\206\345\244\207.md" @@ -11,16 +11,16 @@ 1. 登录[openEuler社区](https://openeuler.org)网站。 2. 单击“下载”。 3. 单击“社区发行版”,显示版本列表。 -4. 在版本列表的“openEuler 22.03 LTS SP2”版本处单击“前往下载”按钮,进入openEuler 22.03_LTS_SP2版本下载列表。 +4. 在版本列表的“openEuler 24.03 LTS SP1”版本处单击“前往下载”按钮,进入openEuler 24.03_LTS_SP1版本下载列表。 5. 根据实际待安装环境的架构和场景选择需要下载的 openEuler 的发布包和校验文件。 1. 若为AArch64架构。 1. 单击“AArch64”。 - 2. 若选择本地安装,选择“Offline Standard ISO”或者“Offline Everything ISO”对应的“立即下载”将发布包 “openEuler-22.03-LTS-SP2-aarch64-dvd.iso”下载到本地。 - 3. 若选择网络安装,选择“Network Install ISO”将发布包 “openEuler-22.03-LTS-SP2-netinst-aarch64-dvd.iso”下载到本地。 + 2. 若选择本地安装,选择“Offline Standard ISO”或者“Offline Everything ISO”对应的“立即下载”将发布包 “openEuler-24.03-LTS-SP1-aarch64-dvd.iso”下载到本地。 + 3. 若选择网络安装,选择“Network Install ISO”将发布包 “openEuler-24.03-LTS-SP1-netinst-aarch64-dvd.iso”下载到本地。 2. 若为x86_64架构。 1. 单击“x86_64”。 - 2. 若选择本地安装,选择“Offline Standard ISO”或者“Offline Everything ISO”对应的“立即下载”将发布包 “openEuler-22.03-LTS-SP2-x86_64-dvd.iso”下载到本地。 - 3. 若选择网络安装,选择“Network Install ISO”将发布包 “openEuler-22.03-LTS-SP2-netinst-x86_64-dvd.iso ”下载到本地。 + 2. 若选择本地安装,选择“Offline Standard ISO”或者“Offline Everything ISO”对应的“立即下载”将发布包 “openEuler-24.03-LTS-SP1-x86_64-dvd.iso”下载到本地。 + 3. 若选择网络安装,选择“Network Install ISO”将发布包 “openEuler-24.03-LTS-SP1-netinst-x86_64-dvd.iso ”下载到本地。 >![](./public_sys-resources/icon-note.gif) **说明:** > @@ -42,9 +42,9 @@ 在校验发布包完整性之前,需要准备如下文件: -iso文件:openEuler-22.03-LTS-SP2-aarch64-dvd.iso +- iso文件:openEuler-24.03-LTS-SP1-aarch64-dvd.iso。 -校验文件:ISO对应完整性校验值,复制保存对应的ISO值 +- 校验文件:ISO对应完整性校验值,复制保存对应的ISO值。 ### 操作指导 @@ -53,14 +53,14 @@ iso文件:openEuler-22.03-LTS-SP2-aarch64-dvd.iso 1. 计算文件的sha256校验值。执行命令如下: ``` - sha256sum openEuler-22.03-LTS-SP2-aarch64-dvd.iso + sha256sum openEuler-24.03-LTS-SP1-aarch64-dvd.iso ``` 命令执行完成后,输出校验值。 2. 对比步骤1计算的校验值与对刚刚复制的SHA256的值是否一致。 - 如果校验值一致说明iso文件破坏,如果校验值不一致则可以确认文件完整性已被破坏,需要重新获取。 + 如果校验值一致说明iso文件完整,如果校验值不一致则可以确认文件完整性已被破坏,需要重新获取。 ## 物理机的安装要求 @@ -68,18 +68,7 @@ iso文件:openEuler-22.03-LTS-SP2-aarch64-dvd.iso ### 硬件兼容支持 -openEuler安装时,应注意硬件兼容性方面的问题,当前已支持的服务器类型如[表1](#table14948632047)所示。 - ->![](./public_sys-resources/icon-note.gif) **说明:** -> ->- TaiShan 200服务器基于华为鲲鹏920处理器。 ->- 当前仅支持华为TaiShan服务器和FusionServer Pro 机架服务器,后续将逐步增加对其他厂商服务器的支持。 - -**表 1** 支持的服务器类型 -| 服务器形态 | 服务器名称 | 服务器型号 | -| :---- | :---- | :---- | -| 机架服务器 | TaiShan 200 | 2280均衡型 | -| 机架服务器 | FusionServer Pro 机架服务器 | FusionServer Pro 2288H V5
说明:
服务器要求配置Avago 3508 RAID控制卡和启用LOM-X722网卡| +openEuler安装时,应注意硬件兼容性方面的问题,当前已支持的服务器类型请参考[兼容性列表](https://www.openeuler.org/zh/compatibility/)。 ### 最小硬件要求 diff --git a/docs/zh/docs/Installation/riscv.md "b/docs/zh/docs/Installation/\345\256\211\350\243\205\345\234\250RISC-V.md" similarity index 87% rename from docs/zh/docs/Installation/riscv.md rename to "docs/zh/docs/Installation/\345\256\211\350\243\205\345\234\250RISC-V.md" index 11128008cc0c31fa395bba64e1a36e21020ce76e..ab70b907eb756ac95f310f6a52c50830238df7ef 100644 --- a/docs/zh/docs/Installation/riscv.md +++ "b/docs/zh/docs/Installation/\345\256\211\350\243\205\345\234\250RISC-V.md" @@ -1,3 +1,3 @@ -# RISC-V安装指南 +# 安装在 RISC-V 本文是介绍 openEuler 操作系统在 RISC-V 架构的安装方法,使用本手册的用户需要具备基础的 Linux 系统管理知识。 diff --git "a/docs/zh/docs/Installation/\345\256\211\350\243\205\346\214\207\345\257\274.md" "b/docs/zh/docs/Installation/\345\256\211\350\243\205\346\214\207\345\257\274.md" index 2caae5a352479380c1525be1386a7d01a49f76bf..97744568da7c410e2236c64759db4db3687bfa1c 100644 --- "a/docs/zh/docs/Installation/\345\256\211\350\243\205\346\214\207\345\257\274.md" +++ "b/docs/zh/docs/Installation/\345\256\211\350\243\205\346\214\207\345\257\274.md" @@ -30,11 +30,11 @@ ### 安装引导界面 -系统使用引导介质完成引导后会显示引导菜单。该引导菜单除启动安装程序外还提供一些选项。安装系统时,默认采用“Test this media & install openEuler 21.09”方式进行安装。如果要选择默认选项之外的选项,请使用键盘中的“↑”和“↓”方向键进行选择,并在选项为高亮状态时按“Enter”。 +系统使用引导介质完成引导后会显示引导菜单。该引导菜单除启动安装程序外还提供一些选项。安装系统时,默认采用“Test this media & install openEuler {version}”方式进行安装。如果要选择默认选项之外的选项,请使用键盘中的“↑”和“↓”方向键进行选择,并在选项为高亮状态时按“Enter”。 >![](./public_sys-resources/icon-note.gif) **说明:** > ->- 如果60秒内未按任何键,系统将从默认选项“Test this media & install openEuler 21.09”自动进入安装界面。 +>- 如果60秒内未按任何键,系统将从默认选项“Test this media & install openEuler {version}”自动进入安装界面。 >- 安装物理机时,如果使用键盘上下键无法选择启动选项,按“Enter”键无响应,可以单击BMC界面上的鼠标控制图标“![](./figures/zh-cn_image_0229420473.png)”,设置“键鼠复位”。 **图 4** 安装引导界面 @@ -42,19 +42,19 @@ 安装引导选项说明如下: -- Install openEuler 21.09 —— 在您的服务器上使用图形用户界面模式安装。 +- Install openEuler {version} —— 在您的服务器上使用图形用户界面模式安装。 -- Test this media & install openEuler 21.09 —— 默认选项,在您的服务器上使用图形用户界面模式安装,但在启动安装程序前会进行安装介质的完整性检查。 +- Test this media & install openEuler {version} —— 默认选项,在您的服务器上使用图形用户界面模式安装,但在启动安装程序前会进行安装介质的完整性检查。 - Troubleshooting —— 问题定位模式,系统无法正常安装时使用。进入问题定位模式后,有如下两个选项。 - - Install openEuler 21.09 in basic graphics mode —— 简单图形安装模式,该模式下在系统启动并运行之前不启动视频驱动程序。 + - Install openEuler {version} in basic graphics mode —— 简单图形安装模式,该模式下在系统启动并运行之前不启动视频驱动程序。 - Rescue the openEuler system —— 救援模式,用于修复系统。该模式下输出定向到VNC或BMC(Baseboard Management Controller)端,串口不可用。 在安装引导界面,按“e”进入已选选项的参数编辑界面,按“c”进入命令行模式。 ### 图形化模式安装 -在“安装引导界面”中选择“Test this media & install openEuler 21.09”进入图形化模式安装。 +在“安装引导界面”中选择“Test this media & install openEuler {version}”进入图形化模式安装。 可以通过键盘操作图形化安装程序。 @@ -141,29 +141,27 @@ - http 或 https 方式 - http 或 https 方式的安装源如下图所示。 + http 或 https 方式的安装源如下图所示。输入框内容以实际版本发布的安装源地址为准,如 ,其中 version 为版本号,$basearch 为CPU 架构,可根据实际情况输入。 ![](./figures/installsource.png) 如果https服务器使用的是私有证书,则需要在安装引导界面按“e”进入已选选项的参数编辑界面,在参数中增加 inst.noverifyssl 参数。 - 输入框内容以实际版本发布的安装源地址为准,如 ,其中openEuler-21.03 为版本号,x86_64 为CPU 架构,可根据实际情况输入。 - - ftp 方式 - ftp 方式的安装源如下图所示,输入框内容根据的 ftp 地址输入。 + ftp 方式的安装源如下图所示,输入框内容根据的 ftp 地址输入。 ![](./figures/sourceftp.png) - ftp服务器需要用户自己搭建,将openEuler-21.09-x86_64-dvd.iso镜像进行挂载,挂载出的文件拷贝到ftp的共享目录中。其中x86_64为CPU 架构,可根据实际情况使用镜像。 + ftp服务器需要用户自己搭建,将iso镜像进行挂载,挂载出的文件拷贝到ftp的共享目录中。其中x86_64为CPU 架构,可根据实际情况使用镜像。 - nfs 方式 - nfs 方式的安装源如下图所示,输入框内容根据的 nfs 地址输入。 + nfs 方式的安装源如下图所示,输入框内容根据的 nfs 地址输入。 ![](./figures/sourcenfs.png) - nfs服务器需要用户自己搭建,将openEuler-21.09-x86_64-dvd.iso镜像进行挂载,挂载出的文件拷贝到nfs的共享目录中。其中x86_64为CPU 架构,可根据实际情况使用镜像。 + nfs服务器需要用户自己搭建,将iso镜像进行挂载,挂载出的文件拷贝到nfs的共享目录中。其中x86_64为CPU 架构,可根据实际情况使用镜像。 安装过程中,如果“设置安装源”有疑问,可参考“[选择安装源出现异常](./FAQ.html#选择安装源出现异常)”。 @@ -201,7 +199,7 @@ >![](./public_sys-resources/icon-note.gif) **说明:** > >- 在进行分区时,出于系统性能和安全的考虑,建议您划分如下单独分区:/boot、/var、/var/log 、/var/log/audit、/home、/tmp。 分区建议如[表1](#table1)所示。 ->- 系统如果配置了swap分区,当系统的物理内存不够用时,会使用swap分区。虽然 swap分区可以增大物理内存大小的限制,但是如果由于内存不足使用到swap分区,会增加系统的响应时间,性能变差。因此在物理内存充足或者性能敏感的系统中,不建议配置swap分区。 +>- 系统如果配置了swap分区,当系统的物理内存不够用时,会使用swap分区。虽然 swap分区可以增大物理内存大小的限制,但是如果由于内存不足使用到swap分区,会增加系统的响应时间,性能变差。因此在物理内存充足或者性能敏感的系统中,不建议配置swap分区。 >- 如果需要拆分逻辑卷组则需要选择“自定义”进行手动分区,并在“手动分区”界面单击“卷组”区域中的“修改”按钮重新配置卷组。 **表1** 磁盘分区建议 diff --git "a/docs/zh/docs/Installation/\345\256\211\350\243\205\346\226\271\345\274\217\344\273\213\347\273\215-1.md" "b/docs/zh/docs/Installation/\345\256\211\350\243\205\346\226\271\345\274\217\344\273\213\347\273\215-1.md" index d293cbb7371e88be92f59565940cbb556ec338f5..4ca949b493f418ce47140204dcc2a93e0a87cd7a 100644 --- "a/docs/zh/docs/Installation/\345\256\211\350\243\205\346\226\271\345\274\217\344\273\213\347\273\215-1.md" +++ "b/docs/zh/docs/Installation/\345\256\211\350\243\205\346\226\271\345\274\217\344\273\213\347\273\215-1.md" @@ -2,7 +2,7 @@ >![](./public_sys-resources/icon-notice.gif) **须知:** > ->- 硬件仅支持树莓派 3B/3B+/4B。 +>- 硬件仅支持树莓派 3B/3B+/4B/400。 >- 采用刷写镜像到 SD 卡方式安装。本章节提供 Windows/Linux/Mac 上刷写镜像的操作方法。 >- 本章节使用的镜像是参考“[安装准备](./安装准备-1.html)”获取 openEuler 的树莓派版本镜像。 @@ -45,9 +45,9 @@ ### 写入 SD 卡 >![](./public_sys-resources/icon-notice.gif) **须知:** ->如果获取的是压缩后的镜像文件“openEuler-21.09-raspi-aarch64.img.xz”,需要先将压缩文件解压得到 “openEuler-21.09-raspi-aarch64.img”镜像文件。 +>如果获取的是压缩后的镜像文件“openEuler-{version}-raspi-aarch64.img.xz”,需要先将压缩文件解压得到 “openEuler-{version}-raspi-aarch64.img”镜像文件。 -请按照以下步骤将“openEuler-21.09-raspi-aarch64.img”镜像文件写入 SD 卡: +请按照以下步骤将“openEuler-{version}-raspi-aarch64.img”镜像文件写入 SD 卡: 1. 下载并安装刷写镜像的工具,以下操作以 Win32 Disk Imager 工具为例。 2. 右键选择“以管理员身份运行”,打开 Win32 Disk Imager。 @@ -75,10 +75,10 @@ ### 写入 SD 卡 -1. 如果获取的是压缩后的镜像,需要先执行 `xz -d openEuler-21.09-raspi-aarch64.img.xz` 命令将压缩文件解压得到“openEuler-21.09-raspi-aarch64.img”镜像文件;否则,跳过该步骤。 -2. 将镜像 `openEuler-21.09-raspi-aarch64.img` 刷写入 SD 卡,在 root 权限下执行以下命令: +1. 如果获取的是压缩后的镜像,需要先执行 `xz -d openEuler-{version}-raspi-aarch64.img.xz` 命令将压缩文件解压得到“openEuler-{version}-raspi-aarch64.img”镜像文件;否则,跳过该步骤。 +2. 将镜像 `openEuler-{version}-raspi-aarch64.img` 刷写入 SD 卡,在 root 权限下执行以下命令: - `dd bs=4M if=openEuler-21.09-raspi-aarch64.img of=/dev/sdb` + `dd bs=4M if=openEuler-{version}-raspi-aarch64.img of=/dev/sdb` >![](./public_sys-resources/icon-note.gif) **说明:** >一般情况下,将块大小设置为 4M。如果写入失败或者写入的镜像无法使用,可以尝试将块大小设置为 1M 重新写入,但是设置为 1M 比较耗时。 @@ -102,10 +102,10 @@ ### 写入 SD 卡 -1. 如果获取的是压缩后的镜像,需要先执行 `xz -d openEuler-21.09-raspi-aarch64.img.xz` 命令将压缩文件解压得到“openEuler-21.09-raspi-aarch64.img”镜像文件;否则,跳过该步骤。 -2. 将镜像 `openEuler-21.09-raspi-aarch64.img` 刷入 SD 卡,在 root 权限下执行以下命令: +1. 如果获取的是压缩后的镜像,需要先执行 `xz -d openEuler-{version}-raspi-aarch64.img.xz` 命令将压缩文件解压得到“openEuler-{version}-raspi-aarch64.img”镜像文件;否则,跳过该步骤。 +2. 将镜像 `openEuler-{version}-raspi-aarch64.img` 刷入 SD 卡,在 root 权限下执行以下命令: - `dd bs=4m if=openEuler-21.09-raspi-aarch64.img of=/dev/disk3` + `dd bs=4m if=openEuler-{version}-raspi-aarch64.img of=/dev/disk3` >![](./public_sys-resources/icon-note.gif) **说明:** >一般情况下,将块大小设置为 4m。如果写入失败或者写入的镜像无法使用,可以尝试将块大小设置为 1m 重新写入,但是设置为 1m 比较耗时。 diff --git "a/docs/zh/docs/Installation/\345\256\211\350\243\205\346\226\271\345\274\217\344\273\213\347\273\215.md" "b/docs/zh/docs/Installation/\345\256\211\350\243\205\346\226\271\345\274\217\344\273\213\347\273\215.md" index 43ecdeca953393e88a1cb887e9793264fae2fe84..f389d5649b2a9aa512e7fa12624888537cc80d25 100644 --- "a/docs/zh/docs/Installation/\345\256\211\350\243\205\346\226\271\345\274\217\344\273\213\347\273\215.md" +++ "b/docs/zh/docs/Installation/\345\256\211\350\243\205\346\226\271\345\274\217\344\273\213\347\273\215.md" @@ -2,7 +2,7 @@ >![](./public_sys-resources/icon-notice.gif) **须知:** > ->- 硬件服务器仅支持Taishan 200服务器和FusionServer Pro 机架服务器,具体支持的服务器型号可参考“[硬件兼容支持](./安装准备.html#硬件兼容支持)”;虚拟化平台仅支持openEuler自有的虚拟化组件(HostOS为openEuler,虚拟化组件为发布包中的qemu、KVM)创建的虚拟化平台和华为公有云的x86虚拟化平台。 +>- 支持的服务器型号可参考“[硬件兼容支持](./安装准备.html#硬件兼容支持)”;虚拟化平台仅支持openEuler自有的虚拟化组件(HostOS为openEuler,虚拟化组件为发布包中的qemu、KVM)创建的虚拟化平台和华为公有云的x86虚拟化平台。 >- 安装方式当前仅支持光盘、USB盘安装、网络安装、qcow2镜像安装和私有镜像安装。其中仅华为公有云的x86虚拟化平台支持私有镜像安装。 @@ -98,10 +98,10 @@ 使用您下载的ISO镜像文件的完整路径替换 /path/to/image.iso,使用之前由 dmesg 命令给出的设备名称替换device,同时设置合理的块大小(例如:512k)替换 blocksize,这样可以加快写入进度。 -例如:如果该ISO镜像文件位于 /home/testuser/Downloads/openEuler-21.09-aarch64-dvd.iso,同时探测到的设备名称为sdb,则该命令如下: +例如:如果该ISO镜像文件位于 /home/testuser/Downloads/openEuler-{version}-aarch64-dvd.iso,同时探测到的设备名称为sdb,则该命令如下: ``` - # dd if=/home/testuser/Downloads/openEuler-21.09-aarch64-dvd.iso of=/dev/sdb bs=512k + # dd if=/home/testuser/Downloads/openEuler-{version}-aarch64-dvd.iso of=/dev/sdb bs=512k ``` >![](./public_sys-resources/icon-note.gif) **说明:** >如isolinux描述,由mkisofs命令创建的ISO 9660 文件系统会通过BIOS固件启动,但只能从CD、DVD和BD等介质启动。所以在使用dd命令制作x86的启动U盘前需要使用 isohybrid -u your.iso 对iso进行处理,然后正常使用dd命令将iso写入u盘即可。(该问题仅影响X86)。 @@ -126,7 +126,7 @@ 3. 在计算机中插入USB盘。 4. 重启计算机系统。 -在短暂的延迟后会出现图形化引导页面,该页面包含不同引导选项。如果您在一分钟内未进行任何操作,安装程序将自动开始安装。 +在短暂的延迟后会出现图形化引导界面,该界面包含不同引导选项。如果您在一分钟内未进行任何操作,安装程序将自动以默认选项开始运行。 ## 使用PXE通过网络安装 diff --git a/docs/zh/docs/Installation/FAQ-1.md "b/docs/zh/docs/Installation/\345\270\270\350\247\201\351\227\256\351\242\230\344\270\216\350\247\243\345\206\263\346\226\271\346\263\225-1.md" similarity index 95% rename from docs/zh/docs/Installation/FAQ-1.md rename to "docs/zh/docs/Installation/\345\270\270\350\247\201\351\227\256\351\242\230\344\270\216\350\247\243\345\206\263\346\226\271\346\263\225-1.md" index ec8c058b9f698d436e14ac96e5a7a93d68da76f0..b3acc810f1c811969fb8eaf185041f4d9f6ea96d 100644 --- a/docs/zh/docs/Installation/FAQ-1.md +++ "b/docs/zh/docs/Installation/\345\270\270\350\247\201\351\227\256\351\242\230\344\270\216\350\247\243\345\206\263\346\226\271\346\263\225-1.md" @@ -1,6 +1,6 @@ -# FAQ +# 常见问题与解决方法 -## 树莓派启动失败 +## 问题1:树莓派启动失败 ### 问题现象 @@ -17,7 +17,7 @@ 将完整的镜像重新刷写入 SD 卡。 -## nmcli 命令连接 WIFI 失败 +## 问题2:nmcli 命令连接 WIFI 失败 ### 问题现象 @@ -46,7 +46,7 @@ 7. 查看添加的 WIFI 连接是否已激活(已激活的连接名称前有 `*` 标记)。如果未激活,选择该 WIFI 连接,然后按下键盘右方向键选择 `Activate`,按 `Enter` 激活该连接。待激活完成后,选择 `Back`,按 `Enter` 退出该激活界面,回退到最初的 nmtui 字符界面。 8. 选择 `Quit`,然后按下键盘右方向键选择 `OK`,按 `Enter` 退出 nmtui 字符界面。 -## tensorflow包及相关包安装失败 +## 问题3:tensorflow包及相关包安装失败 ### 问题现象 diff --git a/docs/zh/docs/Installation/FAQ.md "b/docs/zh/docs/Installation/\345\270\270\350\247\201\351\227\256\351\242\230\344\270\216\350\247\243\345\206\263\346\226\271\346\263\225.md" similarity index 94% rename from docs/zh/docs/Installation/FAQ.md rename to "docs/zh/docs/Installation/\345\270\270\350\247\201\351\227\256\351\242\230\344\270\216\350\247\243\345\206\263\346\226\271\346\263\225.md" index d39fc49f12f8fedbadf92d3fc8db458e0c2b4cbe..10b7682e0f603718b97b6f9cdc8c9230ee13aaba 100644 --- a/docs/zh/docs/Installation/FAQ.md +++ "b/docs/zh/docs/Installation/\345\270\270\350\247\201\351\227\256\351\242\230\344\270\216\350\247\243\345\206\263\346\226\271\346\263\225.md" @@ -1,6 +1,6 @@ -# FAQ +# 常见问题与解决方法 -## 安装openEuler时选择第二盘位为安装目标,操作系统无法启动 +## 问题1:安装openEuler时选择第二盘位为安装目标,操作系统无法启动 ### 问题现象 @@ -22,7 +22,7 @@ - 当系统处于安装过程中,在选择磁盘(选择第一块或者两块都选择)后,指定引导程序安装到第一块盘sda中。 - 当系统已经安装完成,若BIOS支持选择从哪个磁盘启动,则可以通过修改BIOS中磁盘启动顺序,尝试重新启动系统。 -## openEuler开机后进入emergency模式 +## 问题2:openEuler开机后进入emergency模式 ### 问题现象 @@ -59,7 +59,7 @@ UUID=afcc811f-4b20-42fc-9d31-7307a8cfe0df /boot ext4 defaults,x-systemd.device-t /dev/mapper/openEuler-swap swap swap defaults 0 0 ``` -## 系统中存在无法激活的逻辑卷组时,重装系统失败 +## 问题3:系统中存在无法激活的逻辑卷组时,重装系统失败 ### 问题现象 @@ -105,7 +105,7 @@ UUID=afcc811f-4b20-42fc-9d31-7307a8cfe0df /boot ext4 defaults,x-systemd.device-t # vgremove -y testvg32947 ``` -## 选择安装源出现异常 +## 问题4:选择安装源出现异常 ### 问题现象 @@ -119,7 +119,7 @@ UUID=afcc811f-4b20-42fc-9d31-7307a8cfe0df /boot ext4 defaults,x-systemd.device-t 检查安装源是否存在异常。使用新的安装源。 -## 如何手动开启kdump服务 +## 问题5:如何手动开启kdump服务 ### 问题现象 @@ -204,7 +204,7 @@ kdump内核预留内存参数说明如下: -## 多块磁盘组成逻辑卷安装系统后,再次安装不能只选其中一块磁盘 +## 问题6:多块磁盘组成逻辑卷安装系统后,再次安装不能只选其中一块磁盘 ### 问题现象 @@ -244,7 +244,7 @@ kdump内核预留内存参数说明如下: > ![](./public_sys-resources/icon-note.gif) **说明:** > 图形模式下也可以按“Ctrl+Alt+F6”回到图形界面,点击[图1](#fig115949762617)右下角的“Refresh”刷新存储配置生效。 -## x86物理机UEFI模式由于security boot安全选项问题无法安装 +## 问题7:x86物理机UEFI模式由于security boot安全选项问题无法安装 ### 问题现象 @@ -276,7 +276,7 @@ x86物理机安装系统时,由于设置了BIOS选项security boot 为enable > ![](./public_sys-resources/icon-note.gif) **说明:** > 设置security boot为disable之后,保存退出,重新安装即可。 -## 安装openEuler时,软件选择页面选择“服务器-性能工具”,安装后messages日志有pmie_check报错信息 +## 问题8:安装openEuler时,软件选择页面选择“服务器-性能工具”,安装后messages日志有pmie_check报错信息 ### 问题现象 @@ -302,7 +302,7 @@ anaconda不支持在chroot环境中安装selinux策略模块,当安装pcp-seli # sudo dnf reinstall pcp-selinux ``` -## 在两块已经安装了系统的磁盘上进行重复选择,并自定义分区时,安装失败 +## 问题9:在两块已经安装了系统的磁盘上进行重复选择,并自定义分区时,安装失败 ### 问题现象 @@ -323,7 +323,7 @@ anaconda不支持在chroot环境中安装selinux策略模块,当安装pcp-seli -## 安装LSI MegaRAID卡的物理机kdump无法生成vmcore +## 问题10:安装LSI MegaRAID卡的物理机kdump无法生成vmcore ### 问题现象 diff --git "a/docs/zh/docs/Kernel/\345\206\205\345\255\230\345\217\257\351\235\240\346\200\247\345\210\206\347\272\247\347\211\271\346\200\247\344\275\277\347\224\250\346\214\207\345\215\227.md" "b/docs/zh/docs/Kernel/\345\206\205\345\255\230\345\217\257\351\235\240\346\200\247\345\210\206\347\272\247\347\211\271\346\200\247\344\275\277\347\224\250\346\214\207\345\215\227.md" index 58a6d97e562129ed78f867039258097061ebaa1b..b2358623310d2b2b7421be671fa87670772e87c0 100644 --- "a/docs/zh/docs/Kernel/\345\206\205\345\255\230\345\217\257\351\235\240\346\200\247\345\210\206\347\272\247\347\211\271\346\200\247\344\275\277\347\224\250\346\214\207\345\215\227.md" +++ "b/docs/zh/docs/Kernel/\345\206\205\345\255\230\345\217\257\351\235\240\346\200\247\345\210\206\347\272\247\347\211\271\346\200\247\344\275\277\347\224\250\346\214\207\345\215\227.md" @@ -324,7 +324,7 @@ echo 10000000 > /proc/sys/vm/shmem_reliable_bytes_limit #### 使用方法 -确保内核开启config开关CONFIG_ARCH_HAS_COPY_MC,/proc/sys/kernel/machine_check_safe值为1时代表全场景使能,改为0代表不使能,其他值均为非法。 +确保内核开启config开关CONFIG_ARCH_HAS_COPY_MC,/proc/sys/debug/machine_check_safe值为1时代表全场景使能,改为0代表不使能,其他值均为非法。 当前各场景容错处理机制如下: diff --git "a/docs/zh/docs/KernelLiveUpgrade/\345\256\211\350\243\205\344\270\216\351\203\250\347\275\262.md" "b/docs/zh/docs/KernelLiveUpgrade/\345\256\211\350\243\205\344\270\216\351\203\250\347\275\262.md" index 9da677b7a58e0e73814ae82616bd92817af49758..f420e5c328a48c378b637823c69e4228f6b7ee9a 100644 --- "a/docs/zh/docs/KernelLiveUpgrade/\345\256\211\350\243\205\344\270\216\351\203\250\347\275\262.md" +++ "b/docs/zh/docs/KernelLiveUpgrade/\345\256\211\350\243\205\344\270\216\351\203\250\347\275\262.md" @@ -21,7 +21,7 @@ ### 软件要求 -- 操作系统:openEuler 22.03 +- 操作系统:openEuler 24.03 ## 环境准备 @@ -35,21 +35,21 @@ 安装内核热升级工具的操作步骤如下: -1. 挂载openEuler的iso文件 +1. 挂载openEuler的iso文件 - ``` - # mount openEuler-22.03-LTS-everything-aarch64-dvd.iso /mnt + ```shell + # mount openEuler-{version}-everything-aarch64-dvd.iso /mnt ``` -2. 配置本地yum源 +2. 配置本地yum源 - ``` + ```shell # vim /etc/yum.repos.d/local.repo ``` 配置内容如下所示: - ``` + ```shell [local] name=local baseurl=file:///mnt @@ -57,27 +57,26 @@ enabled=1 ``` -3. 将RPM数字签名的GPG公钥导入系统 +3. 将RPM数字签名的GPG公钥导入系统 - ``` + ```shell # rpm --import /mnt/RPM-GPG-KEY-openEuler ``` 4. 安装内核热升级工具 - ``` + ```shell # yum install nvwa -y ``` -5. 验证是否安装成功。命令和回显如下表示安装成功 +5. 验证是否安装成功。命令和回显如下表示安装成功 - ``` + ```shell # rpm -qa | grep nvwa nvwa-xxx ``` - ## 部署内核热升级工具 本章介绍内核热升级工具的配置部署: diff --git "a/docs/zh/docs/Kmesh/\345\270\270\350\247\201\351\227\256\351\242\230\344\270\216\350\247\243\345\206\263\345\212\236\346\263\225.md" "b/docs/zh/docs/Kmesh/\345\270\270\350\247\201\351\227\256\351\242\230\344\270\216\350\247\243\345\206\263\346\226\271\346\263\225.md" similarity index 87% rename from "docs/zh/docs/Kmesh/\345\270\270\350\247\201\351\227\256\351\242\230\344\270\216\350\247\243\345\206\263\345\212\236\346\263\225.md" rename to "docs/zh/docs/Kmesh/\345\270\270\350\247\201\351\227\256\351\242\230\344\270\216\350\247\243\345\206\263\346\226\271\346\263\225.md" index 66afb72547ad9c416a4227aabd84f5332b5f3957..a8b92c15bb19ee4f0a3226af5b359a1fbab1dd3f 100644 --- "a/docs/zh/docs/Kmesh/\345\270\270\350\247\201\351\227\256\351\242\230\344\270\216\350\247\243\345\206\263\345\212\236\346\263\225.md" +++ "b/docs/zh/docs/Kmesh/\345\270\270\350\247\201\351\227\256\351\242\230\344\270\216\350\247\243\345\206\263\346\226\271\346\263\225.md" @@ -1,4 +1,4 @@ -# 常见问题及解决办法 +# 常见问题及解决方法 ## 问题1:在使用集群启动模式时,若没有配置控制面程序ip信息,Kmesh服务启动后会报错退出 @@ -6,7 +6,7 @@ 原因:集群启动模式下,Kmesh服务需要跟控制面程序通信,然后从控制面获取配置信息,因此需要设置正确的控制面程序ip信息。 -解决办法:参考[安装与部署](./安装与部署.md)章节中集群启动模式,设置正确的控制面程序ip信息。 +解决方法:参考[安装与部署](./安装与部署.md)章节中集群启动模式,设置正确的控制面程序ip信息。 ## 问题2:Kmesh服务在启动时,提示"get kube config error!" @@ -14,7 +14,7 @@ 原因:集群启动模式下,Kmesh服务会根据k8s的配置,自动获取控制面程序ip信息,若环境中没有配置k8s的kubeconfig路径,会导致获取kubeconfig失败,然后提示上述信息。(若已经手动修改Kmesh的配置文件,正确配置控制面程序ip信息,该问题可忽略) -解决办法:按如下方式配置kubeconfig: +解决方法:按如下方式配置kubeconfig: ```shell mkdir -p $HOME/.kube diff --git "a/docs/zh/docs/KubeEdge/KubeEdge\344\275\277\347\224\250\346\226\207\346\241\243.md" "b/docs/zh/docs/KubeEdge/KubeEdge\344\275\277\347\224\250\346\226\207\346\241\243.md" index 3fcb029e67790128c688c757cc5afdbe1fdef689..53d8b946f4f6194680f382ed76e1eb3c37b6c00f 100644 --- "a/docs/zh/docs/KubeEdge/KubeEdge\344\275\277\347\224\250\346\226\207\346\241\243.md" +++ "b/docs/zh/docs/KubeEdge/KubeEdge\344\275\277\347\224\250\346\226\207\346\241\243.md" @@ -1,6 +1,6 @@ # KubeEdge使用文档 -KubeEdge将Kubernetes的能力延伸到了边缘场景中,为云和边缘之间的网络,应用部署和元数据同步提供基础架构支持。KubeEdge在使用上与Kubernetes保持完全一致,除此之外还扩展了对边缘设备的管理与控制。本节将通过一个简单的例子向用户演示如何通过KubeEdge完成设备边云协同任务。 +KubeEdge将Kubernetes的能力延伸到了边缘场景中,为云和边缘之间的网络、应用部署和元数据同步提供基础架构支持。KubeEdge在使用上与Kubernetes保持完全一致,除此之外还扩展了对边缘设备的管理与控制。本节将通过一个简单的例子向用户演示如何通过KubeEdge完成设备边云协同任务。 ## 1. 准备工作 diff --git a/docs/zh/docs/KubeEdge/overview.md b/docs/zh/docs/KubeEdge/overview.md index 234e5f0668af755c64c909e4bb14466ac278ce80..c4ef29a92390e6540b6fb2b446bf4e90e3d11d91 100644 --- a/docs/zh/docs/KubeEdge/overview.md +++ b/docs/zh/docs/KubeEdge/overview.md @@ -1,3 +1,3 @@ # KubeEdge 边缘计算平台用户指南 -本文档主要介绍了边缘计算平台 KubeEdge 的部署指南与使用文档,让用户了解 KubeEdge,并指导用户和管理员安装和使用 KubeEdge。 \ No newline at end of file +本文档主要介绍了边缘计算平台 KubeEdge 的部署与使用,让用户了解 KubeEdge,并指导用户和管理员安装和使用 KubeEdge。 \ No newline at end of file diff --git a/docs/zh/docs/KubeOS/overview.md b/docs/zh/docs/KubeOS/overview.md index beb6b093378dc4c24fd7d81e1f37081c1ebf6a91..3b9405aa48795ce6ede124d3aa67b17f00c590d9 100644 --- a/docs/zh/docs/KubeOS/overview.md +++ b/docs/zh/docs/KubeOS/overview.md @@ -6,4 +6,3 @@ * 熟悉Linux基本操作。 * 对kubernetes和docker有一定了解 - diff --git "a/docs/zh/docs/KubeOS/\344\275\277\347\224\250\346\226\271\346\263\225.md" "b/docs/zh/docs/KubeOS/\344\275\277\347\224\250\346\226\271\346\263\225.md" index 22db85467ed49e737ff7bd105ec135e1bac545db..6c143abdaf0eb8958a370fef16bbcd882a0f7581 100644 --- "a/docs/zh/docs/KubeOS/\344\275\277\347\224\250\346\226\271\346\263\225.md" +++ "b/docs/zh/docs/KubeOS/\344\275\277\347\224\250\346\226\271\346\263\225.md" @@ -2,183 +2,596 @@ - - - [使用方法](#使用方法) - - [注意事项](#注意事项) - + - [OS CR参数说明](#os-cr参数说明) - [升级指导](#升级指导) - + - [配置(Settings)指导](#配置settings指导) - [回退指导](#回退指导) - - [使用场景](#使用场景) - - - [手动回退](#手动回退) - - - [工具回退](#工具回退) - - + - [手动回退指导](#手动回退指导) + - [工具回退指导](#工具回退指导) + - [附录](#附录) + - [Setting 列表](#setting-列表) + - [kernel Settings](#kernel-settings) + - [Grub Settings](#grub-settings) + - [kubelet配置](#kubelet配置) + - [containerd配置](#containerd配置) + - [Pam Limits配置](#pam-limits配置) - ## 注意事项 -1. 容器 OS 升级为所有软件包原子升级,默认不在容器 OS 内提供单包升级能力。 -2. 容器 OS 升级为双区升级的方式,不支持更多分区数量。 -3. 单节点的升级过程的日志可在节点的/var/log/messages文件查看。 -4. 请严格按照提供的升级和回退流程进行操作,异常调用顺序可能会导致系统无法升级或回退。 -5. 使用docker镜像升级和mtls双向认证仅支持 openEuler 22.09 及之后的版本 -6. 不支持跨大版本升级 +* 公共注意事项 + * 仅支持虚拟机和物理机x86和arm64 UEFI场景。 + * 使用kubectl apply通过YAML创建或更新OS的CR时,不建议并发apply,当并发请求过多时,kube-apiserver会无法处理请求导致失败。 + * 如用户配置了容器镜像仓的证书或密钥,请用户保证证书或密钥文件的权限最小。 +* 升级注意事项 + * 升级为所有软件包原子升级,默认不提供单包升级能力。 + * 升级为双区升级的方式,不支持更多分区数量。 + * 当前暂不支持跨大版本升级。 + * 单节点的升级过程的日志可在节点的 /var/log/messages 文件查看。 + * 请严格按照提供的升级和回退流程进行操作,异常调用顺序可能会导致系统无法升级或回退。 + * 节点上containerd如需配置ctr使用的私有镜像,请将配置文件host.toml按照ctr指导放在/etc/containerd/certs.d目录下。 + * 使用OCI 镜像升级和mtls双向认证仅支持 openEuler 22.09 及之后的版本。 + * nodeselector、executionmode、timewindow和timeinterval 仅支持openEuler 24.09及之后版本。 + * KubeOS 24.03-LTS-SP1 版本与历史版本不兼容。 + * 使用从http/https服务器下载升级镜像功能需要同步使用对应版本镜像制作工具。 + +* 配置注意事项 + * 用户自行指定配置内容,用户需保证配置内容安全可靠 ,尤其是持久化配置(kernel.sysctl.persist、grub.cmdline.current、grub.cmdline.next、kubernetes.kubelet、container.containerd、pam.limits),KubeOS不对参数有效性进行检验。 + * opstype=config时,若osversion与当前集群节点的OS版本不一致,配置不会进行。 + * 当前仅支持kernel参数临时配置(kernel.sysctl)、持久化配置(kernel.sysctl.persist)和grub cmdline配置(grub.cmdline.current和grub.cmdline.next)、kubelet配置(kubernetes.kubelet)、containerd配置(container.containerd)和pam limits配置(pam.limits)。 + * 持久化配置会写入persist持久化分区,升级重启后配置保留;kernel参数临时配置重启后不保留。 + * 配置grub.cmdline.current或grub.cmdline.next时,如为单个参数(非key=value格式参数),请指定key为该参数,value为空。 + * 进行配置删除(operation=delete)时,key=value形式的配置需保证key、value和实际配置一致。 + * 配置不支持回退,如需回退,请修改配置版本和配置内容,重新下发配置。 + * 配置出现错误,节点状态陷入config时,请将配置版本恢复成上一版本并重新下发配置,从而使节点恢复至idle状态。 但是请注意:出现错误前已经配置完成的参数无法恢复。 + * 在配置grub.cmdline.current或grub.cmdline.next时,若需要将已存在的“key=value”格式的参数更新为只有key无value格式,比如将“rd.info=0”更新成rd.info,需要先删除“key=value”,然后在下一次配置时,添加key。不支持直接更新或者更新删除动作在同一次完成。 + +## OS CR参数说明 + +在集群中创建类别为OS的定制对象,设置相应字段。类别OS来自于[安装和部署章节](./安装与部署.md)创建的CRD对象,字段及说明如下: + +* imageurl指定的地址里包含协议,只支持http或https协议。imageurl为https协议时为安全传输,imageurl为http地址时,需指定flagSafe为true,即用户明确该地址为安全时,才会下载镜像。如imageurl为http地址且没有指定flagSafe为true,默认该地址不安全,不会下载镜像并且在升级节点的日志中提示用户该地址不安全。 +* 对于imageurl,推荐使用https协议,使用https协议需要升级的机器已安装相应证书。如果镜像服务器由用户自己维护,需要用户自己进行签名,并保证升级节点已安装对应证书。用户需要将证书放在容器OS```/etc/KubeOS/certs```目录下。地址由管理员传入,管理员应该保证网址的安全性,推荐采用内网地址。 +* 容器OS镜像的合法性检查需要由容器OS镜像服务提供者做合法性检查,确保下载的容器OS镜像来源可靠。 +* 集群存在多OS版本即存在多个OS的实例时,OS的nodeselector字段需要与其他OS不同,即通过label区分的一类node只能对应一个OS实例: + * 当有OS的nodeselector为all-label时,集群只能存在这一个OS的有效实例(有效实例为存在与这个OS对应的节点)。 + * nodeselector不配置的OS也只能有一个,因为nodeselector不配置时认为是对没有label的节点进行操作。 +* timewinterval参数说明: + * 参数不设置时默认为15s。 + * 参数设置为0时,由于k8s controller-runtime的rate limit限制,operator下发任务的时间间隔会逐渐增加直至1000s。 + * 并行时为每批次operator下发升级/配置的时间间隔。 + * 在串行时为每批次节点串行升级完毕后与下次升级/配置下发的时间间隔,批次内部的时间间隔为15s。 + * OS的实例字段进行更新会立刻触发operator。 + + | 参数 |参数类型 | 参数说明 | 使用说明 | 是否必选 | + | -------------- | ------ | ------------------------------------------------------------ | ----- | ---------------- | + | imagetype | string | 升级镜像的类型 | 仅支持docker ,containerd ,或者是 disk,仅在升级场景有效。**注意**:若使用containerd,agent优先使用crictl工具拉取镜像,没有crictl时才会使用ctr命令拉取镜像。使用ctr拉取镜像时,镜像如果在私有仓内,需按照[官方文档](https://github.com/containerd/containerd/blob/main/docs/hosts.md)在/etc/containerd/certs.d目录下配置私有仓主机信息,才能成功拉取镜像。 |是 | + | opstype | string | 操作类型:升级,回退或者配置 | 仅支持upgrade ,config 或者 rollback |是 | + | osversion | string | 升级/回退的目标版本 | osversion需与节点的目标os版本对应(节点上/etc/os-release中PRETTY_NAME字段或k8s检查到的节点os版本) 例如:KubeOS 1.0.0。 |是 | + | maxunavailable | int | 每批同时进行升级/回退/配置的节点数。 | maxunavailable值大于实际节点数时,取实际节点数进行升级/回退/配置。 |是 | + | containerimage | string | 用于升级的容器镜像 | 仅在imagetype是容器类型时生效,仅支持以下3种格式的容器镜像地址: repository/name repository/name@sha256:xxxx repository/name:tag |是 | + | imageurl | string | 用于升级的磁盘镜像的地址 | imageurl中包含协议,只支持http或https协议,例如:```https://192.168.122.15/update.img``` ,仅在使用磁盘镜像升级场景下有效 |是 | + | checksum | string | 用于升级的磁盘镜像校验的checksum(SHA-256)值或者是用于升级的容器镜像的digests值 | 仅在升级场景下有效 |是 | + | flagSafe | bool | 当imageurl的地址使用http协议表示是否是安全的 | 需为 true 或者 false ,仅在imageurl使用http协议时有效 |是 | + | mtls | bool | 用于表示与imageurl连接是否采用https双向认证 | 需为 true 或者 false ,仅在imageurl使用https协议时有效|是 | + | cacert | string | https或者https双向认证时使用的根证书文件 | 仅在imageurl使用https协议时有效| imageurl使用https协议时必选 | + | clientcert | string | https双向认证时使用的客户端证书文件 | 仅在使用https双向认证时有效|mtls为true时必选 | + | clientkey | string | https双向认证时使用的客户端公钥 | 仅在使用https双向认证时有效|mtls为true时必选 | + | evictpodforce | bool | 升级/回退时是否强制驱逐pod | 需为 true 或者 false ,仅在升级或者回退时有效| 必选 | + | sysconfigs | / | 配置设置 | 1. “opstype=config”时只进行配置。
2.“opstype=upgrade/rollback”时,代表升级/回退后配置,即在升级/回退重启后进行配置,详细字段说明请见[配置(Settings)指导](#配置settings指导) | “opstype=config”时必选 | + | upgradeconfigs | / | 升级前配置设置 | 在升级或者回退时有效,在升级或者回退操作之前起效,详细字段说明请见[配置(Settings)指导](#配置settings指导)| 可选 | + | nodeselector | string | 需要进行升级/配置/回滚操作的节点label | 用于只对具有某些特定label的节点而不是集群所有worker节点进行运维的场景,需要进行运维操作的节点需要包含key为upgrade.openeuler.org/node-selector的label,nodeselector为该label的value值。
注意事项:
1.此参数不配置时,或者配置为“no-label”时对没有upgrade.openeuler.org/node-selector的节点进行操作
2.此参数为“”时,对具有upgrade.openeuler.org/node-selector=“”的节点进行操作
3.如需忽略label,对所有节点进行操作,需指定此参数为all-label| 可选 | + | timewindow | / | 升级/配置/回滚操作的时间窗口 |1.指定时间窗口时starttime和endtime都需指定,即二者需要同时为空或者同时不为空
2.starttime和endtime类型为string,需要为YYYY-MM-DD HH:MM:SS格式或者HH:MM:SS格式,且二者格式需一致
3.为HH:MM:SS格式时,starttime < endtime认为starttime是下一天的该时间
4.timewindow不配置时默认为不存在时间窗限制| 可选 | + | timeinterval | int | 升级/配置/回滚操作每批次任务下发的时间间隔 |参数单位为秒,时间间隔为operator下发任务的时间间隔,如k8s集群繁忙无法立即响应operator请求,实际时间间隔可能会大于指定时间| 可选 | + | executionmode | string | 升级/配置/回滚操作执行的方式 |仅支持serial或者parallel,即串行或者并行,当次参数不设置时,默认采用并行的方式| 可选 | ## 升级指导 -在集群中创建类别为OS的定制对象,设置相应字段。类别OS来自于安装和部署章节创建的CRD对象,字段及说明如下: +1.编写YAML文件,在集群中部署 OS 的cr实例,用于部署cr实例的YAML示例如下,假定将上面的YAML保存到upgrade_v1alpha1_os.yaml; + * 使用磁盘镜像进行升级 + + ```yaml + apiVersion: upgrade.openeuler.org/v1alpha1 + kind: OS + metadata: + name: os-sample + spec: + imagetype: disk + opstype: upgrade + osversion: edit.os.version + maxunavailable: edit.node.upgrade.number + containerimage: "" + evictpodforce: true/false + imageurl: edit.image.url + checksum: image.checksum + flagSafe: imageurl.safety + mtls: imageurl use mtls or not + cacert: ca certificate + clientcert: client certificate + clientkey: client certificate key + ``` + + * 使用容器镜像进行升级 + * 使用容器镜像进行升级前请先制作升级所需的容器镜像,制作方式请见[《容器OS镜像制作指导》](./容器OS镜像制作指导.md)中 [KubeOS OCI 镜像制作](./容器OS镜像制作指导.md#kubeos-oci-镜像制作)。 + * 节点容器引擎为docker + + ``` yaml + apiVersion: upgrade.openeuler.org/v1alpha1 + kind: OS + metadata: + name: os-sample + spec: + imagetype: docker + opstype: upgrade + osversion: edit.os.version + maxunavailable: edit.node.upgrade.number + containerimage: container image like repository/name:tag + evictpodforce: true/false + imageurl: "" + checksum: container image digests + flagSafe: false + mtls: true + ``` + + * 节点容器引擎为containerd + + ```yaml + apiVersion: upgrade.openeuler.org/v1alpha1 + kind: OS + metadata: + name: os-sample + spec: + imagetype: containerd + opstype: upgrade + osversion: edit.os.version + maxunavailable: edit.node.upgrade.number + containerimage: container image like repository/name:tag + evictpodforce: true/false + imageurl: "" + checksum: container image digests + flagSafe: false + mtls: true + ``` + + * 升级并且进行配置的示例如下: + * 以节点容器引擎为containerd为例,升级方式对配置无影响,upgradeconfigs在升级前起效,sysconfigs在升级后起效,配置参数说明请见[配置(Settings)指导](#配置settings指导)。 + * 升级并且配置时opstype字段需为upgrade。 + * upgradeconfig为升级之前执行的配置,sysconfigs为升级机器重启后执行的配置,用户可按需进行配置。 + + ```yaml + apiVersion: upgrade.openeuler.org/v1alpha1 + kind: OS + metadata: + name: os-sample + spec: + imagetype: "" + opstype: upgrade + osversion: edit.os.version + maxunavailable: edit.node.upgrade.number + containerimage: "" + evictpodforce: true/false + imageurl: "" + checksum: container image digests + flagSafe: false + mtls: false + sysconfigs: + version: edit.os.version + configs: + - model: kernel.sysctl + contents: + - key: kernel param key1 + value: kernel param value1 + - key: kernel param key2 + value: kernel param value2 + - model: kernel.sysctl.persist + configpath: persist file path + contents: + - key: kernel param key3 + value: kernel param value3 + - key: "" + value: "" + upgradeconfigs: + version: 1.0.0 + configs: + - model: kernel.sysctl + contents: + - key: kernel param key4 + value: kernel param value4 + ``` + * 设置nodeselector、timewindow、timeinterval、executionmode升级部分节点示例如下: + * 以节点容器引擎为containerd为例,升级方式对节点筛选无影响。 + * 需要进行升级的节点需包含key为`upgrade.openeuler.org/node-selector`的label,nodeselector的值为该label的value,即假定nodeselector值为kubeos,则只对包含`upgrade.openeuler.org/node-selector=kubeos`的label的worker节点进行升级。 + * nodeselector、timewindow、timeinterval、executionmode对配置和回滚同样有效。 + * 节点添加label、修改label、删除label和查看label命令示例如下: + ``` shell + # 为节点kubeos-node1增加label + kubectl label nodes kubeos-node1 upgrade.openeuler.org/node-selector=kubeos-v1 + # 修改节点kubeos-node1的label + kubectl label --overwrite nodes kubeos-node1 upgrade.openeuler.org/node-selector=kubeos-v2 + # 删除节点kubeos-node1的label + kubectl label nodes kubeos-node1 upgrade.openeuler.org/node-selector- + # 查看节点的label + kubectl get nodes --show-labels + ``` + * yaml示例如下: + ```yaml + apiVersion: upgrade.openeuler.org/v1alpha1 + kind: OS + metadata: + name: os-sample + spec: + imagetype: containerd + opstype: upgrade + osversion: edit.os.version + maxunavailable: edit.node.upgrade.number + containerimage: container image like repository/name:tag + evictpodforce: true/false + imageurl: "" + checksum: container image digests + flagSafe: false + mtls: true + nodeselector: edit.node.label.key + timewindow: + starttime: "HH::MM::SS/YYYY-MM-DD HH::MM::SS" + endtime: "HH::MM::SS/YYYY-MM-DD HH::MM::SS" + timeinterval: time intervel like 30 + executionmode: serial/parallel + ``` + +2. 查看未升级的节点的 OS 版本。 + + ```shell + kubectl get nodes -o custom-columns='NAME:.metadata.name,OS:.status.nodeInfo.osImage' + ``` -| 参数 |参数类型 | 参数说明 | 使用说明 | 是否必选 | -| -------------- | ------ | ------------------------------------------------------------ | ----- | ---------------- | -| imagetype | string | 使用的升级镜像的类型 | 需为 docker 或者 disk ,其他值无效,且该参数仅在升级场景有效|是 | -| opstype | string | 进行的操作,升级或者回退 | 需为 upgrade ,或者 rollback ,其他值无效 |是 | -| osversion | string | 用于升级或回退的镜像的OS版本 | 需为 KubeOS version , 例如: KubeOS 1.0.0|是 | -| maxunavailable | int | 同时进行升级或回退的节点数 | maxunavailable值设置为大于实际集群的节点数时也可正常部署,升级或回退时会按照集群内实际节点数进行|是 | -| dockerimage | string | 用于升级的容器镜像 | 需要为容器镜像格式:repository/name:tag,仅在使用容器镜像升级场景下有效|是 | -| imageurl | string | 用于升级的磁盘镜像的地址 | imageurl中包含协议,只支持http或https协议,例如:https://192.168.122.15/update.img 仅在使用磁盘镜像升级场景下有效|是 | -| checksum | string | 用于升级的磁盘镜像校验的checksum(SHA-256)值 | 仅在使用磁盘镜像升级场景下有效 |是 | -| flagSafe | bool | 当imageurl的地址使用http协议表示是否是安全的 | 需为 true 或者 false ,仅在imageurl使用http协议时有效 |是 | -| mtls | bool | 用于表示与imageurl连接是否采用https双向认证 | 需为 true 或者 false ,仅在imageurl使用https协议时有效|是 | -| cacert | string | https或者https双向认证时使用的根证书文件 | 仅在imageurl使用https协议时有效| imageurl使用https协议时必选 | -| clientcert | string | https双向认证时使用的客户端证书文件 | 仅在使用https双向认证时有效|mtls为true时必选 | -| clientkey | string | https双向认证时使用的客户端公钥 | 仅在使用https双向认证时有效|mtls为true时必选 | +3. 执行命令,在集群中部署cr实例后,节点会根据配置的参数信息进行升级。 -imageurl指定的地址里包含协议,只支持http或https协议。imageurl为https协议时为安全传输,imageurl为http地址时,需指定flagSafe为true,即用户明确该地址为安全时,才会下载镜像。如imageurl为http地址且没有指定flagSafe为true,默认该地址不安全,不会下载镜像并且在升级节点的日志中提示用户该地址不安全 + ```shell + kubectl apply -f upgrade_v1alpha1_os.yaml + ``` -对于imageurl,推荐使用https协议,使用https协议需要升级的机器已安装相应证书。如果镜像服务器由用户自己维护,需要用户自己进行签名,并保证升级节点已安装对应证书。用户需要将证书放在容器OS /etc/KubeOS/certs目录下。地址由管理员传入,管理员应该保证网址的安全性,推荐采用内网地址。 +4. 再次查看节点的 OS 版本来确认节点是否升级完成。 -容器OS镜像的合法性检查需要由容器OS镜像服务提供者做合法性检查,确保下载的容器OS镜像来源可靠 + ```shell + kubectl get nodes -o custom-columns='NAME:.metadata.name,OS:.status.nodeInfo.osImage' + ``` -编写YAML文件,在集群中部署 OS 的cr实例,用于部署cr实例的YAML示例如下: +> ![](./public_sys-resources/icon-note.gif)**说明**: +> +> 如果后续需要再次升级,与上面相同对 upgrade_v1alpha1_os.yaml 的 相应字段进行相应修改。 -* 使用磁盘镜像进行升级 +## 配置(Settings)指导 - ``` - apiVersion: upgrade.openeuler.org/v1alpha1 - kind: OS - metadata: - name: os-sample - spec: - imagetype: disk - opstype: upgrade - osversion: edit.os.version - maxunavailable: edit.node.upgrade.number - dockerimage: "" - imageurl: edit.image.url - checksum: image.checksum - flagSafe: imageurl.safety - mtls: imageurl use mtls or not - cacert: ca certificate - clientcert: client certificate - clientkey: client certificate key - ``` +* Settings参数说明: -* 使用容器镜像升级 + 基于示例YAML对配置的参数进行说明,示例YAML如下,配置的格式(缩进)需和示例保持一致: - ``` shell + ```yaml apiVersion: upgrade.openeuler.org/v1alpha1 kind: OS metadata: name: os-sample spec: - imagetype: docker - opstype: upgrade + imagetype: "" + opstype: config osversion: edit.os.version - maxunavailable: edit.node.upgrade.number - dockerimage: dockerimage like repository/name:tag - imageurl: "" + maxunavailable: edit.node.config.number + containerimage: "" + evictpodforce: false checksum: "" - flagSafe: false - mtls: true + sysconfigs: + version: edit.sysconfigs.version + configs: + - model: kernel.sysctl + contents: + - key: kernel param key1 + value: kernel param value1 + - key: kernel param key2 + value: kernel param value2 + operation: delete + - model: kernel.sysctl.persist + configpath: persist file path + contents: + - key: kernel param key3 + value: kernel param value3 + - model: grub.cmdline.current + contents: + - key: boot param key1 + - key: boot param key2 + value: boot param value2 + - key: boot param key3 + value: boot param value3 + operation: delete + - model: grub.cmdline.next + contents: + - key: boot param key4 + - key: boot param key5 + value: boot param value5 + - key: boot param key6 + value: boot param value6 + operation: delete ``` - 使用容器镜像进行升级前请先制作升级所需的容器镜像,制作方式请见《容器OS镜像制作指导》 + 配置的参数说明如下: -假定将上面的YAML保存到upgrade_v1alpha1_os.yaml + | 参数 | 参数类型 | 参数说明 | 使用说明 | 配置中是否必选 | + | ---------- | -------- | --------------------------- | ------------------------------------------------------------ | ----------------------- | + | version | string | 配置的版本 | 通过version是否相等来判断配置是否触发,version为空(为""或者没有值)时同样进行判断,所以不配置sysconfigs/upgradeconfigs时,继存的version值会被清空并触发配置。 | 是 | + | configs | / | 具体配置内容 | 包含具体配置项列表。 | 是 | + | model | string | 配置的类型 | 支持的配置类型请看附录下的[Settings列表](#setting-列表) | 是 | + | configpath | string | 配置文件路径 | 仅在kernel.sysctl.persist配置类型中生效,请看附录下的[Settings列表](#setting-列表)对配置文件路径的说明。 | 否 | + | contents | / | 具体key/value的值及操作类型 | 包含具体配置参数列表。 | 是 | + | key | string | 参数名称 | key不能为空,不能包含"=",不建议配置含空格、tab键的字符串,具体请看附录下的[Settings列表](#setting-列表)中每种配置类型对key的说明。 | 是 | + | value | string | 参数值 | key=value形式的参数中,value不能为空,不建议配置含空格、tab键的字符串,具体请看附录下的[Settings列表](#setting-列表)中对每种配置类型对value的说明。 | key=value形式的参数必选 | + | operation | string | 对参数进行的操作 | 仅对kernel.sysctl.persist、grub.cmdline.current、grub.cmdline.next类型的参数生效。默认为添加或更新。仅支持配置为delete,代表删除已存在的参数(key=value需完全一致才能删除)。 | 否 | -查看未升级的节点的 OS 版本 + * upgradeconfigs与sysconfigs参数相同,upgradeconfigs为升级/回退前进行的配置,仅在upgrade/rollback场景起效,sysconfigs既支持只进行配置,也支持在升级/回退重启后进行配置。 -``` -kubectl get nodes -o custom-columns='NAME:.metadata.name,OS:.status.nodeInfo.osImage' -``` +* 使用说明 -执行命令,在集群中部署cr实例后,节点会根据配置的参数信息进行升级。 + * 编写YAML文件,在集群中部署 OS 的cr实例,用于部署cr实例的YAML示例如上,假定将上面的YAML保存到upgrade_v1alpha1_os.yaml。 -``` -kubectl apply -f upgrade_v1alpha1_os.yaml -``` + * 查看配置之前的节点的配置的版本和节点状态(NODESTATUS状态为idle)。 -再次查看节点的 OS 版本来确认节点是否升级完成 + ```shell + kubectl get osinstances -o custom-columns='NAME:.metadata.name,NODESTATUS:.spec.nodestatus,SYSCONFIG:status.sysconfigs.version,UPGRADECONFIG:status.upgradeconfigs.version' + ``` -``` -kubectl get nodes -o custom-columns='NAME:.metadata.name,OS:.status.nodeInfo.osImage' -``` + * 执行命令,在集群中部署cr实例后,节点会根据配置的参数信息进行配置,再次查看节点状态(NODESTATUS变成config)。 -> ![](./public_sys-resources/icon-note.gif)**说明**: -> -> 如果后续需要再次升级,与上面相同对 upgrade_v1alpha1_os.yaml 的 imageurl ,osversion,checksum,maxunavailable,flagSafe 或者dockerimage字段进行相应修改。 + ```shell + kubectl apply -f upgrade_v1alpha1_os.yaml + kubectl get osinstances -o custom-columns='NAME:.metadata.name,NODESTATUS:.spec.nodestatus,SYSCONFIG:status.sysconfigs.version,UPGRADECONFIG:status.upgradeconfigs.version' + ``` + + * 再次查看节点的配置的版本确认节点是否配置完成(NODESTATUS恢复为idle)。 + + ```shell + kubectl get osinstances -o custom-columns='NAME:.metadata.name,NODESTATUS:.spec.nodestatus,SYSCONFIG:status.sysconfigs.version,UPGRADECONFIG:status.upgradeconfigs.version' + ``` + +* 如果后续需要再次配置,与上面相同对 upgrade_v1alpha1_os.yaml 的相应字段进行相应修改。 ## 回退指导 ### 使用场景 -- 虚拟机无法正常启动时,需要退回到上一可以启动的版本时进行回退操作,仅支持手动回退容器 OS 。 -- 虚拟机能够正常启动并且进入系统,需要将当前版本退回到老版本时进行回退操作,支持工具回退(类似升级方式)和手动回退,建议使用工具回退。 +* 虚拟机无法正常启动时,可在grub启动项页面手动切换启动项,使系统回退至上一版本(即手动回退)。 +* 虚拟机能够正常启动并且进入系统时,支持工具回退和手动回退,建议使用工具回退。 +* 工具回退有两种方式: + 1. rollback模式直接回退至上一版本。 + 2. upgrade模式重新升级至上一版本。 -### 手动回退 +### 手动回退指导 -手动重启虚拟机,选择第二启动项进行回退,手动回退仅支持回退到本次升级之前的版本。 +* 手动重启虚拟机,进入启动项页面后,选择第二启动项进行回退,手动回退仅支持回退到上一个版本。 -### 工具回退 +### 工具回退指导 -* 回退至任意版本 - * 修改 OS 的cr实例的YAML 配置文件(例如 upgrade_v1alpha1_os.yaml),设置相应字段为期望回退的老版本镜像信息。类别OS来自于安装和部署章节创建的CRD对象,字段说明及示例请见上一节升级指导。 +* 回退至任意版本 + 1. 修改 OS 的cr实例的YAML 配置文件(例如 upgrade_v1alpha1_os.yaml),设置相应字段为期望回退的老版本镜像信息。类别OS来自于安装和部署章节创建的CRD对象,字段说明及示例请见上一节升级指导。 - * YAML修改完成后执行更新命令,在集群中更新定制对象后,节点会根据配置的字段信息进行回退 + 2. YAML修改完成后执行更新命令,在集群中更新定制对象后,节点会根据配置的字段信息进行回退 - ``` - kubectl apply -f upgrade_v1alpha1_os.yaml - ``` + ```shell + kubectl apply -f upgrade_v1alpha1_os.yaml + ``` * 回退至上一版本 + * OS回退至上一版本:修改upgrade_v1alpha1_os.yaml,设置osversion为上一版本,opstype为rollback,回退至上一版本(即切换至上一分区)。YAML示例如下: - * 修改upgrade_v1alpha1_os.yaml,设置osversion为上一版本,opstype为rollback,回退至上一版本(即切换至上一分区)。YAML示例如下: - - ``` + ```yaml + apiVersion: upgrade.openeuler.org/v1alpha1 + kind: OS + metadata: + name: os-sample + spec: + imagetype: "" + opstype: rollback + osversion: KubeOS pervious version + maxunavailable: 2 + containerimage: "" + evictpodforce: true/false + imageurl: "" + checksum: "" + flagSafe: false + mtls: true + ``` + + * 配置回退至上一版本:修改upgrade_v1alpha1_os.yaml,设置sysconfigs/upgradeconfigs的version为上一版本,回退至上一版本(已配置的参数无法回退)。YAML示例如下: + + ```yaml apiVersion: upgrade.openeuler.org/v1alpha1 kind: OS metadata: name: os-sample spec: imagetype: "" - opstype: rollback - osversion: KubeOS pervious version - maxunavailable: 2 - dockerimage: "" + opstype: config + osversion: edit.os.version + maxunavailable: edit.node.config.number + containerimage: "" + evictpodforce: true/false imageurl: "" checksum: "" flagSafe: false - mtls:true + mtls: false + sysconfigs: + version: previous config version + configs: + - model: kernel.sysctl + contents: + - key: kernel param key1 + value: kernel param value1 + - key: kernel param key2 + value: kernel param value2 + - model: kernel.sysctl.persist + configpath: persist file path + contents: + - key: kernel param key3 + value: kernel param value3 ``` - * YAML修改完成后执行更新命令,在集群中更新定制对象后,节点会根据配置的字段信息进行回退 +* YAML修改完成后执行更新命令,在集群中更新定制对象后,节点会根据配置的字段信息进行回退。 - ``` - kubectl apply -f upgrade_v1alpha1_os.yaml + ```shell + kubectl apply -f upgrade_v1alpha1_os.yaml + ``` + + 更新完成后,节点会根据配置信息回退容器 OS。 +* 查看节点容器 OS 版本(回退OS版本)或节点config版本&节点状态为idle(回退config版本),确认回退是否成功。 + + ```shell + kubectl get osinstances -o custom-columns='NAME:.metadata.name,NODESTATUS:.spec.nodestatus,SYSCONFIG:status.sysconfigs.version,UPGRADECONFIG:status.upgradeconfigs.version' + ``` + +## 附录 + +### Setting 列表 + +#### kernel Settings + +* kenerl.sysctl:临时设置内核参数,重启后无效,key/value 表示内核参数的 key/value, key与value均不能为空且key不能包含“=”,该参数不支持删除操作(operation=delete)示例如下: + + ```yaml + configs: + - model: kernel.sysctl + contents: + - key: user.max_user_namespaces + value: 16384 + - key: net.ipv4.tcp_tw_recycle + value: 0 + operation: delete + ``` + +* kernel.sysctl.persist: 设置持久化内核参数,key/value表示内核参数的key/value,key与value均不能为空且key不能包含“=”, configpath为配置文件路径,支持新建(需保证父目录存在),如不指定configpath默认修改/etc/sysctl.conf,示例如下: + ```yaml + configs: + - model: kernel.sysctl.persist + configpath : /etc/persist.conf + contents: + - key: user.max_user_namespaces + value: 16384 + - key: net.ipv4.tcp_tw_recycle + value: 0 + operation: delete ``` - 更新完成后,节点会根据配置信息回退容器 OS。 +#### Grub Settings + +* grub.cmdline.current/next: 设置grub.cfg文件中的内核引导参数,该行参数在grub.cfg文件中类似如下示例: -* 查看节点容器 OS 版本,确认回退是否成功。 + ```shell + linux /boot/vmlinuz root=/dev/sda2 ro rootfstype=ext4 nomodeset quiet oops=panic softlockup_panic=1 nmi_watchdog=1 rd.shell=0 selinux=0 crashkernel=256M panic=3 + ``` + + * 在dm-verity模式下,grub.cmdline配置下发无效。 + + * KubeOS使用双分区,grub.cmdline.current/next支持对当前分区或下一分区进行配置: + + * grub.cmdline.current:对当前分区的启动项参数进行配置。 + * grub.cmdline.next:对下一分区的启动项参数进行配置。 + + * 注意:升级/回退前后的配置,始终基于升级/回退操作下发时的分区位置进行current/next的区分。假设当前分区为A分区,下发升级操作并在sysconfigs(升级重启后配置)中配置grub.cmdline.current,重启后进行配置时仍修改A分区对应的grub cmdline。 + + * grub.cmdline.current/next支持“key=value”(value不能为空),也支持单key。若value中有“=”,例如“root=UUID=some-uuid”,key应设置为第一个“=”前的所有字符,value为第一个“=”后的所有字符。 配置方法示例如下: + + ```yaml + configs: + - model: grub.cmdline.current + contents: + - key: selinux + value: "0" + - key: root + value: UUID=e4f1b0a0-590e-4c5f-9d8a-3a2c7b8e2d94 + - key: panic + value: "3" + operation: delete + - key: crash_kexec_post_notifiers + - model: grub.cmdline.next + contents: + - key: selinux + value: "0" + - key: root + value: UUID=e4f1b0a0-590e-4c5f-9d8a-3a2c7b8e2d94 + - key: panic + value: "3" + operation: delete + - key: crash_kexec_post_notifiers + ``` +#### kubelet配置 + +* kuberntes.kubelet: 配置节点kubelet的配置文件中的参数,参数说明和约束如下: + * 仅支持```KubeletConfiguration```中的配置参数。 + * 节点kubelet配置文件需要为yaml格式的文件。 + * 如不指定configpath,默认配置文件路径为```/var/lib/kubelet/config.yaml```,并且需要注意的是配置文件的路径需要与kubelet启动时的```-- config```参数指定的路径一致才能生效,用户需保证配置文件路径有效。 + * kubelet配置的value参数类型支持为空/null、int、float、string、boolean和数组。当为数组时,数组元素允许重复,数组参数进行更新时会追加到已有数组中。如需修改数组中的元素,需要先删除数组,再新增数组来完成修改。 + * 如配置存在嵌套,则通过```'.'```连接嵌套的key值,例如如果修改如下yaml示例中```cacheAuthorizedTTL```参数为1s。 + + ```yaml + authorization: + mode: Webhook + webhook: + cacheAuthorizedTTL: 0s ``` - kubectl get nodes -o custom-columns='NAME:.metadata.name,OS:.status.nodeInfo.osImage' + 参数配置示例如下: + ```yaml + configs: + - model: kuberntes.kubelet + configpath: /etc/test.yaml + contents: + - key: authorization.webhook.cacheAuthorizedTTL + value: 1s ``` + * kubernetes.kubelet进行删除时,不对value与配置文件中的值进行比较。 +#### containerd配置 + +* container.containerd: 配置节点上containerd的配置文件中的参数,参数说明和约束如下: + * containerd需要配置文件为toml格式,所以key为toml中该参数的表头.键名,例如希望修改如下toml示例中```no_shim```为true。 + ```toml + [plugins."io.containerd.runtime.v1.linux"] + no_shim=false + runtime="runc" + runtime_root=" + ``` + 参数配置示例如下: + ```yaml + configs: + - model: container.containerd + configpath: /etc/test.toml + contents: + - key: plugins."io.containerd.runtime.v1.linux".no_shim + value: true + ``` + * toml使用```.```分割键,os-agent识别时与toml保持一致,所以当键名中包含```.```时,该键名需要使用```""```,例如上例中的```"io.containerd.runtime.v1.linux"```为一个键 + * 如不指定configpath,默认配置文件路径为```/etc/containerd/config.toml```,用户需要保证配置文件路径有效。 + * container.conatainerd配置的key和value均不能为空,value参数类型支持int、float、string、boolean和数组。当为数组时,数组元素允许重复,数组参数进行更新时会追加到已有数组中。如需修改数组中的元素,需要先删除数组,再新增数组来完成修改。 + * container.containerd进行删除时,不对value与配置文件中的值进行比较。 + +#### Pam Limits配置 + +* pam.limits:配置节点上/etc/security/limits.conf文件 + * key为domain值,value的格式需要为type.item.value(limits.conf文件要求每行格式为:\ \ \ \),例如: + ```yaml + configs: + - model: pam.limits + contents: + - key: ftp + value: soft.core.0 + ``` + * 更新时,如不需要对type/item/value更新时,可以使用```_```,忽略对此参数的更新,但value必须为点隔的三段式,例如: + ```yaml + configs: + - model: pam.limits + contents: + - key: ftp + value: hard._.1 + ``` + * pam.limits新增时,value中不允许包含```_``` + * pam.limits删除时,会对value进行校验,当value与配置文件中的值不同时,删除失败 + * pam.limits配置的key和value均不能为空 diff --git "a/docs/zh/docs/KubeOS/\345\256\211\350\243\205\344\270\216\351\203\250\347\275\262.md" "b/docs/zh/docs/KubeOS/\345\256\211\350\243\205\344\270\216\351\203\250\347\275\262.md" index a613ae27e30700461c06a734ed7cc026490c9055..e678699819a38166cc3e1508d19b3baa4f5930c0 100644 --- "a/docs/zh/docs/KubeOS/\345\256\211\350\243\205\344\270\216\351\203\250\347\275\262.md" +++ "b/docs/zh/docs/KubeOS/\345\256\211\350\243\205\344\270\216\351\203\250\347\275\262.md" @@ -4,25 +4,22 @@ - - - [安装与部署](#安装与部署) - - [软硬件要求](#软硬件要求) - - [硬件要求](#硬件要求) - [软件要求](#软件要求) - [环境准备](#环境准备) - - [安装容器OS升级工具](#安装容器os升级工具) - - [部署容器OS升级工具](#部署容器os升级工具) - - [制作os-operator和os-proxy镜像](#制作os-operator和os-proxy镜像) - - [制作容器OS镜像](#制作容器os镜像) - - [部署CRD,operator和proxy](#部署crd,operator和proxy) - - + - [环境准备](#环境准备-1) + - [操作步骤](#操作步骤) + - [制作容器OS虚拟机镜像](#制作容器os虚拟机镜像) + - [注意事项](#注意事项) + - [操作步骤](#操作步骤-1) + - [部署CRD,operator和proxy](#部署crdoperator和proxy) + - [注意事项](#注意事项-1) + - [操作步骤](#操作步骤-2) @@ -34,7 +31,7 @@ ### 软件要求 -* 操作系统:openEuler 22.09 +* 操作系统:openEuler 24.03-LTS-SP1 ### 环境准备 @@ -45,33 +42,23 @@ 安装容器 OS 升级工具的操作步骤如下: -1. 配置 yum 源:openEuler 22.09 和 openEuler 22.09 EPOL - - ``` - [openEuler22.09] # openEuler 22.09 官方发布源 - name=openEuler22.09 - baseurl=http://repo.openeuler.org/openEuler-22.09/everything/$basearch/ - enabled=1 - gpgcheck=1 - gpgkey=http://repo.openeuler.org/openEuler-22.09/everything/$basearch/RPM-GPG-KEY-openEuler - ``` +1. 配置 openEuler 24.03-LTS-SP1 yum 源: ``` - [Epol] # openEuler 22.09:Epol 官方发布源 - name=Epol - baseurl=http://repo.openeuler.org/openEuler-22.09/EPOL/main/$basearch/ + [openEuler24.03-LTS-SP1] # openEuler 24.03-LTS-SP1 官方发布源 + name=openEuler24.03-LTS-SP1 + baseurl=http://repo.openeuler.org/openEuler-24.03-LTS-SP1/everything/$basearch/ enabled=1 gpgcheck=1 - gpgkey=http://repo.openeuler.org/openEuler-22.09/OS/$basearch/RPM-GPG-KEY-openEuler + gpgkey=http://repo.openeuler.org/openEuler-24.03-LTS-SP1/everything/$basearch/RPM-GPG-KEY-openEuler ``` 2. 使用 root 帐户安装容器 OS 升级工具: ```shell - # yum install KubeOS KubeOS-scripts -y + yum install KubeOS KubeOS-scripts -y ``` - > ![](./public_sys-resources/icon-note.gif)**说明**: > > 容器 OS 升级工具会安装在 /opt/kubeOS 目录下,包括os-operator,os-proxy,os-agent二进制,制作容器 OS 工具及相应配置文件 。 @@ -106,23 +93,23 @@ export IMG_OPERATOR=your_imageRepository/os-operator_imageName:version ``` -4. 请用户自行编写Dockerfile来构建镜像 ,Dockfile编写请注意以下几项 +4. 请用户自行编写Dockerfile来构建镜像 ,Dockfile编写请注意以下几项: - * os-operator和os-proxy镜像需要基于baseimage进行构建,请用户保证baseimage的安全性 - * 需将os-operator和os-proxy二进制文件分别拷贝到对应的镜像中 - * 请确保os-proxy镜像中os-proxy二进制文件件属主和属组为root,文件权限为500 - * 请确保os-operator镜像中os-operator二进制文件属主和属组为容器内运行os-operator进程的用户,文件权限为500 + * os-operator和os-proxy镜像需要基于baseimage进行构建,请用户保证baseimage的安全性。 + * 需将os-operator和os-proxy二进制文件分别拷贝到对应的镜像中。 + * 请确保os-proxy镜像中os-proxy二进制文件件属主和属组为root,文件权限为500。 + * 请确保os-operator镜像中os-operator二进制文件属主和属组为容器内运行os-operator进程的用户,文件权限为500。 * os-operator和os-proxy的二进制文件在镜像内的位置和容器启动时运行的命令需与部署的yaml中指定的字段相对应。 Dockerfile示例如下 - ``` + ```dockerfile FROM your_baseimage COPY ./bin/proxy /proxy ENTRYPOINT ["/proxy"] ``` - ``` + ```dockerfile FROM your_baseimage COPY --chown=6552:6552 ./bin/operator /operator ENTRYPOINT ["/operator"] @@ -149,20 +136,19 @@ docker push ${IMG_PROXY} ``` - ### 制作容器OS虚拟机镜像 #### 注意事项 -* 以虚拟机镜像为例,如需进行物理机的镜像制作请见《容器OS镜像制作指导》 -* 制作容器OS 镜像需要使用 root 权限 -* 容器OS 镜像制作工具的 rpm 包源为 openEuler 具体版本的 everything 仓库和 EPOL 仓库。制作镜像时提供的 repo 文件中,yum 源建议同时配置 openEuler 具体版本的 everything 仓库和 EPOL 仓库 -* 使用默认 rpmlist 制作的容器OS虚拟机镜像,默认和制作工具保存在相同路径,该分区至少有 25GiB 的剩余磁盘空间 -* 制作容器 OS 镜像时,不支持用户自定义配置挂载文件 +* 以虚拟机镜像为例,如需进行物理机的镜像制作请见《[容器OS镜像制作指导](./容器OS镜像制作指导.md)》。 +* 制作容器OS 镜像需要使用 root 权限。 +* 容器OS 镜像制作工具的 rpm 包源为 openEuler 具体版本的 everything 仓库和 EPOL 仓库。制作镜像时提供的 repo 文件中,yum 源建议同时配置 openEuler 具体版本的 everything 仓库和 EPOL 仓库。 +* 使用默认 rpmlist 制作的容器OS虚拟机镜像,默认保存在调用`kbimg`路径下的`scripts-auto`文件夹内,该分区至少有 25GiB 的剩余磁盘空间。 +* 制作容器 OS 镜像时,不支持用户自定义配置挂载文件。 #### 操作步骤 -制作容器OS 虚拟机镜像使用 kbimg.sh 脚本,命令详情请见《容器OS镜像制作指导》 +制作容器OS 虚拟机镜像使用 kbimg,命令详情请见《[容器OS镜像制作指导](./容器OS镜像制作指导.md)》。 制作容器OS 虚拟机镜像的步骤如下: @@ -172,32 +158,30 @@ cd /opt/kubeOS/scripts ``` -2. 执行 kbming.sh 制作容器OS,参考命令如下: +2. 执行 kbming 制作容器OS,参考命令如下: ```shell - bash kbimg.sh create vm-image -p xxx.repo -v v1 -b ../bin/os-agent -e '''$1$xyz$RdLyKTL32WEvK3lg8CXID0''' + ./kbimg create -f ./kbimg.toml vm-img ``` - 其中 xx.repo 为制作镜像所需要的 yum 源,yum 源建议配置为 openEuler 具体版本的 everything 仓库和 EPOL 仓库。 - 容器 OS 镜像制作完成后,会在 /opt/kubeOS/scripts 目录下生成: + 容器 OS 镜像制作完成后,会在 /opt/kubeOS/scripts/scripts-auto 目录下生成: - - raw格式的系统镜像system.img,system.img大小默认为20G,支持的根文件系统分区大小<2020MiB,持久化分区<16GB。 - - qcow2 格式的系统镜像 system.qcow2。 - - 可用于升级的根文件系统分区镜像 update.img 。 + - raw格式的系统镜像system.img,system.img大小默认为20G,支持的根文件系统分区大小<2560MiB,持久化分区<15GB。 + - qcow2 格式的系统镜像 system.qcow2。 + - 可用于升级的根文件系统 kubeos.tar。 制作出来的容器 OS 虚拟机镜像目前只能用于 CPU 架构为 x86 和 AArch64 的虚拟机场景,不支持 x86 架构的虚拟机使用 legacy 启动模式启动。 - ### 部署CRD,operator和proxy #### 注意事项 -* 请先部署 Kubernetes 集群,部署方法参考《openEuler 22.09 Kubernetes 集群部署指南》 +* 请先部署 Kubernetes 集群,部署方法参考[《openEuler 24.03-LTS-SP1 Kubernetes 集群部署指南》](../Kubernetes/Kubernetes.md)。 -- 集群中准备进行升级的 Worker 节点的 OS 需要为使用上一节方式制作出来的容器 OS,如不是,请用 system.qcow2重新部署虚拟机,虚拟机部署请见《openEuler 22.09 虚拟化用户指南》,Master节点目前不支持容器 OS 升级,请用openEuler 22.09部署Master节点 +- 集群中准备进行升级的 Worker 节点的 OS 需要为使用上一节方式制作出来的容器 OS,如不是,请用 system.qcow2重新部署虚拟机,虚拟机部署请见[《openEuler 24.03-LTS-SP1 虚拟化用户指南》](../Virtualization/virtualization.md),Master节点目前不支持容器 OS 升级,请用openEuler 24.03-LTS-SP1部署Master节点。 - 部署 OS 的 CRD(CustomResourceDefinition),os-operator,os-proxy 以及 RBAC (Role-based access control) 机制的 YAML 需要用户自行编写。 -- operator 和 proxy 部署在 kubernetes 集群中,operator 应部署为 deployment,proxy 应部署为damonset -- 尽量部署好 kubernetes 的安全措施,如 rbac 机制,pod 的 service account 和 security policy 配置等 +- operator 和 proxy 部署在 kubernetes 集群中,operator 应部署为 deployment,proxy 应部署为daemonset。 +- 尽量部署好 kubernetes 的安全措施,如 rbac 机制,pod 的 service account 和 security policy 配置等。 #### 操作步骤 @@ -216,9 +200,3 @@ ```shell kubectl get pods -A ``` - - - - - - diff --git "a/docs/zh/docs/KubeOS/\345\256\271\345\231\250OS\351\225\234\345\203\217\345\210\266\344\275\234\346\214\207\345\257\274.md" "b/docs/zh/docs/KubeOS/\345\256\271\345\231\250OS\351\225\234\345\203\217\345\210\266\344\275\234\346\214\207\345\257\274.md" index 3409b87389beacaf911bb122a9955af52a043e7f..332404fef063e87d16823181213fd80346b1dc48 100644 --- "a/docs/zh/docs/KubeOS/\345\256\271\345\231\250OS\351\225\234\345\203\217\345\210\266\344\275\234\346\214\207\345\257\274.md" +++ "b/docs/zh/docs/KubeOS/\345\256\271\345\231\250OS\351\225\234\345\203\217\345\210\266\344\275\234\346\214\207\345\257\274.md" @@ -1,162 +1,534 @@ -# 容器OS镜像制作指导# - -## 简介 ## +# 容器OS镜像制作指导 + +- [容器OS镜像制作指导](#容器os镜像制作指导) + - [简介](#简介) + - [命令介绍](#命令介绍) + - [命令格式](#命令格式) + - [配置文件说明](#配置文件说明) + - [from\_repo](#from_repo) + - [admin\_container](#admin_container) + - [pxe\_config](#pxe_config) + - [users](#users) + - [copy\_files](#copy_files) + - [grub](#grub) + - [systemd\_service](#systemd_service) + - [chroot\_script](#chroot_script) + - [disk\_partition](#disk_partition) + - [persist\_mkdir](#persist_mkdir) + - [dm\_verity](#dm_verity) + - [使用说明](#使用说明) + - [注意事项](#注意事项) + - [KubeOS OCI 镜像制作](#kubeos-oci-镜像制作) + - [注意事项](#注意事项-1) + - [使用示例](#使用示例) + - [KubeOS 虚拟机镜像制作](#kubeos-虚拟机镜像制作) + - [注意事项](#注意事项-2) + - [使用示例](#使用示例-1) + - [KubeOS 物理机安装所需镜像及文件制作](#kubeos-物理机安装所需镜像及文件制作) + - [注意事项](#注意事项-3) + - [使用示例](#使用示例-2) + - [附录](#附录) + - [异常退出清理方法](#异常退出清理方法) + - [详细toml配置文件示例](#详细toml配置文件示例) + +## 简介 kbimg是KubeOS部署和升级所需的镜像制作工具,可以使用kbimg制作KubeOS docker,虚拟机和物理机镜像。 -## 命令介绍 ## +## 命令介绍 -### 命令格式 ### +### 命令格式 -**bash kbimg.sh** \[ --help | -h \] create \[ COMMANDS \] \[ OPTIONS \] +kbimg - CLI tool for generating various types of image for KubeOS -### 参数说明 ### +```text +Usage: kbimg [OPTIONS] -* COMMANDS +Commands: + create Create a new KubeOS image + help Print this message or the help of the given subcommand(s) - | 参数 | 描述 | - | ------------- | ---------------------------------------------- | - | upgrade-image | 生成用于安装和升级的docker镜像格式的 KubeOS 镜像 | - | vm-image | 生成用于部署和升级的虚拟机镜像 | - | pxe-image | 生成物理机安装所需的镜像及文件 | +Options: + -d, --debug Enable debug mode, generate the scripts without execution + -h, --help Print help + -V, --version Print version +``` - +kbimg-create - Create a new KubeOS image -* OPTIONS +```text +Usage: kbimg create --file - | 参数 | 描述 | - | ------------ | ------------------------------------------------------------ | - | -p | repo 文件的路径,repo 文件中配置制作镜像所需要的 yum 源 | - | -v | 制作出来的KubeOS镜像的版本 | - | -b | os-agent二进制的路径 | - | -e | KubeOS 镜像 root 用户密码,加密后的带盐值的密码,可以用 openssl,kiwi 命令生成 | - | -d | 生成或者使用的 docke r镜像 | - | -h --help | 查看帮助信息 | +Arguments: + [possible values: vm-img, pxe-img, upgrade-img, admin-container] - +Options: + -f, --file Path to the toml configuration file + -h, --help Print help +``` -## 使用说明 ## +### 配置文件说明 -#### 注意事项 ### +#### from_repo -* kbimg.sh 执行需要 root 权限 -* 当前仅支持 x86和 AArch64 架构使用 -* 容器 OS 镜像制作工具的 rpm 包源为 openEuler 具体版本的 everything 仓库和 EPOL 仓库。制作镜像时提供的 repo 文件中,yum 源建议同时配置 openEuler 具体版本的 everything 仓库和 EPOL 仓库 +从 repo 创建升级容器镜像、虚拟机镜像或PXE物理机镜像 -### KubeOS docker镜像制作 ### + | 参数 | 描述 | + | --- | --- | + | agent_path | os-agent 二进制的路径 | + | legacy_bios | 目前仅支持设置为`false`,即UEFI引导 | + | repo_path | repo 文件的路径,repo 文件中配置制作镜像所需要的 yum 源 | + | root_passwd | root 用户密码,与/etc/shadow文件内密码格式一致,可使用`openssl passwd -6 -salt $(head -c18 /dev/urandom \| openssl base64)`命令生成 | + | version | KubeOS 镜像的版本,将写入/etc/os-release文件内作为OS标识 | + | rpmlist | 期望安装进镜像内的rpm包列表 | + | upgrade_img | [可选项]指定生成的升级容器镜像的镜像名(制作升级容器镜像必需) | -#### 注意事项 #### +#### admin_container -* 制作的 docker 镜像仅用于后续的虚拟机/物理机镜像制作或升级使用,不支持启动容器 -* 使用默认 rpmlist 进行容器OS镜像制作时所需磁盘空间至少为6G,如自已定义 rpmlist 可能会超过6G +制作admin运维容器 -#### 使用示例 #### -* 如需进行DNS配置,请先在```scripts```目录下自定义```resolv.conf```文件 -```shell - cd /opt/kubeOS/scripts - touch resolv.conf - vim resolv.conf -``` -* 制作KubeOS容器镜像 -``` shell -cd /opt/kubeOS/scripts -bash kbimg.sh create upgrade-image -p xxx.repo -v v1 -b ../bin/os-agent -e '''$1$xyz$RdLyKTL32WEvK3lg8CXID0''' -d your_imageRepository/imageName:version -``` + | 参数 | 描述 | + | --- | --- | + | hostshell | hostshell二进制路径,可在项目根目录下通过`make hostshell`编译 | + | img_name | 指定生成的容器镜像名 | + +#### pxe_config + +在制作PXE物理机镜像时,配置该参数用于PXE安装。制作PXE物理机镜像时必需。 + + | 参数 | 描述 | + | --- | --- | + | server_ip | 用于下载根文件系统 tar 包的 HTTP 服务器地址 | + | rootfs_name | 放置于 HTTP 服务器的文件系统 tar 包名称 | + | disk | 安装 KubeOS 系统的目标磁盘名 | + | route_ip | 配置目标机器网卡的路由 IP | + | dhcp | [可选项] 是否启用 DHCP 模式配置网络,默认为 false | + | local_ip | [可选项] 配置目标机器网卡的 IP,dhcp 为 false 时必需 | + | net_name | [可选项] 配置目标机器网卡名,dhcp 为 false 时必需 | + | netmask | [可选项] 配置目标机器网卡的子网掩码,dhcp 为 false 时必需 | + +**注意**:`pxe_config`下的配置参数无法进行校验,需要用户自行确认其正确性。 + +#### users + +[可选项] 添加用户 + + | 参数 | 描述 | + | --- | --- | + | name | 用户名 | + | passwd | 密码 | + | primary_groups | [可选项] 用户主组(默认为用户同名组) | + | groups | [可选项] 用户附加组 | + +**注意**:添加用户会默认创建用户同名组,配置用户附加组时,若组不存在会报错失败。若有特殊配置需求,用户可通过[chroot_script](#chroot_script)脚本自行实现。 + +#### copy_files + +[可选项] 拷贝文件到rootfs内指定目录 + + | 参数 | 描述 | + | --- | --- | + | dst | 目标路径 | + | src | 源文件路径 | + | create_dir | [可选项]拷贝前创建文件夹 | + +**注意**:拷贝文件无法保留权限,如果需要特殊权限,可借助[chroot_script](#chroot_script)脚本自行实现。 + +#### grub + +[可选项] grub配置,配置dm-verity时必需 + + | 参数 | 描述 | + | --- | --- | + | passwd | grub 明文密码 | + +#### systemd_service + +[可选项] 配置 systemd 服务开机自启 + + | 参数 | 描述 | + | --- | --- | + | name | systemd 服务名 | + +#### chroot_script + +[可选项] 自定义 chroot 脚本 + + | 参数 | 描述 | + | --- | --- | + | path | 脚本路径 | + | rm | [可选项]执行完毕后是否删除该脚本,配置`true`删除,`false`或空保留 | + +#### disk_partition + +[可选项] 自定义分区大小和镜像大小 -* 制作完成后查看制作出来的KubeOS容器镜像 + | 参数 | 描述 | + | --- | --- | + | root | root分区大小, 单位为MiB,默认2560MiB | + | img_size | [可选项]镜像大小,单位为GB,默认20GB | -``` shell -docker images +#### persist_mkdir + +[可选项] persist 分区新建目录 + + | 参数 | 描述 | + | --- | --- | + | name | 目录名 | + +#### dm_verity + +[可选项] 制作启用dm-verity功能的虚拟机或升级镜像 + + | 参数 | 描述 | + | --- | --- | + | efi_key | efi明文口令 | + | grub_key | grub明文口令 | + | keys_dir |[可选项]可指定密钥文件夹,复用先前制作镜像创建的密钥 | + +## 使用说明 + +### 注意事项 + +* kbimg 执行需要 root 权限。 +* 当前仅支持 x86和 AArch64 架构使用。 +* 不支持并发执行。如果使用脚本`&`连续执行可能会出现异常情况。制作过程中碰到异常掉电或中断后无法清理环境时,可参考[异常退出清理方法](#异常退出清理方法)清理后重新制作。 +* 容器 OS 镜像制作工具的 rpm 包源为 openEuler 具体版本的 everything 仓库和 EPOL 仓库。制作镜像时提供的 repo 文件中,yum 源建议同时配置 openEuler 具体版本的 everything 仓库和 EPOL 仓库。 +* dm-verity使用说明: + * 仅支持虚拟机场景,暂不支持物理机环境。 + * 不支持通过 HTTP/HTTPS 服务器下载升级镜像进行系统升级。仅支持从容器镜像仓库下载升级镜像进行升级。 + * 启动虚拟机时,必须配置使用 virtio 类型设备。 + * 启用dm-verity功能的升级容器镜像不可用于升级未开启dm-verity的容器OS。同理,未启动dm-verity功能的升级容器镜像不可用于升级开启dm-verity功能的容器OS。在集群内,部分节点开启dm-verity功能,部分未开启,需要用户控制下发对应的升级镜像。 + * 制作升级容器镜像和虚拟机镜像时,推荐使用相同的密钥(配置`keys_dir`为先前制作镜像时创建的密钥文件路径。配置`efi_key`或`grub_key`一致不能保证密钥文件是一模一样的)。若密钥不一致,在切换备用分区时可能导致证书校验失败,从而无法启动系统。出现证书校验失败问题时,需要重新导入备用分区证书进行修复。 + +### KubeOS OCI 镜像制作 + +#### 注意事项 + +* 制作出的 OCI 镜像仅用于后续的虚拟机/物理机镜像升级使用,不支持启动容器。 +* 使用默认 rpmlist 进行容器OS镜像制作时所需磁盘空间至少为6G,若使用自定义 rpmlist 可能会超过6G。 + +#### 使用示例 + +* 配置文件示例 + +```toml +[from_repo] +agent_path = "./bin/rust/release/os-agent" +legacy_bios = false +repo_path = "/etc/yum.repos.d/openEuler.repo" +root_passwd = "$1$xyz$RdLyKTL32WEvK3lg8CXID0" # default passwd: openEuler12#$ +rpmlist = [ + "NetworkManager", + "cloud-init", + "conntrack-tools", + "containerd", + "containernetworking-plugins", + "cri-tools", + "dhcp", + "ebtables", + "ethtool", + "iptables", + "kernel", + "kubernetes-kubeadm", + "kubernetes-kubelet", + "openssh-server", + "passwd", + "rsyslog", + "socat", + "tar", + "vi", +] +upgrade_img = "kubeos-upgrade:v1" +version = "v1" ``` -### KubeOS 虚拟机镜像制作 ### +* 结果说明 + * 制作完成后,通过`docker images`查看制作出来的KubeOS容器镜像 + * update-boot.img/update-root.img/update-hash.img: 仅在dm-verity模式下生成,可忽略。 + +### KubeOS 虚拟机镜像制作 + +#### 注意事项 + +* 制作出来的容器 OS 虚拟机镜像目前只能用于 CPU 架构为 x86 和 AArch64 的虚拟机。 +* 容器 OS 目前不支持 x86 架构的虚拟机使用 legacy 启动模式启动。 +* 默认root密码为openEuler12#$ +* 使用默认rpmlist进行容器OS镜像制作时所需磁盘空间至少为25G,若使用自定义rpmlist可能会超过25G。 + +#### 使用示例 + +* 配置文件示例 + +```toml +[from_repo] +agent_path = "./bin/rust/release/os-agent" +legacy_bios = false +repo_path = "/etc/yum.repos.d/openEuler.repo" +root_passwd = "$1$xyz$RdLyKTL32WEvK3lg8CXID0" # default passwd: openEuler12#$ +rpmlist = [ + "NetworkManager", + "cloud-init", + "conntrack-tools", + "containerd", + "containernetworking-plugins", + "cri-tools", + "dhcp", + "ebtables", + "ethtool", + "iptables", + "kernel", + "kubernetes-kubeadm", + "kubernetes-kubelet", + "openssh-server", + "passwd", + "rsyslog", + "socat", + "tar", + "vi", +] +version = "v1" +``` -#### 注意事项 #### +* 结果说明 +容器 OS 镜像制作完成后,会在 ./scripts-auto 目录下生成 + * system.qcow2: 用于启动虚拟机的qcow2 格式的系统镜像,大小默认为 20GiB,支持的根文件系统分区大小 < 2560 MiB,持久化分区 < 15GB 。 + * system.img: 用于启动虚拟机的img 格式的系统镜像,大小默认为 20GiB,支持的根文件系统分区大小 < 2560 MiB,持久化分区 < 15GB 。 + * kubeos.tar: 用于升级的根文件系统tar包。 + * update-boot.img/update-root.img/update-hash.img: 仅在dm-verity模式下生成,可忽略。 -* 如使用 docker 镜像制作请先拉取相应镜像或者先制作docker镜像,并保证 docker 镜像的安全性 -* 制作出来的容器 OS 虚拟机镜像目前只能用于 CPU 架构为 x86 和 AArch64 的虚拟机 -* 容器 OS 目前不支持 x86 架构的虚拟机使用 legacy 启动模式启动 -* 使用默认rpmlist进行容器OS镜像制作时所需磁盘空间至少为25G,如自已定义rpmlist可能会超过25G +### KubeOS 物理机安装所需镜像及文件制作 -#### 使用示例 #### +#### 注意事项 -* 使用repo源制作 - * 如需进行DNS配置,请先在```scripts```目录下自定义```resolv.conf```文件 - ```shell - cd /opt/kubeOS/scripts - touch resolv.conf - vim resolv.conf - ``` - * KubeOS虚拟机镜像制作 - ``` shell - cd /opt/kubeOS/scripts - bash kbimg.sh create vm-image -p xxx.repo -v v1 -b ../bin/os-agent -e '''$1$xyz$RdLyKTL32WEvK3lg8CXID0''' - ``` +* 制作出来的容器 OS 物理安装所需的镜像目前只能用于 CPU 架构为 x86 和 AArch64 的物理机安装。 +* `pxe_config`配置中指定的ip为安装时使用的临时ip,请在系统安装启动后请参考[《openEuler 24.03-LTS-SP1 管理员指南-配置网络》](../Administration/配置网络.md)进行网络配置。 +* 不支持多个磁盘都安装KubeOS,可能会造成启动失败或挂载紊乱。 +* 容器OS 目前不支持 x86 架构的物理机使用 legacy 启动模式启动。 +* 使用默认rpmlist进行镜像制作时所需磁盘空间至少为5G,如自已定义 rpmlist 可能会超过5G。 +* PXE物理机镜像制作不支持dm-verity功能 +* 在 PXE 安装阶段,需要从 HTTP 服务器的根目录下载根分区 tar 包(tar包名称为toml配置文件中配置的名称)。请确保机器拥有足够的内存空间以存储根分区 tar 包及临时中间文件。 -* 使用docker镜像制作 +#### 使用示例 - ``` shell - cd /opt/kubeOS/scripts - bash kbimg.sh create vm-image -d your_imageRepository/imageName:version - ``` -* 结果说明 - 容器 OS 镜像制作完成后,会在 /opt/kubeOS/scripts 目录下生成: - * system.qcow2: qcow2 格式的系统镜像,大小默认为 20GiB,支持的根文件系统分区大小 < 2020 MiB,持久化分区 < 16GiB 。 - * update.img: 用于升级的根文件系统分区镜像 - - -### KubeOS 物理机安装所需镜像及文件制作 ### - -#### 注意事项 #### - -* 如使用 docker 镜像制作请先拉取相应镜像或者先制作 docker 镜像,并保证 docker 镜像的安全性 -* 制作出来的容器 OS 物理安装所需的镜像目前只能用于 CPU 架构为 x86 和 AArch64 的物理机安装 -* Global.cfg配置中指定的ip为安装时使用的临时ip,请在系统安装启动后请参考《openEuler 22.09 管理员指南-配置网络》进行网络配置 -* 不支持多个磁盘都安装KubeOS,可能会造成启动失败或挂载紊乱 -* 容器OS 目前不支持 x86 架构的物理机使用 legacy 启动模式启动 -* 使用默认rpmlist进行镜像制作时所需磁盘空间至少为5G,如自已定义 rpmlist 可能会超过5G -#### 使用示例 #### - -* 首先需要修改```00bootup/Global.cfg```的配置,对相关参数进行配置,参数均为必填,ip目前仅支持ipv4,配置示例如下 - - ```shell +* 首先需要修改```kbimg.toml```中```pxe_config```的配置,对相关参数进行配置,详细参数可见[参数说明](#pxe_config),ip目前仅支持ipv4,配置示例如下 + + ```toml + [pxe_config] + dhcp = false # rootfs file name - rootfs_name=kubeos.tar + rootfs_name = "kubeos.tar" # select the target disk to install kubeOS - disk=/dev/sda + disk = "/dev/vda" # pxe server ip address where stores the rootfs on the http server - server_ip=192.168.1.50 - # target machine temporary ip - local_ip=192.168.1.100 - # target machine temporary route - route_ip=192.168.1.1 - # target machine temporary netmask - netmask=255.255.255.0 + server_ip = "192.168.122.50" + # target machine ip + local_ip = "192.168.122.100" + # target machine route + route_ip = "192.168.122.1" + # target machine netmask + netmask = "255.255.255.0" # target machine netDevice name - net_name=eth0 + net_name = "eth0" ``` -* 使用 repo 源制作 - * 如需进行DNS配置,请在```scripts```目录下自定义```resolv.conf```文件 - ```shell - cd /opt/kubeOS/scripts - touch resolv.conf - vim resolv.conf - ``` - * KubeOS物理机安装所需镜像制作 - ``` - cd /opt/kubeOS/scripts - bash kbimg.sh create pxe-image -p xxx.repo -v v1 -b ../bin/os-agent -e '''$1$xyz$RdLyKTL32WEvK3lg8CXID0''' +* 如需进行DNS配置,请先自定义```resolv.conf```文件,并启用```copy_files```字段将配置文件拷贝到```/etc```目录 + + ```toml + [[copy_files]] + dst = "/etc" + src = "" ``` -* 使用 docker 镜像制作 - ``` shell - cd /opt/kubeOS/scripts - bash kbimg.sh create pxe-image -d your_imageRepository/imageName:version +* KubeOS物理机安装所需镜像制作,及pxe_config配置全示例 + + ```toml + [from_repo] + agent_path = "./bin/rust/release/os-agent" + legacy_bios = false + repo_path = "/etc/yum.repos.d/openEuler.repo" + root_passwd = "$1$xyz$RdLyKTL32WEvK3lg8CXID0" # default passwd: openEuler12#$ + rpmlist = [ + "NetworkManager", + "cloud-init", + "conntrack-tools", + "containerd", + "containernetworking-plugins", + "cri-tools", + "dhcp", + "ebtables", + "ethtool", + "iptables", + "kernel", + "kubernetes-kubeadm", + "kubernetes-kubelet", + "openssh-server", + "passwd", + "rsyslog", + "socat", + "tar", + "vi", + "coreutils", + "dosfstools", + "dracut", + "gawk", + "hwinfo", + "net-tools", + "parted", + ] + version = "v1" + + [pxe_config] + dhcp = true + rootfs_name = "kubeos.tar" + disk = "/dev/vda" + server_ip = "192.168.122.50" + route_ip = "192.168.122.1" + #local_ip = "192.168.1.100" + #netmask = "255.255.255.0" + #net_name = "eth0" ``` * 结果说明 + * initramfs.img: 用于pxe启动用的 initramfs 镜像 + * kubeos.tar: pxe安装所用的根分区文件系统 + +## 附录 + +### 异常退出清理方法 + +1. 若在使用`kbimg`制作镜像过程中,异常退出,无法清理环境,可使用如下方法进行清理: + +```bash +function unmount_dir() { + local dir=$1 + if [ -L "${dir}" ] || [ -f "${dir}" ]; then + echo "${dir} is not a directory, please check it." + return 1 + fi + if [ ! -d "${dir}" ]; then + return 0 + fi + local real_dir=$(readlink -e "${dir}") + local mnts=$(awk '{print $2}' < /proc/mounts | grep "^${real_dir}" | sort -r) + for m in ${mnts}; do + echo "Unmount ${m}" + umount -f "${m}" || true + done + return 0 +} +ls -l ./scripts-auto/test.lock && rm -rf ./scripts-auto/test.lock +unmount_dir ./scripts-auto/rootfs/proc +unmount_dir ./scripts-auto/rootfs/sys +unmount_dir ./scripts-auto/rootfs/dev/pts +unmount_dir ./scripts-auto/rootfs/dev +unmount_dir ./scripts-auto/mnt/boot/grub2 +unmount_dir ./scripts-auto/mnt +rm -rf ./scripts-auto/rootfs ./scripts-auto/mnt +``` - * initramfs.img: 用于pxe启动用的 initramfs 镜像 - * kubeos.tar: pxe安装所用的 OS +2. 如果执行以上命令仍然无法删除目录,可尝试先调用如下命令,再重新执行第一步的命令。 +```bash +fuser -kvm ./scripts-auto/rootfs +fuser -kvm ./scripts-auto/mnt +``` + +### 详细toml配置文件示例 + +请根据需求和[配置文件说明](#配置文件说明),修改如下示例配置文件,生成所需镜像。 + +```toml +[from_repo] +agent_path = "./bin/rust/release/os-agent" +legacy_bios = false +repo_path = "/etc/yum.repos.d/openEuler.repo" +root_passwd = "$1$xyz$RdLyKTL32WEvK3lg8CXID0" # default passwd: openEuler12#$, use "openssl passwd -6 -salt $(head -c18 /dev/urandom | openssl base64)" to generate your passwd +rpmlist = [ + "NetworkManager", + "cloud-init", + "conntrack-tools", + "containerd", + "containernetworking-plugins", + "cri-tools", + "dhcp", + "ebtables", + "ethtool", + "iptables", + "kernel", + "kubernetes-kubeadm", + "kubernetes-kubelet", + "openssh-server", + "passwd", + "rsyslog", + "socat", + "tar", + "vi", + # Below packages are required for pxe-image. Uncomment them if you want to generate pxe-image. + # "coreutils", + # "dosfstools", + # "dracut", + # "gawk", + # "hwinfo", + # "net-tools", + # "parted", +] +upgrade_img = "kubeos-upgrade:v1" +version = "v1" + +# [admin_container] +# img_name = "kubeos-admin-container:v1" +# hostshell = "./bin/hostshell" + +# [pxe_config] +# dhcp = false +# disk = "/dev/vda" +# local_ip = "192.168.1.100" +# net_name = "eth0" +# netmask = "255.255.255.0" +# rootfs_name = "kubeos.tar" +# route_ip = "192.168.1.1" +# server_ip = "192.168.1.50" + +# [[users]] +# groups = ["admin", "wheel"] +# name = "foo" +# passwd = "foo" +# primary_group = "foo" + +# [[users]] +# groups = ["example"] +# name = "bar" +# passwd = "bar" + +# [[copy_files]] +# create_dir = "/root/test" +# dst = "/root/test/foo.txt" +# src = "/root/KubeOS/foo.txt" + +# [[copy_files]] +# dst = "/etc/bar.txt" +# src = "../bar.txt" + +# [grub] +# passwd = "foo" + +# [systemd_service] +# name = ["containerd", "kubelet"] + +# [chroot_script] +# path = "./my_chroot.sh" +# rm = true + +# [disk_partition] +# img_size = 30 # GB +# root = 3000 # MiB + +# [persist_mkdir] +# name = ["bar", "foo"] + +# [dm_verity] +# efi_key = "foo" +# grub_key = "bar" +# keys_dir = "./keys" +``` diff --git a/docs/zh/docs/Kubernetes/Kubernetes.md b/docs/zh/docs/Kubernetes/Kubernetes.md index 0ac8aaeb8010db54b03c7d1551bcefad601fc788..b8ac396432eed21aa43b6048fa9d6280a5ce1e0e 100644 --- a/docs/zh/docs/Kubernetes/Kubernetes.md +++ b/docs/zh/docs/Kubernetes/Kubernetes.md @@ -8,6 +8,5 @@ 本文所使用的集群状态如下: -- 集群结构:6 个 `openEuler 21.09`系统的虚拟机,3 个 master 和 3 个 node 节点 -- 物理机:`openEuler 21.09 `的 `x86/ARM`服务器 - +- 集群结构:6 个 openEuler 系统的虚拟机,3 个 master 和 3 个 node 节点 +- 物理机:openEuler 的 `x86/ARM`服务器 diff --git "a/docs/zh/docs/Kubernetes/Kubernetes\351\233\206\347\276\244\351\203\250\347\275\262\346\214\207\345\215\227 - containerd.md" "b/docs/zh/docs/Kubernetes/Kubernetes\351\233\206\347\276\244\351\203\250\347\275\262\346\214\207\345\215\227 - containerd.md" new file mode 100644 index 0000000000000000000000000000000000000000..18f89355956c264ca8e54eecd9c1f6232e2fee53 --- /dev/null +++ "b/docs/zh/docs/Kubernetes/Kubernetes\351\233\206\347\276\244\351\203\250\347\275\262\346\214\207\345\215\227 - containerd.md" @@ -0,0 +1,244 @@ +# Kubernetes集群部署指南 - containerd +Kubernetes自1.21版本开始不再支持Kubernetes+docker部署Kubernetes集群,本文介绍以containerd作为容器运行时快速搭建Kubernetes集群。若需要对集群进行个性化配置,请查阅[官方文档](https://kubernetes.io/zh-cn/docs/home/) 。 +## 软件包安装 +### 1. 安装必要软件包 +``` +$ yum install -y containerd +$ yum install -y kubernetes* +$ yum install -y cri-tools +``` +> ![](./public_sys-resources/icon-note.gif)**说明** +> +> - 如果系统中已经安装了Docker,请确保在安装containerd之前卸载Docker,否则可能会引发冲突。 + +要求使用1.6.22-15或更高版本的containerd,如果下载的版本过低请运行以下命令升级成1.6.22-15版本,或自行升级。 +``` +$ wget --no-check-certificate https://repo.openeuler.org/openEuler-24.03-LTS/update/x86_64/Packages/containerd-1.6.22-15.oe2403.x86_64.rpm +$ rpm -Uvh containerd-1.6.22-15.oe2403.x86_64.rpm +``` +本教程中通过yum下载的软件包版本如下所示: +``` +1. containerd + -架构:x86_64 + -版本:1.6.22-15 +2. kubernetes - client/help/kubeadm/kubelet/master/node + -架构:x86_64 + -版本:1.29.1-4 +3. cri-tools + -架构:X86_64 + -版本:1.29.0-3 +``` + +### 2. 下载cni组件 + +``` +$ mkdir -p /opt/cni/bin +$ cd /opt/cni/bin +$ wget --no-check-certificate https://github.com/containernetworking/plugins/releases/download/v1.5.1/cni-plugins-linux-amd64-v1.5.1.tgz +$ tar -xzvf ./cni-plugins-linux-amd64-v1.5.1.tgz -C . +``` +> ![](./public_sys-resources/icon-note.gif)**说明** +> +> - 这里提供的是AMD64架构版本的下载链接,请根据系统架构选择合适的版本,其他版本可从[github仓库](https://github.com/containernetworking/plugins/releases/)获取。 + +### 3. 下载CNI插件(Flannel) +``` +$ wget https://raw.githubusercontent.com/flannel-io/flannel/master/Documentation/kube-flannel.yml --no-check-certificate +``` +## 环境配置 +本节对Kubernetes运行时所需的操作系统环境进行配置。 +### 1. 设置主机名 + +``` +$ hostnamectl set-hostname nodeName +``` +### 2. 配置防火墙 +**方法一:** + +配置防火墙规则以开放etcd和API Server的端口,确保控制平面和工作节点之间的正常通信。 +开放etcd的端口: +``` +$ firewall-cmd --zone=public --add-port=2379/tcp --permanent +$ firewall-cmd --zone=public --add-port=2380/tcp --permanent +``` +开放API Server的端口: +``` +$ firewall-cmd --zone=public --add-port=6443/tcp --permanent +``` +使防火墙规则生效: + +``` +$ firewall-cmd --reload +``` +> ![](./public_sys-resources/icon-note.gif)**说明** +> +> - 防火墙配置可能会导致某些容器镜像无法正常使用。为了确保其顺利运行,需要根据所使用的镜像开放相应的端口。 + +**方法二:** + +使用以下命令禁用防火墙: + +``` +$ systemctl stop firewalld +$ systemctl disable firewalld +``` +### 3. 禁用SELinux +SELinux的安全策略可能会阻止容器内的某些操作,比如写入特定目录、访问网络资源、或执行具有特权的操作。这会导致 CoreDNS 等关键服务无法正常运行,并表现为CrashLoopBackOff或 Error状态。可以使用以下命令来禁用SELinux: +``` +$ setenforce 0 +$ sed -i "s/SELINUX=enforcing/SELINUX=disabled/g" /etc/selinux/config +``` +### 4. 禁用swap +Kubernetes的资源调度器根据节点的可用内存和CPU资源来决定将哪些Pod分配到哪些节点上。如果节点上启用了swap,实际可用的物理内存和逻辑上可用的内存可能不一致,这会影响调度器的决策,导致某些节点出现过载,或者在某些情况下调度错误。因此需要禁用swap: +``` +$ swapoff -a +$ sed -ri 's/.*swap.*/#&/' /etc/fstab +``` +### 5. 网络配置 +启用桥接网络上的IPv6和IPv4流量通过iptables进行过滤,并启动IP转发,运行内核转发IPv4包,确保跨界点的Pod间通信: + +``` +$ cat > /etc/sysctl.d/k8s.conf << EOF +net.bridge.bridge-nf-call-ip6tables = 1 +net.bridge.bridge-nf-call-iptables = 1 +net.ipv4.ip_forward = 1 +vm.swappiness=0 +EOF +$ modprobe br_netfilter +$ sysctl -p /etc/sysctl.d/k8s.conf +``` +## 配置containerd +本节对containerd进行配置,包括设置pause_image、cgroup驱动、关闭"registry.k8s.io"镜像源证书验证、配置代理。 + +首先,生成containerd的默认配置文件并将其输出到containerd_conf指定的文件: + +``` +$ containerd_conf="/etc/containerd/config.toml" +$ mkdir -p /etc/containerd +$ containerd config default > "${containerd_conf}" +``` +配置pause_image: +``` +$ pause_img=$(kubeadm config images list | grep pause | tail -1) +$ sed -i "/sandbox_image/s#\".*\"#\"${pause_img}\"#" "${containerd_conf}" +``` +将cgroup驱动指定为systemd: +``` +$ sed -i "/SystemdCgroup/s/=.*/= true/" "${containerd_conf}" +``` +关闭"registry.k8s.io"镜像源证书验证: +``` +$ sed -i '/plugins."io.containerd.grpc.v1.cri".registry.configs/a\[plugins."io.containerd.grpc.v1.cri".registry.configs."registry.k8s.io".tls]\n insecure_skip_verify = true' /etc/containerd/config.toml +``` +配置代理(将HTTP_PROXY、HTTPS_PROXY、NO_PROXY中的"***"替换为自己的代理信息): +``` +$ server_path="/etc/systemd/system/containerd.service.d" +$ mkdir -p "${server_path}" +$ cat > "${server_path}"/http-proxy.conf << EOF +[Service] +Environment="HTTP_PROXY=***" +Environment="HTTPS_PROXY=***" +Environment="NO_PROXY=***" +EOF +``` +重启containerd,使得以上配置生效: +``` +$ systemctl daemon-reload +$ systemctl restart containerd +``` +## 配置crictl使用containerd作为容器运行时 +``` +$ crictl config runtime-endpoint unix:///run/containerd/containerd.sock +$ crictl config image-endpoint unix:///run/containerd/containerd.sock +``` +## 配置kubelet使用systemd作为cgroup驱动 + +``` +$ systemctl enable kubelet.service +$ echo 'KUBELET_EXTRA_ARGS="--runtime-cgroups=/systemd/system.slice --kubelet-cgroups=/systemd/system.slice"' >> /etc/sysconfig/kubelet +$ systemctl restart kubelet +``` +## 使用Kubeadm创建集群(仅控制平面需要) +### 1. 配置集群信息 +``` +$ kubeadm config print init-defaults --component-configs KubeletConfiguration >> kubeletConfig.yaml +$ vim kubeletConfig.yaml +``` +在kubeletConfig.yaml文件中,配置节点名称、广播地址(advertiseAddress)以及Pod网络的CIDR。 +
+**修改name为主机名,与环境配置[第一步](#1-设置主机名)一致:** +
+![](./figures/name.png) +
+**将advertiseAddress修改为控制平面的ip地址:** +
+![](./figures/advertiseAddress.png) +
+**在Networking中添加podSubnet指定CIDR范围:** +
+![](./figures/podSubnet.png) + +### 2. 部署集群 +这里使用kubeadm部署集群,许多配置是默认生成的(如认证证书),如需修改请查阅[官方文档](https://kubernetes.io/zh-cn/docs/home/ )。 + +**关闭代理(如有):** +``` +$ unset http_proxy https_proxy +``` +使用kubeadm init部署集群: + +``` +$ kubeadm init --config kubeletConfig.yaml +``` +指定kubectl使用的配置文件: +``` +$ mkdir -p "$HOME"/.kube +$ cp -i /etc/kubernetes/admin.conf "$HOME"/.kube/config +$ chown "$(id -u)":"$(id -g)" "$HOME"/.kube/config +$ export KUBECONFIG=/etc/kubernetes/admin.conf +``` +### 3. 部署cni插件(flannel) +本教程中使用flannel作为cni插件,以下介绍flannel下载和部署。 +以下使用的flannel从registry-1.docker.io镜像源下载,为避免证书验证失败的问题,请在containerd配置文件(/etc/containerd/config.toml)中配置该镜像源跳过证书验证。 +
+![](./figures/flannelConfig.png) +
+使用kubectl apply部署最开始在软件包安装中下载的kube-flannel.yml。 +``` +$ kubectl apply -f kube-flannel.yml +``` +> ![](./public_sys-resources/icon-note.gif)**说明** +> +> 控制平面可能会有污点的问题,导致kubectl get nodes中节点状态无法变成ready,请查阅[官方文档](https://kubernetes.io/zh-cn/docs/concepts/scheduling-eviction/taint-and-toleration/)去除污点。 +## 加入集群(仅工作节点需要) +**关闭代理(如有):** +``` +$ unset http_proxy https_proxy +``` +工作节点安装配置完环境后可以通过以下命令加入集群。 + +``` +$ kubeadm join : --token --discovery-token-ca-cert-hash sha256: +``` +这个命令会在控制平面库kubeadm init结束后生成,也可以在控制平面按照以下命令获取: + +``` +$ kubeadm token create #生成token +$ openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | \ + openssl dgst -sha256 -hex | sed 's/^.* //' #获取hash +``` + +加入后可以在控制平面通过以下命令查看工作节点的状态: + +``` +$ kubectl get nodes +``` +如果节点状态显示为not ready,可能是因为Flannel插件未成功部署。在这种情况下,请运行本地生成的Flannel可执行文件来完成部署。 +
+**在工作节点运行kubectl命令(可选):** + +如果需要在工作节点上运行kubectl命令,需要将控制面板的配置文件/etc/kubernetes/admin.conf复制到同样的目录,然后运行以下命令进行配置: + +``` +$ export KUBECONFIG=/etc/kubernetes/admin.conf +``` \ No newline at end of file diff --git "a/docs/zh/docs/Kubernetes/eggo\350\207\252\345\212\250\345\214\226\351\203\250\347\275\262.md" "b/docs/zh/docs/Kubernetes/eggo\350\207\252\345\212\250\345\214\226\351\203\250\347\275\262.md" index 1b13cc16ddb99ccd5d46000287100c1adfe95760..bccbb49b97a8a8d371cf085bb742ec9efb336b60 100644 --- "a/docs/zh/docs/Kubernetes/eggo\350\207\252\345\212\250\345\214\226\351\203\250\347\275\262.md" +++ "b/docs/zh/docs/Kubernetes/eggo\350\207\252\345\212\250\345\214\226\351\203\250\347\275\262.md" @@ -1,18 +1,16 @@ # 自动化部署 -由于手动部署 Kubernetes 集群依赖人工部署各类组件,该方式耗时耗力。尤其是在大规模部署 Kubernetes 集群环境时,面临效率和出错的问题。为了解决该问题,openEuler 自 21.09 版本推出 Kubernetes 集群部署工具,该工具实现了大规模 Kubernetes 的自动化部署、部署流程追踪等功能,并且具备高度的灵活性。 +由于手动部署 Kubernetes 集群依赖人工部署各类组件,该方式耗时耗力。尤其是在大规模部署 Kubernetes 集群环境时,面临效率和出错的问题。为了解决该问题,openEuler 推出 Kubernetes 集群部署工具,该工具实现了大规模 Kubernetes 的自动化部署、部署流程追踪等功能,并且具备高度的灵活性。 这里介绍 Kubernetes 集群自动化部署工具的使用方法。 ## 架构简介 - - ![](./figures/arch.png) 自动化集群部署整体架构如图所示,各模块含义如下: -- GitOps:负责集群配置信息的管理,如更新、创建、删除等; 21.09 版本暂时不提供集群管理集群的功能。 +- GitOps:负责集群配置信息的管理,如更新、创建、删除等; - InitCluster:元集群,作为中心集群管理其他业务集群。 - eggops:自定义 CRD 和 controller 用于抽象 k8s 集群。 - master:k8s 的 master 节点,承载集群的控制面。 @@ -20,4 +18,3 @@ - ClusterA、ClusterB、ClusterC:业务集群,承载用户业务。 如果您对openEuler提供的k8s集群部署工具感兴趣,欢迎访问源码仓:[https://gitee.com/openeuler/eggo](https://gitee.com/openeuler/eggo) - diff --git "a/docs/zh/docs/Kubernetes/eggo\351\203\250\347\275\262\351\233\206\347\276\244.md" "b/docs/zh/docs/Kubernetes/eggo\351\203\250\347\275\262\351\233\206\347\276\244.md" index deac5b80cb22f740bcf7d9999b7fb683b2dfbfdc..4ee2bb86b74e2f25b4babe42d1b0c20f19bd9287 100644 --- "a/docs/zh/docs/Kubernetes/eggo\351\203\250\347\275\262\351\233\206\347\276\244.md" +++ "b/docs/zh/docs/Kubernetes/eggo\351\203\250\347\275\262\351\233\206\347\276\244.md" @@ -192,7 +192,7 @@ $ eggo template -f template.yaml 或者直接使用命令行方式修改默认配置,参考命令如下: ```shell -$ eggo template -f template.yaml -n k8s-cluster -u username -p password --masters 192.168.0.1 --masters 192.168.0.2 --workers 192.168.0.3 --etcds 192.168.0.4 --loadbalancer 192.168.0.5 +$ eggo template -f template.yaml -n k8s-cluster -u username -p password --masters 192.168.0.1 --masters 192.168.0.2 --workers 192.168.0.3 --etcds 192.168.0.4 --loadbalance 192.168.0.5 ``` ## 安装 Kubernetes 集群 diff --git a/docs/zh/docs/Kubernetes/figures/advertiseAddress.png b/docs/zh/docs/Kubernetes/figures/advertiseAddress.png new file mode 100644 index 0000000000000000000000000000000000000000..b36e5c4664f2d2e5faaa23128fd4711c11e30179 Binary files /dev/null and b/docs/zh/docs/Kubernetes/figures/advertiseAddress.png differ diff --git a/docs/zh/docs/Kubernetes/figures/flannelConfig.png b/docs/zh/docs/Kubernetes/figures/flannelConfig.png new file mode 100644 index 0000000000000000000000000000000000000000..dc9e7c665edd02fad16d3e6f4970e3125efcbef8 Binary files /dev/null and b/docs/zh/docs/Kubernetes/figures/flannelConfig.png differ diff --git a/docs/zh/docs/Kubernetes/figures/name.png b/docs/zh/docs/Kubernetes/figures/name.png new file mode 100644 index 0000000000000000000000000000000000000000..dd6ddfdc3476780e8c896bfd5095025507f62fa8 Binary files /dev/null and b/docs/zh/docs/Kubernetes/figures/name.png differ diff --git a/docs/zh/docs/Kubernetes/figures/podSubnet.png b/docs/zh/docs/Kubernetes/figures/podSubnet.png new file mode 100644 index 0000000000000000000000000000000000000000..b368f77dd7dfd7722dcf7751b3e37ec28755e42d Binary files /dev/null and b/docs/zh/docs/Kubernetes/figures/podSubnet.png differ diff --git "a/docs/zh/docs/Kubernetes/\345\207\206\345\244\207\350\231\232\346\213\237\346\234\272.md" "b/docs/zh/docs/Kubernetes/\345\207\206\345\244\207\350\231\232\346\213\237\346\234\272.md" index 36f314d3d42e884667155966ddf4ef50c8fed40f..fd04c77015537a5dd1eee76bbd7513ed6c964de5 100644 --- "a/docs/zh/docs/Kubernetes/\345\207\206\345\244\207\350\231\232\346\213\237\346\234\272.md" +++ "b/docs/zh/docs/Kubernetes/\345\207\206\345\244\207\350\231\232\346\213\237\346\234\272.md" @@ -100,7 +100,7 @@ $ systemctl stop firewalld - + diff --git "a/docs/zh/docs/Kubernetes/\345\256\211\350\243\205Kubernetes\350\275\257\344\273\266\345\214\205.md" "b/docs/zh/docs/Kubernetes/\345\256\211\350\243\205Kubernetes\350\275\257\344\273\266\345\214\205.md" index 98fc9552e3d9d5fe80981ee9bf43d42f2d6cfba4..924795f889b5bf8747baed8d21904f1dda27aba9 100644 --- "a/docs/zh/docs/Kubernetes/\345\256\211\350\243\205Kubernetes\350\275\257\344\273\266\345\214\205.md" +++ "b/docs/zh/docs/Kubernetes/\345\256\211\350\243\205Kubernetes\350\275\257\344\273\266\345\214\205.md" @@ -1,13 +1,11 @@ # 安装 Kubernetes 软件包 - ```bash -$ dnf install -y docker conntrack-tools socat +dnf install -y docker conntrack-tools socat ``` 配置EPOL源之后,可以直接通过 dnf 安装 K8S ```bash -$ rpm -ivh kubernetes*.rpm +dnf install kubernetes ``` - diff --git "a/docs/zh/docs/Kubernetes/\345\270\270\350\247\201\351\227\256\351\242\230\344\270\216\350\247\243\345\206\263\346\226\271\346\263\225.md" "b/docs/zh/docs/Kubernetes/\345\270\270\350\247\201\351\227\256\351\242\230\344\270\216\350\247\243\345\206\263\346\226\271\346\263\225.md" new file mode 100644 index 0000000000000000000000000000000000000000..acb84643b9ef8cfaf54c8432df1aa2f058d7f485 --- /dev/null +++ "b/docs/zh/docs/Kubernetes/\345\270\270\350\247\201\351\227\256\351\242\230\344\270\216\350\247\243\345\206\263\346\226\271\346\263\225.md" @@ -0,0 +1,13 @@ +# 常见问题与解决方法 + +## **问题1:Kubernetes + docker为什么无法部署** + +原因:Kubernetes自1.21版本开始不再支持Kubernetes + docker部署Kubernetes集群。 + +解决方法:改为使用cri-dockerd+docker部署集群,也可以使用containerd或者iSulad部署集群。 + +## **问题2:openEuler无法通过yum直接安装Kubernetes相关的rpm包** + +原因:Kubernetes相关的rpm包需要配置yum的repo源有关EPOL的部分。 + +解决方法:[参考链接](https://forum.openeuler.org/t/topic/768)中repo源,重新配置环境中的EPOL源。 \ No newline at end of file diff --git "a/docs/zh/docs/Migration-tools/\347\224\250\346\210\267\346\214\207\345\215\227.md" "b/docs/zh/docs/Migration-tools/\347\224\250\346\210\267\346\214\207\345\215\227.md" index 09040cae3942a69cdd5754373ef7c361527e43d1..c68cdf7b9c80b61685a38cd02859fedca2e4e46a 100644 --- "a/docs/zh/docs/Migration-tools/\347\224\250\346\210\267\346\214\207\345\215\227.md" +++ "b/docs/zh/docs/Migration-tools/\347\224\250\346\210\267\346\214\207\345\215\227.md" @@ -13,13 +13,13 @@ migration-tools 工具提供网页界面方式进行操作,以供使用者在 1. 支持将 AMD64 和 ARM64 架构的 CentOS 系列系统迁移到 UOS 系统,迁移前需自行准备目标系统的全量源。 -2. openeuler迁移:目前仅支持 centos 7.4 cui 系统迁移至 openeuler 20.03-LTS-SP1。 +2. openeuler迁移:目前仅支持 centos 7.4 cui 系统迁移至 openeuler 。 3. 不建议对安装了 i686 架构的 rpm 包的原系统进行迁移,如果对这种原系统进行迁移会出现迁移失败的结果。 |原系统|目标系统|使用的软件源| |---|---|---| -|centos 7.4 cui|openeuler 20.03-LTS-SP1|使用openeuler外网源| +|centos 7.4 cui|openeuler|使用openeuler外网源| |centos 7.0~7.7|UOS 1002a|UOS 1002a(全量源)| |centos 8.0~8.2|UOS 1050a|UOS 1050a(全量源)| @@ -122,7 +122,7 @@ migration-tools 工具提供网页界面方式进行操作,以供使用者在 在准备迁移的 centos 机器上执行以下步骤: ->**注意:** 目前 migration-tools 仅支持 centos7.4 cui 迁移至 openeuler 20.03-LTS-SP1。 +>**注意:** 目前 migration-tools 仅支持 centos7.4 cui 迁移至 openeuler。 - 关闭防火墙。 diff --git "a/docs/zh/docs/NestOS/NestOS For Container\347\224\250\346\210\267\346\214\207\345\215\227.md" "b/docs/zh/docs/NestOS/NestOS For Container\347\224\250\346\210\267\346\214\207\345\215\227.md" new file mode 100644 index 0000000000000000000000000000000000000000..47385becf24cb21b9eda03ae665920b02e990079 --- /dev/null +++ "b/docs/zh/docs/NestOS/NestOS For Container\347\224\250\346\210\267\346\214\207\345\215\227.md" @@ -0,0 +1,995 @@ +# NestOS用户使用指南 + +## 1. NestOS介绍 + +### 1.1 前言 + +NestOS是麒麟软件在openEuler社区开源孵化的云底座操作系统,集成了rpm-ostree支持、ignition配置等技术,采用双根文件系统互为主备、原子化更新的设计思路,提供nestos-assembler工具快速集成构建。NestOS针对K8S、OpenStack平台进行适配,优化容器运行底噪,使系统具备十分便捷的集群组建能力,可以更安全的运行大规模的容器化工作负载。 + +本手册将完整阐述从构建、安装部署到使用NestOS的全流程,帮助用户充分利用NestOS的优势,快速高效地完成系统的配置和部署。 + +### 1.2 应用场景与优势 + +NestOS 适合作为以容器化应用为主的云场景基础运行环境,解决了在使用容器技术与容器编排技术实现业务发布、运维时与底层环境高度解耦而带来的运维技术栈不统一,运维平台重复建设等问题,保证了业务与底座操作系统运维的一致性。 + +![figure1](./figures/figure1.png) + +## 2. 环境准备 + +### 2.1 构建环境要求 + +#### 2.1.1 制作构建工具nestos-assembler环境要求 + +- 推荐使用openEuler环境 + +- 剩余可用硬盘空间 > 5G + +#### 2.1.2 构建NestOS环境要求 + +| **类别** | **要求** | +| :------: | :---------------: | +| CPU | 4vcpu | +| 内存 | 4GB | +| 硬盘 | 剩余可用空间>10GB | +| 架构 | x86_64或aarch64 | +| 其他 | 支持kvm | + +### 2.2 部署配置要求 + +| **类别** | **推荐配置** | **最低配置** | +| :------: | :-------------: | :----------: | +| CPU | >4vcpu | 1vcpu | +| 内存 | >4GB | 512M | +| 硬盘 | >20GB | 10GB | +| 架构 | x86_64、aarch64 | / | + +## 3. 快速使用 + +### 3.1 快速构建 + +1)获取nestos-assembler容器镜像 + +推荐使用基于openEuler的base镜像,更多说明请参考6.1 + +``` +docker pull hub.oepkgs.net/nestos/nestos-assembler:24.03-LTS.20240903.0-aarch64 +``` + +2)编写名为nosa的脚本并存放至/usr/local/bin,并赋予可执行权限 + +``` +#!/bin/bash + +sudo docker run --rm -it --security-opt label=disable --privileged --user=root \ + -v ${PWD}:/srv/ --device /dev/kvm --device /dev/fuse --network=host \ + --tmpfs /tmp -v /var/tmp:/var/tmp -v /root/.ssh/:/root/.ssh/ -v /etc/pki/ca-trust/:/etc/pki/ca-trust/ \ + ${COREOS_ASSEMBLER_CONFIG_GIT:+-v $COREOS_ASSEMBLER_CONFIG_GIT:/srv/src/config/:ro} \ + ${COREOS_ASSEMBLER_GIT:+-v $COREOS_ASSEMBLER_GIT/src/:/usr/lib/coreos-assembler/:ro} \ + ${COREOS_ASSEMBLER_CONTAINER_RUNTIME_ARGS} \ + ${COREOS_ASSEMBLER_CONTAINER:-nestos-assembler:your_tag} "$@" +``` + +注意修改COREOS_ASSEMBLER_CONTAINER 的值为本地环境中实际的nestos-assembler容器镜像。 + +3)获取nestos-config + +使用nosa init 初始化构建工作目录,拉取构建配置,创建工作目录nestos-build,在该目录下执行如下命令 + +``` +nosa init https://gitee.com/openeuler/nestos-config +``` + +4)调整构建配置 + +nestos-config提供默认构建配置,无需额外操作。如需调整,请参考第5章。 + +5)NestOS镜像构建 + +``` +# 拉取构建配置、更新缓存 +nosa fetch +# 生成根文件系统、qcow2及OCI镜像 +nosa build +# 生成live iso及PXE镜像 +nosa buildextend-metal +nosa buildextend-metal4k +nosa buildextend-live +``` + +详细构建及部署流程请参考第6章。 + +### 3.2 快速部署 + +以NestOS ISO镜像为例,启动进入live环境后,执行如下命令根据向导提示完成安装: + +``` +sudo installnestos +``` + +其他部署方式请参考第8章。 + +## 4. 系统默认配置 + +| **选项** | **默认配置** | +| :-------------: | :---------------------: | +| docker服务 | 默认disable,需主动开启 | +| ssh服务安全策略 | 默认仅支持密钥登录 | + +## 5. 构建配置nestos-config + +### 5.1 获取配置 + +nestos-config的仓库地址为https://gitee.com/openeuler/nestos-config + +### 5.2 配置目录结构说明 + +| **目录/****文件** | **说明** | +| :---------------: | :--------------------: | +| live/* | 构建liveiso的引导配置 | +| overlay.d/* | 自定义文件配置 | +| tests/* | 用户自定义测试用例配置 | +| *.repo | repo源配置 | +| .yaml,manifests/ | 主要构建配置 | + +### 5.3 主要文件解释 + +#### 5.3.1 repo文件 + +目录下的repo文件可用来配置用于构建nestos的软件仓库。 + +#### 5.3.2 yaml配置文件 + +目录下的yaml文件主要是提供nestos构建的各种配置,详见5.4章节。 + +### 5.4 主要字段解释 + +| **字段名称** | **作用** | +| :------------------------------------------ | ------------------------------------------------------------ | +| packages-aarch64、packages-x86_64、packages | 软件包集成范围 | +| exclude-packages | 软件包集成黑名单 | +| remove-from-packages | 从指定软件包删除文件(夹) | +| remove-files | 删除特定文件(夹) | +| extra-kargs | 额外内核引导参数 | +| initramfs-args | initramfs构建参数 | +| postprocess | 文件系统构建后置脚本 | +| default-target | 配置default-target,如 multi-user.target | +| rolij.name、releasever | 镜像相关信息(镜像名称、版本) | +| lockfile-repos | 构建可使用的仓库名列表,与5.3.1 介绍的repo文件中的仓库名需要对应 | + +### 5.5 用户可配置项说明 + +#### 5.5.1 repo源配置 + +1)在配置目录编辑repo文件,将内容修改为期望的软件仓库 + +``` +$ vim nestos-pool.repo +[repo_name_1] +Name=xxx +baseurl = https://ip.address/1 +enabled = 1 + +[repo_name_2] +Name=xxx +baseurl = https://ip.address/2 +enabled = 1 +``` + +2)修改yaml配置文件中的lockfile-repo字段内容为相应的仓库名称列表 + +注:仓库名称为repo文件中[]内的内容,不是name字段内容 + +``` +$ vim manifests/rpmlist.yaml +修改lockfile-repo字段内容为 +lockfile-repos: +- repo_name_1 +- repo_name_2 +``` + +#### 5.5.2 软件包裁剪 + +修改packages、packages-aarch64、packages-x86_64字段,可在其中添加或删除软件包。 + +如下所示,在package字段中添加了nano,构建安装后系统中会有nano 。 + +``` +$ vim manifests/rpmlist.yaml +packages: +- bootupd +... +- authselect +- nano +... +packages-aarch64: +- grub2-efi-aa64 +packages-x86_64: +- microcode_ctl +- grub2-efi-x64 +``` + +#### 5.5.3 自定义镜像名称与版本号 + +修改yaml文件中的releasever及rolij.name 字段,这些字段分别控制镜像的版本号及名称。 + +``` +$ vim manifest.yaml + +releasever: "1.0" +rojig: + license: MIT + name: nestos + summary: NestOS stable +``` + +如上配置,构建出的镜像格式为:nestos-1.0.$(date "+%Y%m%d").$build_num.$type,其中build_num为构建次数,type为类型后缀。 + +#### 5.5.4 自定义镜像中的release信息 + +正常release信息是由我们集成的release包(如openeuler-release)提供的,但是我们也可以通过添加postprocess脚本对/etc/os-release文件进行重写操作。 + +``` +$ vim manifests/ system-configuration.yaml +在postprocess添加如下内容,若已存在相关内容,则只需修改对应release信息即可 +postprocess: + - | + #!/usr/bin/env bash + set -xeuo pipefail + export OSTREE_VERSION="$(tail -1 /etc/os-release)" + date_now=$(date "+%Y%m%d") + echo -e 'NAME="openEuler NestOS"\nVERSION="24.03-LTS"\nID="openeuler"\nVERSION_ID="24.03-LTS"\nPRETTY_NAME="NestOS"\nANSI_COLOR="0;31"\nBUILDID="'${date_now}'"\nVARIANT="NestOS"\nVARIANT_ID="nestos"\n' > /usr/lib/os-release + echo -e $OSTREE_VERSION >> /usr/lib/os-release + cp -f /usr/lib/os-release /etc/os-release +``` + +#### 5.5.5 成自定义文件 + +在overlay.d目录下每个目录进行自定义文件的添加和修改,这种操作可以实现构建镜像内容的自定义。 + +``` +mkdir -p overlay.d/15nestos/etc/test/test.txt +echo "This is a test message !" > overlay.d/15nestos/etc/test/test.txt +``` + +使用如上配置进行镜像构建,启动构建出的镜像,查看系统中对应文件内容即为我们上述自定义添加的内容。 + +``` +[root@nosa-devsh ~]# cat /etc/test/test.txt +This is a test message ! +``` + +## 6.构建流程 + +NestOS采用容器化的方式将构建工具链集成为一个完整的容器镜像,称为NestOS-assembler。 + +NestOS提供构建NestOS-assembler容器镜像能力,方便用户使用。使用该容器镜像,用户可在任意linux发行版环境中构建多种形态NestOS镜像(例如在现有CICD环境中使用),也可对构建发布件进行管理、调试和自动化测试。 + +### 6.1 制作构建工具NestOS-assembler容器镜像 + +#### 6.1.1 前置步骤 + +1)准备容器base镜像 + +NestOS-assembler容器镜像需要基于支持yum/dnf软件包管理器的base镜像制作,理论上可由任意发行版base镜像制作,但为最大程度减少软件包兼容性问题,仍推荐使用基于openEuler的base镜像。 + +2)安装必要软件包 + +安装必备依赖docker + +``` +dnf install -y docker +``` + +3)克隆nestos-assembler源代码仓库 + +``` +git clone --depth=1 --single-branch https://gitee.com/openeuler/nestos-assembler.git +``` + +#### 6.1.2 构建NestOS-assembler容器镜像 + +使用openEuler容器镜像作为base镜像,使用以下指令构建: + +``` +cd nestos-assembler/ +docker build -f Dockerfile . -t nestos-assembler:your_tag +``` + +### 6.2 使用NestOS-assembler容器镜像 + +#### 6.2.1 前置步骤 + +1)准备nestos-assembler容器镜像 + +参考6.1章节构建nestos-assembler容器镜像后,可通过私有化部署容器镜像仓库对该容器镜像进行管理和分发。请确保构建NestOS前,拉取适当版本的nestos-assembler容器镜像至当前环境。 + +2)编写使用脚本nosa + +因NestOS构建过程需多次调用nestos-assembler容器镜像执行不同命令,同时需配置较多参数,为简化用户操作,可编写nosa命令脚本,可参见3.1快速构建部分。 + +#### 6.2.2 使用说明 + +构建工具命令一览 + +| **命令** | **功能说明** | +| :-------------------: | :-------------------------------------------------: | +| init | 初始化构建环境及构建配置,详见6.3 | +| fetch | 根据构建配置获取最新软件包至本地缓存 | +| build | 构建ostree commit,是构建NestOS的核心命令 | +| run | 直接启动一个qemu实例,默认使用最新构建版本 | +| prune | 清理历史构建版本,默认保留最新3个版本 | +| clean | 删除全部构建发布件,添加--all参数时同步清理本地缓存 | +| list | 列出当前构建环境中存在的版本及发布件 | +| build-fast | 基于前次构建记录快速构建新版本 | +| push-container | 推送容器镜像发布件至容器镜像仓库 | +| buildextend-live | 构建支持live环境的ISO发布件及PXE镜像 | +| buildextend-metal | 构建裸金属raw发布件 | +| buildextend-metal4k | 构建原生4K模式的裸金属raw发布件 | +| buildextend-openstack | 构建适用于openstack平台的qcow2发布件 | +| buildextend-qemu | 构建适用于qemu的qcow2发布件 | +| basearch | 获得当前架构信息 | +| compress | 压缩发布件 | +| kola | 自动化测试框架 | +| kola-run | 输出汇总结果的自动化测试封装 | +| runc | 以容器方式挂载当前构建根文件系统 | +| tag | 管理构建工程tag | +| virt-install | 通过virt-install为指定构建版本创建实例 | +| meta | 管理构建工程元数据 | +| shell | 进入nestos-assembler容器镜像 | + +### 6.3 准备构建环境 + +NestOS构建环境需要独立的空文件夹作为工作目录,且支持多次构建,保留、管理历史构建版本。创建构建环境前需首先准备构建配置(参考第5章)。 + +建议一份独立维护的构建配置对应一个独立的构建环境,即如果您希望构建多个不同用途的NestOS,建议同时维护多份构建配置及对应的构建环境目录,这样可以保持不同用途的构建配置独立演进和较为清晰的版本管理。 + +#### 6.3.1 初始化构建环境 + +进入待初始化工作目录,执行如下命令即可初始化构建环境: + +``` +nosa init https://gitee.com/openeuler/nestos-config +``` + +仅首次构建时需初始化构建环境,后续构建在不对构建配置做出重大更改的前提下,可重复使用该构建环境。 + +#### 6.3.2 构建环境说明 + +初始化完成后,工作目录创建出如下文件夹: + +**builds:**构建发布件及元数据存储目录,latest子目录软链接指向最新构建版本。 + +**cache:**缓存目录,根据构建配置中的软件源及软件包列表拉取至本地,历史构建NestOS的ostree repo均缓存于此目录。 + +**overrides:**构建过程希望附加到最终发布件rootfs中的文件或rpm包可置于此目录。 + +**src:**构建配置目录,存放nestos-config相关内容。 + +**tmp:**临时目录,构建过程、自动化测试等场景均会使用该目录作为临时目录,构建发生异常时可在此处查看虚拟机命令行输出、journal日志等信息。 + +### 6.4 构建步骤 + +NestOS构建主要步骤及参考命令如下: + +![figure2](./figures/figure2.png) + +#### 6.4.1 首次构建 + +首次构建时需初始化构建环境,详见6.3。 + +非首次构建可直接使用原构建环境,可通过nosa list查看当前构建环境已存在版本及对应发布件。 + +#### 6.4.2 更新构建配置及缓存 + +初始化构建环境后,执行如下命令更新构建配置及缓存: + +``` +nosa fetch +``` + +该步骤初步校验构建配置是否可用,并通过配置的软件源拉取软件包至本地缓存。当构建配置发生变更或单纯希望更新软件源中最新版本软件包,均需要重新执行该步骤,否则可能导致构建失败或不符合预期。 + +当构建配置发生较大变更,希望清空本地缓存重新拉取时,需执行如下命令: + +``` +nosa clean --all +``` + +#### 6.4.3 构建不可变根文件系统 + +NestOS不可变操作系统的核心是基于ostree技术的不可变根文件系统,执行如下步骤构建ostree文件系统: + +``` +nosa build +``` + +build命令默认会生成ostree文件系统和OCI归档文件,您也可以在执行命令时同步添加qemu、metal、metal4k中的一个或多个,同步构建发布件,等效于后续继续执行buildextend-qemu、buildextend-metal和buildextend-metal4k命令。 + +``` +nosa build qemu metal metal4k +``` + +如您希望在构建NestOS时,添加自定义文件或rpm包,请在执行build命令前将相应文件放入构建环境overrides目录下rootfs/或rpm/文件夹。 + +#### 6.4.4 构建各类发布件 + +build命令执行完毕后,可继续执行buildextend-XXX命令用于构建各类型发布件,具体介绍如下: + +- 构建qcow2镜像 + +``` +nosa buildextend-qemu +``` + +- 构建带live环境的ISO镜像或PXE启动组件 + +``` +nosa buildextend-metal +nosa buildextend-metal4k +nosa buildextend-live +``` + +- 构建适用于openstack环境的qcow2镜像 + +``` +nosa buildextend-openstack +``` + +- 构建适用于容器镜像方式更新的容器镜像 + +执行nosa build命令构建ostree文件系统时,会同时生成ociarchive格式镜像,该镜像可直接执行如下命令推送到本地或远程镜像仓库,无需执行其他构建步骤。 + +``` +nosa push-container [container-image-name] +``` + + 远程镜像仓库地址需附加到推送容器镜像名称中,且除隔离镜像tag外,不得出现":"。如未检测到":",该命令会自动生成{latest_build}-{arch}格式的tag。示例如下: + +``` +nosa push-container registry.example.com/nestos:1.0.20240903.0-x86_64 +``` + +该命令支持以下可选参数: + +--authfile :指定登录远程镜像仓库的鉴权文件 + +--insecure:如远程镜像仓库采用自签名证书等场景,添加该参数可不校验SSL/TLS协议 + +--transport:指定目标镜像推送协议,默认为docker,具体支持项及说明如下: + +​ containers-storage:推送至podman、crio等容器引擎本地存储目录 + +​ dir:推送至指定本地目录 + +​ docker:以docker API推送至私有或远端容器镜像仓库 + +​ docker-archive:等效于docker save导出归档文件,可供docker load使用 + +​ docker-daemon:推送至docker容器引擎本地存储目录 + +### 6.5 获取发布件 + +构建完毕后,发布件均生成于构建环境中如下路径: + +``` +builds/{version}/{arch}/ +``` + +如您仅关心最新构建版本或通过CI/CD调用,提供latest目录软链接至最新版本目录,即: + +``` +builds/latest/{arch}/ +``` + +为方便传输,您可以调用如下命令,压缩发布件体积: + +``` +nosa compress +``` + +压缩后原文件会被移除,会导致部分调试命令无法使用,可以调用解压命令恢复原文件: + +``` +nosa uncompress +``` + +### 6.6 构建环境维护 + +在构建NestOS环境前后,可能存在如下需求,可使用推荐的命令解决相应问题: + +#### 6.6.1 清理历史或无效构建版本,以释放磁盘空间 + +可以通过以下命令清理历史版本构建: + +``` +nosa prune +``` + +也可删除当前构建环境中的全部发布件: + +``` +nosa clean +``` + +如构建配置更换过软件源或历史缓存无保留价值,可彻底清理当前构建环境缓存: + +``` +nosa clean --all +``` + +#### 6.6.2 临时运行构建版本实例,用于调试或确认构建正确 + +``` +nosa run +``` + +可通过--qemu-image或--qemu-iso指定启动镜像地址,其余参数请参考nosa run --help说明。 + +实例启动后,构建环境目录会被挂载至/var/mnt/workdir,可通过构建环境目录 + +#### 6.6.3 运行自动化测试 + +``` +nosa kola run +``` + +该命令会执行预设的测试用例,也可在其后追加测试用例名称,单独执行单条用例。 + +``` +nosa kola testiso +``` + +该命令会执行iso或pxe live环境安装部署测试,可作为构建工程的冒烟测试。 + +#### 6.6.4 调试验证构建工具(NestOS-assembler) + +``` +nosa shell +``` + +该命令可启动进入构建工具链容器的shell环境,您可以通过此命令验证构建工具链工作环境是否正常。 + +## 7. 部署配置 + +### 7.1 前言 + +在开始部署NestOS之前,了解和准备必要的配置是至关重要的。NestOS通过点火文件(ignition文件)提供了一系列灵活的配置选项,可以通过Butane工具进行管理,方便用户进行自动化部署和环境设置。 + +在本章节中,将详细的介绍Butane工具的功能和使用方法,并根据不同场景提供配置示例。这些配置将帮助您快速启动和运行NestOS,在满足应用需求的同时,确保系统的安全性和可靠性。此外,还会介绍如何自定义镜像,将点火文件预集成至镜像中,以满足特定应用场景的需求,从而实现高效的配置和部署NestOS。 + +### 7.2 Butane简介 + +Butane是一个用于将人类可读的YAML配置文件转换为NestOS点火文件(Ignition 文件)的工具。Butane工具简化了复杂配置的编写过程,允许用户以更易读的格式编写配置文件,然后将其转换为适合NestOS使用的JSON格式。 + +NestOS对Butane进行了适配修改,新增nestos变体支持和配置规范版本v1.0.0,对应的点火(ignition)配置规范为v3.3.0,确保了配置的稳定性和兼容性。 + +### 7.3 Butane使用 + +安装butane软件包 + +``` +dnf install butane +``` + +编辑example.yaml并执行以下指令将其转换为点火文件example.ign,其中关于yaml文件的编写,将在后续展开: + +``` +butane example.yaml -o example.ign -p +``` + +### 7.4 支持的功能场景 + +以下配置示例(example.yaml)简述了NestOS主要支持的功能场景和进阶使用方法。 + +#### 7.4.1 设置用户和组并配置密码/密钥 + +``` +variant: nestos +version: 1.0.0 +passwd: + users: + - name: nest + ssh_authorized_keys: + - ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDHn2eh... + - name: jlebon + groups: + - wheel + ssh_authorized_keys: + - ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQDC5QFS... + - ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIIveEaMRW... + - name: miabbott + groups: + - docker + - wheel + password_hash: $y$j9T$aUmgEDoFIDPhGxEe2FUjc/$C5A... + ssh_authorized_keys: + - ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQDTey7R... +``` + +#### 7.4.2 文件操作——以配置网卡为例 + +``` +variant: nestos +version: 1.0.0 +storage: + files: + - path: /etc/NetworkManager/system-connections/ens2.nmconnection + mode: 0600 + contents: + inline: | + [connection] + id=ens2 + type=ethernet + interface-name=ens2 + [ipv4] + address1=10.10.10.10/24,10.10.10.1 + dns=8.8.8.8; + dns-search= + may-fail=false + method=manual +``` + +#### 7.4.3 创建目录、文件、软连接并配置权限 + +``` +variant: nestos +version: 1.0.0 +storage: + directories: + - path: /opt/tools + overwrite: true + files: + - path: /var/helloworld + overwrite: true + contents: + inline: Hello, world! + mode: 0644 + user: + name: dnsmasq + group: + name: dnsmasq + - path: /opt/tools/transmogrifier + overwrite: true + contents: + source: https://mytools.example.com/path/to/archive.gz + compression: gzip + verification: + hash: sha512-00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 + mode: 0555 + links: + - path: /usr/local/bin/transmogrifier + overwrite: true + target: /opt/tools/transmogrifier + hard: false +``` + +#### 7.4.4 编写systemd服务——以启停容器为例 + +``` +variant: nestos +version: 1.0.0 +systemd: + units: + - name: hello.service + enabled: true + contents: | + [Unit] + Description=MyApp + After=network-online.target + Wants=network-online.target + + [Service] + TimeoutStartSec=0 + ExecStartPre=-/bin/podman kill busybox1 + ExecStartPre=-/bin/podman rm busybox1 + ExecStartPre=/bin/podman pull busybox + ExecStart=/bin/podman run --name busybox1 busybox /bin/sh -c ""trap 'exit 0' INT TERM; while true; do echo Hello World; sleep 1; done"" + + [Install] + WantedBy=multi-user.target +``` + +### 7.5 点火文件预集成 + +NestOS构建工具链支持用户根据实际使用场景和需求定制镜像。在镜像制作完成后,nestos-installer还提供了针对镜像部署与应用等方面进行自定义的一系列功能,如嵌入点火文件、预分配安装位置、增删内核参数等功能,以下将针对主要功能进行介绍。 + +#### 7.5.1 点火文件预集成至ISO镜像 + +准备好NestOS的ISO镜像至本地;安装nestos-installer软件包;编辑example.yaml,并使用butane工具将其转换为ign文件,在这里,我们仅配置简单的用户名和密码(密码要求加密,示例中为qwer1234),内容如下: + +``` +variant: nestos +version: 1.0.0 +passwd: + users: + - name: root + password_hash: "$1$root$CPjzNGH.NqmQ7rh26EeXv1" +``` + +将上述yaml转换为ign文件后,执行如下指令嵌入点火文件并指定目标磁盘位置,其中xxx.iso为准备至本地的NestOS ISO镜像: + +``` +nestos-installer iso customize --dest-device /dev/sda --dest-ignition example.ign xxx.iso +``` + +使用该集成点火文件的ISO镜像进行安装时,NestOS会自动读取点火文件并安装至目标磁盘,待进度条完成度为100%后,自动进入安装好的NestOS环境,用户可根据ign文件配置的用户名和密码进入系统。 + +#### 7.5.2 点火文件预集成至PXE镜像 + +准备好NestOS的PXE镜像至本地,组件获取方式参考6.5【获取发布件】章节,其他步骤同上。 + +为了方便用户使用,nestos-installer也支持从ISO镜像中提取PXE组件的功能,执行如下指令,其中xxx.iso为保存至本地的NestOS ISO镜像: + +``` +nestos-installer iso extract pxe xxx.iso +``` + +得到如下输出件: + +``` +xxx-initrd.img +xxx-rootfs.img +xxx-vmlinuz +``` + +执行如下指令嵌入点火文件并指定目标磁盘位置: + +``` +nestos-installer pxe customize --dest-device /dev/sda --dest-ignition example.ign xxx-initrd.img --output custom-initrd.img +``` + +根据使用PXE安装NestOS的方式,替换相应的xxx-initrd.img为custom-initrd.img。启动后NestOS会自动读取点火文件并安装至目标磁盘,待进度条完成度为100%后,自动进入安装好的NestOS环境,用户可根据ign文件配置的用户名和密码进入系统。 + +## 8. 部署流程 + +### 8.1 简介 + +NestOS支持多种部署平台及常见部署方式,当前主要支持qcow2、ISO与PXE三种部署方式。与常见通用OS部署相比,主要区别在于如何传入以ign文件为特征的自定义部署配置,以下各部分将会分别介绍。 + +### 8.2 使用qcow2镜像安装 + +#### 8.2.1 使用qemu创建qcow2实例 + +准备NestOS的qcow2镜像及相应点火文件(详见第7章),终端执行如下步骤: + +``` +IGNITION_CONFIG="/path/to/example.ign" +IMAGE="/path/to/image.qcow2" +IGNITION_DEVICE_ARG="-fw_cfg name=opt/com.coreos/config,file=${IGNITION_CONFIG}" + +qemu-img create -f qcow2 -F qcow2 -b ${IMAGE} my-nestos-vm.qcow2 +``` + +aarch64环境执行如下命令: + +``` +qemu-kvm -m 2048 -M virt -cpu host -nographic -drive if=virtio,file=my-nestos-vm.qcow2 ${IGNITION_DEVICE_ARG} -nic user,model=virtio,hostfwd=tcp::2222-:22 -bios /usr/share/edk2/aarch64/QEMU_EFI-pflash.raw +``` + +x86_64环境执行如下命令: + +``` +qemu-kvm -m 2048 -M pc -cpu host -nographic -drive if=virtio,file=my-nestos-vm.qcow2 ${IGNITION_DEVICE_ARG} -nic user,model=virtio,hostfwd=tcp::2222-:22 +``` + +#### 8.2.2 使用virt-install创建qcow2实例 + +假设libvirt服务正常,网络默认采用default子网,绑定virbr0网桥,您可参考以下步骤创建NestOS实例。 + +准备NestOS的qcow2镜像及相应点火文件(详见第7章),终端执行如下步骤: + +``` +IGNITION_CONFIG="/path/to/example.ign" +IMAGE="/path/to/image.qcow2" +VM_NAME="nestos" +VCPUS="4" +RAM_MB="4096" +DISK_GB="10" +IGNITION_DEVICE_ARG=(--qemu-commandline="-fw_cfg name=opt/com.coreos/config,file=${IGNITION_CONFIG}") +``` + +**注意:使用virt-install安装,qcow2镜像及ign文件需指定绝对路径。** + +执行如下命令创建实例: + +``` +virt-install --connect="qemu:///system" --name="${VM_NAME}" --vcpus="${VCPUS}" --memory="${RAM_MB}" --os-variant="kylin-hostos10.0" --import --graphics=none --disk="size=${DISK_GB},backing_store=${IMAGE}" --network bridge=virbr0 "${IGNITION_DEVICE_ARG[@]} +``` + +### 8.3 使用ISO镜像安装 + +准备NestOS的ISO镜像并启动。首次启动的NestOS ISO镜像会默认进入Live环境,该环境为易失的内存环境。 + +#### 8.3.1 通过nestos-installer安装向导脚本安装OS至目标磁盘 + +1)在NestOS的Live环境中,根据首次进入的打印提示,可输入以下指令,即可自动生成一份简易的点火文件并自动安装重启 + +``` +sudo installnestos +``` + +2)根据终端提示信息依次输入用户名和密码; + +3)选择目标磁盘安装位置,可直接选择回车设置为默认项/dev/sda; + +4)执行完以上步骤后,nestos-installer开始根据我们提供的配置将NestOS安装至目标磁盘,待进度条100%后,自动重启; + +5)重启后自动进入NestOS,在grub菜单直接回车或者等待5s后启动系统,随后根据此前配置的用户名和密码进入系统。至此,安装完成。 + +#### 8.3.2 通过nestos-installer命令手动安装OS至目标磁盘 + +1)准备好点火文件example.ign(详见第7章); + +2)根据首次进入NestOS的Live环境打印的提示,输入以下指令开始安装: + +``` +sudo nestos-installer install /dev/sda --ignition-file example.ign +``` + +如具备网络条件,点火文件也可通过网络获取,如: + +``` +sudo nestos-installer install /dev/sda --ignition-file http://www.example.com/example.ign +``` + +3)执行完上述指令后,nestos-installer开始根据我们提供的配置将NestOS安装至目标磁盘,待进度条100%后,自动重启; + +4)重启后自动进入NestOS,在gurb菜单直接回车或者等待5s后启动系统,随后根据此前配置的用户名和密码进入系统。至此,安装完成 + +### 8.4 PXE部署 + +NestOS的PXE安装组件包括kernel、initramfs.img和rootfs.img。这些组件以nosa buildextend-live命令生成(详见第6章)。 + +1)使用PXELINUX 的kernel命令行指定内核,简单示例如下: + +``` +KERNEL nestos-live-kernel-x86_64 +``` + +2)使用PXELINUX 的append命令行指定initrd和rootfs,简单示例如下: + +``` +APPEND initrd=nestos-live-initramfs.x86_64.img,nestos-live-rootfs.x86_64.img +``` + +**注意:如您采用7.5章节所述,已将点火文件预集成至PXE组件,则仅需在此进行替换,无需执行后续步骤。** + +3)指定安装位置,以/dev/sda为例,在APPEND后追加,示例如下: + +``` +nestosos.inst.install_dev=/dev/sda +``` + +4)指定点火文件,需通过网络获取,在APPEND后追加相应地址,示例如下: + +``` +nestos.inst.ignition_url=http://www.example.com/example.ign +``` + +5)启动后NestOS会自动读取点火文件并安装至目标磁盘,待进度条完成度为100%后,自动进入安装好的NestOS环境,用户可根据ign文件配置的用户名和密码进入系统。 + +## 9. 基本使用 + +### 9.1 简介 + +NestOS采用基于ostree和rpm-ostree技术的操作系统封装方案,将关键目录设置为只读状态,核心系统文件和配置不会被意外修改;采用overlay分层思想,允许用户在基础ostree文件系统之上分层管理RPM包,不会破坏初始系统体系结构;同时支持构建OCI格式镜像,实现以镜像为最小粒度进行操作系统版本的切换。 + +### 9.2 SSH连接 + +出于安全考虑,NestOS 默认不支持用户使用密码进行SSH登录,而只能使用密钥认证方式。这一设计旨在增强系统的安全性,防止因密码泄露或弱密码攻击导致的潜在安全风险。 + +NestOS通过密钥进行SSH连接的方法与openEuler一致,如果用户需要临时开启密码登录,可按照以下步骤执行: + +1)编辑ssh服务附加配置文件 + +``` +vi /etc/ssh/sshd_config.d/40-disable-passwords.conf +``` + +2)修改默认配置PasswordAuthentication为如下内容: + +``` +PasswordAuthentication yes +``` + +3)重启sshd服务,便可实现临时使用密码进行SSH登录。 + +### 9.3 RPM包安装 + +**注意:不可变操作系统不提倡在运行环境中安装软件包,提供此方法仅供临时调试等场景使用,因业务需求需要变更集成软件包列表请通过更新构建配置重新构建实现。** + +NestOS不支持常规的包管理器dnf/yum,而是通过rpm-ostree来管理系统更新和软件包安装。rpm-ostree结合了镜像和包管理的优势,允许用户在基础系统之上分层安装和管理rpm包,并且不会破环初始系统的结构。使用以下命令安装rpm包: + +``` +rpm-ostree install +``` + +安装完成后,重新启动操作系统,可以看到引导加载菜单出现了两个分支,默认第一个分支为最新的分支 + +``` +systemctl reboot +``` + +重启进入系统,查看系统包分层状态,可看到当前版本已安装 + +``` +rpm-ostree status -v +``` + +### 9.4 版本回退(临时/永久) + +更新/rpm包安装完成后,上一版本的操作系统部署仍会保留在磁盘上。如果更新导致问题,用户可以使用rpm-ostree进行版本回退,这一步操作需要用户手动操作,具体流程如下: + +#### 9.4.1 临时回退 + +要临时回滚到之前的OS部署,在系统启动过程中按住shift键,当引导加载菜单出现时,在菜单中选择相应的分支(默认有两个,选择另外一个即可)。在此之前,可以使用以下指令查看当前环境中已存在的两个版本分支: + +``` +rpm-ostree status +``` + +#### 9.4.2 永久回退 + +要永久回滚到之前的操作系统部署,用户需在当前版本中运行如下指令,此操作将使用之前版本的系统部署作为默认部署。 + +``` +rpm-ostree rollback +``` + +重新启动以生效,引导加载菜单的默认部署选项已经改变,无需用户手动切换。 + +``` +systemctl reboot +``` + +## 10. 容器镜像方式更新 + +### 10.1 应用场景说明 + +NestOS作为基于不可变基础设施思想的容器云底座操作系统,将文件系统作为一个整体进行分发和更新。这一方案在运维与安全方面带来了巨大的便利。然而,在实际生产环境中,官方发布的版本往往难以满足用户的需求。例如,用户可能希望在系统中默认集成自维护的关键基础组件,或者根据特定场景的需求对软件包进行进一步的裁剪,以减少系统的运行负担。因此,与通用操作系统相比,用户对NestOS有着更强烈和更频繁的定制需求。 + + NestOS-assembler 可提供符合OCI标准的容器镜像,且不仅是将根文件系统打包分发,利用ostree native container特性,可使容器云场景用户使用熟悉的技术栈,只需编写一个ContainerFile(Dockerfile)文件,即可轻松构建定制版镜像,用于自定义集成组件或后续的升级维护工作。 + +### 10.2 使用方式 + +#### 10.2.1 定制镜像 + +- 基本步骤 + +(1) 参考第6章构建NestOS容器镜像,可使用nosa push-container命令推送至公共或私有容器镜像仓库。 + +(2) 编写Containerfile(Dockerfile)示例如下: + +``` +FROM registry.example.com/nestos:1.0.20240603.0-x86_64 + +# 执行自定义构建步骤,例如安装软件或拷贝自构建组件 +# 此处以安装strace软件包为例 +RUN rpm-ostree install strace && rm -rf /var/cache && ostree container commit +``` + +(3)执行docker build或集成于CICD中构建相应镜像 + +- 注意事项 + +(1) NestOS 无yum/dnf包管理器,如需安装软件包可采用rpm-ostree install命令安装本地rpm包或软件源中提供软件 + +(2) 如有需求也可修改/etc/yum.repo.d/目录下软件源配置 + +(3) 每层有意义的构建命令末尾均需添加&& ostree container commit命令,从构建容器镜像最佳实践角度出发,建议尽可能减少RUN层的数量 + +(4) 构建过程中会对非/usr或/etc目录内容进行清理,因此通过容器镜像方式定制主要适用于软件包或组件更新,请勿通过此方式进行系统维护或配置变更(例如添加用户useradd) + +#### 10.2.2 部署/升级镜像 + +假设上述步骤构建容器镜像被推送为registry.example.com/nestos:1.0.20240903.0-x86_64。 + +在已部署NestOS的环境中执行如下命令: + +``` +sudo rpm-ostree rebase ostree-unverified-registry:registry.example.com/nestos:1.0.20240903.0-x86_64 +``` + +重新引导后完成定制版本部署。 + +当您使用容器镜像方式部署后,rpm-ostree upgrade 默认会将更新源从ostree更新源地址更新为容器镜像地址。之后,您可以在相同的tag下更新容器镜像,使用 rpm-ostree upgrade 可以检测远端镜像是否已经更新,如果有变更,它会拉取最新的镜像并完成部署。 diff --git a/docs/zh/docs/NestOS/figures/figure1.png b/docs/zh/docs/NestOS/figures/figure1.png new file mode 100644 index 0000000000000000000000000000000000000000..b4eb9017ed202e854c076802492d8561942dfc88 Binary files /dev/null and b/docs/zh/docs/NestOS/figures/figure1.png differ diff --git a/docs/zh/docs/NestOS/figures/figure2.png b/docs/zh/docs/NestOS/figures/figure2.png new file mode 100644 index 0000000000000000000000000000000000000000..90049769c04e2bd494533da1613e38a5199da3d7 Binary files /dev/null and b/docs/zh/docs/NestOS/figures/figure2.png differ diff --git a/docs/zh/docs/NestOS/overview.md b/docs/zh/docs/NestOS/overview.md index e3b20eb3686e4fce9b34a3786fec7b33e6678599..1d744c820ca42dca9441b53db275b9d72cddaa21 100644 --- a/docs/zh/docs/NestOS/overview.md +++ b/docs/zh/docs/NestOS/overview.md @@ -1,3 +1,4 @@ -# NestOS用户指南 +# NestOS云底座操作系统 -本文介绍云底座操作系统NestOS的安装部署与各个特性说明和使用方法,使用户能够快速了解并使用NestOS。Nestos搭载了docker、iSulad、podman、cri-o等常见容器引擎,将ignition配置、rpm-ostree、OCI支持、SElinux强化等技术集成在一起,采用基于双系统分区、容器技术和集群架构的设计思路,可以适配云场景下多种基础运行环境。同时NestOS针对Kubernetes进行优化,在IaaS生态构建方面,针对openStack、oVirt等平台提供支持;在PaaS生态构建方面,针对OKD、Rancher等平台提供支持,使系统具备十分便捷的集群组件能力,可以更安全的运行大规模的容器化工作负载。镜像下载地址详见[NestOS仓库](https://gitee.com/openeuler/NestOS)。 \ No newline at end of file +本文介绍云底座操作系统NestOS For Container(下称NestOS)的安装部署与各个特性说明和使用方法,使用户能够快速了解并使用NestOS。NestOS For Virt的使用方法与通用操作系统使用方法一致,可参考欧拉官方文档。 +Nestos搭载了docker、iSulad、podman、cri-o等常见容器引擎,将ignition配置、rpm-ostree、OCI支持、SElinux强化等技术集成在一起,采用基于双系统分区、容器技术和集群架构的设计思路,可以适配云场景下多种基础运行环境。同时NestOS针对Kubernetes进行优化,在IaaS生态构建方面,针对openStack、oVirt等平台提供支持;在PaaS生态构建方面,针对OKD、Rancher等平台提供支持,使系统具备十分便捷的集群组建能力,可以更安全的运行大规模的容器化工作负载。镜像下载地址详见[NestOS官网](https://nestos.openeuler.org/)。 diff --git "a/docs/zh/docs/NestOS/\344\275\277\347\224\250\346\226\271\346\263\225.md" "b/docs/zh/docs/NestOS/\344\275\277\347\224\250\346\226\271\346\263\225.md" deleted file mode 100644 index 64d86973bc37dc83903c05f0a5b6d826edbb92c6..0000000000000000000000000000000000000000 --- "a/docs/zh/docs/NestOS/\344\275\277\347\224\250\346\226\271\346\263\225.md" +++ /dev/null @@ -1,906 +0,0 @@ -# 基于NestOS容器化部署Kubernetes - -​ - -## 整体方案 - -Kubernetes(k8s)是为容器服务而生的一个可移植容器的编排管理工具。本指南旨在提供NestOS快速容器化部署k8s的解决方案。该方案以虚拟化平台创建多个NestOS节点作为部署k8s的验证环境,并通过编写Ignition文件的方式,提前将k8s所需的环境配置到一个yaml文件中。在安装NestOS操作系统的同时,即可完成对k8s所需资源的部署并创建节点。裸金属环境也可以参考本文并结合NestOS裸金属安装文档完成k8s部署。 - -- 版本信息: - - - NestOS镜像版本:22.09 - - - k8s版本:v1.23.10 - - - isulad版本:2.0.16 - -- 安装要求 - - 每台机器2GB或更多的RAM - - CPU2核心及以上 - - 集群中所有机器之间网络互通 - - 节点之中不可以有重复的主机名 - - 可以访问外网,需要拉取镜像 - - 禁止swap分区 - - 关闭selinux -- 部署内容 - - NestOS镜像以集成isulad和kubeadm、kubelet、kubectl等二进制文件 - - 部署k8s Master节点 - - 部署容器网络插件 - - 部署k8s Node节点,将节点加入k8s集群中 - -## K8S节点配置 - -NestOS通过Ignition文件机制实现节点批量配置。本章节简要介绍Ignition文件的生成方法,并提供容器化部署k8s时的Ignition配置示例。NestOS节点系统配置内容如下: - -| 配置项 | 用途 | -| ------------ | -------------------------------------- | -| passwd | 配置节点登录用户和访问鉴权等相关信息 | -| hostname | 配置节点的hostname | -| 时区 | 配置节点的默认时区 | -| 内核参数 | k8s部署环境需要开启部分内核参数 | -| 关闭selinux | k8s部署环境需要关闭selinux | -| 设置时间同步 | k8s部署环境通过chronyd服务同步集群时间 | - -### 生成登录密码 - -使用密码登录方式访问NestOS实例,可使用下述命令生成${PASSWORD_HASH} 供点火文件配置使用: - -``` -openssl passwd -1 -salt yoursalt -``` - -### 生成ssh密钥对 - -采用ssh公钥方式访问NestOS实例,可通过下述命令生成ssh密钥对: - -``` -ssh-keygen -N '' -f /root/.ssh/id_rsa -``` - -查看公钥文件id_rsa.pub,获取ssh公钥信息后供Ignition文件配置使用: - -``` -cat /root/.ssh/id_rsa.pub -``` - -### 编写butane配置文件 - -本配置文件示例中,下列字段均需根据实际部署情况自行配置。部分字段上文提供了生成方法: - -- ${PASSWORD_HASH}:指定节点的登录密码 -- ${SSH-RSA}:配置节点的公钥信息 -- ${MASTER_NAME}:配置主节点的hostname -- ${MASTER_IP}:配置主节点的IP -- ${MASTER_SEGMENT}:配置主节点的网段 -- ${NODE_NAME}:配置node节点的hostname -- ${NODE_IP}:配置node节点的IP -- ${GATEWAY}:配置节点网关 -- ${service-cidr}:指定service分配的ip段 -- ${pod-network-cidr}:指定pod分配的ip段 -- ${image-repository}:指定镜像仓库地址,例:https://registry.cn-hangzhou.aliyuncs.com -- ${token}:加入集群的token信息,通过master节点获取 - -master节点butane配置文件示例: - -```yaml -variant: fcos -version: 1.1.0 -##passwd相关配置 -passwd: - users: - - name: root - ##登录密码 - password_hash: "${PASSWORD_HASH}" - "groups": [ - "adm", - "sudo", - "systemd-journal", - "wheel" - ] - ##ssh公钥信息 - ssh_authorized_keys: - - "${SSH-RSA}" -storage: - directories: - - path: /etc/systemd/system/kubelet.service.d - overwrite: true - files: - - path: /etc/hostname - mode: 0644 - contents: - inline: ${MASTER_NAME} - - path: /etc/hosts - mode: 0644 - overwrite: true - contents: - inline: | - 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 - ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 - ${MASTER_IP} ${MASTER_NAME} - ${NODE_IP} ${NODE_NAME} - - path: /etc/NetworkManager/system-connections/ens2.nmconnection - mode: 0600 - overwrite: true - contents: - inline: | - [connection] - id=ens2 - type=ethernet - interface-name=ens2 - [ipv4] - address1=${MASTER_IP}/24,${GATEWAY} - dns=8.8.8.8 - dns-search= - method=manual - - path: /etc/sysctl.d/kubernetes.conf - mode: 0644 - overwrite: true - contents: - inline: | - net.bridge.bridge-nf-call-iptables=1 - net.bridge.bridge-nf-call-ip6tables=1 - net.ipv4.ip_forward=1 - - path: /etc/isulad/daemon.json - mode: 0644 - overwrite: true - contents: - inline: | - { - "exec-opts": ["native.cgroupdriver=systemd"], - "group": "isula", - "default-runtime": "lcr", - "graph": "/var/lib/isulad", - "state": "/var/run/isulad", - "engine": "lcr", - "log-level": "ERROR", - "pidfile": "/var/run/isulad.pid", - "log-opts": { - "log-file-mode": "0600", - "log-path": "/var/lib/isulad", - "max-file": "1", - "max-size": "30KB" - }, - "log-driver": "stdout", - "container-log": { - "driver": "json-file" - }, - "hook-spec": "/etc/default/isulad/hooks/default.json", - "start-timeout": "2m", - "storage-driver": "overlay2", - "storage-opts": [ - "overlay2.override_kernel_check=true" - ], - "registry-mirrors": [ - "docker.io" - ], - "insecure-registries": [ - "${image-repository}" - ], - "pod-sandbox-image": "k8s.gcr.io/pause:3.6", - "native.umask": "secure", - "network-plugin": "cni", - "cni-bin-dir": "/opt/cni/bin", - "cni-conf-dir": "/etc/cni/net.d", - "image-layer-check": false, - "use-decrypted-key": true, - "insecure-skip-verify-enforce": false, - "cri-runtimes": { - "kata": "io.containerd.kata.v2" - } - } - - path: /root/pull_images.sh - mode: 0644 - overwrite: true - contents: - inline: | - #!/bin/sh - KUBE_VERSION=v1.23.10 - KUBE_PAUSE_VERSION=3.6 - ETCD_VERSION=3.5.1-0 - DNS_VERSION=v1.8.6 - CALICO_VERSION=v3.19.4 - username=${image-repository} - images=( - kube-proxy:${KUBE_VERSION} - kube-scheduler:${KUBE_VERSION} - kube-controller-manager:${KUBE_VERSION} - kube-apiserver:${KUBE_VERSION} - pause:${KUBE_PAUSE_VERSION} - etcd:${ETCD_VERSION} - ) - for image in ${images[@]} - do - isula pull ${username}/${image} - isula tag ${username}/${image} k8s.gcr.io/${image} - isula rmi ${username}/${image} - done - isula pull ${username}/coredns:${DNS_VERSION} - isula tag ${username}/coredns:${DNS_VERSION} k8s.gcr.io/coredns/coredns:${DNS_VERSION} - isula rmi ${username}/coredns:${DNS_VERSION} - isula pull calico/node:${CALICO_VERSION} - isula pull calico/cni:${CALICO_VERSION} - isula pull calico/kube-controllers:${CALICO_VERSION} - isula pull calico/pod2daemon-flexvol:${CALICO_VERSION} - touch /var/log/pull-images.stamp - - path: /etc/systemd/system/kubelet.service.d/10-kubeadm.conf - mode: 0644 - contents: - inline: | - # Note: This dropin only works with kubeadm and kubelet v1.11+ - [Service] - Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf" - Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml" - # This is a file that "kubeadm init" and "kubeadm join" generates at runtime, populating the KUBELET_KUBEADM_ARGS variable dynamically - EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env - # This is a file that the user can use for overrides of the kubelet args as a last resort. Preferably, the user should use - # the .NodeRegistration.KubeletExtraArgs object in the configuration files instead. KUBELET_EXTRA_ARGS should be sourced from this file. - EnvironmentFile=-/etc/sysconfig/kubelet - ExecStart= - ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS - - path: /root/init-config.yaml - mode: 0644 - contents: - inline: | - apiVersion: kubeadm.k8s.io/v1beta2 - kind: InitConfiguration - nodeRegistration: - criSocket: /var/run/isulad.sock - name: k8s-master01 - kubeletExtraArgs: - volume-plugin-dir: "/opt/libexec/kubernetes/kubelet-plugins/volume/exec/" - --- - apiVersion: kubeadm.k8s.io/v1beta2 - kind: ClusterConfiguration - controllerManager: - extraArgs: - flex-volume-plugin-dir: "/opt/libexec/kubernetes/kubelet-plugins/volume/exec/" - kubernetesVersion: v1.23.10 - imageRepository: k8s.gcr.io - controlPlaneEndpoint: "${MASTER_IP}:6443" - networking: - serviceSubnet: "${service-cidr}" - podSubnet: "${pod-network-cidr}" - dnsDomain: "cluster.local" - dns: - type: CoreDNS - imageRepository: k8s.gcr.io/coredns - imageTag: v1.8.6 - links: - - path: /etc/localtime - target: ../usr/share/zoneinfo/Asia/Shanghai - -systemd: - units: - - name: kubelet.service - enabled: true - contents: | - [Unit] - Description=kubelet: The Kubernetes Node Agent - Documentation=https://kubernetes.io/docs/ - Wants=network-online.target - After=network-online.target - - [Service] - ExecStart=/usr/bin/kubelet - Restart=always - StartLimitInterval=0 - RestartSec=10 - - [Install] - WantedBy=multi-user.target - - - name: set-kernel-para.service - enabled: true - contents: | - [Unit] - Description=set kernel para for Kubernetes - ConditionPathExists=!/var/log/set-kernel-para.stamp - - [Service] - Type=oneshot - RemainAfterExit=yes - ExecStart=modprobe br_netfilter - ExecStart=sysctl -p /etc/sysctl.d/kubernetes.conf - ExecStart=/bin/touch /var/log/set-kernel-para.stamp - - [Install] - WantedBy=multi-user.target - - - name: pull-images.service - enabled: true - contents: | - [Unit] - Description=pull images for kubernetes - ConditionPathExists=!/var/log/pull-images.stamp - - [Service] - Type=oneshot - RemainAfterExit=yes - ExecStart=systemctl start isulad - ExecStart=systemctl enable isulad - ExecStart=sh /root/pull_images.sh - - [Install] - WantedBy=multi-user.target - - - name: disable-selinux.service - enabled: true - contents: | - [Unit] - Description=disable selinux for kubernetes - ConditionPathExists=!/var/log/disable-selinux.stamp - - [Service] - Type=oneshot - RemainAfterExit=yes - ExecStart=bash -c "sed -i 's#SELINUX=enforcing#SELINUX=disabled#g' /etc/selinux/config" - ExecStart=setenforce 0 - ExecStart=/bin/touch /var/log/disable-selinux.stamp - - [Install] - WantedBy=multi-user.target - - - name: set-time-sync.service - enabled: true - contents: | - [Unit] - Description=set time sync for kubernetes - ConditionPathExists=!/var/log/set-time-sync.stamp - - [Service] - Type=oneshot - RemainAfterExit=yes - ExecStart=bash -c "sed -i '3aserver ntp1.aliyun.com iburst' /etc/chrony.conf" - ExecStart=bash -c "sed -i '24aallow ${MASTER_SEGMENT}' /etc/chrony.conf" - ExecStart=bash -c "sed -i '26alocal stratum 10' /etc/chrony.conf" - ExecStart=systemctl restart chronyd.service - ExecStart=/bin/touch /var/log/set-time-sync.stamp - - [Install] - WantedBy=multi-user.target - - - name: init-cluster.service - enabled: true - contents: | - [Unit] - Description=init kubernetes cluster - Requires=set-kernel-para.service pull-images.service disable-selinux.service set-time-sync.service - After=set-kernel-para.service pull-images.service disable-selinux.service set-time-sync.service - ConditionPathExists=/var/log/set-kernel-para.stamp - ConditionPathExists=/var/log/set-time-sync.stamp - ConditionPathExists=/var/log/disable-selinux.stamp - ConditionPathExists=/var/log/pull-images.stamp - ConditionPathExists=!/var/log/init-k8s-cluster.stamp - - [Service] - Type=oneshot - RemainAfterExit=yes - ExecStart=kubeadm init --config=/root/init-config.yaml --upload-certs - ExecStart=/bin/touch /var/log/init-k8s-cluster.stamp - - [Install] - WantedBy=multi-user.target - - - - name: install-cni-plugin.service - enabled: true - contents: | - [Unit] - Description=install cni network plugin for kubernetes - Requires=init-cluster.service - After=init-cluster.service - - [Service] - Type=oneshot - RemainAfterExit=yes - ExecStart=bash -c "curl https://docs.projectcalico.org/v3.19/manifests/calico.yaml -o /root/calico.yaml" - ExecStart=/bin/sleep 6 - ExecStart=bash -c "sed -i 's#usr/libexec/#opt/libexec/#g' /root/calico.yaml" - ExecStart=kubectl apply -f /root/calico.yaml --kubeconfig=/etc/kubernetes/admin.conf - - [Install] - WantedBy=multi-user.target - -``` - -Node节点butane配置文件示例: - -```yaml -variant: fcos -version: 1.1.0 -passwd: - users: - - name: root - password_hash: "${PASSWORD_HASH}" - "groups": [ - "adm", - "sudo", - "systemd-journal", - "wheel" - ] - ssh_authorized_keys: - - "${SSH-RSA}" -storage: - directories: - - path: /etc/systemd/system/kubelet.service.d - overwrite: true - files: - - path: /etc/hostname - mode: 0644 - contents: - inline: ${NODE_NAME} - - path: /etc/hosts - mode: 0644 - overwrite: true - contents: - inline: | - 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 - ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 - ${MASTER_IP} ${MASTER_NAME} - ${NODE_IP} ${NODE_NAME} - - path: /etc/NetworkManager/system-connections/ens2.nmconnection - mode: 0600 - overwrite: true - contents: - inline: | - [connection] - id=ens2 - type=ethernet - interface-name=ens2 - [ipv4] - address1=${NODE_IP}/24,${GATEWAY} - dns=8.8.8.8; - dns-search= - method=manual - - path: /etc/sysctl.d/kubernetes.conf - mode: 0644 - overwrite: true - contents: - inline: | - net.bridge.bridge-nf-call-iptables=1 - net.bridge.bridge-nf-call-ip6tables=1 - net.ipv4.ip_forward=1 - - path: /etc/isulad/daemon.json - mode: 0644 - overwrite: true - contents: - inline: | - { - "exec-opts": ["native.cgroupdriver=systemd"], - "group": "isula", - "default-runtime": "lcr", - "graph": "/var/lib/isulad", - "state": "/var/run/isulad", - "engine": "lcr", - "log-level": "ERROR", - "pidfile": "/var/run/isulad.pid", - "log-opts": { - "log-file-mode": "0600", - "log-path": "/var/lib/isulad", - "max-file": "1", - "max-size": "30KB" - }, - "log-driver": "stdout", - "container-log": { - "driver": "json-file" - }, - "hook-spec": "/etc/default/isulad/hooks/default.json", - "start-timeout": "2m", - "storage-driver": "overlay2", - "storage-opts": [ - "overlay2.override_kernel_check=true" - ], - "registry-mirrors": [ - "docker.io" - ], - "insecure-registries": [ - "${image-repository}" - ], - "pod-sandbox-image": "k8s.gcr.io/pause:3.6", - "native.umask": "secure", - "network-plugin": "cni", - "cni-bin-dir": "/opt/cni/bin", - "cni-conf-dir": "/etc/cni/net.d", - "image-layer-check": false, - "use-decrypted-key": true, - "insecure-skip-verify-enforce": false, - "cri-runtimes": { - "kata": "io.containerd.kata.v2" - } - } - - path: /root/pull_images.sh - mode: 0644 - overwrite: true - contents: - inline: | - #!/bin/sh - KUBE_VERSION=v1.23.10 - KUBE_PAUSE_VERSION=3.6 - ETCD_VERSION=3.5.1-0 - DNS_VERSION=v1.8.6 - CALICO_VERSION=v3.19.4 - username=${image-repository} - images=( - kube-proxy:${KUBE_VERSION} - kube-scheduler:${KUBE_VERSION} - kube-controller-manager:${KUBE_VERSION} - kube-apiserver:${KUBE_VERSION} - pause:${KUBE_PAUSE_VERSION} - etcd:${ETCD_VERSION} - ) - for image in ${images[@]} - do - isula pull ${username}/${image} - isula tag ${username}/${image} k8s.gcr.io/${image} - isula rmi ${username}/${image} - done - isula pull ${username}/coredns:${DNS_VERSION} - isula tag ${username}/coredns:${DNS_VERSION} k8s.gcr.io/coredns/coredns:${DNS_VERSION} - isula rmi ${username}/coredns:${DNS_VERSION} - touch /var/log/pull-images.stamp - - path: /etc/systemd/system/kubelet.service.d/10-kubeadm.conf - mode: 0644 - contents: - inline: | - # Note: This dropin only works with kubeadm and kubelet v1.11+ - [Service] - Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf" - Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml" - # This is a file that "kubeadm init" and "kubeadm join" generates at runtime, populating the KUBELET_KUBEADM_ARGS variable dynamically - EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env - # This is a file that the user can use for overrides of the kubelet args as a last resort. Preferably, the user should use - # the .NodeRegistration.KubeletExtraArgs object in the configuration files instead. KUBELET_EXTRA_ARGS should be sourced from this file. - EnvironmentFile=-/etc/sysconfig/kubelet - ExecStart= - ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS - - path: /root/join-config.yaml - mode: 0644 - contents: - inline: | - apiVersion: kubeadm.k8s.io/v1beta3 - caCertPath: /etc/kubernetes/pki/ca.crt - discovery: - bootstrapToken: - apiServerEndpoint: ${MASTER_IP}:6443 - token: ${token} - unsafeSkipCAVerification: true - timeout: 5m0s - tlsBootstrapToken: ${token} - kind: JoinConfiguration - nodeRegistration: - criSocket: /var/run/isulad.sock - imagePullPolicy: IfNotPresent - name: ${NODE_NAME} - taints: null - links: - - path: /etc/localtime - target: ../usr/share/zoneinfo/Asia/Shanghai - -systemd: - units: - - name: kubelet.service - enabled: true - contents: | - [Unit] - Description=kubelet: The Kubernetes Node Agent - Documentation=https://kubernetes.io/docs/ - Wants=network-online.target - After=network-online.target - - [Service] - ExecStart=/usr/bin/kubelet - Restart=always - StartLimitInterval=0 - RestartSec=10 - - [Install] - WantedBy=multi-user.target - - - name: set-kernel-para.service - enabled: true - contents: | - [Unit] - Description=set kernel para for kubernetes - ConditionPathExists=!/var/log/set-kernel-para.stamp - - [Service] - Type=oneshot - RemainAfterExit=yes - ExecStart=modprobe br_netfilter - ExecStart=sysctl -p /etc/sysctl.d/kubernetes.conf - ExecStart=/bin/touch /var/log/set-kernel-para.stamp - - [Install] - WantedBy=multi-user.target - - - name: pull-images.service - enabled: true - contents: | - [Unit] - Description=pull images for kubernetes - ConditionPathExists=!/var/log/pull-images.stamp - - [Service] - Type=oneshot - RemainAfterExit=yes - ExecStart=systemctl start isulad - ExecStart=systemctl enable isulad - ExecStart=sh /root/pull_images.sh - - [Install] - WantedBy=multi-user.target - - - name: disable-selinux.service - enabled: true - contents: | - [Unit] - Description=disable selinux for kubernetes - ConditionPathExists=!/var/log/disable-selinux.stamp - - [Service] - Type=oneshot - RemainAfterExit=yes - ExecStart=bash -c "sed -i 's#SELINUX=enforcing#SELINUX=disabled#g' /etc/selinux/config" - ExecStart=setenforce 0 - ExecStart=/bin/touch /var/log/disable-selinux.stamp - - [Install] - WantedBy=multi-user.target - - - name: set-time-sync.service - enabled: true - contents: | - [Unit] - Description=set time sync for kubernetes - ConditionPathExists=!/var/log/set-time-sync.stamp - - [Service] - Type=oneshot - RemainAfterExit=yes - ExecStart=bash -c "sed -i '3aserver ${MASTER_IP}' /etc/chrony.conf" - ExecStart=systemctl restart chronyd.service - ExecStart=/bin/touch /var/log/set-time-sync.stamp - - [Install] - WantedBy=multi-user.target - - - name: join-cluster.service - enabled: true - contents: | - [Unit] - Description=node join kubernetes cluster - Requires=set-kernel-para.service pull-images.service disable-selinux.service set-time-sync.service - After=set-kernel-para.service pull-images.service disable-selinux.service set-time-sync.service - ConditionPathExists=/var/log/set-kernel-para.stamp - ConditionPathExists=/var/log/set-time-sync.stamp - ConditionPathExists=/var/log/disable-selinux.stamp - ConditionPathExists=/var/log/pull-images.stamp - - [Service] - Type=oneshot - RemainAfterExit=yes - ExecStart=kubeadm join --config=/root/join-config.yaml - - [Install] - WantedBy=multi-user.target - -``` - -### 生成Ignition文件 - -为了方便使用者读、写,Ignition文件增加了一步转换过程。将Butane配置文件(yaml格式)转换成Ignition文件(json格式),并使用生成的Ignition文件引导新的NestOS镜像。Butane配置转换成Ignition配置命令: - -``` -podman run --interactive --rm quay.io/coreos/butane:release --pretty --strict < your_config.bu > transpiled_config.ign -``` - - - -## K8S集群搭建 - -利用上一节配置的Ignition文件,执行下述命令创建k8s集群的Master节点,其中 vcpus、ram 和 disk 参数可自行调整,详情可参考 virt-install 手册。 - -``` -virt-install --name=${NAME} --vcpus=4 --ram=8192 --import --network=bridge=virbr0 --graphics=none --qemu-commandline="-fw_cfg name=opt/com.coreos/config,file=${IGNITION_FILE_PATH}" --disk=size=40,backing_store=${NESTOS_RELEASE_QCOW2_PATH} --network=bridge=virbr1 --disk=size=40 -``` - -Master节点系统安装成功后,系统后台会起一系列环境配置服务,其中set-kernel-para.service会配置内核参数,pull-images.service会拉取集群所需的镜像,disable-selinux.service会关闭selinux,set-time-sync.service服务会设置时间同步,init-cluster.service会初始化集群,之后install-cni-plugin.service会安装cni网络插件。整个集群部署过程中由于要拉取镜像,所以需要等待几分钟。 - -通过kubectl get pods -A命令可以查看是否所有pod状态都为running - - -在Master节点上通过下面命令查看token: - -``` -kubeadm token list -``` - -将查询到的token信息添加到Node节点的ignition文件中,并利用该ignition文件创建Node节点。Node节点创建完成后,在Master节点上通过执行kubectl get nodes命令,可以查看Node节点是否加入到了集群中。 - -至此,k8s部署成功 - -# rpm-ostree使用 - -## rpm-ostree安装软件包 - -安装wget - -``` -rpm-ostree install wget -``` - -重启系统,可在启动时通过键盘上下按键选择rpm包安装完成后或安装前的系统状态,其中【ostree:0】为安装之后的版本。 - -``` -systemctl reboot -``` - -查看wget是否安装成功 - -``` -rpm -qa | grep wget -``` - -## rpm-ostree 手动更新升级 NestOS - -在NestOS中执行命令可查看当前rpm-ostree状态,可看到当前版本号 - -``` -rpm-ostree status -``` - -执行检查命令查看是否有升级可用,发现存在新版本 - -``` -rpm-ostree upgrade --check -``` - -预览版本的差异 - -``` -rpm-ostree upgrade --preview -``` - -在最新版本中,我们将nano包做了引入。 -执行如下指令会下载最新的ostree和RPM数据,不需要进行部署 - -``` -rpm-ostree upgrade --download-only -``` - -重启NestOS,重启后可看到系统的新旧版本两个状态,选择最新版本的分支进入 - -``` -rpm-ostree upgrade --reboot -``` - -## 比较NestOS版本差别 - -检查状态,确认此时ostree有两个版本,分别为LTS.20210927.dev.0和LTS.20210928.dev.0 - -``` -rpm-ostree status -``` - -根据commit号比较2个ostree的差别 - -``` -rpm-ostree db diff 55eed9bfc5ec fe2408e34148 -``` - -## 系统回滚 - -当一个系统更新完成,之前的NestOS部署仍然在磁盘上,如果更新导致了系统出现问题,可以使用之前的部署回滚系统。 - -### 临时回滚 - -要临时回滚到之前的OS部署,在系统启动过程中按住shift键,当引导加载菜单出现时,在菜单中选择相关的分支。 - -### 永久回滚 - -要永久回滚到之前的操作系统部署,登录到目标节点,运行rpm-ostree rollback,此操作将使用之前的系统部署作为默认部署,并重新启动到其中。 -执行命令,回滚到前面更新前的系统。 - -``` -rpm-ostree rollback -``` - -重启后失效。 - -## 切换版本 - -在上一步将NestOS回滚到了旧版本,可以通过命令切换当前 NestOS 使用的rpm-ostree版本,将旧版本切换为新版本。 - -``` -rpm-ostree deploy -r 22.03.20220325.dev.0 -``` - -重启后确认目前NestOS已经使用的是新版本的ostree了。 - - -# zincati自动更新使用 - -zincati负责NestOS的自动更新,zincati通过cincinnati提供的后端来检查当前是否有可更新版本,若检测到有可更新版本,会通过rpm-ostree进行下载。 - -目前系统默认关闭zincati自动更新服务,可通过修改配置文件设置为开机自动启动自动更新服务。 - -``` -vi /etc/zincati/config.d/95-disable-on-dev.toml -``` - -将updates.enabled设置为true -同时增加配置文件,修改cincinnati后端地址 - -``` -vi /etc/zincati/config.d/update-cincinnati.toml -``` - -添加如下内容 - -``` -[cincinnati] -base_url="http://nestos.org.cn:8080" -``` - -重新启动zincati服务 - -``` -systemctl restart zincati.service -``` - -当有新版本时,zincati会自动检测到可更新版本,此时查看rpm-ostree状态,可以看到状态是“busy”,说明系统正在升级中。 - -一段时间后NestOS将自动重启,此时再次登录NestOS,可以再次确认rpm-ostree的状态,其中状态转为"idle",而且当前版本已经是“20220325”,这说明rpm-ostree版本已经升级了。 - -查看zincati服务的日志,确认升级的过程和重启系统的日志。另外日志显示的"auto-updates logic enabled"也说明更新是自动的。 - -# 定制NestOS - -我们可以使用nestos-installer 工具对原始的NestOS ISO文件进行加工,将Ignition文件打包进去从而生成定制的 NestOS ISO文件。使用定制的NestOS ISO文件可以在系统启动完成后自动执行NestOS的安装,因此NestOS的安装会更加简单。 - -在开始定制NestOS之前,需要做如下准备工作: - -- 下载 NestOS ISO -- 准备 config.ign文件 - -## 生成定制NestOS ISO文件 - -### 设置参数变量 - -``` -$ export COREOS_ISO_ORIGIN_FILE=nestos-22.03.20220324.x86_64.iso -$ export COREOS_ISO_CUSTOMIZED_FILE=my-nestos.iso -$ export IGN_FILE=config.ign -``` - -### ISO文件检查 - -确认原始的NestOS ISO文件中是没有包含Ignition配置。 - -``` -$ nestos-installer iso ignition show $COREOS_ISO_ORIGIN_FILE - -Error: No embedded Ignition config. -``` - -### 生成定制NestOS ISO文件 - -将Ignition文件和原始NestOS ISO文件打包生成定制的NestOS ISO文件。 - -``` -$ nestos-installer iso ignition embed $COREOS_ISO_ORIGIN_FILE --ignition-file $IGN_FILE $COREOS_ISO_ORIGIN_FILE --output $COREOS_ISO_CUSTOMIZED_FILE -``` - -### ISO文件检查 - -确认定制NestOS ISO 文件中已经包含Ignition配置了 - -``` -$ nestos-installer iso ignition show $COREOS_ISO_CUSTOMIZED_FILE -``` - -执行命令,将会显示Ignition配置内容 - -## 安装定制NestOS ISO文件 - -使用定制的 NestOS ISO 文件可以直接引导安装,并根据Ignition自动完成NestOS的安装。在完成安装后,我们可以直接在虚拟机的控制台上用nest/password登录NestOS。 \ No newline at end of file diff --git "a/docs/zh/docs/NestOS/\345\212\237\350\203\275\347\211\271\346\200\247\346\217\217\350\277\260.md" "b/docs/zh/docs/NestOS/\345\212\237\350\203\275\347\211\271\346\200\247\346\217\217\350\277\260.md" index c513fd2f0ab3d70525af432656d53d78388e9c8b..f0f661d462d4f467b910c0aa902434865937af6d 100644 --- "a/docs/zh/docs/NestOS/\345\212\237\350\203\275\347\211\271\346\200\247\346\217\217\350\277\260.md" +++ "b/docs/zh/docs/NestOS/\345\212\237\350\203\275\347\211\271\346\200\247\346\217\217\350\277\260.md" @@ -102,4 +102,4 @@ Afterburn包含了很多可以在实例生命周期中不同时间段运行的 (2)从实例元数据中检索属性 -(3)给提供者登记以便报道成功的启动或实例供应 \ No newline at end of file +(3)给提供者登记以便报道成功的启动或实例供应 diff --git "a/docs/zh/docs/NestOS/\345\256\211\350\243\205\344\270\216\351\203\250\347\275\262.md" "b/docs/zh/docs/NestOS/\345\256\211\350\243\205\344\270\216\351\203\250\347\275\262.md" deleted file mode 100644 index ba1f7cacbd6b9e8cccd2623f66ba2e9f7d3e3c25..0000000000000000000000000000000000000000 --- "a/docs/zh/docs/NestOS/\345\256\211\350\243\205\344\270\216\351\203\250\347\275\262.md" +++ /dev/null @@ -1,129 +0,0 @@ -# 安装与部署 - -## 在 VMware 上部署 NestOS - -本指南展示了如何在VMware虚拟机管理程序上配置最新的 NestOS。 - -目前NestOS已支持x86_64和aarch64架构。 - -### 开始之前 - -​ 在开始部署 NestOS 之前,需要做如下准备工作: - -- 下载 NestOS ISO -- 准备 config.bu 文件 -- 配置 butane 工具(Linux环境/win10环境) -- 安装有VMware的宿主机 - -### 初步安装与启动 - -#### 启动 NestOS - -初次启动 NestOS ,ignition 尚未安装,可根据系统提示使用 nestos-installer 组件进行ignition的安装。 - -### 配置 ignition 文件 - -#### 获取 Butane - -可以通过 Butane 将 bu 文件转化为 ignition 文件。ignition 配置文件被设计为可读但难以编写,是为了 -阻止用户尝试手动编写配置。 -Butane 提供了多种环境的支持,可以在 linux/windows 宿主机中或容器环境中进行配置。 - -``` -docker pull quay.io/coreos/butane:release -``` - -#### 生成登录密码 - -在宿主机执行如下命令,并输入你的密码。 - -``` -# openssl passwd -1 -salt yoursalt -Password: -$1$yoursalt$1QskegeyhtMG2tdh0ldQN0 -``` - -#### 生成ssh-key - -在宿主机执行如下命令,获取公钥和私钥以供后续 ssh 登录。 - -``` -# ssh-keygen -N '' -f ./id_rsa -Generating public/private rsa key pair. -Your identification has been saved in ./id_rsa -Your public key has been saved in ./id_rsa.pub -The key fingerprint is: -SHA256:4fFpDDyGHOYEd2fPaprKvvqst3T1xBQuk3mbdon+0Xs root@host-12-0-0-141 -``` - -``` -The key's randomart image is: -+---[RSA 3072]----+ -| ..= . o . | -| * = o * . | -| + B = * | -| o B O + . | -| S O B o | -| * = . . | -| . +o . . | -| +.o . .E | -| o*Oo ... | -+----[SHA256]-----+ -``` - -可以在当前目录查看id_rsa.pub公钥: - -``` -# cat id_rsa.pub -ssh-rsa -AAAAB3NzaC1yc2... -``` - -#### 编写bu文件 - -进行最简单的初始配置,如需更多详细的配置,参考后面的 ignition 详解。 -如下为最简单的 config.bu 文件: - -``` -variant: fcos -version: 1.1.0 -passwd: - users: - - name: nest - password_hash: "$1$yoursalt$1QskegeyhtMG2tdh0ldQN0" - ssh_authorized_keys: - - "ssh-rsa - AAAAB3NzaC1yc2EAAA..." -``` - -#### 生成ignition文件 - -将 config.bu 通过 Butane 工具转换为 config.ign 文件,如下为在容器环境下进行转换。 - -``` -# docker run --interactive --rm quay.io/coreos/butane:release \ ---pretty --strict < your_config.bu > transpiled_config.ign -``` - -### 安装 NestOS - -将宿主机生成的config.ign文件通过scp拷贝到前面初步启动的 NestOS 中,该OS目前运行在内存中, -并没有安装到硬盘。 - -``` -sudo -i -scp root@your_ipAddress:/root/config.ign /root -``` - -根据系统所给提示,执行如下指令完成安装。 - -``` -nestos-installer install /dev/sda --ignition-file config.ign -``` - -安装完成后重启 NestOS 。 - -``` -systemctl reboot -``` -完成。 diff --git "a/docs/zh/docs/NfsMultipath/\345\256\211\350\243\205\344\270\216\351\203\250\347\275\262.md" "b/docs/zh/docs/NfsMultipath/\345\256\211\350\243\205\344\270\216\351\203\250\347\275\262.md" index 3afa92ecff217809f3bb6e65fec61453abbee794..a00bf66181a2946e0e4ed0930199fb3f3cfa30b5 100644 --- "a/docs/zh/docs/NfsMultipath/\345\256\211\350\243\205\344\270\216\351\203\250\347\275\262.md" +++ "b/docs/zh/docs/NfsMultipath/\345\256\211\350\243\205\344\270\216\351\203\250\347\275\262.md" @@ -1,9 +1,5 @@ # 安装与部署 -## 软件要求 - -* 操作系统:openEuler 23.03 - ## 硬件要求 * x86_64架构 diff --git a/docs/zh/docs/Ods-Pipeline/grammar/v1.1/v1.1_grammar.md b/docs/zh/docs/Ods-Pipeline/grammar/v1.1/v1.1_grammar.md new file mode 100644 index 0000000000000000000000000000000000000000..d8fe36953afe4e6dab897b3c6a8e537b53c9cb17 --- /dev/null +++ b/docs/zh/docs/Ods-Pipeline/grammar/v1.1/v1.1_grammar.md @@ -0,0 +1,1301 @@ +# V1.1语法说明文档 + +workflow,即流水线,是一连串具备一定串并联关系的任务组合,描述一连串的任务之间存在的依赖关系、输入输出参数,以及整个流水线的触发条件。 + +不同版本具备不同的语法规则,从零编写和学会workflow的声明方法请详阅对应版本的文档,避免无法正常解析。 + +## 版本信息 + +| 版本 | v1.1 | +| --- | --- | +| 维护者 | wanglin | +| 创建时间 | 2024-01-03 | +| 是否废弃 | 否 | + + + +## 1. 语法特性 + +流水线通过YAML描述,描述文件的YAML语法规则基于YAML 1.2版本,书写时需要基于YAML 1.2支持的书写方式进行书写。后文中提及的语法特性是描述的基于此之上的解析规则,不涉及YAML 1.2语法的说明。 + +> YAML(YAML Ain't Markup Language)是一种人类可读的数据序列化标准,它被广泛用于配置文件、数据交换语言、云计算等场景。YAML 1.2 是 YAML 的最新版本,于 2009 年发布。 +> +> 相比于之前的版本,做了一些重要的改进和修正,包括: +> +> - 更严格的类型转换规则,以避免一些常见的类型转换错误。 +> - 支持 JSON,即任何有效的 JSON 文件也是一个有效的 YAML 1.2 文件。 +> - 更好的 Unicode 支持。 +> +> YAML 1.2 的官方文档可以在以下链接找到:[YAML 1.2 官方文档](http://yaml.org/spec/1.2/spec.html)。这份文档详细地描述了 YAML 1.2 的所有特性和语法规则。 + + + +### 1.0. 编码风格 + +- 键命名风格 + + 为使流水线描述文档风格统一,建议所有的键命名均采用"lower_case"的命名方式,尽量不使用大写字符,如下示例: + + ```yaml + this_is_a_key: value + jobs.this_is_a_job: job + ``` + + 注意:命名风格不等于命名规则,如果不遵循建议的键命名风格并不会出现错误。 + + + +- 一级key声明风格 + + 一级key的含义为整个YAML文档的第一级键,虽然第一级键无论以何种顺序排列不会影响解析结果,但基于统一风格的出发点考虑,建议用户按照如下顺序对一级key进行排列,且一级key之间通过一行空行间隔,如下示例: + + ```yaml + # 版本声明,可以不存在,则默认使用v1.0语法解析 + version: v1.1 + + # 流水线命名,必填 + name: + + # 流水线触发设置,可以不存在 + on: + + # 流水线变量,可以不存在 + vars: + + # 流水线额外事件声明,可以不存在 + events.xxx: + + # 流水线job声明,至少需要声明一个job + jobs.xxx: + + # 流水线控制流说明 + sequence: + xxx: + ``` + + 对于本版本流水线语法而言,一级key仅识别上述6类关键字,在这七种关键字之外的一级key将被忽略。如果某个关键字不存在,比如vars,剩余关键字建议仍保持上述先后顺序排列。 + + 对于关键字的含义和详细语法说明见后文。 + + + +- 每行文本长度 + + 为了保持良好的可读性,建议每行文本长度不要超过**80**个字符。这是一种常见的编程规范,可以使代码在大多数编辑器和终端中看起来更清晰。但这并不是强制性的规定,根据实际情况和个人习惯,可以适当调整。 + + 对于长文本,可以利用YAML的特性转行声明,如下示例: + + ```yaml + # 通过"|"语法保留换行符"\n" + key: | + this is a long long story, + you could learn it step by step. + # key = "this is a long long story,\nyou could learn it step by step." + + # 通过"|+"语法保留所有换行符"\n" + key: |+ + this is a long long story, + you could learn it step by step. + + + # key = "this is a long long story,\nyou could learn it step by step.\n\n\n" + + # 通过"|-"语法,去除末尾换行符"\n" + key: |- + this is a long long story, + you could learn it step by step. + + + # key = "this is a long long story,\nyou could learn it step by step." + + # 通过">"语法,虽然内容书写存在换行,但解析后的内容去除换行,以空格代替 + key: > + https://repo1/ + https://repo2/ + # key = "https://repo1/ https://repo2" + ``` + + 更多的说明请参考YAML1.2官方文档(见 **章节1. 语法特性** 开头) + + + +### 1.1. 基本声明 + +基本声明包含**version**与**name**两个一级key,前者用以锚定语法解析版本,后者用以标识流水线名称。 + +#### 1.1.1. 语法版本声明 + +workflow支持多版本语法解析,对于不声明version的workflow而言,采用默认版本语法(v1.0)进行解析。 +声明版本通过关键字version定义: + +```yaml +# 一般情况下,version会被声明在workflow.yaml的顶部 +# 但version的位置并不会影响解析,确保version并非嵌套于其他key下即可 + +version: v1.1 +``` + +根据声明的version的不同,请查阅不同版本的语法特性介绍。 + +#### 1.1.2. 流水线命名 + +无论在什么版本,workflow的名字均由name字段定义。流水线的名字不要求唯一,可以是任意**字符串**。 + +name为一定需要定义的key,如果流水线yaml中缺少这个key解析器将不予通过。 + +```yaml +version: v1.0 + +name: my workflow +``` + + + +### 1.2. 触发条件定义 + +#### 1.2.1. workflow支持三种触发方式 + + - 手动触发: 基本的触发方式,不传递触发事件数据的方式,触发后将提交所有不存在依赖的任务 + - 定时触发: 周期性自动触发方式,通过设定时间条件,系统自动触发。 + - webhook触发: git仓库配置服务的webhook回调地址等信息,通过webhook回调请求自动触发。 + + 定时触发和webhook触发方式通过关键字"on"定义,如果不需要这两种触发方式,on可以不声明。 + + + 示例: + + ```yaml + version: v1.1 + + name: workflow + + on: + # webhook触发事件定义 + - type: webhook/pr + git_repo: https://gitee.com/openeuler/radiaTest.git + branch: master + # 定时触发事件定义 + - type: cron + crontab: 0 15 10 ? * MON-FRI + + other_keys: other_values + ``` + +#### 1.2.2. webhook触发 + + webhook事件分为webhook/pr,webhook/push,webhook/note,webhook/issue四种类型。 + + - PullRequest类事件 + + 声明的事件键值对必须包含type(webhook/pr),git_repo(仓库地址),branch(仓库分支),action(场景,共8类) + + 即当对应的仓库分支存在PullRequest相关事件时,均会触发此流水线,包括新建PR/删除PR/合入PR/...等事件。 + + 如果需要对PR事件进行更细致的筛选,用户可以指定action进行过滤。 + + - Push类事件 + + 声明的事件键值对必须包含type(webhook/push),git_repo(仓库地址),branch(仓库分支) + + 即当对应的仓库分支被推送更新后,均会触发此流水线。 + + - 评论类事件 + + 声明的事件键值对必须包含type(webhook/note),git_repo(仓库地址),branch(仓库分支,仅支持PullRequest场景存在),notable_type(评论主体),notes(评论钩子) + + 即当对应的仓库分支的指定被评论主体(如PullRequest作为被评论主体)并且评论内容能够匹配评论钩子时,会触发此流水线。 + + - Issue类事件 + + 声明的事件键值对必须包含type(webhook/issue),git_repo(仓库地址),state(问题单状态) + + 即当对应的仓库分支存在Issue相关事件时,均会触发此流水线,包括新建issue/删除issue/...等事件。 + + state提供了对issue的场景细分能力,通过配置状态可以拆分不同场景。 + + + + webhook事件的声明支持矩阵式声明方式,从而帮助减少重复描述,如下所示: + + ```yaml +on: + - type: webhook/note + git_repo: https://gitee.com/openeuler/radiaTest.git + branch: + - master + - dev + - test + notable_type: + - PullRequest + notes: + - /retry + - /retest + ``` + +这个例子意味着多个分支下的PullReques被评论了"/retry"或"/retest"都会触发此流水线 + +以上关于webhook讲解较为粗略,建议阅读更详细的webhook配置文档,以便更好的使用: + +https://gitee.com/openeuler-customization/ods/blob/master/src/workflow_webhook/README.md + + + +特别说明: 如果不仅仅需要触发,还需要在流水线中引用(该特性将在后文详述)触发事件中的字段,建议编辑者通过查阅不同git仓库的webhook文档了解。 + + 1. Gitee: https://help.gitee.com/enterprise/code-manage/%E9%9B%86%E6%88%90%E4%B8%8E%E7%94%9F%E6%80%81/WebHook/WebHook%20%E7%AE%80%E4%BB%8B + + 2. Github: https://docs.github.com/webhooks + + 3. Gitlab: https://docs.gitlab.com/ee/user/project/integrations/webhooks.html + + +#### 1.2.3. 定时触发事件 + + 当配置的触发事件type字段为[cron,interval,date]值时,说明该事件为定时事件,对于某一个定时任务,type仅可为其中某一个取值,以下为简单示例: + + ```yaml +on: + - type: cron + crontab: 0 15 10 ? * MON-FRI + - type: interval + seconds: 60 + - type: date + run_date: 2024-01-01 00:00:00 + - type: date + run_date: 164900500 + ``` + +以上配置,意味着时间满足任意一个场景时,流水线被执行。 + +关于三种类型的定时参数,参看以下官方文档配置即可: + +https://apscheduler.readthedocs.io/en/stable/modules/triggers/cron.html + +https://apscheduler.readthedocs.io/en/stable/modules/triggers/interval.html + +https://apscheduler.readthedocs.io/en/stable/modules/triggers/date.html + +### 1.3. 流水线全局变量定义 + +流水线的全局变量通过vars字段声明,当前版本支持字符串、数组、对象(字典/哈希表)三种格式。 + +```yaml +version: v1.1 + +name: workflow + +vars: + # 字符串 + varA: string + # 数组 + varB: + - elementA + - elementB + # 对象(支持多级结构嵌套) + varC: + keyA: valueA + # 嵌套数组 + keyB: + - valueB1 + - valueB2 + # 嵌套对象 + keyC: + keyC1: +``` + +流水线变量定义的目的在于定义整个流水线可以利用的若干变量(常量),定义后的变量可以被流水线任意阶段任务引用,从而避免每个任务对于该变量的重复冗余声明。 + +举例而言,假设某个流水线的10个任务均需要上述案例的varB变量作为Input(输入/入参),则只需要引用varB赋予给对应参数即可。 + +具体的引用方式详见后文对于引用语法的介绍。 + +如果一个流水线不需要定义任何流水线变量时,vars关键字可以不存在: + +```yaml +version: v1.1 + +name: workflow + +other_keys: other_values +``` + +> 💡 注:v1.1相较上一个版本v1.0的新语法说明 + +除了在一级key中声明vars外,还可以在sequence中定义某些job特殊的vars,并且局部vars的优先级高于全局vars。 + +### 1.4. 额外事件声明 + +如果流水线内部的某个任务除了依赖于前置的任务外,还实际依赖于额外的webhook事件,或者依赖于一些额外的事件,则需要对这些额外的事件进行预声明。 + +用户通过events.xxx模式的key进行额外事件的声明,"xxx"为额外事件的命名。 + +**注意:**大多数情况下,用户不需要定义额外事件。额外事件不支持定时以及手动事件声明。 + +#### 1.4.1. webhook事件 + +当定义额外webhook事件时,该事件需求定义的key-values与上文流水线触发设置中介绍的一致,如下示例: + +```yaml +version: v1.1 + +name: workflow + +vars: + +events.eventA: + type: webhook/pr + git_repo: https://gitee.com/openeuler/repositry.git + branch: + - master + - dev +``` + +#### 1.4.2. job事件 + +除了webhook事件外,额外事件可以定义一种新的事件类型,即job类型事件。job类型事件分为stage和step两个子类,如下示例: + +```yaml +version: v1.1 + +name: workflow + +vars: + +# 额外事件A - jobA进入boot阶段 +events.eventA: + type: job/stage + job: jobA + job_stage: boot + +# 额外事件B - jobA进入名为testcase001的步骤 +events.eventB: + type: job/step + job: jobA + job_step: testcase001 + +# 额外事件C - jobA进入finish阶段且incomplete +events.eventC: + type: job/stage + job: jobA + job_stage: finish + job_health: incomplete +``` + +job类型的事件除了type和job必填外,其他字段可以根据需求从job类型事件的全集keys中选取(job_stage/job_health/job_step/nickname)。 + +定义后的job类型事件如何使用见后文任务定义和控制流声明章节。 + + + +### 1.5. 任务定义 + +#### 1.5.1. 基础概念 + +对于所有流水线涉及的任务,都需要通过jobs.xxx模式的一级key进行一次声明,主要目的为定义任务的Input(输入/入参),且每个被声明job的value均要求为对象(字典/哈希表)格式(或者为空)。 + +jobs.xxx类关键字常见的两种二级关键字为**defaults**和**overrides**,这两个二级key可以不声明,但如果具备value,则value必须为对象(字典/哈希表)格式,用以声明待提交的job即为模板所具备的所有参数。 + +如下所示: + +```yaml +name: workflow + +vars: + +# 空value的job缺省声明 +jobs.jobA: + +# 空defaults value, 空overrides value声明 +jobs.jobB: + defaults: + overrides: + +# 缺省defaults,overrides非空声明 +jobs.jobB: + overrides: + # 字符串 + keyA: valueA + # 数组 + keyB: + - valueB1 + - valueB2 + # 对象(支持多级嵌套) + keyC: + keyC1: valueC1 + keyC2: + - valueC2 +``` + + + +除了嵌套的声明方式,流水线语法支持扁平化的方式减少声明的难度,如下示例: + +```yaml +jobs.jobB: + overrides: + keyC.keyC1: valueC1 + +# 等价于 +jobs.jobB: + overrides: + keyC: + keyC1: valueC1 +``` + +**注意:**这种等价仅于defaults和overrides下有效。 + + + +defaults和overrides意义如字面含义所示,defaults中定义的key-values如果原job中存在对应key,则以原job中的value为实际提交value;overrides中定义的key-values将无条件覆盖到原job的值提交。 + +对于job的概念,以及原job.yaml的内容,建议查阅compass-ci/lkp-tests的文档进行了解: + +1. 如何向compass-ci/lkp-tests新增job: https://gitee.com/compass-ci/lkp-tests/blob/master/doc/add-testcase.zh.md +2. job的定义: https://gitee.com/compass-ci/lkp-tests/blob/master/jobs/README.md +3. job示例: https://gitee.com/compass-ci/lkp-tests/blob/master/programs/ltp/jobs/ltp-bm.yaml + + + +> 💡 注:v1.1相较上一个版本v1.0的新语法说明 + +在v1.1版语法中,允许jobs.XX方式定义job,在sequence控制流中并不全部引用,并不会解析报错。 + +#### 1.5.2. 任务别名定义 + +通常情况下,jobs.xxx模式中xxx即为被声明的job名,如如果计划声明一个ltp-bm的任务,则声明jobs.ltp-bm。但在某些流水线中,可能同一个任务需要运行多次,且任务实际的入参并不相同,因此设计多次声明的可能。在这种情况下,则需要利用"别名"语法特性。 + +别名的声明方式如下示例: + +```yaml +jobs.ltp-bm:first-ltp-bm: + +jobs.ltp-bm:second-ltp-bm: +``` + +这两个被声明的任务实际指向的都是ltp-bm这同一个job,但是因为别名所以流水线会将其看作两个不同的个体。 + + + +#### 1.5.3. 额外事件依赖声明 + +除了defaults和overrides两种常用的二级keys外,jobs.xxx还支持声明depends字段,本字段用以声明额外依赖(额外事件,即上文**章节1.3**内容的应用) + +```yaml +events.eventA: + xxx: xxx + +events.eventC: + xxx: xxx + +jobs.jobA: + defaults: + default_keyA: valueA + overrides: + override_keyB: valueB + depends: + # 额外依赖于上文中通过events.eventA声明的事件 + # 只需要写key:的形式,不需要填value,填了也会被忽略 + eventA: + # 且额外依赖于上文中通过events.eventC声明的事件 + eventC: +``` + +对于jobA而言,depends字段相当于定义了若干AND逻辑关系的额外依赖,当且仅当所有依赖的事件均发生后jobA才会被提交执行。 + +当depends不声明的时候,jobA的依赖仅取决于其处于控制流的位置(详见后文控制流声明),否则为控制流依赖于额外依赖的逻辑与结果。 + +``` +submit_jobA = [jobA's depends parsed from controlflow] AND [jobA's depends defined from 'depends'] +``` + + + +### 1.6. 控制流声明 + +#### 1.6.1. 基础特性 + +流水线任务的串并行结构通过控制流声明对已通过一级key”jobs.xxx“预声明的各个任务进行编排,通过一级key“sequence”定义,如下所示: + +```yaml +version: v1.1 + +name: workflow + +# jobA:first进入boot阶段 +events.eventA: + type: job/stage + job: jobA + nickname: first + job_stage: boot + +# 定义别名为first的jobA +jobs.jobA:first: + overrides: + +jobs.jobB: + overrides: + # 额外依赖于"jobA:first进入boot阶段"事件 + depends: + eventA: + +# 控制流声明 +sequence: + # 声明并行子结构 + parallel: + # jobA:first和jobB并行 + jobA:first: + jobB: +``` + + + +控制流声明中,存在sequence、parallel、matrix、vars四种关键字,除了关键字外,所有key都会被认作对已预声明的job的引用。对于所有job的引用,需要确保引用的job全称(包含别名)在流水线一级key中存在(以jobs.xxx预声明)。如果sequence中引用了jobA,但流水线一级key中缺少jobs.jobA这个key,解析器将不会给予通过。 + +**注意:**控制流声明的根(一级key)必须为sequence。 + +- 关键字sequence + + sequence意在声明一个串行结构,在sequence下的所有key将被解析为按声明顺序(从上到下)排列的一连串成员,每一个成员必然依赖于其上面一个的成员。 + + ``` + sequence: |---------| |---------| + jobA: = | jobA | =========> | jobB | + jobB: |---------| |---------| + ``` + + sequence和job一样,可以通过sequence:xxx:的方式定义别名,该别名仅在一级串行子结构中存在实际意义,非一级子结构的别名仅起标识作用,具体参考下文stage声明说明。 + + ```yaml + # 根sequence,不可添加别名,为控制流声明关键字 + sequence: + # 一级串行子结构,别名为seqA + sequence:seqA: + # 二级串行子结构,别名为seqB + sequence:seqB: + ``` + + 如上所示,sequence的key并不一定只能是job,当sequence内部的key同样是sequence时,意味着串行结构的嵌套。当然,对于纯sequence的嵌套是不具备实际意义的,仅为分组标识,单纯的串行嵌套相当于没有嵌套。 + + ``` + sequence: < - - - - - seqA - - - - - - > + sequence:seqA: |---------| |---------| |---------| + jobA: = | jobA | =========> | jobB | =========> | jobC | + jobB: |---------| |---------| |---------| + jobC: + ``` + +- 关键字parallel + + parallel意在声明一个并行结构,在parallel下的所有key将被解析为并列的若干成员,每一个成员都依赖于整个parallel都前置依赖,互相之间不存在控制流定义的依赖关系(可以存在通过depends额外声明的依赖,额外跳线依赖不被控制流声明控制) + + ``` + ___________|___________ + sequence: | | + parallel: |---------| |---------| + jobA: = | jobA | | jobB | + jobB: |---------| |---------| + |______________________| + | + ``` + + 注意,parallel一定不可以声明在workflow.yaml的一级key,对于控制流声明而言,根key一定是sequence。 + + 和sequence一致,parallel也可以以parallel:xxx:的方式定义别名,该别名同样仅在一级并行子结构中存在实际含义,非一级子结构的别名仅起标识作用,具体参考下文stage声明说明。 + + 同理,单纯的并行嵌套相当于没有嵌套,如下示例,等价于jobA、jobB、jobC三者并行。 + + ``` + ___ _________________|_______________ + sequence: | | _________|__________ ___ + parallel:prlA: | |---------| |----|----| |----|----| | + jobA: = prlA | jobA | | jobB | | jobC | prlB + parallel:prlB: | |---------| |----|----| |----|----| | + jobB: | | |__________________| _|_ + jobC: _|_ |_______________________________| + | + ``` + +- 关键字matrix + + matrix关键字将在1.7章节中详细说明。 + + + +> 💡 注:vars关键字为v1.1相较上一个版本v1.0的新语法 + +- 关键字vars + + sequence中声明的vars和一级key中vars的含义和用法完全相同,sequence中的vars存在局部作用域,并且局部作用域的优先级高于全局作用域的优先级,即声明位置越近优先级越高,常用于某些job要引用的变量与全局变量的值不同,需要覆盖全局定义变量的场景。另外,vars可以定义在任意sequence或parallel结构中。 + + 优先级的说明如下示例: + + ```yaml + # 全局定义的变量 + vars: + keyA: valueA + + jobs.job1: + overrides: + key1: ${{ vars.keyA }} + + jobs.job2: + overrides: + key1: ${{ vars.keyA }} + + jobs.job3: + overrides: + key1: ${{ vars.keyA }} + + sequence: + job1: + sequence:s1: + # 此处定义的vars只对sequence:s1结构生效 + vars: + keyA: valueB + job2: + parallel:p1: + # 此处定义的vars只对parallel:p1结构生效 + vars: + keyA: valueC + job3: + + # 各job变量引用的实际值: + # job1.key1 = valueA + # job2.key1 = valueB + # job3.key1 = valueC + + ``` + + + +#### 1.6.2. job的补充声明 + +由上文可知,job无论是defaults、overrides还是depends的声明,可以声明在jobs.xxx这个key之下,这也是比较推荐的用法。但其实在控制流声明中,用户可以对job进行补充声明,补充的声明将深层update到预定义的job声明中,如下所示: + +```yaml +name: workflow + +jobs.jobA: + defaults: + keyA: valueA + overrides: + keyB: + keyB1: valueB1 + keyB2: + keyB21: valueB21 + +sequence: + jobA: + overrides: + # 与上文中jobs下的overrides特性相同 + # 采用keyB.keyB2.keyB21和keyB.keyB2.keyB22为key,即 + # keyB.keyB2.keyB21: valueB21_new + # keyB.keyB2.keyB22: valueB22 + # 与下述声明方式等价 + keyB: + keyB2: + keyB21: valueB21_new + keyB22: valueB22 +``` + +在这个例子中,sequence中将jobA的预定义的overrides下的keyB21重新定义为valueB2_new,且在keyB2下新创建了一个keyB22的键值对。 + +补充声明特性以对象(字典/哈希表)的递归update实现,一定为控制流中的定义覆写jobs的预定义。 + + + +> 💡 注:以下为v1.1相较上一个版本v1.0的新语法 + +在sequence中除了可以补充定义job的overrides和defaults字段外,还支持定义always、if、unless执行条件语法。 + +- always关键字 + + always用于决定job是否一定会被提交。在常规控制流的依赖关系中,如果前置job执行失败,后面依赖它的job将会阻塞不会再被提交,如果在job中声明了**always: true**,前置job运行失败或者异常,后面的任务都会被提交。 + + ```yaml + sequence: + jobA: + jobB: + # jobB运行结束,运行结果成功、失败或是异常,jobC都会被提交运行 + jobC: + always: true + + ``` + + always关键字有一种语法糖的写法:jobX!,如下示例的写法和上面yaml作用相同: + + ```yaml + sequence: + jobA: + jobB: + # jobB运行结束,运行结果成功、失败或是异常,jobC都会被提交运行 + jobC!: + + ``` + +​ 另外,如果always关键字没有声明,缺省值取false。 + +- if/unless关键字 + + if和unless关键字用于控制job是否需要被提交,如果if条件判断的结果为true,job才会被提交,否则此job将被跳过,并将job_stage设置为finish,job_health设置为skipped,unless的判断逻辑和if正好相反。 + + ```yaml + sequence: + jobA: + jobB: + if: ${{ jobs.jobA.result.id }} # 如果jobA的result.id有值,jobB会被提交,否则jobB不会被运行 + jobC: + ``` + + 如果job中没有声明关键字if,缺省值取true;如果job中always和if关键字同时存在,优先判断if关键字的执行逻辑。 + + + +### 1.7. 流水线阶段(stage)声明 + +在**章节1.6. 控制流声明**中有提及,无论是sequence还是parallel分别可以通过sequence:xxx:和parallel:xxx:的形式声明别名。流水线web服务将基于下述规则划分控制流的不同阶段,规则如下所示: + +``` +1. 当且仅当sequence和parallel为根sequence下的一级结构时,其别名等同于阶段名。 +2. 当根sequence下存在job名时(非sequence也非parallel),该job以自身job的别名作为阶段名(若无别名则以job名)独立被识别为一个阶段。 +3. 阶段存在向后包裹的特点,直到下一个有效阶段声明前,所有结构属于同一个阶段。 +3. 沿着根sequence向下检索,在遇到第一个有效的stage命名之前,所有的结构均属于“未命名”阶段。 +sequence: + 阶段(stage) + parallel: ——| + job0: > 未命名 + job1: ——| + + job2:build-job: —— > build-job + + jobA: ——| + sequence: > jobA + jobB: ——| + + parallel:prlA: ——| + jobC: | + jobD: | + sequence:seqB: > prlA + jobE: | + jobF: ——| + + sequence:seqC: —— > seqC + jobG: ——| +``` + +p.s. 阶段仅会影响web端的渲染,控制流的实际意义不依赖于阶段的定义,换而言之,如果不考虑可视化的便利性,可以不对阶段命名深究。 + + + +### 1.8. Matrix语法特性 + +#### 1.8.1. 基本概念 + +用户可以在控制流**串行结构的任意位置**可以插入一个matrix关键字,用以混入(Mixin)局部的矩阵(参数组合),从而改变后续任务的上下文(Context)。 + +matrix关键字同样可以声明别名,用以避免对象(字典/哈希表)的重key异常,但除了区别外没有实际意义。 + +**注意:** matrix不能直接声明在parallel关键字下,只能声明在sequence关键字下。 + +matrix的声明结构一定为如下格式: + +```yaml +sequence: + matrix: + paramA: + - valueA1 + - valueA2 + - valueAn + paramB: + - valueB1 + - valueB2 +``` + +即,matrix是一个对象(字典/哈希表),且所有一级value均为数组(列表)。 + +上述例子中matrix的含义为,对所处位置的流水线上下文混入矩阵,其中paramA有三种可能的取值,paramB有两种可能的取值,即共3*2共6种取值组合。 + +``` +matrix: _ + paramA: | 1. paramA = valueA1; paramB = valueB1 + - valueA1 | 2. paramA = valueA1; paramB = valueB2 + - valueA2 | 3. paramA = valueA2; paramB = valueB1 + - valueAn => { 4. paramA = valueA2; paramB = valueB2 + paramB: | 5. paramA = valueAn; paramB = valueB1 + - valueB1 |_ 6. paramA = valueAn; paramB = valueB2 + - valueB2 +``` + +当流水线上下文混入(Mixin)一个局部的矩阵后,流水线的上下文将会根据参数取值组合的种数裂解成多个“分支”,每一个“分支”的上下文依据其中一种取值组合。当这个分支之后的任务直接引用上下文中的paramA时,会根据当前上下文的paramA取值,后续任务的驱动也会与其他”分支“独立。 + +举例而言: + +```yaml +sequence: + jobA: + parallel: + jobB: + sequence: + matrix: + arch: + - aarch64 + - x86_64 + jobC: + jobD: +``` + +根据 **章节1.5. 控制流声明** 的介绍,不难看出,这个描述声明的结构如下: + +``` + ________ + | | + |-------| jobB |----------------------------| + | |________| | + ________ | | ________ + | | | | | | +-------| jobA |-------| |-------| jobD |--------> + |________| | | |________| + | ________ | + | / \ | | | + |-------| matrix |-------| jobC |---------| + \ / |________| + 1. arch = aarch64; + 2. arch = x86_64 +``` + +对于jobA和jobB,如果他们在被提交的时候引用“当前上下文”(所谓当前为被提交的时间点)中的arch变量,他们将取不到任何值。 + +p.s. 关于引用的概念详见**章节1.6.** + +而对于jobC和jobD而言,他们实际上被裂解到了并行的两个“分支”上,其中一个分支上下文中的arch是aarch64而另一个分支上的arch是x86_64,即上述控制流结构等价于: + +``` +1. arch = aarch64; + ________ + | | + |----------| jobB |---------| + | |________| | + ________ | | ________ + | | | | | | +-------| jobA |-------| |-------| jobD |--------> + |________| | | |________| + | ________ | aarch64 + | | | | + |----------| jobC |---------| + |________| + aarch64 + +2. arch = x86_64 + ________ + | | + |----------| jobB |---------| + | |________| | + ________ | | ________ + | | | | | | +-------| jobA |-------| |-------| jobD |--------> + |________| | | |________| + | ________ | x86_64 + | | | | + |----------| jobC |---------| + |________| + x86_64 +``` + +这两个矩阵参数组合“分支”共享jobA和jobB的前置依赖,但aarch64的jobD只会依赖于aarch64的jobC,即各分支依赖独立。 + +这样避免了在很多场景下的相同结构的重复声明。 + +#### 1.8.2. 矩阵x矩阵 + +流水线控制流支持多matrix在不同位置声明,在这种情况下,下文矩阵受到上文矩阵影响,下文矩阵实际为上下文矩阵相乘的结果,如下示例: + +```yaml +sequence: + matrix:m1: + os: + - openeuler + os_version: + - 20.03 + - 22.03-LTS + jobA: + matrix:m2: + arch: + - aarch64 + - x86_64 + jobB: +``` + +对于这个例子而言,jobA共有两种上下文分支,而jobB共有4种,如下所示: + +``` + ________ ________ + | | | | +--------------| jobA |-----------------------------------------| jobB |-----------------------> + |________| |________| + 1. os=openeuler; os_version=20.03 1. os=openeuler;os_version=20.03;arch=aarch64; + 2. os=openeuler; os_version=22.03-LTS 2. os=openeuler;os_version=20.03;arch=x86_64; + 3. os=openeuler;os_version=22.03-LTS;arch=aarch64; + 4. os=openeuler;os_version=22.03-LTS;x86_64; +``` + +因此对于声明此例控制流的关于jobA和jobB的流水线,实际jobA将会被提交两次,jobB将会被提交4次,jobB的1和2分支依赖于jobA的1分支,jobB的3和4分支依赖于jobA的2分支,jobB的最终参数组合即jobA之前声明的matrix与jobB之前的matrix相乘的结果。 + +#### 1.8.3. excludes语法特性 + +matrix支持通过excludes声明排除特定的组合,如下所示: + +```yaml +sequence: + matrix: + os: + - openeuler + - centos + os_version: + - "20.03" + - 7 + excludes: + # 下述两种描述形式均可支持 + - {"os": "openeuler", "os_version": "7"} + - os: centos + os_version: "20.03" +``` + +此声明方式意为此矩阵只存在两种参数组合,即 ”os=openeuler;os_version=20.03“ 和 “os=centos;os_version=7”。 + +#### 1.8.4. 参数组合语法糖 + +同时,matrix具备一种简化excludes声明的语法糖"|",以上述样例可以改写为: + +```yaml +sequence: + matrix: + os|os_version: + - openeuler | 20.03 #有无空格或者制表符均支持 + - centos | 7 #推荐以制表符分隔,这样的声明较为直观 +``` + + +### 1.8. "引用"表达式声明 + +#### 1.8.1. 基本概念 + +对于一条正在运作的流水线而言,其上下文是动态的,每执行完成一个任务,每感知到一个有效事件,“当前”上下文都会发生变化。 + +流水线运行上下文(Context)由六个固定的namespace组成: + +- vars,流水线变量空间(流水线静态变量全集) +- event,事件空间(事件数据全集) +- jobs,任务空间(前置已完成的任务数据) +- matrix,矩阵空间(当前矩阵参数组合分支的参数集合) +- depends,未满足的依赖事件清单(此namespace一般不会被引用) +- fullfilled,已履行的依赖事件清单(此namespace一般不会被引用) + +流水线不仅仅支持对定量的声明,流水线具备“引用”的语法特性,可以对“当前上下文”的变量进行引用,以及进行字符串拼接和python表达式运算。 + +“引用”由模式 ${{ xxxx }} 识别,通过"."的方式获取不同namespace下的所有value,支持下述两种使用方式: + +- 字符串拼接引用 + + ```yaml + # 取vars空间中的varA变量的值,并且与vars空间中的varB变量的值,最后通过"-"拼接 + key: ${{ vars.varA }}-${{ vars.varB }} + # 取当前矩阵参数组合的os、os_version、arch拼接命名 + project_name: my_project:${{ matrix.os }}:${{ matrix.os_version }}:${{ matrix.arch }} + ``` + + 对于这种拼接引用的方式,需要用户确保引用变量的值一定是字符串。如果实际的值不为字符串或者无法转换为字符串,那么采用这种引用声明的job很可能无法正常提交。 + +- 单引用 + + 单引用的情况下,引用表达式的结果可以为字符串、数字、数组(列表)或者哈希表(字典),不受类型影响。 + + ```yaml + # 取vars空间中的数组arrayA,作为key的值 + key: ${{ vars.arrayA }} + # 取前置已完成的jobA的输出result.arrayB,作为key的值 + key: ${{ jobs.jobA.result.arrayB }} + ``` + +**注意:**引用特性仅支持在defaults和overrides下使用,即jobs.xxx下的defaults和overrides或者sequence下某个job的defaults和overrides。后续演进的语法版本中将加入"在matrix中引用vars变量"的支持。 + +#### 1.8.2. python语法支持 + +对于任意引用内部而言,在引用的变量被实际的值替换后,替换后的内容将会被当作python表达式运行,如下示例: + +```yaml +# 取多个不同namespace的变量进行数值运算 +key: ${{ vars.numA + jobs.jobA.result.success_num }} + +# 调用python datetime模块,获取年月日并拼接字符串 +project_name: ${{ vars.my_name }}-${{ datetime.datetime.now().year }}-{{ datetime.datetime.now().month }}-${{ datetime.datetime.now().day }} + +# 调用字符串处理方法,对字符串进行大小写转换,split等操作 +key: ${{ vars.stringA.lower() }} +key: ${{ vars.stringA.split(':') }} + +# 单纯通过python表达式计算数值,不对变量进行引用,如计算一天一共有多少秒 +key: ${{ 24*60*60 }} +``` + +支持的非内置Python模块: + +| 模块名 | 作用 | 官方文档链接 | +| ------ | ---- | ------------ | +| re | 提供正则表达式匹配操作 | https://docs.python.org/3/library/re.html | +| math | 提供数学运算函数 | https://docs.python.org/3/library/math.html | +| time | 提供时间相关函数 | https://docs.python.org/3/library/time.html | +| datetime | 提供日期和时间处理函数 | https://docs.python.org/3/library/datetime.html | + +支持的安全内置Python模块: + +| 类型 | 模块名 | +| ---- | ------ | +| 数据类型 | object, bool, int, float, complex, str, bytes, bytearray, tuple, list, set, frozenset, dict | +| 数学运算 | abs, round, pow, divmod | +| 迭代器 | iter, next | +| 集合操作 | len, sum, min, max, all, any, map, filter, zip, enumerate, sorted, reversed | +| 数字转换 | bin, hex, oct | +| 字符串格式化 | ascii, repr, chr, ord, format | +| 变量和内存 | dir, locals, globals, id, hash | +| 类型检查 | isinstance, issubclass, callable | + + + +## 2. workflow.yaml完整示例 + +以下是一个完整的workflow.yaml文件示例: + +```yaml +# 语法版本声明 +version: v1.0 + +# 流水线命名 +name: 每日构建 + +# 触发设置 +on: + # 设定定时触发事件,每天00:00触发 + - type: cron + week_day: + - Monday + - Tuesday + - Wednesday + - Thursday + - Friday + - Saturday + - Sunday + time: "00:00" + start_date: "2023-10-18" + +# 流水线变量设置 +vars: + eulermaker_account: account + eulermaker_password: passwd + os: os + os_version: version + +# 任务声明 +jobs.eulermaker-build-project:everything: + overrides: + project_name: ${{ vars.os }}-${{ vars.os_version }}:everything + build_type: full + build_arch: ${{ matrix.arch }} + secrets: + ACCOUNT: ${{ vars.eulermaker_account }} + PASSWORD: ${{ vars.eulermaker_password }} + testbox: vm-2p8g + +jobs.eulermaker-build-project:epol: + overrides: + project_name: ${{ vars.os }}-${{ vars.os_version }}:epol + build_type: full + build_arch: ${{ matrix.arch }} + secrets: + ACCOUNT: ${{ vars.eulermaker_account }} + PASSWORD: ${{ vars.eulermaker_password }} + testbox: vm-2p8g + +jobs.eulermaker-create-image: + overrides: + image_project_params: + pipeline_info: + pipeline_name: ${{ vars.os }}-${{ vars.os_version }}-${{ datetime.datetime.now().year }}-${{ datetime.datetime.now().month }}-${{ datetime.datetime.now().day }}-1 + group: dailybuild + category: standard + scene: cloud + image_format: qcow2 + arch: ${{ matrix.arch }} + image_config: + release_image_config: + repo_url: > + http://xxxxx/ + http://xxxx/ + http://xxx/ + http://xxxxxx/ + product: ${{ vars.os.lower() }} + version: ${{ vars.os_version }}-${{ datetime.datetime.now().year }}-${{ datetime.datetime.now().month }}-${{ datetime.datetime.now().day }} + secrets: + ACCOUNT: ${{ vars.eulermaker_account }} + PASSWORD: ${{ vars.eulermaker_password }} + testbox: vm-2p8g + +jobs.eulermaker-build-image: + overrides: + secrets: + ACCOUNT: ${{ vars.eulermaker_account }} + PASSWORD: ${{ vars.eulermaker_password }} + pipeline_id: ${{ jobs.eulermaker-create-image.result.id }} + testbox: vm-2p8g + runtime: ${{ 24*60*60 }} + +jobs.qcow2rootfs: + overrides: + qcow2rootfs.qcow2_os: ${{ jobs.eulermaker-build-image.result.product }} + qcow2rootfs.qcow2_arch: ${{ matrix.arch }} + qcow2rootfs.qcow2_version: ${{ jobs.eulermaker-build-image.result.version }} + qcow2rootfs.qcow2_url: ${{ jobs.eulermaker-build-image.result.image_link }} + qcow2rootfs.rootfs_protocol: nfs + qcow2rootfs.rootfs_server: "172.168.131.2" + qcow2rootfs.rootfs_path: os-rw + testbox: vm-2p32g + +jobs.mugen-smoke-baseinfo: + overrides: + os: ${{ jobs.qcow2rootfs.result.os }} + os_version: ${{ jobs.qcow2rootfs.result.version }} + os_mount: nfs + arch: ${{ jobs.qcow2rootfs.result.arch }} + testbox: vm-2p8g + +jobs.mugen-smoke-basic-os: + overrides: + os: ${{ jobs.qcow2rootfs.result.os }} + os_version: ${{ jobs.qcow2rootfs.result.version }} + os_mount: nfs + arch: ${{ jobs.qcow2rootfs.result.arch }} + testbox: vm-2p8g + +# 控制流声明 +sequence: + # 矩阵声明 + matrix: + arch: + - aarch64 + - x86_64 + # 并行子结构声明 + parallel:build: + eulermaker-build-project:everything: + eulermaker-build-project:epol: + # 串行子结构声明 + sequence:create-image: + eulermaker-create-image: + eulermaker-build-image: + qcow2rootfs: + parallel:AT: + mugen-smoke-baseinfo: + mugen-smoke-basic-os: +``` + + + +## 3. v1.1新语法 + +#### 3.1 sequence局部变量定义 + +控制流中新增对vars关键字的支持。sequence中声明的vars和一级key中vars的含义和用法完全相同,sequence中的vars存在局部作用域,并且局部作用域的优先级高于全局作用域的优先级,即声明位置越近优先级越高,常用于某些job要引用的变量与全局变量的值不同,需要覆盖全局定义变量的场景。另外,vars可以定义在任意sequence或parallel结构中。 + +优先级的说明如下示例: + +```yaml +# 全局定义的变量 +vars: + keyA: valueA + +jobs.job1: + overrides: + key1: ${{ vars.keyA }} + +jobs.job2: + overrides: + key1: ${{ vars.keyA }} + +jobs.job3: + overrides: + key1: ${{ vars.keyA }} + +sequence: + job1: + sequence:s1: + # 此处定义的vars只对sequence:s1结构生效 + vars: + keyA: valueB + job2: + parallel:p1: + # 此处定义的vars只对parallel:p1结构生效 + vars: + keyA: valueC + job3: + +# 各job变量引用的实际值: +# job1.key1 = valueA +# job2.key1 = valueB +# job3.key1 = valueC + +``` + +#### 3.2 判断job执行条件 + +在sequence中除了可以补充定义job的overrides和defaults字段外,还支持定义always、if、unless执行条件语法。 + +- always关键字 + + always用于决定job是否一定会被提交。在常规控制流的依赖关系中,如果前置job执行失败,后面依赖它的job将会阻塞不会再被提交,如果在job中声明了**always: true**,前置job运行失败或者异常,后面的任务都会被提交。 + + ```yaml + sequence: + jobA: + jobB: + # jobB运行结束,运行结果成功、失败或是异常,jobC都会被提交运行 + jobC: + always: true + + ``` + + always关键字有一种语法糖的写法:jobX!,如下示例的写法和上面yaml作用相同: + + ```yaml + sequence: + jobA: + jobB: + # jobB运行结束,运行结果成功、失败或是异常,jobC都会被提交运行 + jobC!: + + ``` + +​ 另外,如果always关键字没有声明,缺省值取false。 + +- if/unless关键字 + + if和unless关键字用于控制job是否需要被提交,如果if条件判断的结果为true,job才会被提交,否则此job将被跳过,并将job_stage设置为finish,job_health设置为skipped,unless的判断逻辑和if正好相反。 + + ```yaml + sequence: + jobA: + jobB: + if: ${{ jobs.jobA.result.id }} # 如果jobA的result.id有值,jobB会被提交,否则jobB不会被运行 + jobC: + ``` + + 如果job中没有声明关键字if,缺省值取true;如果job中always和if关键字同时存在,优先判断if关键字的执行逻辑。 + +#### 3.3 job声明不选用 + +通过jobs.jobX方式声明的job不一定全部在sequence中选用。 + +```yaml +jobs.jobA: +jobs.jobB: + +sequence: + jobB: +``` diff --git a/docs/zh/docs/Ods-Pipeline/grammar/v1/v1.0_grammar.md b/docs/zh/docs/Ods-Pipeline/grammar/v1/v1.0_grammar.md new file mode 100644 index 0000000000000000000000000000000000000000..a96793f285323c6fd578383c45bbe0dec03688c3 --- /dev/null +++ b/docs/zh/docs/Ods-Pipeline/grammar/v1/v1.0_grammar.md @@ -0,0 +1,1093 @@ +# 从零开始编写workflow.yaml + +workflow,即流水线,是一连串具备一定串并联关系的任务组合,描述一连串的任务之间存在的依赖关系、输入输出参数,以及整个流水线的触发条件。 + +不同版本具备不同的语法规则,从零编写和学会workflow的声明方法请详阅对应版本的文档,避免无法正常解析。 + +## 版本信息 + +| 版本 | v1.0 | +| -------- | ------------------------------------- | +| 维护者 | Ethan-Zhang(ethanzhang55@outlook.com) | +| 创建时间 | 2023-09-30 | +| 是否废弃 | 否 | + + + +## 1. 语法特性 + +流水线通过YAML描述,描述文件的YAML语法规则基于YAML 1.2版本,书写时需要基于YAML 1.2支持的书写方式进行书写。后文中提及的语法特性是描述的基于此之上的解析规则,不涉及YAML 1.2语法的说明。 + +> YAML(YAML Ain't Markup Language)是一种人类可读的数据序列化标准,它被广泛用于配置文件、数据交换语言、云计算等场景。YAML 1.2 是 YAML 的最新版本,于 2009 年发布。 +> +> 相比于之前的版本,做了一些重要的改进和修正,包括: +> +> - 更严格的类型转换规则,以避免一些常见的类型转换错误。 +> - 支持 JSON,即任何有效的 JSON 文件也是一个有效的 YAML 1.2 文件。 +> - 更好的 Unicode 支持。 +> +> YAML 1.2 的官方文档可以在以下链接找到:[YAML 1.2 官方文档](http://yaml.org/spec/1.2/spec.html)。这份文档详细地描述了 YAML 1.2 的所有特性和语法规则。 + + + +### 1.0. 编码风格 + +- 键命名风格 + + 为使流水线描述文档风格统一,建议所有的键命名均采用"lower_case"的命名方式,尽量不使用大写字符,如下示例: + + ```yaml + this_is_a_key: value + jobs.this_is_a_job: job + ``` + + 注意:命名风格不等于命名规则,如果不遵循建议的键命名风格并不会出现错误。 + + + +- 一级key声明风格 + + 一级key的含义为整个YAML文档的第一级键,虽然第一级键无论以何种顺序排列不会影响解析结果,但基于统一风格的出发点考虑,建议用户按照如下顺序对一级key进行排列,且一级key之间通过一行空行间隔,如下示例: + + ```yaml + # 版本声明,可以不存在 + version: + + # 流水线命名 + name: + + # 流水线触发设置,可以不存在 + on: + + # 流水线变量,可以不存在 + vars: + + # 流水线额外事件声明,可以不存在 + events.xxx: + + # 流水线job声明,至少需要声明一个job + jobs.xxx: + + # 流水线控制流说明 + sequence: + xxx: + ``` + + 对于本版本流水线语法而言,一级key仅识别上述6类关键字,在这七种关键字之外的一级key将被忽略。如果某个关键字不存在,比如vars,剩余关键字建议仍保持上述先后顺序排列。 + + 对于关键字的含义和详细语法说明见后文。 + + + +- 每行文本长度 + + 为了保持良好的可读性,建议每行文本长度不要超过**80**个字符。这是一种常见的编程规范,可以使代码在大多数编辑器和终端中看起来更清晰。但这并不是强制性的规定,根据实际情况和个人习惯,可以适当调整。 + + 对于长文本,可以利用YAML的特性转行声明,如下示例: + + ```yaml + # 通过"|"语法保留换行符"\n" + key: | + this is a long long story, + you could learn it step by step. + # key = "this is a long long story,\nyou could learn it step by step." + + # 通过"|+"语法保留所有换行符"\n" + key: |+ + this is a long long story, + you could learn it step by step. + + + # key = "this is a long long story,\nyou could learn it step by step.\n\n\n" + + # 通过"|-"语法,去除末尾换行符"\n" + key: |- + this is a long long story, + you could learn it step by step. + + + # key = "this is a long long story,\nyou could learn it step by step." + + # 通过">"语法,虽然内容书写存在换行,但解析后的内容去除换行,以空格代替 + key: > + https://repo1/ + https://repo2/ + # key = "https://repo1/ https://repo2" + ``` + + 更多的说明请参考YAML1.2官方文档(见 **章节1. 语法特性** 开头) + + + +### 1.1. 基本声明 + +基本声明包含**version**与**name**两个一级key,前者用以锚定语法解析版本,后者用以标识流水线名称。 + +#### 1.1.1. 语法版本声明 + +workflow支持多版本语法解析,对于不声明version的workflow而言,采用默认版本语法(v1.0)进行解析。 +声明版本通过关键字version定义: + +```yaml +# 一般情况下,version会被声明在workflow.yaml的顶部 +# 但version的位置并不会影响解析,确保version并非嵌套于其他key下即可 + +version: v1.0 +``` + +根据声明的version的不同,请查阅不同版本的语法特性介绍。 + +#### 1.1.2. 流水线命名 + +无论在什么版本,workflow的名字均由name字段定义。流水线的名字不要求唯一,可以是任意**字符串**。 + +name为一定需要定义的key,如果流水线yaml中缺少这个key解析器将不予通过。 + +```yaml +version: v1.0 + +name: my workflow +``` + + + +### 1.2. 触发条件定义 + +#### 1.2.1. workflow支持三种触发方式 + + - 手动触发: 基本的触发方式,不传递触发事件数据的方式,触发后将提交所有不存在依赖的任务 + - 定时触发: 周期性自动触发方式,通过设定时间条件,系统自动触发。 + - webhook触发: git仓库配置服务的webhook回调地址等信息,通过webhook回调请求自动触发。 + + 定时触发和webhook触发方式通过关键字"on"定义,如果不需要这两种触发方式,on可以不声明。 + + + 示例: + +```yaml + version: v1.1 + + name: workflow + + on: + # webhook触发事件定义 + - type: webhook/pr + git_repo: https://gitee.com/openeuler/radiaTest.git + branch: master + # 定时触发事件定义 + - type: cron + crontab: 0 15 10 ? * MON-FRI + + other_keys: other_values +``` + +#### 1.2.2. webhook触发 + + webhook事件分为webhook/pr,webhook/push,webhook/note,webhook/issue四种类型。 + + - PullRequest类事件 + + 声明的事件键值对必须包含type(webhook/pr),git_repo(仓库地址),branch(仓库分支),action(场景,共8类) + + 即当对应的仓库分支存在PullRequest相关事件时,均会触发此流水线,包括新建PR/删除PR/合入PR/...等事件。 + + 如果需要对PR事件进行更细致的筛选,用户可以指定action进行过滤。 + + - Push类事件 + + 声明的事件键值对必须包含type(webhook/push),git_repo(仓库地址),branch(仓库分支) + + 即当对应的仓库分支被推送更新后,均会触发此流水线。 + + - 评论类事件 + + 声明的事件键值对必须包含type(webhook/note),git_repo(仓库地址),branch(仓库分支,仅支持PullRequest场景存在),notable_type(评论主体),notes(评论钩子) + + 即当对应的仓库分支的指定被评论主体(如PullRequest作为被评论主体)并且评论内容能够匹配评论钩子时,会触发此流水线。 + + - Issue类事件 + + 声明的事件键值对必须包含type(webhook/issue),git_repo(仓库地址),state(问题单状态) + + 即当对应的仓库分支存在Issue相关事件时,均会触发此流水线,包括新建issue/删除issue/...等事件。 + + state提供了对issue的场景细分能力,通过配置状态可以拆分不同场景。 + + + + webhook事件的声明支持矩阵式声明方式,从而帮助减少重复描述,如下所示: + +```yaml +on: + - type: webhook/note + git_repo: https://gitee.com/openeuler/radiaTest.git + branch: + - master + - dev + - test + notable_type: + - PullRequest + notes: + - /retry + - /retest +``` + +这个例子意味着多个分支下的PullReques被评论了"/retry"或"/retest"都会触发此流水线 + +以上关于webhook讲解较为粗略,建议阅读更详细的webhook配置文档,以便更好的使用: + +https://gitee.com/openeuler-customization/ods/blob/master/src/workflow_webhook/README.md + + + +特别说明: 如果不仅仅需要触发,还需要在流水线中引用(该特性将在后文详述)触发事件中的字段,建议编辑者通过查阅不同git仓库的webhook文档了解。 + + 1. Gitee: https://help.gitee.com/enterprise/code-manage/%E9%9B%86%E6%88%90%E4%B8%8E%E7%94%9F%E6%80%81/WebHook/WebHook%20%E7%AE%80%E4%BB%8B + + 2. Github: https://docs.github.com/webhooks + + 3. Gitlab: https://docs.gitlab.com/ee/user/project/integrations/webhooks.html + + +#### 1.2.3. 定时触发事件 + + 当配置的触发事件type字段为[cron,interval,date]值时,说明该事件为定时事件,对于某一个定时任务,type仅可为其中某一个取值,以下为简单示例: + +```yaml +on: + - type: cron + crontab: 0 15 10 ? * MON-FRI + - type: interval + seconds: 60 + - type: date + run_date: 2024-01-01 00:00:00 + - type: date + run_date: 164900500 +``` + +以上配置,意味着时间满足任意一个场景时,流水线被执行。 + +关于三种类型的定时参数,参看以下官方文档配置即可: + +https://apscheduler.readthedocs.io/en/stable/modules/triggers/cron.html + +https://apscheduler.readthedocs.io/en/stable/modules/triggers/interval.html + +https://apscheduler.readthedocs.io/en/stable/modules/triggers/date.html + + + +### 1.3. 流水线全局变量定义 + +流水线的全局变量通过vars字段声明,当前版本支持字符串、数组、对象(字典/哈希表)三种格式。 + +```yaml +version: v1.0 + +name: workflow + +vars: + # 字符串 + varA: string + # 数组 + varB: + - elementA + - elementB + # 对象(支持多级结构嵌套) + varC: + keyA: valueA + # 嵌套数组 + keyB: + - valueB1 + - valueB2 + # 嵌套对象 + keyC: + keyC1: +``` + +流水线变量定义的目的在于定义整个流水线可以利用的若干变量(常量),定义后的变量可以被流水线任意阶段任务引用,从而避免每个任务对于该变量的重复冗余声明。 + +举例而言,假设某个流水线的10个任务均需要上述案例的varB变量作为Input(输入/入参),则只需要引用varB赋予给对应参数即可。 + +具体的引用方式详见后文对于引用语法的介绍。 + + +如果一个流水线不需要定义任何流水线变量时,vars关键字可以不存在: + +```yaml +version: v1.0 + +name: workflow + +other_keys: other_values +``` + + + +### 1.4. 额外事件声明 + +如果流水线内部的某个任务除了依赖于前置的任务外,还实际依赖于额外的webhook事件,或者依赖于一些额外的事件,则需要对这些额外的事件进行预声明。 + +用户通过events.xxx模式的key进行额外事件的声明,"xxx"为额外事件的命名。 + +**注意:**大多数情况下,用户不需要定义额外事件。额外事件不支持定时以及手动事件声明。 + +#### 1.4.1. webhook事件 + +当定义额外webhook事件时,该事件需求定义的key-values与上文流水线触发设置中介绍的一致,如下示例: + +```yaml +version: v1.0 + +name: workflow + +vars: + +events.eventA: + type: webhook/pr + git_repo: https://gitee.com/openeuler/repositry.git + branch: + - master + - dev +``` + +#### 1.4.2. job事件 + +除了webhook事件外,额外事件可以定义一种新的事件类型,即job类型事件。job类型事件分为stage和step两个子类,如下示例: + +```yaml +version: v1.0 + +name: workflow + +vars: + +# 额外事件A - jobA进入boot阶段 +events.eventA: + type: job/stage + job: jobA + job_stage: boot + +# 额外事件B - jobA进入名为testcase001的步骤 +events.eventB: + type: job/step + job: jobA + job_step: testcase001 + +# 额外事件C - jobA进入finish阶段且incomplete +events.eventC: + type: job/stage + job: jobA + job_stage: finish + job_health: incomplete +``` + +job类型的事件除了type和job必填外,其他字段可以根据需求从job类型事件的全集keys中选取(job_stage/job_health/job_step/nickname)。 + +定义后的job类型事件如何使用见后文任务定义和控制流声明章节。 + + + +### 1.5. 任务定义 + +#### 1.5.1. 基础概念 + +对于所有流水线涉及的任务,都需要通过jobs.xxx模式的一级key进行一次声明,主要目的为定义任务的Input(输入/入参),且每个被声明job的value均要求为对象(字典/哈希表)格式(或者为空)。 + +jobs.xxx类关键字常见的两种二级关键字为**defaults**和**overrides**,这两个二级key可以不声明,但如果具备value,则value必须为对象(字典/哈希表)格式,用以声明待提交的job即为模板所具备的所有参数。 + +如下所示: + +```yaml +name: workflow + +vars: + +# 空value的job缺省声明 +jobs.jobA: + +# 空defaults value, 空overrides value声明 +jobs.jobB: + defaults: + overrides: + +# 缺省defaults,overrides非空声明 +jobs.jobB: + overrides: + # 字符串 + keyA: valueA + # 数组 + keyB: + - valueB1 + - valueB2 + # 对象(支持多级嵌套) + keyC: + keyC1: valueC1 + keyC2: + - valueC2 +``` + + + +除了嵌套的声明方式,流水线语法支持扁平化的方式减少声明的难度,如下示例: + +```yaml +jobs.jobB: + overrides: + keyC.keyC1: valueC1 + +# 等价于 +jobs.jobB: + overrides: + keyC: + keyC1: valueC1 +``` + +**注意:**这种等价仅于defaults和overrides下有效。 + + + +defaults和overrides意义如字面含义所示,defaults中定义的key-values如果原job中存在对应key,则以原job中的value为实际提交value;overrides中定义的key-values将无条件覆盖到原job的值提交。 + +对于job的概念,以及原job.yaml的内容,建议查阅compass-ci/lkp-tests的文档进行了解: + +1. 如何向compass-ci/lkp-tests新增job: https://gitee.com/compass-ci/lkp-tests/blob/master/doc/add-testcase.zh.md +2. job的定义: https://gitee.com/compass-ci/lkp-tests/blob/master/jobs/README.md +3. job示例: https://gitee.com/compass-ci/lkp-tests/blob/master/programs/ltp/jobs/ltp-bm.yaml + + + +#### 1.5.2. 任务别名定义 + +通常情况下,jobs.xxx模式中xxx即为被声明的job名,如如果计划声明一个ltp-bm的任务,则声明jobs.ltp-bm。但在某些流水线中,可能同一个任务需要运行多次,且任务实际的入参并不相同,因此设计多次声明的可能。在这种情况下,则需要利用"别名"语法特性。 + +别名的声明方式如下示例: + +```yaml +jobs.ltp-bm:first-ltp-bm: + +jobs.ltp-bm:second-ltp-bm: +``` + +这两个被声明的任务实际指向的都是ltp-bm这同一个job,但是因为别名所以流水线会将其看作两个不同的个体。 + + + +#### 1.5.3. 额外事件依赖声明 + +除了defaults和overrides两种常用的二级keys外,jobs.xxx还支持声明depends字段,本字段用以声明额外依赖(额外事件,即上文**章节1.3**内容的应用) + +```yaml +events.eventA: + xxx: xxx + +events.eventC: + xxx: xxx + +jobs.jobA: + defaults: + default_keyA: valueA + overrides: + override_keyB: valueB + depends: + # 额外依赖于上文中通过events.eventA声明的事件 + # 只需要写key:的形式,不需要填value,填了也会被忽略 + eventA: + # 且额外依赖于上文中通过events.eventC声明的事件 + eventC: +``` + +对于jobA而言,depends字段相当于定义了若干AND逻辑关系的额外依赖,当且仅当所有依赖的事件均发生后jobA才会被提交执行。 + +当depends不声明的时候,jobA的依赖仅取决于其处于控制流的位置(详见后文控制流声明),否则为控制流依赖于额外依赖的逻辑与结果。 + +``` +submit_jobA = [jobA's depends parsed from controlflow] AND [jobA's depends defined from 'depends'] +``` + + + +### 1.6. 控制流声明 + +#### 1.6.1. 基础特性 + +流水线任务的串并行结构通过控制流声明对已通过一级key”jobs.xxx“预声明的各个任务进行编排,通过一级key“sequence”定义,如下所示: + +```yaml +version: v1.0 + +name: workflow + +# jobA:first进入boot阶段 +events.eventA: + type: job/stage + job: jobA + nickname: first + job_stage: boot + +# 定义别名为first的jobA +jobs.jobA:first: + overrides: + +jobs.jobB: + overrides: + # 额外依赖于"jobA:first进入boot阶段"事件 + depends: + eventA: + +# 控制流声明 +sequence: + # 声明并行子结构 + parallel: + # jobA:first和jobB并行 + jobA:first: + jobB: +``` + +控制流声明中,存在sequence、parallel、matrix三种关键字,除了关键字外,所有key都会被认作对已预声明的job的引用。对于所有job的引用,需要确保引用的job全称(包含别名)在流水线一级key中存在(以jobs.xxx预声明)。如果sequence中引用了jobA,但流水线一级key中缺少jobs.jobA这个key,解析器将不会给予通过。 + +**注意:**控制流声明的根(一级key)必须为sequence。 + +- 关键字sequence + + sequence意在声明一个串行结构,在sequence下的所有key将被解析为按声明顺序(从上到下)排列的一连串成员,每一个成员必然依赖于其上面一个的成员。 + + ``` + sequence: |---------| |---------| + jobA: = | jobA | =========> | jobB | + jobB: |---------| |---------| + ``` + + sequence和job一样,可以通过sequence:xxx:的方式定义别名,该别名仅在一级串行子结构中存在实际意义,非一级子结构的别名仅起标识作用,具体参考下文stage声明说明。 + + ```yaml + # 根sequence,不可添加别名,为控制流声明关键字 + sequence: + # 一级串行子结构,别名为seqA + sequence:seqA: + # 二级串行子结构,别名为seqB + sequence:seqB: + ``` + + 如上所示,sequence的key并不一定只能是job,当sequence内部的key同样是sequence时,意味着串行结构的嵌套。当然,对于纯sequence的嵌套是不具备实际意义的,仅为分组标识,单纯的串行嵌套相当于没有嵌套。 + + ``` + sequence: < - - - - - seqA - - - - - - > + sequence:seqA: |---------| |---------| |---------| + jobA: = | jobA | =========> | jobB | =========> | jobC | + jobB: |---------| |---------| |---------| + jobC: + ``` + +- 关键字parallel + + parallel意在声明一个并行结构,在parallel下的所有key将被解析为并列的若干成员,每一个成员都依赖于整个parallel都前置依赖,互相之间不存在控制流定义的依赖关系(可以存在通过depends额外声明的依赖,额外跳线依赖不被控制流声明控制) + + ``` + ___________|___________ + sequence: | | + parallel: |---------| |---------| + jobA: = | jobA | | jobB | + jobB: |---------| |---------| + |______________________| + | + ``` + + 注意,parallel一定不可以声明在workflow.yaml的一级key,对于控制流声明而言,根key一定是sequence。 + + 和sequence一致,parallel也可以以parallel:xxx:的方式定义别名,该别名同样仅在一级并行子结构中存在实际含义,非一级子结构的别名仅起标识作用,具体参考下文stage声明说明。 + + 同理,单纯的并行嵌套相当于没有嵌套,如下示例,等价于jobA、jobB、jobC三者并行。 + + ``` + ___ _________________|_______________ + sequence: | | _________|__________ ___ + parallel:prlA: | |---------| |----|----| |----|----| | + jobA: = prlA | jobA | | jobB | | jobC | prlB + parallel:prlB: | |---------| |----|----| |----|----| | + jobB: | | |__________________| _|_ + jobC: _|_ |_______________________________| + | + ``` + +#### 1.6.2. job的补充声明 + +由上文可知,job无论是defaults、overrides还是depends的声明,可以声明在jobs.xxx这个key之下,这也是比较推荐的用法。但其实在控制流声明中,用户可以对job进行补充声明,补充的声明将深层update到预定义的job声明中,如下所示: + +```yaml +name: workflow + +jobs.jobA: + defaults: + keyA: valueA + overrides: + keyB: + keyB1: valueB1 + keyB2: + keyB21: valueB21 + +sequence: + jobA: + overrides: + # 与上文中jobs下的overrides特性相同 + # 采用keyB.keyB2.keyB21和keyB.keyB2.keyB22为key,即 + # keyB.keyB2.keyB21: valueB21_new + # keyB.keyB2.keyB22: valueB22 + # 与下述声明方式等价 + keyB: + keyB2: + keyB21: valueB21_new + keyB22: valueB22 +``` + +在这个例子中,sequence中将jobA的预定义的overrides下的keyB21重新定义为valueB2_new,且在keyB2下新创建了一个keyB22的键值对。 + +补充声明特性以对象(字典/哈希表)的递归update实现,一定为控制流中的定义覆写jobs的预定义。 + + + +### 1.7. 流水线阶段(stage)声明 + +在**章节1.6. 控制流声明**中有提及,无论是sequence还是parallel分别可以通过sequence:xxx:和parallel:xxx:的形式声明别名。流水线web服务将基于下述规则划分控制流的不同阶段,规则如下所示: + +``` +1. 当且仅当sequence和parallel为根sequence下的一级结构时,其别名等同于阶段名。 +2. 当根sequence下存在job名时(非sequence也非parallel),该job以自身job的别名作为阶段名(若无别名则以job名)独立被识别为一个阶段。 +3. 阶段存在向后包裹的特点,直到下一个有效阶段声明前,所有结构属于同一个阶段。 +3. 沿着根sequence向下检索,在遇到第一个有效的stage命名之前,所有的结构均属于“未命名”阶段。 +sequence: + 阶段(stage) + parallel: ——| + job0: > 未命名 + job1: ——| + + job2:build-job: —— > build-job + + jobA: ——| + sequence: > jobA + jobB: ——| + + parallel:prlA: ——| + jobC: | + jobD: | + sequence:seqB: > prlA + jobE: | + jobF: ——| + + sequence:seqC: —— > seqC + jobG: ——| +``` + +p.s. 阶段仅会影响web端的渲染,控制流的实际意义不依赖于阶段的定义,换而言之,如果不考虑可视化的便利性,可以不对阶段命名深究。 + + + +### 1.8. Matrix语法特性 + +#### 1.8.1. 基本概念 + +用户可以在控制流**串行结构的任意位置**可以插入一个matrix关键字,用以混入(Mixin)局部的矩阵(参数组合),从而改变后续任务的上下文(Context)。 + +matrix关键字同样可以声明别名,用以避免对象(字典/哈希表)的重key异常,但除了区别外没有实际意义。 + +**注意:** matrix不能直接声明在parallel关键字下,只能声明在sequence关键字下。 + +matrix的声明结构一定为如下格式: + +```yaml +sequence: + matrix: + paramA: + - valueA1 + - valueA2 + - valueAn + paramB: + - valueB1 + - valueB2 +``` + +即,matrix是一个对象(字典/哈希表),且所有一级value均为数组(列表)。 + +上述例子中matrix的含义为,对所处位置的流水线上下文混入矩阵,其中paramA有三种可能的取值,paramB有两种可能的取值,即共3*2共6种取值组合。 + +``` +matrix: _ + paramA: | 1. paramA = valueA1; paramB = valueB1 + - valueA1 | 2. paramA = valueA1; paramB = valueB2 + - valueA2 | 3. paramA = valueA2; paramB = valueB1 + - valueAn => { 4. paramA = valueA2; paramB = valueB2 + paramB: | 5. paramA = valueAn; paramB = valueB1 + - valueB1 |_ 6. paramA = valueAn; paramB = valueB2 + - valueB2 +``` + +当流水线上下文混入(Mixin)一个局部的矩阵后,流水线的上下文将会根据参数取值组合的种数裂解成多个“分支”,每一个“分支”的上下文依据其中一种取值组合。当这个分支之后的任务直接引用上下文中的paramA时,会根据当前上下文的paramA取值,后续任务的驱动也会与其他”分支“独立。 + +举例而言: + +```yaml +sequence: + jobA: + parallel: + jobB: + sequence: + matrix: + arch: + - aarch64 + - x86_64 + jobC: + jobD: +``` + +根据 **章节1.5. 控制流声明** 的介绍,不难看出,这个描述声明的结构如下: + +``` + ________ + | | + |-------| jobB |----------------------------| + | |________| | + ________ | | ________ + | | | | | | +-------| jobA |-------| |-------| jobD |--------> + |________| | | |________| + | ________ | + | / \ | | | + |-------| matrix |-------| jobC |---------| + \ / |________| + 1. arch = aarch64; + 2. arch = x86_64 +``` + +对于jobA和jobB,如果他们在被提交的时候引用“当前上下文”(所谓当前为被提交的时间点)中的arch变量,他们将取不到任何值。 + +p.s. 关于引用的概念详见**章节1.6.** + +而对于jobC和jobD而言,他们实际上被裂解到了并行的两个“分支”上,其中一个分支上下文中的arch是aarch64而另一个分支上的arch是x86_64,即上述控制流结构等价于: + +``` +1. arch = aarch64; + ________ + | | + |----------| jobB |---------| + | |________| | + ________ | | ________ + | | | | | | +-------| jobA |-------| |-------| jobD |--------> + |________| | | |________| + | ________ | aarch64 + | | | | + |----------| jobC |---------| + |________| + aarch64 + +2. arch = x86_64 + ________ + | | + |----------| jobB |---------| + | |________| | + ________ | | ________ + | | | | | | +-------| jobA |-------| |-------| jobD |--------> + |________| | | |________| + | ________ | x86_64 + | | | | + |----------| jobC |---------| + |________| + x86_64 +``` + +这两个矩阵参数组合“分支”共享jobA和jobB的前置依赖,但aarch64的jobD只会依赖于aarch64的jobC,即各分支依赖独立。 + +这样避免了在很多场景下的相同结构的重复声明。 + +#### 1.8.2. 矩阵x矩阵 + +流水线控制流支持多matrix在不同位置声明,在这种情况下,下文矩阵受到上文矩阵影响,下文矩阵实际为上下文矩阵相乘的结果,如下示例: + +```yaml +sequence: + matrix:m1: + os: + - openeuler + os_version: + - 20.03 + - 22.03-LTS + jobA: + matrix:m2: + arch: + - aarch64 + - x86_64 + jobB: +``` + +对于这个例子而言,jobA共有两种上下文分支,而jobB共有4种,如下所示: + +``` + ________ ________ + | | | | +--------------| jobA |-----------------------------------------| jobB |-----------------------> + |________| |________| + 1. os=openeuler; os_version=20.03 1. os=openeuler;os_version=20.03;arch=aarch64; + 2. os=openeuler; os_version=22.03-LTS 2. os=openeuler;os_version=20.03;arch=x86_64; + 3. os=openeuler;os_version=22.03-LTS;arch=aarch64; + 4. os=openeuler;os_version=22.03-LTS;x86_64; +``` + +因此对于声明此例控制流的关于jobA和jobB的流水线,实际jobA将会被提交两次,jobB将会被提交4次,jobB的1和2分支依赖于jobA的1分支,jobB的3和4分支依赖于jobA的2分支,jobB的最终参数组合即jobA之前声明的matrix与jobB之前的matrix相乘的结果。 + +#### 1.8.3. excludes语法特性 + +matrix支持通过excludes声明排除特定的组合,如下所示: + +```yaml +sequence: + matrix: + os: + - openeuler + - centos + os_version: + - "20.03" + - 7 + excludes: + # 下述两种描述形式均可支持 + - {"os": "openeuler", "os_version": "7"} + - os: centos + os_version: "20.03" +``` + +此声明方式意为此矩阵只存在两种参数组合,即 ”os=openeuler;os_version=20.03“ 和 “os=centos;os_version=7”。 + +#### 1.8.4. 参数组合语法糖 + +同时,matrix具备一种简化excludes声明的语法糖"|",以上述样例可以改写为: + +```yaml +sequence: + matrix: + os|os_version: + - openeuler | 20.03 #有无空格或者制表符均支持 + - centos | 7 #推荐以制表符分隔,这样的声明较为直观 +``` + + + +### 1.9. "引用"表达式声明 + +#### 1.9.1. 基本概念 + +对于一条正在运作的流水线而言,其上下文是动态的,每执行完成一个任务,每感知到一个有效事件,“当前”上下文都会发生变化。 + +流水线运行上下文(Context)由六个固定的namespace组成: + +- vars,流水线变量空间(流水线静态变量全集) +- event,事件空间(事件数据全集) +- jobs,任务空间(前置已完成的任务数据) +- matrix,矩阵空间(当前矩阵参数组合分支的参数集合) +- depends,未满足的依赖事件清单(此namespace一般不会被引用) +- fullfilled,已履行的依赖事件清单(此namespace一般不会被引用) + +流水线不仅仅支持对定量的声明,流水线具备“引用”的语法特性,可以对“当前上下文”的变量进行引用,以及进行字符串拼接和python表达式运算。 + +“引用”由模式 ${{ xxxx }} 识别,通过"."的方式获取不同namespace下的所有value,支持下述两种使用方式: + +- 字符串拼接引用 + + ```yaml + # 取vars空间中的varA变量的值,并且与vars空间中的varB变量的值,最后通过"-"拼接 + key: ${{ vars.varA }}-${{ vars.varB }} + # 取当前矩阵参数组合的os、os_version、arch拼接命名 + project_name: my_project:${{ matrix.os }}:${{ matrix.os_version }}:${{ matrix.arch }} + ``` + + 对于这种拼接引用的方式,需要用户确保引用变量的值一定是字符串。如果实际的值不为字符串或者无法转换为字符串,那么采用这种引用声明的job很可能无法正常提交。 + +- 单引用 + + 单引用的情况下,引用表达式的结果可以为字符串、数字、数组(列表)或者哈希表(字典),不受类型影响。 + + ```yaml + # 取vars空间中的数组arrayA,作为key的值 + key: ${{ vars.arrayA }} + # 取前置已完成的jobA的输出result.arrayB,作为key的值 + key: ${{ jobs.jobA.result.arrayB }} + ``` + +**注意:**引用特性仅支持在defaults和overrides下使用,即jobs.xxx下的defaults和overrides或者sequence下某个job的defaults和overrides。后续演进的语法版本中将加入"在matrix中引用vars变量"的支持。 + +#### 1.9.2. python语法支持 + +对于任意引用内部而言,在引用的变量被实际的值替换后,替换后的内容将会被当作python表达式运行,如下示例: + +```yaml +# 取多个不同namespace的变量进行数值运算 +key: ${{ vars.numA + jobs.jobA.result.success_num }} + +# 调用python datetime模块,获取年月日并拼接字符串 +project_name: ${{ vars.my_name }}-${{ datetime.datetime.now().year }}-{{ datetime.datetime.now().month }}-${{ datetime.datetime.now().day }} + +# 调用字符串处理方法,对字符串进行大小写转换,split等操作 +key: ${{ vars.stringA.lower() }} +key: ${{ vars.stringA.split(':') }} + +# 单纯通过python表达式计算数值,不对变量进行引用,如计算一天一共有多少秒 +key: ${{ 24*60*60 }} +``` + +支持的非内置Python模块: + +| 模块名 | 作用 | 官方文档链接 | +| -------- | ---------------------- | ----------------------------------------------- | +| re | 提供正则表达式匹配操作 | https://docs.python.org/3/library/re.html | +| math | 提供数学运算函数 | https://docs.python.org/3/library/math.html | +| time | 提供时间相关函数 | https://docs.python.org/3/library/time.html | +| datetime | 提供日期和时间处理函数 | https://docs.python.org/3/library/datetime.html | + +支持的安全内置Python模块: + +| 类型 | 模块名 | +| ------------ | ------------------------------------------------------------ | +| 数据类型 | object, bool, int, float, complex, str, bytes, bytearray, tuple, list, set, frozenset, dict | +| 数学运算 | abs, round, pow, divmod | +| 迭代器 | iter, next | +| 集合操作 | len, sum, min, max, all, any, map, filter, zip, enumerate, sorted, reversed | +| 数字转换 | bin, hex, oct | +| 字符串格式化 | ascii, repr, chr, ord, format | +| 变量和内存 | dir, locals, globals, id, hash | +| 类型检查 | isinstance, issubclass, callable | + + + +## 2. workflow.yaml完整示例 + +以下是一个完整的workflow.yaml文件示例: + +```yaml +# 语法版本声明 +version: v1.0 + +# 流水线命名 +name: 每日构建 + +# 触发设置 +on: + # 设定定时触发事件,每天00:00触发 + - type: cron + week_day: + - Monday + - Tuesday + - Wednesday + - Thursday + - Friday + - Saturday + - Sunday + time: 00:00 + start_date: 2023-10-18 + +# 流水线变量设置 +vars: + eulermaker_account: account + eulermaker_password: passwd + os: os + os_version: version + +# 任务声明 +jobs.eulermaker-build-project:everything: + overrides: + project_name: ${{ vars.os }}-${{ vars.os_version }}:everything + build_type: full + build_arch: ${{ matrix.arch }} + secrets: + ACCOUNT: ${{ vars.eulermaker_account }} + PASSWORD: ${{ vars.eulermaker_password }} + testbox: vm-2p8g + +jobs.eulermaker-build-project:epol: + overrides: + project_name: ${{ vars.os }}-${{ vars.os_version }}:epol + build_type: full + build_arch: ${{ matrix.arch }} + secrets: + ACCOUNT: ${{ vars.eulermaker_account }} + PASSWORD: ${{ vars.eulermaker_password }} + testbox: vm-2p8g + +jobs.eulermaker-create-image: + overrides: + image_project_params: + pipeline_info: + pipeline_name: ${{ vars.os }}-${{ vars.os_version }}-${{ datetime.datetime.now().year }}-${{ datetime.datetime.now().month }}-${{ datetime.datetime.now().day }}-1 + group: dailybuild + category: standard + scene: cloud + image_format: qcow2 + arch: ${{ matrix.arch }} + image_config: + release_image_config: + repo_url: > + http://xxxxx/ + http://xxxx/ + http://xxx/ + http://xxxxxx/ + product: ${{ vars.os.lower() }} + version: ${{ vars.os_version }}-${{ datetime.datetime.now().year }}-${{ datetime.datetime.now().month }}-${{ datetime.datetime.now().day }} + secrets: + ACCOUNT: ${{ vars.eulermaker_account }} + PASSWORD: ${{ vars.eulermaker_password }} + testbox: vm-2p8g + +jobs.eulermaker-build-image: + overrides: + secrets: + ACCOUNT: ${{ vars.eulermaker_account }} + PASSWORD: ${{ vars.eulermaker_password }} + pipeline_id: ${{ jobs.eulermaker-create-image.result.id }} + testbox: vm-2p8g + runtime: ${{ 24*60*60 }} + +jobs.qcow2rootfs: + overrides: + qcow2rootfs.qcow2_os: ${{ jobs.eulermaker-build-image.result.product }} + qcow2rootfs.qcow2_arch: ${{ matrix.arch }} + qcow2rootfs.qcow2_version: ${{ jobs.eulermaker-build-image.result.version }} + qcow2rootfs.qcow2_url: ${{ jobs.eulermaker-build-image.result.image_link }} + qcow2rootfs.rootfs_protocol: nfs + qcow2rootfs.rootfs_server: "172.168.131.2" + qcow2rootfs.rootfs_path: os-rw + testbox: vm-2p32g + +jobs.mugen-smoke-baseinfo: + overrides: + os: ${{ jobs.qcow2rootfs.result.os }} + os_version: ${{ jobs.qcow2rootfs.result.version }} + os_mount: nfs + arch: ${{ jobs.qcow2rootfs.result.arch }} + testbox: vm-2p8g + +jobs.mugen-smoke-basic-os: + overrides: + os: ${{ jobs.qcow2rootfs.result.os }} + os_version: ${{ jobs.qcow2rootfs.result.version }} + os_mount: nfs + arch: ${{ jobs.qcow2rootfs.result.arch }} + testbox: vm-2p8g + +# 控制流声明 +sequence: + # 矩阵声明 + matrix: + arch: + - aarch64 + - x86_64 + # 并行子结构声明 + parallel:build: + eulermaker-build-project:everything: + eulermaker-build-project:epol: + # 串行子结构声明 + sequence:create-image: + eulermaker-create-image: + eulermaker-build-image: + qcow2rootfs: + parallel:AT: + mugen-smoke-baseinfo: + mugen-smoke-basic-os: +``` diff --git "a/docs/zh/docs/Ods-Pipeline/image/UI\347\274\226\346\216\222.jpg" "b/docs/zh/docs/Ods-Pipeline/image/UI\347\274\226\346\216\222.jpg" new file mode 100644 index 0000000000000000000000000000000000000000..c0ed1c60374edba909ca898bf82ed48b9228bc4a Binary files /dev/null and "b/docs/zh/docs/Ods-Pipeline/image/UI\347\274\226\346\216\222.jpg" differ diff --git "a/docs/zh/docs/Ods-Pipeline/image/YAML\347\274\226\346\216\222.jpg" "b/docs/zh/docs/Ods-Pipeline/image/YAML\347\274\226\346\216\222.jpg" new file mode 100644 index 0000000000000000000000000000000000000000..782c9cd987712f9087990a5159971d2a05358f7f Binary files /dev/null and "b/docs/zh/docs/Ods-Pipeline/image/YAML\347\274\226\346\216\222.jpg" differ diff --git a/docs/zh/docs/Ods-Pipeline/image/build_stage.png b/docs/zh/docs/Ods-Pipeline/image/build_stage.png deleted file mode 100644 index ecc0a751b9bd114f62b47e8a590452457f79e397..0000000000000000000000000000000000000000 Binary files a/docs/zh/docs/Ods-Pipeline/image/build_stage.png and /dev/null differ diff --git "a/docs/zh/docs/Ods-Pipeline/image/cci\345\270\220\346\210\267\346\263\250\345\206\214.jpg" "b/docs/zh/docs/Ods-Pipeline/image/cci\345\270\220\346\210\267\346\263\250\345\206\214.jpg" new file mode 100644 index 0000000000000000000000000000000000000000..ca163e10f3198b5aa120aea410e4a25c6fb3ffe2 Binary files /dev/null and "b/docs/zh/docs/Ods-Pipeline/image/cci\345\270\220\346\210\267\346\263\250\345\206\214.jpg" differ diff --git "a/docs/zh/docs/Ods-Pipeline/image/cci\345\270\220\346\210\267\347\273\221\345\256\232.jpg" "b/docs/zh/docs/Ods-Pipeline/image/cci\345\270\220\346\210\267\347\273\221\345\256\232.jpg" new file mode 100644 index 0000000000000000000000000000000000000000..8cb9da2ba91a75aae8a6b112753c64d1e6f4c76e Binary files /dev/null and "b/docs/zh/docs/Ods-Pipeline/image/cci\345\270\220\346\210\267\347\273\221\345\256\232.jpg" differ diff --git "a/docs/zh/docs/Ods-Pipeline/image/matrix\346\200\273\350\247\210.jpg" "b/docs/zh/docs/Ods-Pipeline/image/matrix\346\200\273\350\247\210.jpg" new file mode 100644 index 0000000000000000000000000000000000000000..1244aa6147af43a67fd000855b66ba11711e27fe Binary files /dev/null and "b/docs/zh/docs/Ods-Pipeline/image/matrix\346\200\273\350\247\210.jpg" differ diff --git a/docs/zh/docs/Ods-Pipeline/image/pipeline.png b/docs/zh/docs/Ods-Pipeline/image/pipeline.png deleted file mode 100644 index 2ade49082715c53365cb7fccbde09f0c7e8cf3a7..0000000000000000000000000000000000000000 Binary files a/docs/zh/docs/Ods-Pipeline/image/pipeline.png and /dev/null differ diff --git a/docs/zh/docs/Ods-Pipeline/image/post_pipeline.png b/docs/zh/docs/Ods-Pipeline/image/post_pipeline.png deleted file mode 100644 index df2d56b25448b9a01419d93be8f5233876f8ba7c..0000000000000000000000000000000000000000 Binary files a/docs/zh/docs/Ods-Pipeline/image/post_pipeline.png and /dev/null differ diff --git a/docs/zh/docs/Ods-Pipeline/image/post_submit.png b/docs/zh/docs/Ods-Pipeline/image/post_submit.png deleted file mode 100644 index ff3e6f66cfde597a4625d1ac37022c663330ef4b..0000000000000000000000000000000000000000 Binary files a/docs/zh/docs/Ods-Pipeline/image/post_submit.png and /dev/null differ diff --git a/docs/zh/docs/Ods-Pipeline/image/post_workflow.png b/docs/zh/docs/Ods-Pipeline/image/post_workflow.png deleted file mode 100644 index 8447a2a388ce67c8876285120b4ac90b75ca47a2..0000000000000000000000000000000000000000 Binary files a/docs/zh/docs/Ods-Pipeline/image/post_workflow.png and /dev/null differ diff --git a/docs/zh/docs/Ods-Pipeline/image/pre_deploy_stage.png b/docs/zh/docs/Ods-Pipeline/image/pre_deploy_stage.png deleted file mode 100644 index 35fea094ba62775d5b80d980d4bc1946e16dd700..0000000000000000000000000000000000000000 Binary files a/docs/zh/docs/Ods-Pipeline/image/pre_deploy_stage.png and /dev/null differ diff --git a/docs/zh/docs/Ods-Pipeline/image/snapshot.png b/docs/zh/docs/Ods-Pipeline/image/snapshot.png deleted file mode 100644 index 27f8a9d652559dff6a005b56e0d7ec07c0f1be47..0000000000000000000000000000000000000000 Binary files a/docs/zh/docs/Ods-Pipeline/image/snapshot.png and /dev/null differ diff --git a/docs/zh/docs/Ods-Pipeline/image/test_stage.png b/docs/zh/docs/Ods-Pipeline/image/test_stage.png deleted file mode 100644 index c94bf4fc8542ec6f5dc6f7c1e4b08f3a6f60259f..0000000000000000000000000000000000000000 Binary files a/docs/zh/docs/Ods-Pipeline/image/test_stage.png and /dev/null differ diff --git a/docs/zh/docs/Ods-Pipeline/image/yaml2jobs.png b/docs/zh/docs/Ods-Pipeline/image/yaml2jobs.png deleted file mode 100644 index 6456d49c0215df0388006d50745a9c33a47ce8c1..0000000000000000000000000000000000000000 Binary files a/docs/zh/docs/Ods-Pipeline/image/yaml2jobs.png and /dev/null differ diff --git "a/docs/zh/docs/Ods-Pipeline/image/\344\273\223\345\272\223\351\205\215\347\275\256webhook1.jpg" "b/docs/zh/docs/Ods-Pipeline/image/\344\273\223\345\272\223\351\205\215\347\275\256webhook1.jpg" new file mode 100644 index 0000000000000000000000000000000000000000..8900f78845be8b45e3f677e9fc255b9f4d092f6d Binary files /dev/null and "b/docs/zh/docs/Ods-Pipeline/image/\344\273\223\345\272\223\351\205\215\347\275\256webhook1.jpg" differ diff --git "a/docs/zh/docs/Ods-Pipeline/image/\344\273\223\345\272\223\351\205\215\347\275\256webhook2.jpg" "b/docs/zh/docs/Ods-Pipeline/image/\344\273\223\345\272\223\351\205\215\347\275\256webhook2.jpg" new file mode 100644 index 0000000000000000000000000000000000000000..1f3d4d7cb390b1a04377da0d5a1426a174b7ba79 Binary files /dev/null and "b/docs/zh/docs/Ods-Pipeline/image/\344\273\223\345\272\223\351\205\215\347\275\256webhook2.jpg" differ diff --git "a/docs/zh/docs/Ods-Pipeline/image/\344\273\273\345\212\241\350\257\246\346\203\205.jpg" "b/docs/zh/docs/Ods-Pipeline/image/\344\273\273\345\212\241\350\257\246\346\203\205.jpg" new file mode 100644 index 0000000000000000000000000000000000000000..1e7f99e6b83371ad665df0353129ae4cf45c5be6 Binary files /dev/null and "b/docs/zh/docs/Ods-Pipeline/image/\344\273\273\345\212\241\350\257\246\346\203\205.jpg" differ diff --git "a/docs/zh/docs/Ods-Pipeline/image/\344\273\273\345\212\241\351\207\215\350\257\225.jpg" "b/docs/zh/docs/Ods-Pipeline/image/\344\273\273\345\212\241\351\207\215\350\257\225.jpg" new file mode 100644 index 0000000000000000000000000000000000000000..7ad1ce1b8b1013b89096607bed6ce132c796015c Binary files /dev/null and "b/docs/zh/docs/Ods-Pipeline/image/\344\273\273\345\212\241\351\207\215\350\257\225.jpg" differ diff --git "a/docs/zh/docs/Ods-Pipeline/image/\344\277\256\346\224\271\350\247\222\350\211\262.jpg" "b/docs/zh/docs/Ods-Pipeline/image/\344\277\256\346\224\271\350\247\222\350\211\262.jpg" new file mode 100644 index 0000000000000000000000000000000000000000..0487c824d31c50be7edf0f4f80859c14c3fe5951 Binary files /dev/null and "b/docs/zh/docs/Ods-Pipeline/image/\344\277\256\346\224\271\350\247\222\350\211\262.jpg" differ diff --git "a/docs/zh/docs/Ods-Pipeline/image/\344\277\256\346\224\271\350\247\222\350\211\262\346\235\203\351\231\2201.jpg" "b/docs/zh/docs/Ods-Pipeline/image/\344\277\256\346\224\271\350\247\222\350\211\262\346\235\203\351\231\2201.jpg" new file mode 100644 index 0000000000000000000000000000000000000000..dfcc830a5a734a12c73af9f37819b950ec520163 Binary files /dev/null and "b/docs/zh/docs/Ods-Pipeline/image/\344\277\256\346\224\271\350\247\222\350\211\262\346\235\203\351\231\2201.jpg" differ diff --git "a/docs/zh/docs/Ods-Pipeline/image/\344\277\256\346\224\271\350\247\222\350\211\262\346\235\203\351\231\2202.jpg" "b/docs/zh/docs/Ods-Pipeline/image/\344\277\256\346\224\271\350\247\222\350\211\262\346\235\203\351\231\2202.jpg" new file mode 100644 index 0000000000000000000000000000000000000000..abfbe32c67022f385250f73fed14153118f88ffa Binary files /dev/null and "b/docs/zh/docs/Ods-Pipeline/image/\344\277\256\346\224\271\350\247\222\350\211\262\346\235\203\351\231\2202.jpg" differ diff --git "a/docs/zh/docs/Ods-Pipeline/image/\345\210\240\351\231\244\346\265\201\346\260\264\347\272\277.jpg" "b/docs/zh/docs/Ods-Pipeline/image/\345\210\240\351\231\244\346\265\201\346\260\264\347\272\277.jpg" new file mode 100644 index 0000000000000000000000000000000000000000..cdac7a7a2649e82b0eea40a51104a5adbf4f83c1 Binary files /dev/null and "b/docs/zh/docs/Ods-Pipeline/image/\345\210\240\351\231\244\346\265\201\346\260\264\347\272\277.jpg" differ diff --git "a/docs/zh/docs/Ods-Pipeline/image/\345\216\206\345\217\262\350\277\220\350\241\214.jpg" "b/docs/zh/docs/Ods-Pipeline/image/\345\216\206\345\217\262\350\277\220\350\241\214.jpg" new file mode 100644 index 0000000000000000000000000000000000000000..a0e6d7558e2b853921646a4882dd0c93b1d7c5b0 Binary files /dev/null and "b/docs/zh/docs/Ods-Pipeline/image/\345\216\206\345\217\262\350\277\220\350\241\214.jpg" differ diff --git "a/docs/zh/docs/Ods-Pipeline/image/\345\217\230\346\233\264\346\265\201\346\260\264\347\272\277\347\261\273\345\236\213.jpg" "b/docs/zh/docs/Ods-Pipeline/image/\345\217\230\346\233\264\346\265\201\346\260\264\347\272\277\347\261\273\345\236\213.jpg" new file mode 100644 index 0000000000000000000000000000000000000000..3fad8fbb1d1176cde38329d16f1c1004eb18bea0 Binary files /dev/null and "b/docs/zh/docs/Ods-Pipeline/image/\345\217\230\346\233\264\346\265\201\346\260\264\347\272\277\347\261\273\345\236\213.jpg" differ diff --git "a/docs/zh/docs/Ods-Pipeline/image/\346\210\220\345\221\230\347\256\241\347\220\206.jpg" "b/docs/zh/docs/Ods-Pipeline/image/\346\210\220\345\221\230\347\256\241\347\220\206.jpg" new file mode 100644 index 0000000000000000000000000000000000000000..c1f73666826bbf68f8a9a6ffcc9c3b298c52fe5f Binary files /dev/null and "b/docs/zh/docs/Ods-Pipeline/image/\346\210\220\345\221\230\347\256\241\347\220\206.jpg" differ diff --git "a/docs/zh/docs/Ods-Pipeline/image/\346\226\260\345\273\272\346\265\201\346\260\264\347\272\277.jpg" "b/docs/zh/docs/Ods-Pipeline/image/\346\226\260\345\273\272\346\265\201\346\260\264\347\272\277.jpg" new file mode 100644 index 0000000000000000000000000000000000000000..451793dc756d3de06559bb15ba343a61bcc9925b Binary files /dev/null and "b/docs/zh/docs/Ods-Pipeline/image/\346\226\260\345\273\272\346\265\201\346\260\264\347\272\277.jpg" differ diff --git "a/docs/zh/docs/Ods-Pipeline/image/\346\237\245\347\234\213\346\234\200\346\226\260\350\277\220\350\241\214\350\257\246\346\203\205.jpg" "b/docs/zh/docs/Ods-Pipeline/image/\346\237\245\347\234\213\346\234\200\346\226\260\350\277\220\350\241\214\350\257\246\346\203\205.jpg" new file mode 100644 index 0000000000000000000000000000000000000000..5edd5100c1aa78a58c3b466c806bb4ef7d393b09 Binary files /dev/null and "b/docs/zh/docs/Ods-Pipeline/image/\346\237\245\347\234\213\346\234\200\346\226\260\350\277\220\350\241\214\350\257\246\346\203\205.jpg" differ diff --git "a/docs/zh/docs/Ods-Pipeline/image/\346\267\273\345\212\240webhook\350\247\246\345\217\221\346\235\241\344\273\266.jpg" "b/docs/zh/docs/Ods-Pipeline/image/\346\267\273\345\212\240webhook\350\247\246\345\217\221\346\235\241\344\273\266.jpg" new file mode 100644 index 0000000000000000000000000000000000000000..168a7e480f8a4b6f01e3339305d30bd11c29db96 Binary files /dev/null and "b/docs/zh/docs/Ods-Pipeline/image/\346\267\273\345\212\240webhook\350\247\246\345\217\221\346\235\241\344\273\266.jpg" differ diff --git "a/docs/zh/docs/Ods-Pipeline/image/\346\267\273\345\212\240\346\210\220\345\221\230.jpg" "b/docs/zh/docs/Ods-Pipeline/image/\346\267\273\345\212\240\346\210\220\345\221\230.jpg" new file mode 100644 index 0000000000000000000000000000000000000000..b7bf76af38091df75ed1954d55eaa3f698c12270 Binary files /dev/null and "b/docs/zh/docs/Ods-Pipeline/image/\346\267\273\345\212\240\346\210\220\345\221\230.jpg" differ diff --git "a/docs/zh/docs/Ods-Pipeline/image/\347\231\273\345\275\225.jpg" "b/docs/zh/docs/Ods-Pipeline/image/\347\231\273\345\275\225.jpg" new file mode 100644 index 0000000000000000000000000000000000000000..defa329929d065e39ef27fd7cf65d33117ed49f2 Binary files /dev/null and "b/docs/zh/docs/Ods-Pipeline/image/\347\231\273\345\275\225.jpg" differ diff --git "a/docs/zh/docs/Ods-Pipeline/image/\347\244\276\345\214\272\351\211\264\346\235\203.jpg" "b/docs/zh/docs/Ods-Pipeline/image/\347\244\276\345\214\272\351\211\264\346\235\203.jpg" new file mode 100644 index 0000000000000000000000000000000000000000..d1a665f670873ed8e9538c784a8d00afad8f1925 Binary files /dev/null and "b/docs/zh/docs/Ods-Pipeline/image/\347\244\276\345\214\272\351\211\264\346\235\203.jpg" differ diff --git "a/docs/zh/docs/Ods-Pipeline/image/\347\273\223\345\257\271\345\244\215\347\216\260.jpg" "b/docs/zh/docs/Ods-Pipeline/image/\347\273\223\345\257\271\345\244\215\347\216\260.jpg" new file mode 100644 index 0000000000000000000000000000000000000000..5a393bed7116cc0c959398480c447e80ed35a257 Binary files /dev/null and "b/docs/zh/docs/Ods-Pipeline/image/\347\273\223\345\257\271\345\244\215\347\216\260.jpg" differ diff --git "a/docs/zh/docs/Ods-Pipeline/image/\347\273\223\345\257\271\345\244\215\347\216\260\350\260\203\350\257\225.jpg" "b/docs/zh/docs/Ods-Pipeline/image/\347\273\223\345\257\271\345\244\215\347\216\260\350\260\203\350\257\225.jpg" new file mode 100644 index 0000000000000000000000000000000000000000..3436e7bedd5cda5dad5e55966db94ba576e1b97f Binary files /dev/null and "b/docs/zh/docs/Ods-Pipeline/image/\347\273\223\345\257\271\345\244\215\347\216\260\350\260\203\350\257\225.jpg" differ diff --git "a/docs/zh/docs/Ods-Pipeline/image/\347\274\226\346\216\222\347\233\256\346\240\207\346\265\201\346\260\264\347\272\277.jpg" "b/docs/zh/docs/Ods-Pipeline/image/\347\274\226\346\216\222\347\233\256\346\240\207\346\265\201\346\260\264\347\272\277.jpg" new file mode 100644 index 0000000000000000000000000000000000000000..c4aae306bec750f9a3121bcfac2014846d9adeeb Binary files /dev/null and "b/docs/zh/docs/Ods-Pipeline/image/\347\274\226\346\216\222\347\233\256\346\240\207\346\265\201\346\260\264\347\272\277.jpg" differ diff --git "a/docs/zh/docs/Ods-Pipeline/image/\350\256\276\347\275\256\346\265\201\346\260\264\347\272\277\346\235\203\351\231\220.jpg" "b/docs/zh/docs/Ods-Pipeline/image/\350\256\276\347\275\256\346\265\201\346\260\264\347\272\277\346\235\203\351\231\220.jpg" new file mode 100644 index 0000000000000000000000000000000000000000..c4d2f21e6051000969295c3011c22500a12daed2 Binary files /dev/null and "b/docs/zh/docs/Ods-Pipeline/image/\350\256\276\347\275\256\346\265\201\346\260\264\347\272\277\346\235\203\351\231\220.jpg" differ diff --git "a/docs/zh/docs/Ods-Pipeline/image/\350\277\220\350\241\214\346\265\201\346\260\264\347\272\2771.jpg" "b/docs/zh/docs/Ods-Pipeline/image/\350\277\220\350\241\214\346\265\201\346\260\264\347\272\2771.jpg" new file mode 100644 index 0000000000000000000000000000000000000000..e9d48f17fb05a5ed9966973e212fbea83081a2c5 Binary files /dev/null and "b/docs/zh/docs/Ods-Pipeline/image/\350\277\220\350\241\214\346\265\201\346\260\264\347\272\2771.jpg" differ diff --git "a/docs/zh/docs/Ods-Pipeline/image/\350\277\220\350\241\214\346\265\201\346\260\264\347\272\2772.jpg" "b/docs/zh/docs/Ods-Pipeline/image/\350\277\220\350\241\214\346\265\201\346\260\264\347\272\2772.jpg" new file mode 100644 index 0000000000000000000000000000000000000000..4c00631e6d785cfca1493748a9f529f9b73c469a Binary files /dev/null and "b/docs/zh/docs/Ods-Pipeline/image/\350\277\220\350\241\214\346\265\201\346\260\264\347\272\2772.jpg" differ diff --git "a/docs/zh/docs/Ods-Pipeline/image/\350\277\220\350\241\214\350\257\246\346\203\205.jpg" "b/docs/zh/docs/Ods-Pipeline/image/\350\277\220\350\241\214\350\257\246\346\203\205.jpg" new file mode 100644 index 0000000000000000000000000000000000000000..db831e2baa85e04f65a5ca9565885ef7fe2016ee Binary files /dev/null and "b/docs/zh/docs/Ods-Pipeline/image/\350\277\220\350\241\214\350\257\246\346\203\205.jpg" differ diff --git "a/docs/zh/docs/Ods-Pipeline/image/\351\200\211\346\213\251\345\244\215\347\216\260\346\226\271\345\274\217.jpg" "b/docs/zh/docs/Ods-Pipeline/image/\351\200\211\346\213\251\345\244\215\347\216\260\346\226\271\345\274\217.jpg" new file mode 100644 index 0000000000000000000000000000000000000000..d746819f65486e9ba0f98250d95e926b6ea0aa6d Binary files /dev/null and "b/docs/zh/docs/Ods-Pipeline/image/\351\200\211\346\213\251\345\244\215\347\216\260\346\226\271\345\274\217.jpg" differ diff --git "a/docs/zh/docs/Ods-Pipeline/image/\351\200\211\346\213\251\345\256\242\346\210\267\347\253\257.jpg" "b/docs/zh/docs/Ods-Pipeline/image/\351\200\211\346\213\251\345\256\242\346\210\267\347\253\257.jpg" new file mode 100644 index 0000000000000000000000000000000000000000..675f30fcf1dd480cf3e6a1f6202d4912596444fc Binary files /dev/null and "b/docs/zh/docs/Ods-Pipeline/image/\351\200\211\346\213\251\345\256\242\346\210\267\347\253\257.jpg" differ diff --git "a/docs/zh/docs/Ods-Pipeline/image/\351\200\211\346\213\251\346\250\241\346\235\277.jpg" "b/docs/zh/docs/Ods-Pipeline/image/\351\200\211\346\213\251\346\250\241\346\235\277.jpg" new file mode 100644 index 0000000000000000000000000000000000000000..ebde36c2b8ae3f30aff0d9519208bc5b41651108 Binary files /dev/null and "b/docs/zh/docs/Ods-Pipeline/image/\351\200\211\346\213\251\346\250\241\346\235\277.jpg" differ diff --git "a/docs/zh/docs/Ods-Pipeline/image/\351\205\215\347\275\256\345\256\232\346\227\266\350\247\246\345\217\221\346\235\241\344\273\266.jpg" "b/docs/zh/docs/Ods-Pipeline/image/\351\205\215\347\275\256\345\256\232\346\227\266\350\247\246\345\217\221\346\235\241\344\273\266.jpg" new file mode 100644 index 0000000000000000000000000000000000000000..990d99a5ffa13bfdabdcd05ab73428eaca278e68 Binary files /dev/null and "b/docs/zh/docs/Ods-Pipeline/image/\351\205\215\347\275\256\345\256\232\346\227\266\350\247\246\345\217\221\346\235\241\344\273\266.jpg" differ diff --git "a/docs/zh/docs/Ods-Pipeline/image/\351\205\215\347\275\256\346\265\201\346\260\264\347\272\277\345\217\230\351\207\217.jpg" "b/docs/zh/docs/Ods-Pipeline/image/\351\205\215\347\275\256\346\265\201\346\260\264\347\272\277\345\217\230\351\207\217.jpg" new file mode 100644 index 0000000000000000000000000000000000000000..5c58a5ada93c8260daf2ff4cbe5af4659b45badd Binary files /dev/null and "b/docs/zh/docs/Ods-Pipeline/image/\351\205\215\347\275\256\346\265\201\346\260\264\347\272\277\345\217\230\351\207\217.jpg" differ diff --git "a/docs/zh/docs/Ods-Pipeline/image/\351\207\215\350\257\225\346\265\201\346\260\264\347\272\2771.jpg" "b/docs/zh/docs/Ods-Pipeline/image/\351\207\215\350\257\225\346\265\201\346\260\264\347\272\2771.jpg" new file mode 100644 index 0000000000000000000000000000000000000000..91cdb6a6b06855c1293602665080c41cebfba301 Binary files /dev/null and "b/docs/zh/docs/Ods-Pipeline/image/\351\207\215\350\257\225\346\265\201\346\260\264\347\272\2771.jpg" differ diff --git "a/docs/zh/docs/Ods-Pipeline/ods pipeline\347\224\250\346\210\267\346\214\207\345\215\227.md" "b/docs/zh/docs/Ods-Pipeline/ods pipeline\347\224\250\346\210\267\346\214\207\345\215\227.md" index 57aa7758fc7a6a7f8d320a434138a1b8be4b8c39..069922054ccd3fe231269bbcacc3774e3cae9e44 100644 --- "a/docs/zh/docs/Ods-Pipeline/ods pipeline\347\224\250\346\210\267\346\214\207\345\215\227.md" +++ "b/docs/zh/docs/Ods-Pipeline/ods pipeline\347\224\250\346\210\267\346\214\207\345\215\227.md" @@ -1,161 +1,185 @@ -# ods pipeline用户指南 - -## 前言 - -### 概述 - -本文档介绍了ods的pipeline,workflow定义,编写规则,功能说明,使用的方法,以及日志结果查询。 - -### 使用对象 - -- 社区开发者 -- 版本Maitainer -- 性能优化工程师 - -### 术语说明 - -| 名称 | 说明 | -| :------: | :----------------------------------------------------------: | -| DevOps | 是指在一组开发者服务的集合,开发者可以直接使用服务,也可以对服务进行编排。 | -| pipeline | 对于编排好的一系列任务,可被系统执行的语法yaml 文件。 | -| workflow | 为完成一组功能的工作流(如:构建为repo、生成ISO等),是一个编排好的jobs,可以是被pipeline 引用。 | -| job | 在某个执行机器上具体指定的一组执行接口。 | - -### 修订记录 - -| 文档版本 | 发布日期 | 修改说明 | -| -------- | ---------- | -------- | -| 01 | 2023-06-28 | | - -## 工具概述 - -ods是基于pipeline执行流水线测试任务并输出结果的工具,支持配置化管理pipeline以及workflow,灵活组合workflow,降低人工参与成本。解决任务流中繁琐人工操作,效率低的问题。 - -ods主要提供以下功能: - -- pipeline与workflow管理 - -对用户上传的pipeline.yaml或者workflow.yaml进行管理,并能生成pipeline的快照。 - -- 运行pipeline - -用户可运行指定的pipeline,并生成日志。 - -### 应用场景 - -- 操作系统串行测试 :构建任务完成后 ,测试在某台机器上完成 - -``` --提交构建repo任务 --生成镜像(基于多个repo,生成ISO、image) --iso做成compass-ci做成compass-ci 支持的rootfs --提交串行测试任务 --汇总测试结果 -``` - -- 操作系统并行测试 :构建任务完成后 ,测试在不同的台机器上完成 - -``` --构建repo --生成镜像(基于多个repo,生成ISO、image) --iso做成compass-ci做成compass-ci 支持的rootfs --执行并行测试任务 --汇总测试结果 -``` - -- 操作系统包有更新 :操作系统本身不变,基于repo 更新rpm,开展测试 - -``` --构建rpm repo --加载对应系统的rootfs --执行测试任务 --汇总测试结果 -``` - -- 场景测试(mysql, ceph, spark => 构建+部署+冒烟测试) - - -``` -- 构建软件包 -- 加载对应系统的rootfs -- 执行测试任务 -- 输出测试报告 -``` - -### 使用介绍 - -本示例中,下列字段均需根据实际部署情况自行配置。部分字段上文提供了生成方法: - -${HOST_IP}:ods服务宿主机IP - -${ODS_MANAGER_PORT}:ods管理服务对应的端口 - -${ODS_RUNNER_PORT}:ods任务运行服务对应的端口 - -${PIPELINE_NAME}: pipeline的名称 - -${WORKFLOW_NAME}: workflow的名称 - -${SUBMIT_USER}: 提交的task的用户名 - -步骤1:编写并提交workflow.yaml -以下图为例: - -![build_stage](./image/build_stage.png) - -![pre_deploy_stage](./image/pre_deploy_stage.png) - -![test_stage](./image/test_stage.png) - -分别编写了名为build_stage,pre_test_stage,test_stage的workflow。填写内容不限于图中所示,具体与compass-ci处理job参数保持一致。 - -调用接口分别提交: - -![post_workflow](./image/post_workflow.png) - - - -步骤2:编写并提交pipeline.yaml - -以下图为例: - -![pipeline](./image/pipeline.png) - -编写了名为test的pipeline。填写内容不限于图中所示。 - -调用接口提交: - -![post_pipeline](./image/post_pipeline.png) - - - -说明:pipeline提交后,会根据其下包含的workflow名称去寻找对应workflow信息,进行扩展,生成snapshot,snapshot仅可通过更新pipeline的方式更新,解析job的来源是snapshot。 - -snapshot示例: - -![snapshot](./image/snapshot.png) - - - -步骤3:提交执行pipeline - -调用接口执行: - -![post_submit](./image/post_submit.png) - -说明: - -1.对于pipeline下的matrix入参,会做笛卡尔积组合,生成多个task - -2.对于单个job的依赖项,其值为workflow.need + job.depends + job.input去重后的合集,若对应的依赖找不到,跳过该依赖,以上图为例,最终job依赖项见下图: - -![yaml2jobs](./image/yaml2jobs.png) - -3.单个job是否提交,会查看其依赖项是否全部满足,否则不会提交执行。 - - - -步骤4:查看执行结果 - -单个job的执行结果在${HOST_IP}的/result目录下,查找对应的测试套名称,具体日志看其下文件。 - +# EulerDevOps原生开发服务使用手册 + +本文聚焦于web页面的维度进行EulerDevOps原生开发服务的功能介绍。 + +### 1. 登录与注册 + +EulerDevOps原生开发服务采用基于openEuler社区鉴权服务的多端统一用户管理系统,社区身份+集群身份共同组成用户个体。 + +- 初始页面,点击悬浮右上角图标,点击登录按钮登录 + + ![登录](./image/登录.jpg) + +- openEuler社区鉴权,验证身份 + + ![鉴权](./image/社区鉴权.jpg) + +- lab-z9作为当前公共执行机集群,用户可以在社区身份鉴权完成后,需要注册或绑定lab-z9的Compass-CI帐户 + + - 绑定用户,需要提供Compass-CI帐号、Compass-CI用户名、Compass-CI邮箱、Compass-CI令牌四个必填字段 + + ![登录](./image/cci帐户绑定.jpg) + + - 如果之前没有lab-z9集群的Compass-CI帐户,也可以注册新用户 + + ![注册](./image/cci帐户注册.jpg) + + + +### 2. 新建流水线 + +EulerDevOps原生开发服务支持基于lkp-tests仓库中的流水线yaml模板快速创建新流水线,流水线模板根据其所属目录分为不同类型,便于用户根据具体需求场景快速选取。 + +- 点击新建流水线按钮(当页面不存在流水线数据时按钮居中,当页面存在流水线数据时按钮位于左上角) + + ![新建](./image/新建流水线.jpg) + +- 可以通过选择“公开”或“受限”设定流水线的初始类型,公开类型的流水线所有用户可查看其内容但无法操作,受限类型的流水线所有接口权限均被管控。同时,用户需要选择某个仓库类型或自定义类型的客户端,流水线基于该客户端运作(客户端不同流水线模板以及可选的原子任务不同) + +- 公开类型的流水线可以被任意用户查看详情,但除查询外的操作需要流水线maintainer管理权限 + +- 受限类型的流水线则受到流水线maintainer的全权管控 + + ![选择客户端](./image/选择客户端.jpg) + + ![选择模板](./image/选择模板.jpg) + + + +### 3. 编排流水线 + +流水线的编排分为通过web页面进行UI编排,以及通过YAML编排(修改流水线描述文本) + +- UI编排 + + 通过UI编排后,变更同样会被同步到流水线YAML中,本质是一致的行为。流水线web服务提供将UI编排“翻译”为YAML内容变更的能力,以此降低用户编排流水线的入门门槛。 + + ![UI编排](./image/UI编排.jpg) + + - 流水线变量设置 + + ![流水线变量](./image/配置流水线变量.jpg) + + - 定时触发条件设置 + + ![定时](./image/配置定时触发条件.jpg) + + - webhook触发条件设置 + + ![webhook](./image/添加webhook触发条件.jpg) + + - 添加webhook触发条件后,需要到仓库进行webhook配置,下图以gitee为例。在gitee代码仓点击设置—WebHooks—添加webHook + + ![webhook1](./image/仓库配置webhook1.jpg) + + - URL需要填写流水线服务配置webhook触发条件页面中提供的webhook URL,然后选择WebHook密码,填写页面中生成的密码。 + + ![webhook2](./image/仓库配置webhook2.jpg) + + - 原子任务添加/删除(暂未开放) + + - 原子任务选择/详情介绍查看(暂未开放) + + - 原子任务参数修改(暂未开放) + +- YAML编排 + + - 通过YAML编排需要熟悉流水线支持的语法,流水线支持多版本语法解析,语法版本通过“version”关键字声明,目前仅支持v1.0版本语法。 + + - v1.0语法请参考文档:[EulerDevOps原生开发服务流水线语法v1.0](https://gitee.com/openeuler-customization/ods/blob/master/doc/grammar/v1/v1.0_grammar.md) + + ![yaml](./image/YAML编排.jpg) + +- 变更流水线类型,由公开变为受限,或由受限变为公开 + + ![变更类型](./image/变更流水线类型.jpg) + +- 删除流水线 + + ![删除](./image/删除流水线.jpg) + + + +### 4. 运行流水线 + +运行流水线有三种方式,用户可以手动点击流水线的运行按钮或者重新运行按钮触发流水线的执行,也可以通过编排触发设置令流水线被webhook事件或定时事件驱动。 + +- webhook事件驱动 +- 定时事件驱动 + +- 手动运行可以通过如下两种按钮运行 + + ![运行1](./image/运行流水线1.jpg) + + ![运行2](./image/运行流水线2.jpg) + + + +### 5. 流水线权限设置 + +点击设置按钮,可进入流水线的权限配置页面,通过成员管理、角色分配以及角色权限设置完成流水线权限管理。 + +- 点击设置,进入成员管理 + + ![点击设置](./image/设置流水线权限.jpg) + + ![成员管理](./image/成员管理.jpg) + +- 可以通过邀请链接(暂未开放)或者直接用户名添加的方式添加流水线成员 + + ![添加成员](./image/添加成员.jpg) + +- 可以通过成员管理的操作列修改成员角色 + + ![修改角色](./image/修改角色.jpg) + +- 点击权限管理页签,可以添加自定义角色,或者基于现有角色修改访问权限 + + ![权限1](./image/修改角色权限1.jpg) + + ![权限2](./image/修改角色权限2.jpg) + + + +### 6. 流水线运行详情 + +流水线运行后,可以点击流水线名称列亮起的流水线名,进入流水线最新运行的详情页,如下所示。 + +![查看运行](./image/查看最新运行详情.jpg) + +![详情](./image/运行详情.jpg) + +用户在本页面可以清晰感知流水线的运行状态,以及流水线各任务的运行细节,以及可以通过重新运行、挂起、取消运行按钮控制流水线运行状态。 + +- 查看任务详情,检阅dmesg日志 + + ![任务详情](./image/任务详情.jpg) + +- 重试单个任务,尝试使流水线从失败的节点继续运作 + + ![任务重试](./image/任务重试.jpg) + +- 查看历史运行记录,可通过历史运行记录表的操作列,点击详情按钮,从而跳转到历史运行详情页面检阅历史数据 + + ![历史运行](./image/历史运行.jpg) + +- 点击右上角matrix总览按钮,可以在流水线存在矩阵时以方块视图直观感知不同参数组合下流水线的实时执行情况 + + ![matrix](./image/matrix总览.jpg) + + + +### 7. 一键复现与结对调试 + +原生开发流水线提供了一键复现的能力,用户可以于流水线运行详情页面,点击任意任务卡片上的一键复现按钮,选择独立或者结对的方式重现任务执行环境,通过服务提供的web控制台或者利用公钥ssh的方式(暂未开放)访问终端,进行问题定位与调试。 + +- 独立复现意味着用户独立建立一个复现的房间,在不共享他人房间token时,独享该环境的web终端 + +- 结对复现需要用户输入已有复现房间的token,进入他人创建的房间共享web终端结对调试 + +![选择复现方式](./image/选择复现方式.jpg) + +![结对选项](./image/结对复现.jpg) + +![结对复现](./image/结对复现调试.jpg) + diff --git a/docs/zh/docs/Open-Source-Software-Notice/openEuler-Open-Source-Software-Notice.zip b/docs/zh/docs/Open-Source-Software-Notice/openEuler-Open-Source-Software-Notice.zip index ab74b7f0aa515f45d656bca9e018d47f7159c3d2..ad250749c025bd799472f607c3cc78dabe9a333e 100644 Binary files a/docs/zh/docs/Open-Source-Software-Notice/openEuler-Open-Source-Software-Notice.zip and b/docs/zh/docs/Open-Source-Software-Notice/openEuler-Open-Source-Software-Notice.zip differ diff --git "a/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\2661.png" "b/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\2661.png" new file mode 100644 index 0000000000000000000000000000000000000000..47530416ac9ec71da0d1e925e3132b0b2b785855 Binary files /dev/null and "b/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\2661.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\26610.png" "b/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\26610.png" new file mode 100644 index 0000000000000000000000000000000000000000..39a57121b446cdbf5b4331d15bb6360eb5496b69 Binary files /dev/null and "b/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\26610.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\26611.png" "b/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\26611.png" new file mode 100644 index 0000000000000000000000000000000000000000..60b687eb558441caec7dc875677b27c86ce65025 Binary files /dev/null and "b/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\26611.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\26612.png" "b/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\26612.png" new file mode 100644 index 0000000000000000000000000000000000000000..b48b1b06a4d993e56475d93684a0f330dcfdf979 Binary files /dev/null and "b/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\26612.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\26613.png" "b/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\26613.png" new file mode 100644 index 0000000000000000000000000000000000000000..c11643e7ded48861f0ed8ec6b42e0b6133345df6 Binary files /dev/null and "b/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\26613.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\26614.png" "b/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\26614.png" new file mode 100644 index 0000000000000000000000000000000000000000..b6865b5086e39119276bd5b5a6ded26549cc4f84 Binary files /dev/null and "b/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\26614.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\26615.png" "b/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\26615.png" new file mode 100644 index 0000000000000000000000000000000000000000..604eb1bfa61471130f26a8f7ffd1c0b3614b22d3 Binary files /dev/null and "b/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\26615.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\26616.png" "b/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\26616.png" new file mode 100644 index 0000000000000000000000000000000000000000..912a5d5d99277969277b372581ca1ecc039019c4 Binary files /dev/null and "b/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\26616.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\26617.png" "b/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\26617.png" new file mode 100644 index 0000000000000000000000000000000000000000..ca4ef0f0e6dff84b39f4dab70a6b84a0a40cfb04 Binary files /dev/null and "b/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\26617.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\26618.png" "b/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\26618.png" new file mode 100644 index 0000000000000000000000000000000000000000..41db05c19217a7f913b902e548a85d0f818fd5fb Binary files /dev/null and "b/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\26618.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\26619.png" "b/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\26619.png" new file mode 100644 index 0000000000000000000000000000000000000000..9c973575f161d398a347960ecc0b07b4d465033a Binary files /dev/null and "b/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\26619.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\2662.png" "b/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\2662.png" new file mode 100644 index 0000000000000000000000000000000000000000..84fb9ab57756bb845d87e7d485a6a10ed2da280b Binary files /dev/null and "b/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\2662.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\26620.png" "b/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\26620.png" new file mode 100644 index 0000000000000000000000000000000000000000..b4df524d8de03adb72d85e20c79124fec981af5a Binary files /dev/null and "b/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\26620.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\2663.png" "b/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\2663.png" new file mode 100644 index 0000000000000000000000000000000000000000..1c14ef5949d4db979061aba87962e6236207dfcd Binary files /dev/null and "b/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\2663.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\2664.png" "b/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\2664.png" new file mode 100644 index 0000000000000000000000000000000000000000..e8fcd86545f43145738da6ba02ae4e3e3f97d6b6 Binary files /dev/null and "b/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\2664.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\2665.png" "b/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\2665.png" new file mode 100644 index 0000000000000000000000000000000000000000..9b1627384a188a831a1f2c629220bbbf2b102d86 Binary files /dev/null and "b/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\2665.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\2666.png" "b/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\2666.png" new file mode 100644 index 0000000000000000000000000000000000000000..db62f93aaeb6c273984fb11854ee1585015131ea Binary files /dev/null and "b/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\2666.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\2667.png" "b/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\2667.png" new file mode 100644 index 0000000000000000000000000000000000000000..df8450690e4e7becd0c999fb5d2d3561a9b3d44b Binary files /dev/null and "b/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\2667.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\2668.png" "b/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\2668.png" new file mode 100644 index 0000000000000000000000000000000000000000..b58d90081d2a8735bcd4af4f3a878badcab78ccd Binary files /dev/null and "b/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\2668.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\2669.png" "b/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\2669.png" new file mode 100644 index 0000000000000000000000000000000000000000..daebc416e2f8700d4e57371c2f6d3a727f401e05 Binary files /dev/null and "b/docs/zh/docs/PilotGo/figures/A\346\217\222\344\273\2669.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/G\346\217\222\344\273\2661.png" "b/docs/zh/docs/PilotGo/figures/G\346\217\222\344\273\2661.png" index 5c7eaa4cde3364c70ca6bff24c768edad986a59c..6a5398b3ede4a0939f415df176aed10d0c582506 100644 Binary files "a/docs/zh/docs/PilotGo/figures/G\346\217\222\344\273\2661.png" and "b/docs/zh/docs/PilotGo/figures/G\346\217\222\344\273\2661.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/G\346\217\222\344\273\2662.png" "b/docs/zh/docs/PilotGo/figures/G\346\217\222\344\273\2662.png" index 45437297fb46749b9f840f45e38cc3e5c4d0d595..4c51e269d22cda516e7a7c4191aa5a398773e1be 100644 Binary files "a/docs/zh/docs/PilotGo/figures/G\346\217\222\344\273\2662.png" and "b/docs/zh/docs/PilotGo/figures/G\346\217\222\344\273\2662.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/P\346\217\222\344\273\2661.png" "b/docs/zh/docs/PilotGo/figures/P\346\217\222\344\273\2661.png" index f4a923729e62fb321931342ec56238b568dbf16e..bdb7a7c95da1562829cb12a445948f1bdcc8d7e5 100644 Binary files "a/docs/zh/docs/PilotGo/figures/P\346\217\222\344\273\2661.png" and "b/docs/zh/docs/PilotGo/figures/P\346\217\222\344\273\2661.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/T\345\210\233\345\273\27216.png" "b/docs/zh/docs/PilotGo/figures/T\345\210\233\345\273\27216.png" new file mode 100644 index 0000000000000000000000000000000000000000..6170f1e4b3c72d7c5759f8604e5eb81fd4f0ba81 Binary files /dev/null and "b/docs/zh/docs/PilotGo/figures/T\345\210\233\345\273\27216.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\2661.png" "b/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\2661.png" new file mode 100644 index 0000000000000000000000000000000000000000..f8a60fe0a2c7ab78b7cb9e27d2ee3e2bb46b855c Binary files /dev/null and "b/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\2661.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\26610.png" "b/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\26610.png" new file mode 100644 index 0000000000000000000000000000000000000000..0737ac47dd75324fd32bc04d30efebc0c82d06e6 Binary files /dev/null and "b/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\26610.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\26611.png" "b/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\26611.png" new file mode 100644 index 0000000000000000000000000000000000000000..e68b2924606bdacd45780df26ea3040512f41529 Binary files /dev/null and "b/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\26611.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\26612.png" "b/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\26612.png" new file mode 100644 index 0000000000000000000000000000000000000000..434f5ae6f698cffe047c8d841e33d56b9524fb57 Binary files /dev/null and "b/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\26612.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\26613.png" "b/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\26613.png" new file mode 100644 index 0000000000000000000000000000000000000000..52b832b784480c487bbfe522a2c288abc7688d63 Binary files /dev/null and "b/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\26613.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\26614.png" "b/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\26614.png" new file mode 100644 index 0000000000000000000000000000000000000000..9f9d37bc22605aeca505ffa02cdb1fb4f77348c2 Binary files /dev/null and "b/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\26614.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\26615.png" "b/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\26615.png" new file mode 100644 index 0000000000000000000000000000000000000000..9e2ae86c1eeb97b7a5497bd9088f397f12a1f1c3 Binary files /dev/null and "b/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\26615.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\26616.png" "b/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\26616.png" new file mode 100644 index 0000000000000000000000000000000000000000..e11c688d9c1bf95b54006c48d34dd986e743d875 Binary files /dev/null and "b/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\26616.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\26617.png" "b/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\26617.png" new file mode 100644 index 0000000000000000000000000000000000000000..a51bf0d3c26008efdf1611c57d9edd8566b4404b Binary files /dev/null and "b/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\26617.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\2662.png" "b/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\2662.png" new file mode 100644 index 0000000000000000000000000000000000000000..642736945bc173fc0ee586d3dcced92b8b095b51 Binary files /dev/null and "b/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\2662.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\2663.png" "b/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\2663.png" new file mode 100644 index 0000000000000000000000000000000000000000..cacef3c82a2b26d782a93855bd90b4997877ba6b Binary files /dev/null and "b/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\2663.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\2664.png" "b/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\2664.png" new file mode 100644 index 0000000000000000000000000000000000000000..1b83df0d7778cac5296d2e9886a901a8292160a9 Binary files /dev/null and "b/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\2664.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\2665.png" "b/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\2665.png" new file mode 100644 index 0000000000000000000000000000000000000000..7d444877c2dbc0f7ed83528f1847c45e3aa0aec5 Binary files /dev/null and "b/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\2665.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\2666.png" "b/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\2666.png" new file mode 100644 index 0000000000000000000000000000000000000000..45ec5470710ecdf3430f4468b1dcb8716f8b8485 Binary files /dev/null and "b/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\2666.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\2667.png" "b/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\2667.png" new file mode 100644 index 0000000000000000000000000000000000000000..8531a7cc0116c09e01b7ce0a46fd3c9b873fa254 Binary files /dev/null and "b/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\2667.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\2668.png" "b/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\2668.png" new file mode 100644 index 0000000000000000000000000000000000000000..c413b2ce7881ffcba3bcb38b766ed2c0961bbad7 Binary files /dev/null and "b/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\2668.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\2669.png" "b/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\2669.png" new file mode 100644 index 0000000000000000000000000000000000000000..882213869392dd7e787e92f45b65386c71e8305d Binary files /dev/null and "b/docs/zh/docs/PilotGo/figures/T\346\217\222\344\273\2669.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\344\277\256\346\224\271\345\257\206\347\240\2011.png" "b/docs/zh/docs/PilotGo/figures/\344\277\256\346\224\271\345\257\206\347\240\2011.png" index a51096f17e336fc0917bce7be08ff69ec2604562..94b11e9d73b09a6592418fd28895f14049aa3042 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\344\277\256\346\224\271\345\257\206\347\240\2011.png" and "b/docs/zh/docs/PilotGo/figures/\344\277\256\346\224\271\345\257\206\347\240\2011.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\344\277\256\346\224\271\345\257\206\347\240\2012.png" "b/docs/zh/docs/PilotGo/figures/\344\277\256\346\224\271\345\257\206\347\240\2012.png" index f26d9ddf85da2d5955ce8f9d338fd1bb036b1132..283a407e3c2c051893528306d485b9cb0df9f4df 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\344\277\256\346\224\271\345\257\206\347\240\2012.png" and "b/docs/zh/docs/PilotGo/figures/\344\277\256\346\224\271\345\257\206\347\240\2012.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\344\277\256\346\224\271\345\257\206\347\240\2013.png" "b/docs/zh/docs/PilotGo/figures/\344\277\256\346\224\271\345\257\206\347\240\2013.png" index b3ffd4507aab3a85b3ab8e775bc1ab4c1efcfda3..123213c25423d8eaa72f346f9531dcd7f0ff15af 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\344\277\256\346\224\271\345\257\206\347\240\2013.png" and "b/docs/zh/docs/PilotGo/figures/\344\277\256\346\224\271\345\257\206\347\240\2013.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\344\277\256\346\224\271\350\212\202\347\202\2711.png" "b/docs/zh/docs/PilotGo/figures/\344\277\256\346\224\271\350\212\202\347\202\2711.png" index 4a127fafef22d62f326e38075173f53f244acfa7..6a703ed196cdc2b4658e317d5fa51330583977fa 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\344\277\256\346\224\271\350\212\202\347\202\2711.png" and "b/docs/zh/docs/PilotGo/figures/\344\277\256\346\224\271\350\212\202\347\202\2711.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\344\277\256\346\224\271\350\212\202\347\202\2712.png" "b/docs/zh/docs/PilotGo/figures/\344\277\256\346\224\271\350\212\202\347\202\2712.png" index 8a097306b1dbf7ce5c6cb14e9c84ff7f59079dfb..aedc97318d613a230f584d3cd7a39b38d30619e4 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\344\277\256\346\224\271\350\212\202\347\202\2712.png" and "b/docs/zh/docs/PilotGo/figures/\344\277\256\346\224\271\350\212\202\347\202\2712.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\344\277\256\346\224\271\350\212\202\347\202\2713.png" "b/docs/zh/docs/PilotGo/figures/\344\277\256\346\224\271\350\212\202\347\202\2713.png" index 1e517062c17505a2ec0905863934e5e0a5e47c36..28d6cc49987efe174b3ff8a53db25d472c35c334 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\344\277\256\346\224\271\350\212\202\347\202\2713.png" and "b/docs/zh/docs/PilotGo/figures/\344\277\256\346\224\271\350\212\202\347\202\2713.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\345\210\233\345\273\272\346\211\271\346\254\2411.png" "b/docs/zh/docs/PilotGo/figures/\345\210\233\345\273\272\346\211\271\346\254\2411.png" index ee14b990e8ab6cf0c71bef1a40cb74cd2919e2fc..c68c4411fd6c1babe7b2014c1dcbd0a60854b033 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\345\210\233\345\273\272\346\211\271\346\254\2411.png" and "b/docs/zh/docs/PilotGo/figures/\345\210\233\345\273\272\346\211\271\346\254\2411.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\345\210\233\345\273\272\346\211\271\346\254\2412.png" "b/docs/zh/docs/PilotGo/figures/\345\210\233\345\273\272\346\211\271\346\254\2412.png" index 1f5a1658552227a88cf07f592e048c4bc1005286..ef19f105e2490140151cb54ebffeebcc5aba87e1 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\345\210\233\345\273\272\346\211\271\346\254\2412.png" and "b/docs/zh/docs/PilotGo/figures/\345\210\233\345\273\272\346\211\271\346\254\2412.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\345\210\233\345\273\272\346\211\271\346\254\2413.png" "b/docs/zh/docs/PilotGo/figures/\345\210\233\345\273\272\346\211\271\346\254\2413.png" index 4066752952e177ca2bb14b61a86d44ff1efc11f6..f252f90886373a0378effb7a41ca2e2c7d3f4f60 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\345\210\233\345\273\272\346\211\271\346\254\2413.png" and "b/docs/zh/docs/PilotGo/figures/\345\210\233\345\273\272\346\211\271\346\254\2413.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\345\210\233\345\273\272\346\211\271\346\254\2414.png" "b/docs/zh/docs/PilotGo/figures/\345\210\233\345\273\272\346\211\271\346\254\2414.png" index ade3fb143ac6a0186985b63c5505afef9666e57e..0f0e95086a9082fc339e9dd930d727f652cea3d8 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\345\210\233\345\273\272\346\211\271\346\254\2414.png" and "b/docs/zh/docs/PilotGo/figures/\345\210\233\345\273\272\346\211\271\346\254\2414.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\346\211\271\346\254\2411.png" "b/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\346\211\271\346\254\2411.png" index e360587420e42233933a9bb27ad31a62557374f0..5b463d0ebb4d5ee3de034e9bfcd668ec5dd20486 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\346\211\271\346\254\2411.png" and "b/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\346\211\271\346\254\2411.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\346\211\271\346\254\2412.png" "b/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\346\211\271\346\254\2412.png" index 0efb93e8dd16f855b444d6a5891be38fdebe92c7..441278a946303bd01f93bddfcbd7b5f5c248033e 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\346\211\271\346\254\2412.png" and "b/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\346\211\271\346\254\2412.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\346\211\271\346\254\2413.png" "b/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\346\211\271\346\254\2413.png" index 2263d7c359bc58451f9382693b98c15cae4fb273..4f5f66df31a36437f14e2bb82c8c95c1cecc977a 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\346\211\271\346\254\2413.png" and "b/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\346\211\271\346\254\2413.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\346\234\272\345\231\2501.png" "b/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\346\234\272\345\231\2501.png" index 74c10a8dee0fb08e4ac39d73c3389b9a2262c143..46b8df350f33a09d26250fc1394ff364c5240877 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\346\234\272\345\231\2501.png" and "b/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\346\234\272\345\231\2501.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\346\234\272\345\231\2502.png" "b/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\346\234\272\345\231\2502.png" index d4e467dd0b6fbd9d13a928deebfa8cca1a515c61..987d5c751cbdcd7a931c8e790c77a8acd69761d6 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\346\234\272\345\231\2502.png" and "b/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\346\234\272\345\231\2502.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\346\234\272\345\231\2503.png" "b/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\346\234\272\345\231\2503.png" index 1bb38a09498d5a0d8c96aef1ce7b39f8bbb43207..2690f8dd111fca999a6208ac6220fba27c8de7d8 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\346\234\272\345\231\2503.png" and "b/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\346\234\272\345\231\2503.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\347\224\250\346\210\2671.png" "b/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\347\224\250\346\210\2671.png" index c0599cd9d3679c2c16debcbf46b85b1328130104..aa44c85f2249df946761c8a230edc0289ea028f2 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\347\224\250\346\210\2671.png" and "b/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\347\224\250\346\210\2671.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\347\224\250\346\210\2672.png" "b/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\347\224\250\346\210\2672.png" index 96a3636ed380608616fccb672017ef363108d529..1bed708ca71b6c69229697707c54be66d96803ac 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\347\224\250\346\210\2672.png" and "b/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\347\224\250\346\210\2672.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\350\212\202\347\202\2711.png" "b/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\350\212\202\347\202\2711.png" index e278954b5422dff1a59ca4acd37b601c9f0ad24e..b1fb1936d03477f1fa3f774382a30f423aae30be 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\350\212\202\347\202\2711.png" and "b/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\350\212\202\347\202\2711.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\350\212\202\347\202\2712.png" "b/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\350\212\202\347\202\2712.png" index e739b14f7b60794065a9ec8a9b2478b2f0b37dd0..417f2c3ab074a0e962ea08e81b9511322851700f 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\350\212\202\347\202\2712.png" and "b/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\350\212\202\347\202\2712.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\350\212\202\347\202\2713.png" "b/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\350\212\202\347\202\2713.png" index d8c8967d525a68515a7ce651f7d30169654bd784..ade96f981a8b1dcb8fa18e21a710ae5eba829024 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\350\212\202\347\202\2713.png" and "b/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\350\212\202\347\202\2713.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\350\247\222\350\211\2621.png" "b/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\350\247\222\350\211\2621.png" index cf3d51f7ab12f241f8a93223631406d0c1b99ab4..0173d0db11dd18d0454523ead716e5dcd80fc833 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\350\247\222\350\211\2621.png" and "b/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\350\247\222\350\211\2621.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\350\247\222\350\211\2622.png" "b/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\350\247\222\350\211\2622.png" index b41055b466720578ca9282ff31589b6e147e8ada..e3d52203ec36234581a3064ff0931cb9c9cf7afe 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\350\247\222\350\211\2622.png" and "b/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\350\247\222\350\211\2622.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\350\247\222\350\211\2623.png" "b/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\350\247\222\350\211\2623.png" index 661ed75def31a49cbf6043493c1805d65c83a83b..966f04257fbd52ee6713d83350b251b8f7255567 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\350\247\222\350\211\2623.png" and "b/docs/zh/docs/PilotGo/figures/\345\210\240\351\231\244\350\247\222\350\211\2623.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\345\217\230\346\233\264\346\235\203\351\231\2201.png" "b/docs/zh/docs/PilotGo/figures/\345\217\230\346\233\264\346\235\203\351\231\2201.png" index e9344f19ded8c509b6ac1047d615d98f97dc4d12..6d46d6f496a3b43a501b6e9f806f5501069e4bb5 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\345\217\230\346\233\264\346\235\203\351\231\2201.png" and "b/docs/zh/docs/PilotGo/figures/\345\217\230\346\233\264\346\235\203\351\231\2201.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\345\217\230\346\233\264\346\235\203\351\231\2202.png" "b/docs/zh/docs/PilotGo/figures/\345\217\230\346\233\264\346\235\203\351\231\2202.png" index c04eb7c5c9f14f5de2bf5f223a8ffbba9cdd599f..901bed4daf0c9b7dbbe7bd6833dea6093e2b6e9a 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\345\217\230\346\233\264\346\235\203\351\231\2202.png" and "b/docs/zh/docs/PilotGo/figures/\345\217\230\346\233\264\346\235\203\351\231\2202.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\345\217\230\346\233\264\351\203\250\351\227\2501.png" "b/docs/zh/docs/PilotGo/figures/\345\217\230\346\233\264\351\203\250\351\227\2501.png" index 23c2d754679c0a374d89c26596669e9bbbebf2f6..708de466507f8c07f29c0b23915b99b66e5ab308 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\345\217\230\346\233\264\351\203\250\351\227\2501.png" and "b/docs/zh/docs/PilotGo/figures/\345\217\230\346\233\264\351\203\250\351\227\2501.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\345\217\230\346\233\264\351\203\250\351\227\2502.png" "b/docs/zh/docs/PilotGo/figures/\345\217\230\346\233\264\351\203\250\351\227\2502.png" index 0efb1384611e7f5b4cb1370e626a238908567dbb..3c401d9c96c8c3cd4602540296143d1d3331d845 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\345\217\230\346\233\264\351\203\250\351\227\2502.png" and "b/docs/zh/docs/PilotGo/figures/\345\217\230\346\233\264\351\203\250\351\227\2502.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\346\211\271\351\207\217\344\270\213\345\217\2211.png" "b/docs/zh/docs/PilotGo/figures/\346\211\271\351\207\217\344\270\213\345\217\2211.png" index 387df3d4cd301fe677e663c6a919abf093efba87..82518a40fd982bff95ff7af1cd94b2f6970eb4c4 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\346\211\271\351\207\217\344\270\213\345\217\2211.png" and "b/docs/zh/docs/PilotGo/figures/\346\211\271\351\207\217\344\270\213\345\217\2211.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\346\211\271\351\207\217\344\270\213\345\217\2212.png" "b/docs/zh/docs/PilotGo/figures/\346\211\271\351\207\217\344\270\213\345\217\2212.png" index ca5e64cbf7d0aeabcececacea125585484e873ca..6dec37e1e56e9c0c56469e2dd81b350f287d25ff 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\346\211\271\351\207\217\344\270\213\345\217\2212.png" and "b/docs/zh/docs/PilotGo/figures/\346\211\271\351\207\217\344\270\213\345\217\2212.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\346\211\271\351\207\217\345\215\270\350\275\2751.png" "b/docs/zh/docs/PilotGo/figures/\346\211\271\351\207\217\345\215\270\350\275\2751.png" index 4bc4ca6f620619fe10a81205a939535f83e772c2..a8f3dfd37854995e502ca77e04e794e380c8467f 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\346\211\271\351\207\217\345\215\270\350\275\2751.png" and "b/docs/zh/docs/PilotGo/figures/\346\211\271\351\207\217\345\215\270\350\275\2751.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\346\211\271\351\207\217\345\215\270\350\275\2752.png" "b/docs/zh/docs/PilotGo/figures/\346\211\271\351\207\217\345\215\270\350\275\2752.png" index 68467232ca5bd65a03eccc4fc3fb8a5e95529ddf..584a55ba3a87ff087d77de756e455c61276ed022 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\346\211\271\351\207\217\345\215\270\350\275\2752.png" and "b/docs/zh/docs/PilotGo/figures/\346\211\271\351\207\217\345\215\270\350\275\2752.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\346\211\271\351\207\217\346\223\215\344\275\2341.png" "b/docs/zh/docs/PilotGo/figures/\346\211\271\351\207\217\346\223\215\344\275\2341.png" index 5cee721e3c0ce14f666a85cd3acb27b57684f077..0b25e8e5cee2f6d4597d2d244565ff827947b4d2 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\346\211\271\351\207\217\346\223\215\344\275\2341.png" and "b/docs/zh/docs/PilotGo/figures/\346\211\271\351\207\217\346\223\215\344\275\2341.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\346\227\245\345\277\227\346\237\245\347\234\213.png" "b/docs/zh/docs/PilotGo/figures/\346\227\245\345\277\227\346\237\245\347\234\213.png" index d98ef2d084ccf737b7a69c168dac1f8e7ef6e49d..aef655d63baec93d740d8a1bb010715ee5f46b43 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\346\227\245\345\277\227\346\237\245\347\234\213.png" and "b/docs/zh/docs/PilotGo/figures/\346\227\245\345\277\227\346\237\245\347\234\213.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\346\227\245\345\277\227\350\257\246\346\203\205.png" "b/docs/zh/docs/PilotGo/figures/\346\227\245\345\277\227\350\257\246\346\203\205.png" new file mode 100644 index 0000000000000000000000000000000000000000..97864d9a760b2bc0acd6e82b2d072cf3647384a2 Binary files /dev/null and "b/docs/zh/docs/PilotGo/figures/\346\227\245\345\277\227\350\257\246\346\203\205.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\346\234\272\345\231\250.png" "b/docs/zh/docs/PilotGo/figures/\346\234\272\345\231\250.png" index a65f27145edee0d8e10259a808a49c997bdbbb81..2c866f8739a5757505aeb5d5ad1626a6c0e42179 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\346\234\272\345\231\250.png" and "b/docs/zh/docs/PilotGo/figures/\346\234\272\345\231\250.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\346\234\272\345\231\250\346\234\215\345\212\241\346\237\245\350\257\242.png" "b/docs/zh/docs/PilotGo/figures/\346\234\272\345\231\250\346\234\215\345\212\241\346\237\245\350\257\242.png" index 95cf112c8e05a31b1f92861f91d76417d23d807c..c9f00f1467981d26f162c45fc065cba7043e88c7 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\346\234\272\345\231\250\346\234\215\345\212\241\346\237\245\350\257\242.png" and "b/docs/zh/docs/PilotGo/figures/\346\234\272\345\231\250\346\234\215\345\212\241\346\237\245\350\257\242.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\346\234\272\345\231\250\347\224\250\346\210\267\344\277\241\346\201\257.png" "b/docs/zh/docs/PilotGo/figures/\346\234\272\345\231\250\347\224\250\346\210\267\344\277\241\346\201\257.png" index 7b371c41d42349d6e7aaf900444034e6ee72ec0f..e96ba36aa2bcbadd7b7f9bd23e451d6faa456bb9 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\346\234\272\345\231\250\347\224\250\346\210\267\344\277\241\346\201\257.png" and "b/docs/zh/docs/PilotGo/figures/\346\234\272\345\231\250\347\224\250\346\210\267\344\277\241\346\201\257.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\346\234\272\345\231\250\347\275\221\347\273\234\351\205\215\347\275\256.png" "b/docs/zh/docs/PilotGo/figures/\346\234\272\345\231\250\347\275\221\347\273\234\351\205\215\347\275\256.png" index 742c506ea550d649354a06010bb96b853bce02bf..6a3017159efe20948a8c1cdbd4d98b9ef482e775 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\346\234\272\345\231\250\347\275\221\347\273\234\351\205\215\347\275\256.png" and "b/docs/zh/docs/PilotGo/figures/\346\234\272\345\231\250\347\275\221\347\273\234\351\205\215\347\275\256.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\346\234\272\345\231\250\350\257\246\346\203\205.png" "b/docs/zh/docs/PilotGo/figures/\346\234\272\345\231\250\350\257\246\346\203\205.png" new file mode 100644 index 0000000000000000000000000000000000000000..8013f25410e592c4f10486b7d6214f6eba3717fd Binary files /dev/null and "b/docs/zh/docs/PilotGo/figures/\346\234\272\345\231\250\350\257\246\346\203\205.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\346\234\272\345\231\250\350\275\257\344\273\266\345\214\205\345\215\270\350\275\275.png" "b/docs/zh/docs/PilotGo/figures/\346\234\272\345\231\250\350\275\257\344\273\266\345\214\205\345\215\270\350\275\275.png" index cc74a97dcf92ca3eb57b8cf7b2319e73cf10c099..d9299c2fa59898b7063049aa7ecb428eed4601b7 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\346\234\272\345\231\250\350\275\257\344\273\266\345\214\205\345\215\270\350\275\275.png" and "b/docs/zh/docs/PilotGo/figures/\346\234\272\345\231\250\350\275\257\344\273\266\345\214\205\345\215\270\350\275\275.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\346\234\272\345\231\250\350\275\257\344\273\266\345\214\205\345\256\211\350\243\205.png" "b/docs/zh/docs/PilotGo/figures/\346\234\272\345\231\250\350\275\257\344\273\266\345\214\205\345\256\211\350\243\205.png" new file mode 100644 index 0000000000000000000000000000000000000000..f9726c1b6b879f63e618a2245354c595e4b17f55 Binary files /dev/null and "b/docs/zh/docs/PilotGo/figures/\346\234\272\345\231\250\350\275\257\344\273\266\345\214\205\345\256\211\350\243\205.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\346\246\202\350\247\210.png" "b/docs/zh/docs/PilotGo/figures/\346\246\202\350\247\210.png" index ca652711583c0c537df164621384e0cb251dac03..d32d9673426c3a3beb3b7b02ae2dd7ba0cee8671 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\346\246\202\350\247\210.png" and "b/docs/zh/docs/PilotGo/figures/\346\246\202\350\247\210.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\346\267\273\345\212\240\347\224\250\346\210\2671.png" "b/docs/zh/docs/PilotGo/figures/\346\267\273\345\212\240\347\224\250\346\210\2671.png" index e5f5631e6ca19f8498fa2b030613b0a75d7168f1..04ed9a75aacb082a4c416b4d3309606b800c8d27 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\346\267\273\345\212\240\347\224\250\346\210\2671.png" and "b/docs/zh/docs/PilotGo/figures/\346\267\273\345\212\240\347\224\250\346\210\2671.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\346\267\273\345\212\240\347\224\250\346\210\2672.png" "b/docs/zh/docs/PilotGo/figures/\346\267\273\345\212\240\347\224\250\346\210\2672.png" index 017c47fdc9974c3a9ee5758c05512eb0b01a929c..e69cb0806e796f31a54a37ec0d9136e2ce74efab 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\346\267\273\345\212\240\347\224\250\346\210\2672.png" and "b/docs/zh/docs/PilotGo/figures/\346\267\273\345\212\240\347\224\250\346\210\2672.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\346\267\273\345\212\240\350\212\202\347\202\2711.png" "b/docs/zh/docs/PilotGo/figures/\346\267\273\345\212\240\350\212\202\347\202\2711.png" index c7cb768fdd35d3c2a30e3f175157418e650f5c9a..f7554243b3009c7945ad1746f86a77b046573f2f 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\346\267\273\345\212\240\350\212\202\347\202\2711.png" and "b/docs/zh/docs/PilotGo/figures/\346\267\273\345\212\240\350\212\202\347\202\2711.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\346\267\273\345\212\240\350\212\202\347\202\2712.png" "b/docs/zh/docs/PilotGo/figures/\346\267\273\345\212\240\350\212\202\347\202\2712.png" index 45f82cb1d563356585b932aa1de6ae79b174b2eb..8ee25dd17e3252c21267231e4b6ad17860ceac6e 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\346\267\273\345\212\240\350\212\202\347\202\2712.png" and "b/docs/zh/docs/PilotGo/figures/\346\267\273\345\212\240\350\212\202\347\202\2712.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\346\267\273\345\212\240\350\247\222\350\211\2621.png" "b/docs/zh/docs/PilotGo/figures/\346\267\273\345\212\240\350\247\222\350\211\2621.png" index a51db5c136e8d6baf61187d8882d4b02758cb056..5a98fc65e9f7e3dbdb312ed5951eb3767a5001f8 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\346\267\273\345\212\240\350\247\222\350\211\2621.png" and "b/docs/zh/docs/PilotGo/figures/\346\267\273\345\212\240\350\247\222\350\211\2621.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\346\267\273\345\212\240\350\247\222\350\211\2622.png" "b/docs/zh/docs/PilotGo/figures/\346\267\273\345\212\240\350\247\222\350\211\2622.png" index a352b27353c2513f55cad32d968b1095de96eb23..ba014915a2fec33734e7daee451c8eaf7738d6a1 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\346\267\273\345\212\240\350\247\222\350\211\2622.png" and "b/docs/zh/docs/PilotGo/figures/\346\267\273\345\212\240\350\247\222\350\211\2622.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\347\224\250\346\210\267\345\257\274\345\205\2451.png" "b/docs/zh/docs/PilotGo/figures/\347\224\250\346\210\267\345\257\274\345\205\2451.png" index 7b7c230d9942bd9fceaeb2fbb23b3e16255b2505..b538a40e8b89c49ff4ee4a26df7f390c8564dbd6 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\347\224\250\346\210\267\345\257\274\345\205\2451.png" and "b/docs/zh/docs/PilotGo/figures/\347\224\250\346\210\267\345\257\274\345\205\2451.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\347\224\250\346\210\267\345\257\274\345\205\2452.png" "b/docs/zh/docs/PilotGo/figures/\347\224\250\346\210\267\345\257\274\345\205\2452.png" index dad2779f6ddb6577a636fe8fb6050aeec69ee2ad..c725bbe4a5e738bafebeefbe0c90b2f555275cc8 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\347\224\250\346\210\267\345\257\274\345\205\2452.png" and "b/docs/zh/docs/PilotGo/figures/\347\224\250\346\210\267\345\257\274\345\205\2452.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\347\224\250\346\210\267\345\257\274\345\205\2453.png" "b/docs/zh/docs/PilotGo/figures/\347\224\250\346\210\267\345\257\274\345\205\2453.png" index 88d855f0e0f48d3da3523d59df9e2358fb49a92c..c731959721be7ecacd4f7f5e2d82486deecb997f 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\347\224\250\346\210\267\345\257\274\345\205\2453.png" and "b/docs/zh/docs/PilotGo/figures/\347\224\250\346\210\267\345\257\274\345\205\2453.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\347\224\250\346\210\267\345\257\274\345\207\2721.png" "b/docs/zh/docs/PilotGo/figures/\347\224\250\346\210\267\345\257\274\345\207\2721.png" index 6198f25e96b6f782e042a1e1c36b0bef897ca064..fc395b3b31fe80d8d63caf644b8af92bccadb691 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\347\224\250\346\210\267\345\257\274\345\207\2721.png" and "b/docs/zh/docs/PilotGo/figures/\347\224\250\346\210\267\345\257\274\345\207\2721.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\347\224\250\346\210\267\345\257\274\345\207\2722.png" "b/docs/zh/docs/PilotGo/figures/\347\224\250\346\210\267\345\257\274\345\207\2722.png" index c55645090a3475c117b2e5805b42bad57a90dfd0..fd44f2600b62b30ac8abb125668ecc3e4f59cf5f 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\347\224\250\346\210\267\345\257\274\345\207\2722.png" and "b/docs/zh/docs/PilotGo/figures/\347\224\250\346\210\267\345\257\274\345\207\2722.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\347\274\226\350\276\221\346\211\271\346\254\2411.png" "b/docs/zh/docs/PilotGo/figures/\347\274\226\350\276\221\346\211\271\346\254\2411.png" index 068b66d65a0f63fabd9f4cd78b46aafbbd1eb8b7..ddc0b69e0101881902ed3f12cf5f56cdc20dc32b 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\347\274\226\350\276\221\346\211\271\346\254\2411.png" and "b/docs/zh/docs/PilotGo/figures/\347\274\226\350\276\221\346\211\271\346\254\2411.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\347\274\226\350\276\221\346\211\271\346\254\2412.png" "b/docs/zh/docs/PilotGo/figures/\347\274\226\350\276\221\346\211\271\346\254\2412.png" index b4485514201339dc8d3e59c466e57afdd7817c06..f5625966a81c44421eb1c2f3906aa7ce2746faf5 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\347\274\226\350\276\221\346\211\271\346\254\2412.png" and "b/docs/zh/docs/PilotGo/figures/\347\274\226\350\276\221\346\211\271\346\254\2412.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\347\274\226\350\276\221\346\211\271\346\254\2413.png" "b/docs/zh/docs/PilotGo/figures/\347\274\226\350\276\221\346\211\271\346\254\2413.png" index a469a8798beecb882e5823132f442ee1eaf5cb21..27519a3d4f0becbc9b77e27897741547ae12e269 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\347\274\226\350\276\221\346\211\271\346\254\2413.png" and "b/docs/zh/docs/PilotGo/figures/\347\274\226\350\276\221\346\211\271\346\254\2413.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\347\274\226\350\276\221\347\224\250\346\210\2671.png" "b/docs/zh/docs/PilotGo/figures/\347\274\226\350\276\221\347\224\250\346\210\2671.png" index 36cdb73c8cffc40e7e9d6831691183cdfb481649..4df5e7583f4a9676c864dda18bb9d411c29862e8 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\347\274\226\350\276\221\347\224\250\346\210\2671.png" and "b/docs/zh/docs/PilotGo/figures/\347\274\226\350\276\221\347\224\250\346\210\2671.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\347\274\226\350\276\221\347\224\250\346\210\2672.png" "b/docs/zh/docs/PilotGo/figures/\347\274\226\350\276\221\347\224\250\346\210\2672.png" index 7391fda93795f334f7674c98c811bf93919e99a0..0a36495c86e915bc3c8c9d23f422755aeea80f46 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\347\274\226\350\276\221\347\224\250\346\210\2672.png" and "b/docs/zh/docs/PilotGo/figures/\347\274\226\350\276\221\347\224\250\346\210\2672.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\347\274\226\350\276\221\350\247\222\350\211\2621.png" "b/docs/zh/docs/PilotGo/figures/\347\274\226\350\276\221\350\247\222\350\211\2621.png" index d752d16e201a493d71feee178f6a9ca4541df5ed..4ace76039f0cfe5a21d1ec53940819ed3ca31c5a 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\347\274\226\350\276\221\350\247\222\350\211\2621.png" and "b/docs/zh/docs/PilotGo/figures/\347\274\226\350\276\221\350\247\222\350\211\2621.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\347\274\226\350\276\221\350\247\222\350\211\2622.png" "b/docs/zh/docs/PilotGo/figures/\347\274\226\350\276\221\350\247\222\350\211\2622.png" index 25c650b0393a73ba5b40f3409a760e420881dcfe..65b8bb7a4def07435da61d80f9485f9637a109e3 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\347\274\226\350\276\221\350\247\222\350\211\2622.png" and "b/docs/zh/docs/PilotGo/figures/\347\274\226\350\276\221\350\247\222\350\211\2622.png" differ diff --git "a/docs/zh/docs/PilotGo/figures/\351\207\215\347\275\256\345\257\206\347\240\2011.png" "b/docs/zh/docs/PilotGo/figures/\351\207\215\347\275\256\345\257\206\347\240\2011.png" index 0f33a7a9476814caf942edb428b55a8aa31e3d91..bad1fce9f9742ee586ac3f8d9f61ae37b03b8779 100644 Binary files "a/docs/zh/docs/PilotGo/figures/\351\207\215\347\275\256\345\257\206\347\240\2011.png" and "b/docs/zh/docs/PilotGo/figures/\351\207\215\347\275\256\345\257\206\347\240\2011.png" differ diff --git "a/docs/zh/docs/PilotGo/\344\275\277\347\224\250\346\211\213\345\206\214.md" "b/docs/zh/docs/PilotGo/\344\275\277\347\224\250\346\211\213\345\206\214.md" index eea8287df907ea6d2b09a4ab9842b95a8cb10209..dc1fa1ecf470350a10011aad5ac0eea76ff25e53 100644 --- "a/docs/zh/docs/PilotGo/\344\275\277\347\224\250\346\211\213\345\206\214.md" +++ "b/docs/zh/docs/PilotGo/\344\275\277\347\224\250\346\211\213\345\206\214.md" @@ -45,7 +45,8 @@ PilotGo可以单机部署也可以采用集群式部署。安装之前先关闭 ### 2.1 首次登录 #### 2.1.1 用户登录页面 -用户登录页面如图所示,输入正确的用户名和密码登录系统。默认用户名为admin@123.com,默认密码为admin,首次登录之后建议先修改密码。![本地路径](./figures/登录.png) +用户登录页面如图所示,输入正确的用户名和密码登录系统。默认用户名为admin,默认密码为admin,首次登录之后建议先修改密码。![本地路径](./figures/登录.png) +登录成功显示概览页面。![本地路径](./figures/概览.png) ### 2.2 用户模块 @@ -95,6 +96,11 @@ PilotGo可以单机部署也可以采用集群式部署。安装之前先关闭 2. 点击页面的导出按钮;![本地路径](./figures/用户导出1.png) 3. 浏览器显示下载进度,成功下载后打开xlsx文件查看信息。![本地路径](./figures/用户导出2.png) +#### 2.2.5 用户退出 +1. 具有该权限的用户成功登录,点击页面右上角的按钮; +2. 点击页面的确定按钮; +3. 退出到登录页面。![本地路径](./figures/登录.png) + ### 2.3 角色模块 #### 2.3.1 添加角色 @@ -108,14 +114,14 @@ PilotGo可以单机部署也可以采用集群式部署。安装之前先关闭 #### 2.3.2.1 修改角色信息 1. 具有该权限的用户成功登录,点击左侧导航栏中的角色管理; 2. 点击对应角色的编辑按钮; -3. 输入新的角色名和描述信息,并点击确定按钮;![本地路径](./figures/添加角色1.png) +3. 输入新的角色名和描述信息,并点击确定按钮;![本地路径](./figures/编辑角色1.png) 4. 页面弹框提示“角色信息修改成功”,并页面显示修改后的角色信息。![本地路径](./figures/编辑角色2.png) #### 2.3.2.2 修改角色权限 1. 具有该权限的用户成功登录,点击左侧导航栏中的角色管理; 2. 点击对应角色的变更按钮; -3. 选择相应的权限,点击重置按钮可以清空所选权限,并点击确定按钮;![本地路径](./figures/编辑角色1.png) -4. 页面弹框提示“角色权限变更成功”。![本地路径](./figures/编辑角色2.png) +3. 选择相应的权限,点击重置按钮可以清空所选权限,并点击确定按钮;![本地路径](./figures/变更权限1.png) +4. 页面弹框提示“角色权限变更成功”。![本地路径](./figures/变更权限2.png) ### 2.3.3 删除角色 1. 具有该权限的用户成功登录,点击左侧导航栏中的角色管理; @@ -131,126 +137,88 @@ PilotGo可以单机部署也可以采用集群式部署。安装之前先关闭 #### 2.4.2 删除部门节点 1. 具有该权限的用户成功登录,点击左侧导航栏中的系统和机器列表; -2. 在部门节点对应位置点击删除符号并点击确定;![本地路径](./figures/修改节点1.png)![本地路径](./figures/删除节点2.png) +2. 在部门节点对应位置点击删除符号并点击确定;![本地路径](./figures/删除节点1.png)![本地路径](./figures/删除节点2.png) 3. 页面弹框提示“删除成功”,并不显示删除节点的信息。![本地路径](./figures/删除节点3.png) -### 2.5 配置库模块 - -#### 2.5.1 添加 repo 配置文件 -1. 具有该权限的用户成功登录,点击左侧导航栏中的库配置文件; -2. 点击页面的新增按钮;![本地路径](./figures/创建文件1.png) -3. 输入文件名、文件类型、文件路径、描述和内容等信息,文件名必须以.repo结尾,文件路径必须正确,文件内容要符合repo文件的格式,并点击确定按钮;![本地路径](./figures/创建文件2.png) -4. 页面弹框提示“文件保存成功”;并显示新增的repo配置文件信息。![本地路径](./figures/创建文件3.png) - -#### 2.5.2 修改 repo 配置文件 -1. 具有该权限的用户成功登录,点击左侧导航栏中的库配置文件; -2. 找到要修改的repo文件,点击对应的编辑按钮;![本地路径](./figures/编辑文件1.png) -3. 输入修改后的文件名、文件类型、文件路径、描述和内容等信息,并点击确定按钮;![本地路径](./figures/编辑文件2.png) -4. 页面弹框提示“配置文件修改成功”;并显示修改后的repo配置文件信息。![本地路径](./figures/编辑文件3.png) - -#### 2.5.3 删除 repo 配置文件 -1. 具有该权限的用户成功登录,点击左侧导航栏中的库配置文件; -2. 选择要删除的文件,点击页面的删除按钮,并点击确定;![本地路径](./figures/删除角色1.png)![本地路径](./figures/删除角色2.png) -3. 页面弹框提示“存储的文件已从数据库删除”,且页面不显示删除的repo配置文件信息。![本地路径](./figures/文件删除3.png) - -#### 2.5.4 下发 repo 配置文件 -1. 具有该权限的用户成功登录,点击左侧导航栏中的库配置文件; -2. 找到要下发的文件,点击页面的下发按钮,选择要下发的批次,并点击确定;![本地路径](./figures/文件下发1.png)![本地路径](./figures/文件下发2.png) -3. 页面弹框提示“配置文件下发成功”。![本地路径](./figures/文件下发3.png) - -#### 2.5.5 回滚 repo 配置文件历史版本 -1. 具有该权限的用户成功登录,点击左侧导航栏中的库配置文件; -2. 找到要回滚的文件,点击页面的历史版本按钮;![本地路径](./figures/文件历史版本.png) -3. 选择要回滚的版本,点击回滚按钮并点击确定;![本地路径](./figures/文件回滚1.png)![本地路径](./figures/文件回滚2.png) -4. 页面弹框提示“已回退到历史版本”,历史版本页面增加一条“-latest”记录。![本地路径](./figures/文件回滚3.png)![本地路径](./figures/文件回滚4.png) +#### 2.4.3 添加部门节点 +1. 具有该权限的用户成功登录,点击左侧导航栏中的系统和机器列表; +2. 在部门节点对应位置点击加号并点击确定;![本地路径](./figures/添加节点1.png) +3. 页面弹框提示“新建成功”,并显示添加节点的信息。![本地路径](./figures/添加节点2.png) -### 2.6 批次模块 +### 2.5 批次模块 -#### 2.6.1 创建批次 +#### 2.5.1 创建批次 1. 具有该权限的用户成功登录,点击左侧导航栏中的系统和创建批次; 2. 点击机器所在的部门名字,在备选项中选择0个或多个机器ip(点击ip前面的方框),若选择一个或多个部门的所有机器可以点击部门列表的方框,并点击备选项中的部门名称,选择完成后点击向右的箭头;![本地路径](./figures/创建批次1.png) 3. 输入批次名称和描述,并点击创建按钮;![本地路径](./figures/创建批次2.png) 4. 页面弹框提示“批次入库成功”,并批次页面显示新创建的批次信息。![本地路径](./figures/创建批次3.png)![本地路径](./figures/创建批次4.png) -#### 2.6.2 修改批次 +#### 2.5.2 修改批次 1. 具有该权限的用户成功登录,点击左侧导航栏中的批次; 2. 点击对应批次的编辑按钮;![本地路径](./figures/编辑批次1.png) -3. 输入新的批次名称和备注信息,并点击确定按钮;![本地路径](./figures/编辑文件2.png) +3. 输入新的批次名称和备注信息,并点击确定按钮;![本地路径](./figures/编辑批次2.png) 4. 页面弹框提示“批次修改成功”,并显示修改后的批次信息。![本地路径](./figures/编辑批次3.png) -#### 2.6.3 删除批次 +#### 2.5.3 删除批次 1. 具有该权限的用户成功登录,点击左侧导航栏中的批次; 2. 选择要删除的批次,点击删除按钮并点击确定;![本地路径](./figures/删除批次1.png)![本地路径](./figures/删除批次2.png) 3. 页面弹框提示“批次删除成功”,并不显示删除批次的信息。![本地路径](./figures/删除批次3.png) -#### 2.6.4 批量安装软件包 +#### 2.5.4 批量安装软件包 1. 具有该权限的用户成功登录,点击左侧导航栏中的批次,并点击批次名称;![本地路径](./figures/批量操作1.png) 2. 点击右上角的rpm下发按钮,在搜索框输入软件包的名称,并点击下发按钮;![本地路径](./figures/批量下发1.png) 3. 页面弹框提示“软件包安装成功”,agent端可以查到下发的rpm包。![本地路径](./figures/批量下发2.png) -#### 2.6.5 批量卸载软件包 +#### 2.5.5 批量卸载软件包 1. 具有该权限的用户成功登录,点击左侧导航栏中的批次,并点击批次名称;![本地路径](./figures/批量操作1.png) 2. 点击右上角的rpm卸载按钮,在搜索框输入软件包的名称,并点击卸载按钮;![本地路径](./figures/批量卸载1.png) 3. 页面弹框提示“软件包卸载成功”,agent端无此软件包。![本地路径](./figures/批量卸载2.png) -### 2.7 机器模块 +### 2.6 机器模块 -#### 2.7.1 删除机器 +#### 2.6.1 删除机器 1. 具有该权限的用户成功登录,点击左侧导航栏中的系统和机器列表; 2. 选择要删除的机器,点击删除按钮并点击确定;![本地路径](./figures/删除机器1.png)![本地路径](./figures/删除机器2.png) 3. 页面弹框提示“机器删除成功”,并不显示删除机器的信息。![本地路径](./figures/删除机器3.png) -#### 2.7.2 变更机器部门 +#### 2.6.2 变更机器部门 1. 具有该权限的用户成功登录,点击左侧导航栏中的系统和机器列表; 2. 选择要变更部门的机器,点击变更部门按钮; 3. 核对变更部门机器ip的信息,选择新的部门,并点击确定;![本地路径](./figures/变更部门1.png) 4. 页面弹框提示“机器部门修改成功”,并显示变更后的信息。![本地路径](./figures/变更部门2.png) -#### 2.7.3 修改机器内核参数 +#### 2.6.3 安装软件包 1. 具有该权限的用户成功登录,点击左侧导航栏中的系统和机器列表; -2. 点击要查看信息的机器ip,并点击内核参数信息栏目;![本地路径](./figures/机器内核修改1.png) -3. 输入要查找的内核,点击修改,输入参数值并点击确定;![本地路径](./figures/机器内核修改2.png) -4. 页面显示修改进度,成功后显示100%。![本地路径](./figures/机器内核修改3.png) +2. 点击要查看信息的机器ip,并点击软件包信息栏目; +3. 在搜索框输入软件包的名称,并点击安装按钮; +4. 页面显示repo名称、repo地址信息,并页面显示软件包名、执行动作、结果等信息。![本地路径](./figures/机器软件包安装.png) -#### 2.7.4 启动机器服务 +#### 2.6.4 卸载软件包 1. 具有该权限的用户成功登录,点击左侧导航栏中的系统和机器列表; -2. 点击要查看信息的机器ip,并点击服务信息栏目; -3. 在搜索框输入要启动的服务名称,并点击启动按钮; -4. 页面显示软件包名、执行动作、执行结果进度条信息。![本地路径](./figures/机器服务启动.png) +2. 点击要查看信息的机器ip,并点击软件包信息栏目; +3. 在搜索框输入软件包的名称,并点击卸载按钮; +4. 页面显示repo名称、repo地址信息,并页面显示软件包名、执行动作、结果等信息。![本地路径](./figures/机器软件包卸载.png) -#### 2.7.5 重启机器服务 +#### 2.6.5 查看机器详情 1. 具有该权限的用户成功登录,点击左侧导航栏中的系统和机器列表; -2. 点击要查看信息的机器ip,并点击服务信息栏目; -3. 在搜索框输入要重启的服务名称,并点击重启按钮; -4. 页面显示软件包名、执行动作、执行结果进度条信息。![本地路径](./figures/机器服务重启.png) +2. 点击要查看信息的机器ip; +3. 页面显示机器的相关信息。![本地路径](./figures/机器详情.png)![本地路径](./figures/机器服务查询.png)![本地路径](./figures/机器网络配置.png)![本地路径](./figures/机器用户信息.png) -#### 2.7.6 停止机器服务 -1. 具有该权限的用户成功登录,点击左侧导航栏中的系统和机器列表; -2. 点击要查看信息的机器ip,并点击服务信息栏目; -3. 在搜索框输入要启动的服务名称,并点击停止按钮; -4. 页面显示软件包名、执行动作、执行结果进度条信息。![本地路径](./figures/机器服务停止.png) -#### 2.7.7 安装软件包 -1. 有该权限的用户成功登录,点击左侧导航栏中的系统和机器列表; -2. 点击要查看信息的机器ip,并点击软件包信息栏目; -3. 在搜索框输入软件包的名称,并点击安装按钮; -4. 页面显示repo名称、repo地址信息,并页面显示软件包名、执行动作、结果等信息。![本地路径](./figures/机器软件包安装2.png) +### 2.7 日志模块 -#### 2.7.8 卸载软件包 -1. 有该权限的用户成功登录,点击左侧导航栏中的系统和机器列表; -2. 点击要查看信息的机器ip,并点击软件包信息栏目; -3. 在搜索框输入软件包的名称,并点击卸载按钮; -4. 页面显示repo名称、repo地址信息,并页面显示软件包名、执行动作、结果等信息。![本地路径](./figures/机器软件包卸载.png) +#### 2.7.1 查看所有日志 +1. 具有该权限的用户成功登录,点击左侧导航栏中的审计日志; +2. 页面展示日志信息。![本地路径](./figures/日志查看.png) -#### 2.7.9 连接机器终端 -1. 有该权限的用户成功登录,点击左侧导航栏中的系统和机器列表; -2. 点击要查看信息的机器ip,并点击终端信息栏目; -3. 输入ip地址和机器密码,点击连接按钮;![本地路径](./figures/机器终端1.png) -4. 页面显示终端窗口。![本地路径](./figures/机器终端.png) +#### 2.7.2 查看批处理日志详情 +1. 具有该权限的用户成功登录,点击左侧导航栏中的审计日志; +2. 点击折叠按钮; +3. 页面显示子日志的信息。![本地路径](./figures/日志详情.png) ## 3 PilotGo平台插件使用说明 -### 3.1 Grafana插件使用说明 +### 3.1 PilotGo-plugin-grafana插件使用说明 1. 在任意一台服务器上执行dnf install PilotGo-plugin-grafana grafana; 2. 将/opt/PilotGo/plugin/grafana/config.yaml文件中ip地址修改为本机真实ip,修改/etc/grafana/grafana.ini文件一下信息: @@ -266,16 +234,113 @@ PilotGo可以单机部署也可以采用集群式部署。安装之前先关闭 `systemctl start PilotGo-plugin-grafana` -4. 成功登录pilotgo平台,点击左侧导航栏中的插件管理,点击添加插件按钮,填写插件名称和服务地址,并点击确定;![本地路径](./figures/G插件1.png) +4. 成功登录pilotgo平台,点击左侧导航栏中的插件管理,点击添加插件按钮,填写插件名称和服务地址,机器地址例子:http://真实ip:9999/plugin/grafana,并点击确定;![本地路径](./figures/G插件1.png) 5. 页面增加一条插件管理数据,导航栏增加一个插件按钮。![本地路径](./figures/G插件2.png)![本地路径](./figures/G插件3.png) -### 3.2 Prometheus插件使用说明 +### 3.2 PilotGo-plugin-prometheus插件使用说明 1. 在任意一台服务器上执行dnf install PilotGo-plugin-prometheus; 2. 将/opt/PilotGo/plugin/prometheus/server/config.yml文件中ip地址修改为本机真实ip和mysql服务地址; 3. 重启服务,执行以下命令: -`systemctl start PilotGo-plugin-prometheusX` +`systemctl start PilotGo-plugin-prometheus` -4. 成功登录pilotgo平台,点击左侧导航栏中的插件管理,点击添加插件按钮,填写插件名称和服务地址,并点击确定;![本地路径](./figures/P插件1.png) +4. 成功登录pilotgo平台,点击左侧导航栏中的插件管理,点击添加插件按钮,填写插件名称和服务地址,机器地址例子:http://真实ip:8090/plugin/prometheus,并点击确定;![本地路径](./figures/P插件1.png) 5. 页面增加一条插件管理数据,导航栏增加一个插件按钮。![本地路径](./figures/P插件2.png)![本地路径](./figures/P插件3.png) -6. 在页面选择机器ip和监控时间,展示机器数据面板。![本地路径](./figures/P插件4.png) \ No newline at end of file +6. 在页面选择机器ip和监控时间,展示机器数据面板。![本地路径](./figures/P插件4.png) + +### 3.3 PilotGo-plugin-a-tune插件使用说明 +1. 在需要调优的机器上安装PilotGo-agent、atune,并执行dnf install PilotGo-plugin-a-tune; +2. 将/opt/PilotGo/plugin/a-tune/config.yml文件中ip地址修改为真实ip; +3. 重启服务,执行以下命令: + +`systemctl start PilotGo-plugin-a-tune` + +4. 成功登录pilotgo平台,点击左侧导航栏中的插件管理,点击添加插件按钮,填写插件名称和服务地址,机器地址例子:http://真实ip:8099,并点击确定;![本地路径](./figures/A插件1.png) +5. 页面增加一条插件管理数据,导航栏增加一个插件按钮。![本地路径](./figures/A插件2.png) + +#### 3.3.1 调优模板 + +##### 3.3.1.1 添加调优模板 +1. 具有该权限的用户成功登录,点击左侧导航栏中的plugin-atune插件和调优模板; +2. 点击页面右上角的新增按钮,输入字段信息,并点击保存;![本地路径](./figures/A插件3.png) +3. 页面显示新添加的调优模板信息,并点击操作栏目的详情按钮可以查看模板详情。![本地路径](./figures/A插件4.png) + +##### 3.3.1.2 修改调优模板 +1. 具有该权限的用户成功登录,点击左侧导航栏中的plugin-atune插件和调优模板; +2. 在页面上找到要修改的模板,并点击编辑按钮;![本地路径](./figures/A插件6.png) +3. 修改信息后点击页面的保存按钮,刷新页面,页面显示修改后的调优模板信息。![本地路径](./figures/A插件7.png)![本地路径](./figures/A插件8.png) + +##### 3.3.1.3 删除调优模板 +1. 具有该权限的用户成功登录,点击左侧导航栏中的plugin-atune插件和调优模板; +2. 在页面上找到要删除的模板,选中对应模板,点击右上角的删除按钮,并点击确定;![本地路径](./figures/A插件9.png) +3. ![本地路径](./figures/A插件10.png)![本地路径](./figures/A插件5.png) + +#### 3.3.2 调优任务 + +##### 3.3.2.1 添加调优任务 +1. 具有该权限的用户成功登录,点击左侧导航栏中的plugin-atune插件和任删除模板后刷新页面,页面不存在已删除的模板信息。务列表; +2. 点击页面右上角的新增按钮,填写新增任务的信息,并点击保存按钮;![本地路径](./figures/A插件11.png) +3. 刷新页面,页面显示新增加的任务信息,状态栏显示等待。![本地路径](./figures/A插件12.png) + +##### 3.3.2.2 重启单个任务 +1. 具有该权限的用户成功登录,点击左侧导航栏中的plugin-atune插件和任务列表; +2. 在页面上找到要重启的任务,点击对应的重启按钮;![本地路径](./figures/A插件13.png) +3. 页面状态栏显示运行中,执行完成后状态变为完成。![本地路径](./figures/A插件14.png)![本地路径](./figures/A插件15.png) + +##### 3.3.2.3 查看单个任务详情 +1. 具有该权限的用户成功登录,点击左侧导航栏中的plugin-atune插件和任务列表; +2. 在页面上找到要查看的任务名称,点击对应的详情按钮; +3. 页面显示单个任务的详细信息。![本地路径](./figures/A插件16.png) + +##### 3.3.2.4 删除调优任务 +1. 具有该权限的用户成功登录,点击左侧导航栏中的plugin-atune插件和任务列表; +2. 在页面上找到要删除的任务列表,选中对应的任务,点击右上角的删除按钮,并点击确定;![本地路径](./figures/A插件17.png)![本地路径](./figures/A插件18.png) +3. 删除任务后刷新页面,页面不存在任已删除的务信息。![本地路径](./figures/A插件19.png) + +### 3.4 PilotGo-plugin-topology插件使用说明 +1. 在任意一台服务器上执行dnf install PilotGo-plugin-topology-server; +2. 将/opt/PilotGo/plugin/topology/server/config.yml文件中ip地址修改为真实ip,并配置java、neo4j、mysq、redis数据库等信息; +3. 重启服务,执行以下命令: + +`systemctl start PilotGo-plugin-topology-server` + +4. 在需要topo展示的服务器上执行dnf install PilotGo-plugin-topology-agent; +5. 将/opt/PilotGo/plugin/topology/agent/config.yml文件中ip地址修改为真实ip; +6. 重启服务,执行以下命令: + +`systemctl start PilotGo-plugin-topology-agent` + +7. 成功登录pilotgo平台,点击左侧导航栏中的插件管理,点击添加插件按钮,填写插件名称和服务地址,http://真实ip:9991,并点击确定;![本地路径](./figures/T插件1.png) +8. 页面增加一条插件管理数据,导航栏增加一个插件按钮。![本地路径](./figures/T插件2.png) + +#### 3.4.1 添加自定义拓扑配置 +1. 具有该权限的用户成功登录,点击左侧导航栏中的plugin-topology插件和创建配置; +2. 在页面上填写相关信息,并点击创建按钮;![本地路径](./figures/T插件3.png) +3. 页面显示新创拓扑配置的信息和拓扑图。![本地路径](./figures/T插件4.png) + +#### 3.4.2 修改自定义拓扑配置 +1. 具有该权限的用户成功登录,点击左侧导航栏中的plugin-topology插件和配置列表; +2. 在页面上找到要修改的配置,点击编辑;![本地路径](./figures/T插件5.png) +3. 修改配置信息后点击更新按钮,页面右半边将生成新的拓扑图。![本地路径](./figures/T插件6.png) + +#### 3.4.3 查看单机的全局拓扑图 +1. 具有该权限的用户成功登录,点击左侧导航栏中的plugin-topology插件和配置列表; +2. 点击页面上的单机/多机按钮,选择要查看拓扑信息的机器,点击确定;![本地路径](./figures/T插件7.png)![本地路径](./figures/T插件8.png) +3. 页面将展示单机的拓扑图。![本地路径](./figures/T插件9.png) + +#### 3.4.4 查看某个自定义配置的拓扑图和配置信息 +1. 具有该权限的用户成功登录,点击左侧导航栏中的plugin-topology插件和配置列表; +2. 在页面上找到要查看的配置名称,点击对应的拓扑图按钮;![本地路径](./figures/T插件10.png) +3. 页面将展示此配置拓扑图。![本地路径](./figures/T插件11.png) +4. 在页面上找到要查看的配置名称,点击对应的json配置按钮;![本地路径](./figures/T插件12.png) +5. 页面将展示此配置的详细信息。![本地路径](./figures/T插件13.png) + +#### 3.4.5 删除自定义拓扑配置 +1. 具有该权限的用户成功登录,点击左侧导航栏中的plugin-topology插件和配置列表; +2. 选择要删除的配置,点击删除按钮,再点击确定;![本地路径](./figures/T插件14.png) +3. 页面将不展示已删除的配置信息。![本地路径](./figures/T插件15.png) + +#### 3.4.6 查看多机的全局拓扑图 +1. 具有该权限的用户成功登录,点击左侧导航栏中的plugin-topology插件和配置列表; +2. 点击页面上的单机/多机按钮,选择多机,点击确定;![本地路径](./figures/T插件16.png) +3. 页面将展示多机的拓扑图。![本地路径](./figures/T插件17.png) \ No newline at end of file diff --git "a/docs/zh/docs/Pin/\346\217\222\344\273\266\346\241\206\346\236\266\347\211\271\346\200\247\347\224\250\346\210\267\346\214\207\345\215\227.md" "b/docs/zh/docs/Pin/\346\217\222\344\273\266\346\241\206\346\236\266\347\211\271\346\200\247\347\224\250\346\210\267\346\214\207\345\215\227.md" index 41c8d1f1df8fcccdf4953b6f2df3fea552e339b7..a560e02dfe30118ea33c81a82b73f32027a68022 100755 --- "a/docs/zh/docs/Pin/\346\217\222\344\273\266\346\241\206\346\236\266\347\211\271\346\200\247\347\224\250\346\210\267\346\214\207\345\215\227.md" +++ "b/docs/zh/docs/Pin/\346\217\222\344\273\266\346\241\206\346\236\266\347\211\271\346\200\247\347\224\250\346\210\267\346\214\207\345\215\227.md" @@ -1,6 +1,6 @@ # 安装与部署 ## 软件要求 -* 操作系统:openEuler 23.03 +* 操作系统:openEuler 24.03 ## 硬件要求 * x86_64架构 * ARM架构 diff --git a/docs/zh/docs/Quickstart/quick-start.md b/docs/zh/docs/Quickstart/quick-start.md index 378877aa2c351e2364f6501d755b9eaad71a9961..6dd8016d51ac87062df20241bd77af1a55488a66 100644 --- a/docs/zh/docs/Quickstart/quick-start.md +++ b/docs/zh/docs/Quickstart/quick-start.md @@ -1,42 +1,12 @@ # 快速入门 -本文档以TaiShan 200服务器上安装openEuler 21.09 为例,旨在指导用户快速地安装和使用openEuler操作系统,更详细的安装要求和安装方法请参考《[安装指南](./../Installation/installation.html)》。 +本文档以TaiShan 200服务器上安装 openEuler 为例,旨在指导用户快速地安装和使用openEuler操作系统,更详细的安装要求和安装方法请参考《[安装指南](./../Installation/installation.html)》。 ## 安装要求 - 硬件兼容支持 - 支持的服务器类型如[表1](#table14948632047)所示。 - - **表 1** 支持的服务器类型 - - - - - - - - - - - - - - - - -

服务器形态

-

服务器名称

-

服务器型号

-

机架服务器

-

TaiShan 200

-

2280均衡型

-

机架服务器

-

FusionServer Pro 机架服务器

-

FusionServer Pro 2288H V5

-
说明:

服务器要求配置Avago 3508 RAID控制卡和启用LOM-X722网卡。

-
-
+ 支持的服务器类型请参考[兼容性列表](https://www.openeuler.org/zh/compatibility/)。 - 最小硬件要求 @@ -83,16 +53,16 @@ 1. 登录[openEuler社区](https://openeuler.org)网站。 2. 单击“下载”。 3. 单击“社区发行版”,显示版本列表。 -4. 在版本列表的“openEuler 22.03 LTS SP2”版本处单击“前往下载”按钮,进入openEuler 22.03_LTS_SP2版本下载列表。 +4. 在版本列表的“openEuler 24.03 LTS SP1”版本处单击“前往下载”按钮,进入版本下载列表。 5. 根据实际待安装环境的架构和场景选择需要下载的 openEuler 的发布包和校验文件。 1. 若为AArch64架构。 1. 单击“AArch64”。 - 2. 若选择本地安装,选择“Offline Standard ISO”或者“Offline Everything ISO”对应的“立即下载”将发布包 “openEuler-22.03-LTS-SP2-aarch64-dvd.iso”下载到本地。 - 3. 若选择网络安装,选择“Network Install ISO”将发布包 “openEuler-22.03-LTS-SP2-netinst-aarch64-dvd.iso”下载到本地。 + 2. 若选择本地安装,选择“Offline Standard ISO”或者“Offline Everything ISO”对应的“立即下载”将发布包 “openEuler-24.03-LTS-SP1-aarch64-dvd.iso”下载到本地。 + 3. 若选择网络安装,选择“Network Install ISO”将发布包 “openEuler-24.03-LTS-SP1-netinst-aarch64-dvd.iso”下载到本地。 2. 若为x86_64架构。 1. 单击“x86_64”。 - 2. 若选择本地安装,选择“Offline Standard ISO”或者“Offline Everything ISO”对应的“立即下载”将发布包 “openEuler-22.03-LTS-SP2-x86_64-dvd.iso”下载到本地。 - 3. 若选择网络安装,选择“Network Install ISO”将发布包 “openEuler-22.03-LTS-SP2-netinst-x86_64-dvd.iso ”下载到本地。 + 2. 若选择本地安装,选择“Offline Standard ISO”或者“Offline Everything ISO”对应的“立即下载”将发布包 “openEuler-24.03-LTS-SP1-x86_64-dvd.iso”下载到本地。 + 3. 若选择网络安装,选择“Network Install ISO”将发布包 “openEuler-24.03-LTS-SP1-netinst-x86_64-dvd.iso ”下载到本地。 > ![](./public_sys-resources/icon-note.gif) **说明:** > @@ -115,7 +85,7 @@ 在校验发布包完整性之前,需要准备如下文件: -iso文件:openEuler-22.03-LTS-SP2-aarch64-dvd.iso +iso文件:openEuler-24.03-LTS-SP1-aarch64-dvd.iso 校验文件:ISO对应完整性校验值,复制保存对应的ISO值 @@ -126,7 +96,7 @@ iso文件:openEuler-22.03-LTS-SP2-aarch64-dvd.iso 1. 计算文件的sha256校验值。执行命令如下: ```sh - # sha256sum openEuler-22.03-LTS-SP2-aarch64-dvd.iso + # sha256sum openEuler-24.03-LTS-SP1-aarch64-dvd.iso ``` 命令执行完成后,输出校验值。 @@ -170,14 +140,14 @@ iso文件:openEuler-22.03-LTS-SP2-aarch64-dvd.iso > ![](./public_sys-resources/icon-note.gif) **说明:** > - > - 如果60秒内未按任何键,系统将从默认选项“Test this media & install openEuler 21.09”自动进入安装界面。 + > - 如果60秒内未按任何键,系统将从默认选项“Test this media & install openEuler ”自动进入安装界面。 > - 安装物理机时,如果使用键盘上下键无法选择启动选项,按“Enter”键无响应,可以单击BMC界面上的鼠标控制图标“![](./figures/zh-cn_image_0229420473.png)”,设置“键鼠复位”。 > **图 5** 安装引导界面 ![](./figures/Installation_wizard.png) -9. 在安装引导界面,按“Enter”,进入默认选项“Test this media & install openEuler 21.09”的图形化安装界面。 +9. 在安装引导界面,按“Enter”,进入默认选项“Test this media & install openEuler ”的图形化安装界面。 ## 安装 @@ -290,7 +260,7 @@ iso文件:openEuler-22.03-LTS-SP2-aarch64-dvd.iso ## 查看系统信息 -系统安装完成并重启后直接进入系统命令行登录界面,输入安装过程中设置的用户和密码,进入openEuler操作系统,查看如下系统信息。若需要进行系统管理和配置操作,请参考《[管理员指南](https://openeuler.org/zh/docs/21.09/docs/Administration/administration.html)》。 +系统安装完成并重启后直接进入系统命令行登录界面,输入安装过程中设置的用户和密码,进入openEuler操作系统,查看如下系统信息。若需要进行系统管理和配置操作,请参考《[管理员指南](https://openeuler.org/zh/docs/24.03_LTS_SP1/docs/Administration/administration.html)》。 - 查看系统信息,命令如下: diff --git "a/docs/zh/docs/ROS/\345\256\211\350\243\205\344\270\216\351\203\250\347\275\262.md" "b/docs/zh/docs/ROS/\345\256\211\350\243\205\344\270\216\351\203\250\347\275\262.md" index 8991572536507318e54e511f9fefcd68385a87aa..6d97af121e79e193f8183a84ad8171c57e28d17a 100644 --- "a/docs/zh/docs/ROS/\345\256\211\350\243\205\344\270\216\351\203\250\347\275\262.md" +++ "b/docs/zh/docs/ROS/\345\256\211\350\243\205\344\270\216\351\203\250\347\275\262.md" @@ -1,16 +1,12 @@ # 安装与部署 -## 软件要求 - -* 操作系统:openEuler 22.03 LTS SP2 - ## 硬件要求 * x86_64架构、AArch64架构 ## 环境准备 -* 安装openEuler 22.03 LTS SP2系统,安装方法参考 《[安装指南](../Installation/installation.md)》。 +* 安装 openEuler 系统,安装方法参考 《[安装指南](../Installation/installation.md)》。 ## 1. ROS2 @@ -22,7 +18,7 @@ ```shell [root@openEuler ~]# yum install openeuler-ros -[root@openEuler ~]# yum install ros-humble-ros-base ros-humble-xxx 例如安装小乌龟ros-humble-turtlesim +[root@openEuler ~]# yum install ros-noetic-* ``` 2. 执行如下命令,查看安装是否成功。如果回显有对应软件包,表示安装成功。 @@ -31,6 +27,24 @@ [root@openEuler ~]# rpm -q ros-humble ``` +### 2. ros-noetic + +#### 2. 安装ros-noetic + + +1. 安装ros-noetic 软件包 + +```shell +[root@openEuler ~]# yum install openeuler-ros +[root@openEuler ~]# yum install ros-noetic-* +``` + +2. 执行如下命令,查看安装是否成功。如果回显有对应软件包,表示安装成功。 + +```shell +[root@openEuler ~]# rpm -q ros-noetic +``` + #### 2. 测试 ros-humble ##### 运行小乌龟 diff --git "a/docs/zh/docs/ROS/\345\270\270\350\247\201\351\227\256\351\242\230\344\270\216\350\247\243\345\206\263\346\226\271\346\263\225.md" "b/docs/zh/docs/ROS/\345\270\270\350\247\201\351\227\256\351\242\230\344\270\216\350\247\243\345\206\263\346\226\271\346\263\225.md" index f43a30740d3910758e0b705a6772c2707a943801..5a41881407ba05eb31cc26068d821ebb4a23c0f3 100644 --- "a/docs/zh/docs/ROS/\345\270\270\350\247\201\351\227\256\351\242\230\344\270\216\350\247\243\345\206\263\346\226\271\346\263\225.md" +++ "b/docs/zh/docs/ROS/\345\270\270\350\247\201\351\227\256\351\242\230\344\270\216\350\247\243\345\206\263\346\226\271\346\263\225.md" @@ -4,7 +4,9 @@ ![](./figures/problem.png) -原因:出现该警告的原因在于环境变量中同时存在ROS1、ROS2,解决办法:修改环境变量,避免两个版本冲突。 +原因:出现该警告的原因在于环境变量中同时存在ROS1、ROS2。 + +解决方法:修改环境变量,避免两个版本冲突。 ```shell [root@openEuler ~]# vim /opt/ros/humble/share/ros_environment/environment/1.ros_distro.sh diff --git a/docs/zh/docs/Releasenotes/release_notes.md b/docs/zh/docs/Releasenotes/release_notes.md index 1578f3aa8364f52da274a503bc815db8da894b99..0647b3ba561ff00e261b8a0bc46438905a919c19 100644 --- a/docs/zh/docs/Releasenotes/release_notes.md +++ b/docs/zh/docs/Releasenotes/release_notes.md @@ -1,3 +1,3 @@ # 发行说明 -本文档是 openEuler 22.09 版本的发行说明。 +本文档是 openEuler 24.03-LTS-SP1 版本的发行说明。 diff --git "a/docs/zh/docs/Releasenotes/\345\205\263\351\224\256\347\211\271\346\200\247.md" "b/docs/zh/docs/Releasenotes/\345\205\263\351\224\256\347\211\271\346\200\247.md" index 8cd6be1657ab976f5f9dbb28d1d8d95ff13c3fee..c11569c737f8cd8afd7e78f20d7875012aeb0a3e 100644 --- "a/docs/zh/docs/Releasenotes/\345\205\263\351\224\256\347\211\271\346\200\247.md" +++ "b/docs/zh/docs/Releasenotes/\345\205\263\351\224\256\347\211\271\346\200\247.md" @@ -1,299 +1,369 @@ -# 关键特性 - -## 异构通用内存管理框架(GMEM)特性 - -异构通用内存管理框架 GMEM (Generalized Memory Management),提供了异构内存互联的中心化管理机制,是面向 OS For AI 的终极内存管理解决方案。GMEM 革新了 Linux 内核中的内存管理架构,其中逻辑映射系统屏蔽了 CPU 和加速器地址访问差异,remote_pager内存消息交互框架提供了设备接入抽象层。在统一的地址空间下,GMEM可以在数据需要被访问或换页时,自动地迁移数据到OS或加速器端。GMEM API与Linux原生内存管理API保持统一,易用性强,性能与可移植性好。 - -- **逻辑映射系统**: 在内核中提供 GMEM 高层 API,允许加速器驱动直接获取内存管理功能,建立逻辑页表。逻辑页表将内存管理的高层逻辑与 CPU 的硬件相关层解耦,从而抽象出能让各类加速器复用的高层内存管理逻辑。 - -- **Remote Pager 内存消息交互框架**:实现可供主机和加速器设备交互的消息通道、进程管理、内存交换和内存预取等模块。通过 remote_pager 抽象层可以让第三方加速器很容易的接入 GMEM 系统,简化设备适配难度。 - -- **用户API**: 使用 OS 的 mmap 分配统一虚拟内存,GMEM 在 mmap 系统调用中新增分配统一虚拟内存标志(MMAP_PEER_SHARED)。同时 libgmem 用户态库提供内存预取语义 hmadvise 接口,协助用户优化加速器内存访问效率。 - -## 开源大模型原生支持(LLaMA和ChatGLM) - -llama.cpp 和 chatglm-cpp 是基于 C/C++ 实现的模型推理框架,通过模型量化等手段,支持用户可以在CPU机器上完成开源大模型的部署和使用。llama.cpp 支持多个英文开源大模型的部署,如 LLaMa/LLaMa2/Vicuna 等。chatglm-cpp 支持多个中文开源大模型的部署,如 ChatGLM-6B/ChatGLM2-6B/Baichuan-13B等。 - -- 基于 ggml 的 C/C++ 实现。 - -- 通过 int4/int8 量化、优化的 KV 缓存和并行计算等多种方式加速内存高效 CPU 推理。 - -## openEuler 6.4 内核中的特性 - -openEuler 23.09 基于 Linux Kernel 6.4 内核构建,在此基础上,同时吸收了社区高版本的有益特性及社区创新特性。 - -- **潮汐 affinity 调度特性**:感知业务负载动态调整业务 CPU 亲和性,当业务负载低时使用 prefered cpus 处理,增强资源的局部性;当业务负载高时,突破 preferred cpus 范围限制,通过增加CPU核的供给提高业务的 QoS。 - -- **CPU QoS优先级负载均衡特性**:在离线混部 CPU QoS 隔离增强, 支持多核 CPU QoS 负载均衡,进一步降低离线业务 QoS 干扰。 - -- **SMT 驱离优先级反转特性**:解决混部 SMT 驱离特性的优先级反转问题,减少离线任务对在线任务 QoS 的影响。 - -- **混部多优先级**:允许 cgroup 配置-2~2的cpu.qos_level,即多个优先级,使用 qos_level_weight 设置不同优先级权重,按照 CPU 的使用比例进行资源的划分,并提供唤醒抢占能力。 - -- **可编程调度**:基于 eBPF 的可编程调度框架,支持内核调度器动态扩展调度策略,以满足不同负载的性能需求。 - -- **Numa Aware spinlock**:基于 MCS 自旋锁在锁传递算法上针对多 NUMA 系统优化,通过优先在本 NUMA 节点内传递,能大量减少跨 NUMA 的 Cache 同步和乒乓,从而提升锁的整体吞吐量,提升业务性能。 - -- **支持 TCP 压缩**:在 TCP 层对指定端口的数据进行压缩后再传输,收包侧把数据解压后再传给用户态,从而提升分布式场景节点间数据传输的效率。 - -- **内核热补丁**:主要针对内核的函数实现的 bug 进行免重启修复,原理主要在于如何完成动态函数替换,采用直接修改指令的方法,而非主线基于 ftrace 实现,在运行时直接跳转至新函数,无需经过查找中转,效率较高。 - -- **Sharepool 共享内存**:一种在多个进程之间共享数据的技术。它允许多个进程访问同一块内存区域,从而实现数据共享。 - -- **memcg 异步回收**:一种优化机制,它可以在系统负载较低的时候,异步地回收 memcg 中的内存,以避免在系统负载高峰期间出现内存回收的延迟和性能问题。 - -- **支持 filescgroup**:Cgroup files 子系统提供对系统中一组进程打开的文件数量(即句柄数)进行分组管理,相比于已有的 rlimit 方法,能更好的实现文件句柄数的资源控制(资源申请及释放、资源使用动态调整、实现分组控制等)。 - -- **cgroupv1 使能 cgroup writeback**:cgroup writeback 用于控制和管理文件系统缓存的写回行为,提供了一种灵活的方式来管理文件系统缓存的写回行为,以满足不同应用场景下的需求。主要功能包括:缓存写回控制、IO 优先级控制、写回策略调整等。 - -- **支持核挂死检测特性**:解决 PMU 停止计数导致 hardlockup 无法检测系统卡死的问题,利用核间 CPU 挂死检测机制,让每个 CPU 检测相邻 CPU 是否挂死,保障系统在部分 CPU 关中断挂死场景下能够自愈。 - -## 嵌入式 - -openEuler 23.09 Embedded 支持嵌入式虚拟化弹性底座,提供 Jailhouse 虚拟化方案、openAMP 轻量化混合部署方案,用户可以根据自己的使用场景选择最优的部署方案。同时支持 ROS humble 版本,集成 ros-core、rosbase、SLAM 等核心软件包,满足 ROS2 运行时要求。 - -- **南向生态**:openEuler Embedded Linux 当前主要支持 ARM64、x86-64 两种芯片架构,支持 RK3568、Hi3093、树莓派 4B、x86-64 工控机等具体硬件,23.09 版本新增支持 RK3399、RK3588 芯片。初步支持 ARM32、RISC-V 两种架构具体通过 QEMU 仿真来体现。 - -- **嵌入式弹性虚拟化底座**:openEuler Embedded 的融合弹性底座是为了在多核片上系统(SoC,System On Chip)上实现多个操作系统/运行时共同运行的一系列技术的集合,包含了裸金属、嵌入式虚拟化、轻量级容器、LibOS、可信执行环境(TEE)、异构等多种实现形态。 - -- **混合关键性部署框架**:构建在融合弹性底座之上,通过一套统一的框架屏蔽下层融合弹性底座形态的不同,从而实现 Linux 和其他 OS 运行时便捷地混合部署。 - -- **北向生态**:350+ 嵌入式领域常用软件包的构建;支持 ROS2 humble 版本,集成 ros-core、ros-base、SLAM 等核心包,并提供 ROS SDK,简化嵌入式 ROS 开发;基于 Linux 5.10 内核提供软实时能力,软实时中断响应时延微秒级;集成 OpenHarmony 的分布式软总线和 hichain 点对点认证模块,实现欧拉嵌入式设备之间互联互通、欧拉嵌入式设备和 OpenHarmony 设备之间互联互通。 - -- **硬实时系统(UniProton)**:是一款实时操作系统,具备极致的低时延和灵活的混合关键性部署特性,可以适用于工业控制场景,既支持微控制器 MCU,也支持算力强的多核 CPU。 - -## SysCare 热补丁能力 - -SysCare 是一个系统级热修复软件,为操作系统提供安全补丁和系统错误热修复能力,主机无需重新启动即可修复该系统问题。SysCare 将内核态热补丁技术与用户态热补丁技术进行融合统一,用户仅需聚焦在自己核心业务中,系统修复问题交予 SysCare 进行处理。后期计划根据修复组件的不同,提供系统热升级技术,进一步解放运维用户提升运维效率。 - -**支持容器内构建补丁**: - -- 通过使用 ebpf 技术监控编译器进程,实现无需创建字符设备、纯用户态化获取热补丁变化信息,并允许用户在多个不同容器内进行并行热补丁编译。 - -- 用户可以通过安装不同 rpm 包(syscare-build-kmod 或 syscare-build-ebpf)来选择使用 ko 或者 ebpf 实现,syscare-build 进程将会自适应相应底层实现。 +# 关键特性 + +## AI专项 +智能时代,操作系统需要面向AI不断演进。一方面,在操作系统开发、部署、运维全流程以AI加持,让操作系统更智能;另一方面,openEuler已支持Arm,x86,RISC-V等全部主流通用计算架构,在智能时代,openEuler也率先支持NVIDIA、昇腾等主流AI处理器,成为使能多样性算力的首选。 +- **OS for AI**:openEuler兼容NVIDIA、Ascend等主流算力平台的软件栈,为用户提供高效的开发运行环境。通过将不同AI算力平台的软件栈进行容器化封装,即可简化用户部署过程,提供开箱即用的体验。同时,openEuler也提供丰富的AI框架,方便大家快速在openEuler上使用AI能力。 + - openEuler已兼容CANN、CUDA等硬件SDK,以及TensorFlow、PyTorch、MindSpore等相应的AI框架软件,支持AI应用在openEuler上高效开发与运行。 + - openEuler AI软件栈容器化封装优化环境部署过程,并面向不同场景提供以下三类容器镜像。 + 1. SDK镜像:以openEuler为基础镜像,安装相应硬件平台的SDK,如Ascend平台的CANN或NVIDIA的CUDA软件。 + 2. AI框架镜像:以SDK镜像为基础,安装AI框架软件,如PyTorch或TensorFlow。此外,通过此部分镜像也可快速搭建AI分布式场景,如Ray等AI分布式框架。 + 3. 模型应用镜像:在AI框架镜像的基础上,包含完整的工具链和模型应用。 + + 相关使用方式请参考[openEuler AI 容器镜像用户指南](https://forum.openeuler.org/t/topic/4189/4)。 +- **AI for OS**:当前,openEuler和AI深度结合,一方面使用基础大模型,基于大量openEuler操作系统的代码和数据,训练出openEuler Copilot System,初步实现代码辅助生成、智能问题智能分析、系统辅助运维等功能,让openEuler更智能。 + - 智能问答:openEuler Copilot System智能问答平台目前支持web和智能shell两个入口。 + 1. 工作流调度:原子化智能体操作流程:通过采用“流”的组织形式,openEuler Copilot System允许用户将智能体的多个操作过程组合成一个内部有序、相互关联的多步骤“工作流”;即时数据处理:智能体在工作流的每个步骤中生成的数据和结果能够立即得到处理,并无缝传递到下一个步骤;智能交互:在面对模糊或复杂的用户指令时,openEuler Copilot System能主动询问用户,以澄清或获取更多信息。 + 2. 任务推荐:智能响应:openEuler Copilot System能够分析用户输入的语义信息;智能指引:openEuler Copilot System综合分析当前工作流的执行状况、功能需求以及关联任务等多维度数据,为用户量身定制最适宜的下一步操作建议。 + 3. RAG:openEluer Copilot System中的RAG技术能更强的适应多种文档格式和内容场景,在不为系统增加较大负担的情况下,增强问答服务体验。 + 4. 语料治理:语料治理是openEuler Copilot System中的RAG技术的基础能力之一,其通过片段相对关系提取、片段衍生物构建和OCR等方式将语料以合适形态入库,以增强用户查询命中期望文档的概率。 + + 相关使用方式请参考[openEuler Copilot System 智能问答用户指南](https://gitee.com/openeuler/docs/tree/stable2-22.03_LTS_SP3/docs/zh/docs/AI/openEuler_Copilot_System/%E4%BD%BF%E7%94%A8%E6%8C%87%E5%8D%97) + - 智能调优:openEluer Copilot System 智能调优功能目前支持智能shell入口。 +在上述功能入口,用户可通过与openEluer Copilot System进行自然语言交互,完成性能数据采集、系统性能分析、系统性能优化等作业,实现启发式调优。 + - 智能诊断: + 1. 巡检:调用Inspection Agent,对指定IP进行异常事件检测,为用户提供包含异常容器ID以及异常指标(cpu、memory等)的异常事件列表。 + 2. 定界:调用Demarcation Agent,对巡检结果中指定异常事件进行定界分析,输出导致该异常事件的根因指标TOP3。 + 3. 定位:调用Detection Agent,对定界结果中指定根因指标进行Profiling定位分析,为用户提供该根因指标异常的热点堆栈、热点系统时间、热点性能指标等信息。 + - 智能容器镜像:openEuler Copilot System目前支持通过自然语言调用环境资源,在本地协助用户基于实际物理资源拉取容器镜像,并且建立适合算力设备调试的开发环境。当前版本支持三类容器,并且镜像源已同步在dockerhub发布,用户可手动拉取运行: + 1. SDK层:仅封装使能AI硬件资源的组件库,例如:cuda、cann等。 + 2. SDK + 训练/推理框架:在SDK层的基础上加装tensorflow、pytorch等框架,例如:tensorflow2.15.0-cuda12.2.0、pytorch2.1.0.a1-cann7.0.RC1等。 + 3. SDK + 训练/推理框架 + 大模型:在第2类容器上选配几个模型进行封装,例如llama2-7b、chatglm2-13b等语言模型。 +## openEuler Embedded +openEuler Embedded围绕以制造、机器人为代表的OT领域持续深耕,通过行业项目垂直打通,不断完善和丰富嵌入式系统软件栈和生态。openEuler发布面向嵌入式领域的版本openEuler 24.03 LTS SP1,构建了一个相对完整的综合嵌入系统软件平台,在南北向生态、关键技术特性、基础设施、落地场景等方面都有显著的进步。未来openEuler Embedded将协同openEuler社区生态伙伴、用户、开发者,逐步扩展支持龙芯等新的芯片架构和更多的南向硬件,完善工业中间件、嵌入式AI、嵌入式边缘、仿真系统等能力,打造综合嵌入式系统软件平台解决方案。 +- **南向生态**:openEuler Embedded Linux当前主要支持ARM64、x86-64、ARM32、RISC-V等多种芯片架构,未来计划支持龙芯等架构,从24.03 版本开始,南向支持大幅改善,已经支持树莓派、海思、瑞芯微、瑞萨、德州仪器、飞腾、赛昉、全志等厂商的芯片。openEuler 24.03 LTS SP1新增鲲鹏920支持。 +- **嵌入式弹性虚拟化底座**:openEuler Embedded的弹性虚拟化底座是为了在多核片上系统(SoC, System On Chip)上实现多个操作系统共同运行的一系列技术的集合,包含了裸金属、嵌入式虚拟化、轻量级容器、LibOS、可信执行环境(TEE)、异构部署等多种实现形态。 +- **混合关键性部署框架**: openEuler Embedded打造了构建在融合弹性底座之上混合关键性部署框架,并命名为MICA(MIxed CriticAlity),旨在通过一套统一的框架屏蔽下层弹性底座形态的不同,从而实现Linux和其他OS运行时便捷地混合部署。依托硬件上的多核能力使得通用的Linux和专用的实时操作系统有效互补,从而达到全系统兼具两者的特点,并能够灵活开发、灵活部署。 +- **北向生态**:600+嵌入式领域常用软件包的构建;提供软实时能力,软实时中断响应时延微秒级;集成 OpenHarmony 的分布式软总线和hichain点对点认证模块,实现欧拉嵌入式设备之间互联互通、欧拉嵌入式设备和 OpenHarmony 设备之间互联互通;支持iSula容器,可以实现在嵌入式上部署openEuler或其他操作系统容器,简化应用移植和部署。支持生成嵌入式容器镜像,最小大小可到5MB,可以部署在其他支持容器的操作系统之上。 +- **UniProton硬实时系统**:UniProton 是一款实时操作系统,具备极致的低时延和灵活的混合关键性部署特性,可以适用于工业控制场景,既支持微控制器 MCU,也支持算力强的多核 CPU。目前关键能力如下: + 1. 支持Cortex-M、ARM64、X86_64、riscv64架构,支持M4、RK3568、RK3588、X86_64、Hi3093、树莓派4B、鲲鹏920、昇腾310、全志D1s。 + 2. 支持树莓派4B、Hi3093、RK3588、X86_64设备上通过裸金属模式和openEuler Embedded Linux混合部署。 + 3. 支持通过gdb在openEuler Embedded Linux侧远程调试。 + ## DevStation 开发者工作站支持 +openEuler首个面向开发者的开发者工作站Devstation正式发布,预装VSCODE,Devstation将打通部署,编码,编译,构建,发布全流程,开发者可以方便的使用oeDeploy完成AI软件栈,云原生软件栈部署,使用oeDevPlugin插件进行一键拉取代码仓,一键使用AI4C编译器编译,一键调用EulerMaker,轻松使用Devstation版本进行软件开发;同时Devstation将会集成openEuler新型包管理体系EPKG,可以进行多环境,多版本安装,方便开发者在不同版本之间进行切换。 +- **开发者友好的集成环境**:发行版预装了广泛的开发工具和 IDE,如 VS Code系列等。支持多种编程语言,满足从前端、后端到全栈开发的需求。 +- **软件包管理与自动部署**:提供简单便捷的软件包管理工具,支持通过一键安装和更新多种开发环境。同时,内置 Docker、Isula 等容器技术,方便开发者进行应用容器化与自动化部署,提供新型包管理体系EPKG,支持多版本部署,大大降低开发者在安装不同开发工具时的使用门槛。 +- **图形化编程环境**:集成了图形化编程工具,降低了新手的编程门槛,同时也为高级开发者提供了可视化编程的强大功能。 +- **AI开发支持**:针对 AI 领域的开发者,预装了 TensorFlow、PyTorch 等机器学习框架,同时优化了硬件加速器(如 GPU、NPU)的支持,提供完整的 AI 模型开发与训练环境。同时,openEuler Devstation集成了openEuler Copilot System,提供AI助手服务,帮助用户解决大部分操作系统使用问题。 +- **调试与测试工具**:内置 GDB、CUnit、gtest、perf等调试工具和测试、调优工具,帮助开发者快速调试和自动化测试,提升开发效率。 +- **版本控制和协作**:集成 Git、SVN 等版本控制工具,并支持多种远程协作工具,如 Slack、Mattermost 和 GitLab,使得团队开发和远程协作更加顺畅。 +- **安全与合规检查**:提供安全扫描和代码合规性检查工具,帮助开发者在开发阶段就能发现并修复潜在的安全漏洞和代码问题。 +## epkg新型软件包 +epkg是一款新型软件包,支持普通用户在操作系统中安装及使用。新的软件包格式相比现有软件包,主要解决多版本兼容性问题,用户可以在一个操作系统上通过简单地命令行安装不同版本的软件包。同时支持环境环境实现环境的创建/切换/使能等操作,来使用不同版本的软件包。目前epkg主要支持非服务类的软件包的安装和使用。 +- **多版本兼容**:支持普通用户安装,支持安装不同版本的软件包,不同版本的同一软件包安装不冲突。使能用户在同一个节点上,快速安装同一软件包的不同版本,实现多版本软件包的共存。 +- **环境管理**:支持环境环境实现环境的创建/切换/使能等操作,用户通过环境的切换,在环境中使用不同的channel,实现在不同的环境中使用不同版本的软件包。用户可以基于环境,快速实现软件包版本的切换。 +- **普通用户安装**:epkg支持普通用户安装软件包,普通用户能够自行创建环境,对个人用户下的环境镜像管理,无需特权版本。降低软件包安装引起的安全问题。 +## GCC 14多版本编译工具链支持 +为了使能多样算例新特性,满足不同用户对不同硬件特性支持的需求,在 openEuler 24.03 LTS SP1 版本推出 openEuler GCC Toolset 14编译工具链,该工具链提供一个高于系统主 GCC 版本的副版本 GCC 14编译工具链,为用户提供了更加灵活且高效的编译环境选择。通过使用 openEuler GCC Toolset 14副版本编译工具链,用户可以轻松地在不同版本的 GCC 之间进行切换,以便充分利用新硬件特性,同时享受到 GCC 最新优化所带来的性能提升。 +为了与系统默认主版本 GCC 解耦,防止副版本 GCC 安装与主版本GCC 安装的依赖库产生冲突,openEuler GCC Toolset 14工具链的软件包名均以前缀“gcc-toolset-14-”开头,后接原有GCC软件包名。 +此外,为便于版本切换与管理,本方案引入SCL版本切换工具。SCL工具的核心就是会在/opt/openEuler/gcc-toolset-14 路径下提供一个enable脚本,通过注册将 gcc-toolset-14 的环境变量注册到 SCL 工具中,从而可以使用 SCL 工具启动一个新的 bash shell,此 bash shell 中的环境变量即为 enable 脚本中设置的副版本环境变量,从而实现主副版本 GCC 工具链的便捷切换。 +## 内核创新 +openEuler 24.03 LTS SP1基于 Linux Kernel 6.6内核构建,在此基础上,同时吸收了社区高版本的有益特性及社区创新特性。 +- **内存管理folio特性**:Linux内存管理基于page(页)转换到由folio(拉丁语 foliō,对开本)进行管理,相比page,folio可以由一个或多个page组成,采用struct folio参数的函数声明它将对整个(1个或者多个)页面进行操作,而不仅仅是PAGE_SIZE字节,从而移除不必要复合页转换,降低误用tail page问题;从内存管理效率上采用folio减少LRU链表数量,提升内存回收效率,另一方,一次分配更多连续内存减少page fault次数,一定程度降低内存碎片化;而在IO方面,可以加速大IO的读写效率,提升吞吐。全量支持匿名页、文件页的large folio,提供系统级别的开关控制,业务可以按需使用。对于ARM64架构,基于硬件contiguous bit技术(16个连续PTE只占一个 TLB entry),可以进一步降低系统TLB miss,从而提升整体系统性能。24.03 LTS SP1版本新增支持anonymous shmem分配mTHP与支持mTHP的lazyfree,进一步增加内存子系统对于large folio的支持;新增page cache分配mTHP的sysfs控制接口,提供系统级别的开关控制,业务可以按需使用。 -## GCC for openEuler - -GCC for openEuler 基线版本从 GCC 10.3 升级到 GCC 12.3 版本,支持自动反馈优化、软硬件协同、内存优化、SVE向量化、矢量化数学库等特性。 - -- GCC 版本升级到 12.3,默认语言标准从 14 升级到 C17/C++17 标准,支持 Armv9-a 架构,X86 的 AVX512 FP16 等更多硬件架构特性。 - -- 支持结构体优化,指令选择优化等,充分使能 ARM 架构的硬件特性,运行效率更高,在 SPEC CPU 2017 等基准测试中性能大幅优于上游社区的 GCC 10.3 版本。 - -- 支持自动反馈优化特性,实现应用层 MySQL 数据库等场景性能大幅提升。 - -## A-Ops智能运维 - -IT基础设施和应用产生的数据量快速增长(每年增长2~3倍),应用大数据和机器学习技术日趋成熟,驱动高效智能运维系统产生,助力企业降本增效。openEuler 智能运维提供智能运维基本框架,支持 CVE 管理、异常检测(数据库场景)等基础能力,支持快速排障和运维成本降低。 - -- **智能补丁管理**:支持补丁服务、内核热修复、智能补丁巡检、冷热补丁混合管理。 - -- **异常检测**:提供 MySQL、openGauss 业务场景中出现的网络 I/O 时延、丢包、中断等故障以及磁盘 I/O 高负载故障检测能力。 - -- **配置溯源**:支持集群配置收集和基线能力,实现配置可管可控。对整体集群实现配置检查,实时与基线进行对比,快速识别未经授权的配置变更,实现故障快速定位。 - -## A-Ops gala 特性 - -GALA 项目将全面支持 K8S 场景故障诊断,提供包括应用 drill-down 分析、微服务& DB 性能可观测、云原生网络监控、云原生性能 Profiling、进程性能诊断等特性,支撑 OS 五类问题(网络、磁盘、进程、内存、调度)分钟级诊断。 - -- **DDE服务器版本优化K8S环境易部署**:gala-gopher 提供 daemonset 方式部署,每个 Work Node 部署一个 gala-gopher 实例;gala-spider、gala-anteater 以容器方式部署至 K8S 管理 Node。 - -- **应用drill-down分析**:提供云原生场景中亚健康问题的故障诊断能力,分钟级完成应用与云平台之间问题定界能力。 - -- **全栈监控**:提供面向应用的精细化监控能力,覆盖语言运行时(JVM)、GLIBC、系统调用、内核(TCP、I/O、调度等)等跨软件栈观测能力,实时查看系统资源对应用的影响。 - -- **全链路监控**:提供网络流拓扑(TCP、RPC)、软件部署拓扑信息,基于这些信息构建系统 3D 拓扑,精准查看应用依赖的资源范围,快速识别故障半径。 - -- **GALA因果型AI**:提供可视化根因推导能力,分钟级定界至资源节点。 - -- **微服务&DB性能可观测**:提供非侵入式的微服务、DB 访问性能可观测能力,包括 HTTP 1.x 访问性能可观测,性能包括吞吐量、时延、错误率等,支持 API 精细化可观测能力,以及 HTTP Trace 能力,便于查看异常 HTTP 请求过程。 - -- **PGSQL访问性能可观测**:性能包括吞吐量、时延、错误率等,支持基于 SQL 访问精细化观测能力,以及慢 SQL Trace 能力,便于查看慢 SQL 的具体 SQL 语句。 - -- **云原生应用性能Profiling**:提供非侵入、零修改的跨栈 profiling 分析工具,并能够对接 pyroscope 业界通用UI前端。 - -- **云原生网络监控**:针对 K8S 场景,提供 TCP、Socket、DNS 监控能力,具备更精细化网络监控能力。 - -- **GALA因果型AI**:提供可视化根因推导能力,分钟级定界至资源节点。 - -- **进程性能诊断**:针对云原生场景的中间件(比如 MySQL、Redis 等)提供进程级性能问题诊断能力,同时监控进程性能 KPI、进程相关系统层 Metrics(比如I/O、内存、TCP等),完成进程性能 KPI 异常检测以及影响该KPI的系统层 Metrics。 - -## sysMaster 特性 - -sysMaster 是一套超轻量、高可靠的服务管理程序集合,是对 1 号进程的全新实现,旨在改进传统的 init 守护进程。它使用 Rust 编写,具有故障监测、秒级自愈和快速启动等能力,从而提升操作系统可靠性和业务可用度。本次发布的 0.5.0 版本,支持在容器、虚机两种场景下,以 sysMaster 的方式管理系统中的服务。 - -- 支持devMaster组件,用于管理设备热插拔。 - -- 支持sysMaster热升级、热重启功能。 - -- 支持在虚机中以1号进程运行。 - -## utsudo 项目 - -utsudo 是一个采用 Rust 重构 Sudo 的项目,旨在提供一个更加高效、安全、灵活的提权工具,涉及的模块主要有通用工具、整体框架和功能插件等。 - -- **访问控制**:根据需求限制用户可以执行的命令,并规定所需的验证方式。 - -- **审计日志**:记录和追踪每个用户使用 utsudo 执行的命令和任务。 - -- **临时提权**:允许普通用户通过输入自己的密码,临时提升为超级用户执行特定的命令或任务。 - -- **灵活配置**:设置参数如命令别名、环境变量、执行参数等,以满足复杂的系统管理需求。 - -## utshell 项目 - -utshell 是一个延续了 bash 使用习惯的全新 shell,它能够与用户进行命令行交互,响应用户的操作去执行命令并给予反馈。并且能执行自动化脚本帮助运维。 - -- **命令执行**:执行部署在用户机器上的命令,并将执行的返回值反馈给用户。 - -- **批处理**:通过脚本完成自动任务执行。 - -- **作业控制**:能够将用户命令作为后台作业,从而实现多个命令同时执行。并对并行执行的任务进行管理和控制。 - -- **历史记录**:记录用户所输入的命令。 - -- **别名功能**:能够让用户对命令起一个自己喜欢的别名,从而个性化自己的操作功能。 - -## migration-tools 项目 - -migration-tools 是一款操作系统迁移软件,面向已部署业务应用于其他操作系统且具有国产化替换需求的用户,帮助其快速、平滑、稳定且安全地迁移至 openEuler 系操作系统。迁移软件的系统架构分为以下模块。 - -- **Server 模块**,迁移的软件的核心,采用 pythonflaskweb 框架研发,负责接收任务请求,同时处理相关执行指令并分发至各 Agent。 - -- **Agent模块**,安装在待迁移的操作系统中,负责接收Server发出的任务请求,执行迁移等功能。 - -- **配置模块**,为 Server 模块和 Agent 模块提供配置文件的读取功能。 - -- **日志模块**,提供迁移的全部运行过程记录日志。 - -- **迁移评估模块**,提供迁移前的基础环境检测、软件包对比分析、ABI 兼容性检测等评估报告,为用户的迁移工作提供依据。 - -- **迁移功能模块**,提供一键迁移、迁移进度展示、迁移结果判断等功能。 - -## DDE组件 - -统信桌面环境(DDE)专注打磨产品交互、视觉设计,拥有桌面环境的核心技术,主要功能包含:登录锁屏、桌面及文件管理器、启动器、任务栏(DOCK)、窗口管理器、 控制中心等。由于界面美观、交互优雅、安全可靠、尊重隐私,一直是用户首选 桌面环境之一,用户可以使用它进行办公与娱乐,在工作中发挥创意和提高效率,和亲朋好友保持联系,轻松浏览网页、享受影音播放。 - -## Kmesh 项目 - -Kmesh 基于可编程内核,将服务治理下沉OS,实现高性能服务网格数据面,服务间通信时延对比业界方案提升5倍。 - -- 支持对接遵从 XDS 协议的网格控制面(如 istio)。 - -- **流量编排能力**,支持轮询等负载均衡策略;支持L4、L7路由规则;支持百分比灰度方式选择后端服务策略。 - -- **sockamp 网格加速能力**,以典型的 service mesh 场景为例,使能 sockmap 网格加速能力之后,业务容器和 envoy 容器之间的通信将被ebpf程序短接,通过缩短通信路径从而达到加速效果,对于同节点上 Pod 间通信也能通过 ebpf 程序进行加速。 - -## RISC-V 架构 QEMU 镜像 - -openEuler 23.09 版本中发布了官方支持的 RISC-V 架构的操作系统。该版本的操作系统底座旨在为上层应用程序提供基础支持,具备高度可定制性、灵活性和安全性。它为 RISC-V 架构的计算平台提供稳定、可靠的操作环境,方便用户进行上层应用的安装和验证,共同推动 RISC-V 架构下软件生态的丰富和质量的提升。 - -- 该操作系统底座的功能包括升级到 6.4.0 版本的内核,与主流架构保持一致。 - -- 提供稳定的基础系统底座,包括处理器管理、内存管理、任务调度、设备驱动等核心功能,以及常用的工具等。 - -## 动态完整性度量特性 - -DIM(Dynamic Integrity Measurement)动态完整性度量特性通过在程序运行时对内存中的关键数据(如代码段)进行度量,并将度量结果和基准值进行对比,确定内存数据是否被篡改,从而检测攻击行为,并采取应对措施。 - -- 支持度量用户态进程、内核模块、内核内存代码段数据。 - -- 支持将度量结果扩展至 TPM 2.0 芯片 PCR 寄存器,用于对接远程证明。 - -- 支持配置度量策略,支持度量策略签名校验。 - -- 支持工具生成并导入度量基线数据,支持基线数据签名校验。 - -- 支持配置国密 SM3 度量算法。 - -## Kuasar 统一容器运行时特性 - -Kuasar 是一款支持多种类型沙箱统一管理的容器运行时,可同时支持业界主流的多钟沙箱隔离技术,openEuler 基于 Kuasar 统一容器运行时并结合已有 openEuler 生态中 iSulad 容器引擎和 StratoVirt 虚拟化引擎技术,打造面向云原生场景轻量级全栈自研的安全容器极低底噪、极速启动的关键竞争力。 - -本次发布的 Kuasar 0.1.0 版本,支持 StratoVirt 类型轻量级虚拟机沙箱,支持通过 K8S+iSulad 创建 StratoVirt 类型的安全容器实例。 - -- 支持 iSulad 容器引擎对接 Kuasar 容器运行时,兼容 K8S 云原生生态。 - -- 支持基于 StratoVirt 类型轻量级虚拟机沙箱技术创建安全容器沙箱。 - -- 支持 StratoVirt 类型安全容器进行资源精准限制管理。 - -## sysBoost 项目 - -sysBoost 是一个为应用进行系统微架构优化的工具,优化涉及汇编指令、代码布局、数据布局、内存大页、系统调用等方面。 - -- **二进制文件合并**:目前只支持全静态合并场景,将应用与其依赖的动态库合并为一个二进制,并进行段级别的重排,将多个离散的代码段/数据段合并为一个,提升应用性能。 - -- **sysBoost 守护进程服务**:sysBoost 使用注册 systemd 服务的方式使性能开箱最优,系统启动后,systemd将会拉起sysBoost守护进程,sysBoost 守护进程读取配置文件获取需要优化的二进制以及对应的优化方式。 - -- **rto二进制加载内核模块**:采用新增二进制加载模块的方法,在内核加载二进制时自动加载优化的二进制。 - -- **二进制代码段/数据段大页预加载**:sysBoost 提供大页预加载的功能,在二进制优化完成后立即将其内容以大页形式加载到内核中,在应用启动时将预加载的内容批量映射到用户态页表,减少应用的缺页中断和访存延迟,提升启动速度和运行效率。 - -## CTinspector 项目 - -CTinspector 是天翼云科技有限公司基于 ebpf 指令集自主创新研发的语言虚拟机运行框架。基于 CTinspector 运行框架可以快速拓展其应用实例用于诊断网络性能瓶颈点,诊断存储 I/O 处理的热点和负载均衡等,提高系统运行时诊断的稳定性和时效性。 - -- 采用一个 ebpf 指令集的语言虚拟机 Packet VM,它最小只有 256 字节,包含所有虚拟机应有的部件:寄存器,堆栈段,代码段,数据段,页表。 - -- Packet VM 支持自主的 migration,即 packet VM 内的代码可以调用 migrate kernel function,以将 packet VM 迁移至它自己指定的节点。 - -- Packet VM 同时支持断点续执行,即 packet VM 迁移至下一个节点后可以沿着上一个节点中断的位置继续执行下一条指令。 - -## CVE-ease 项目 - -CVE-ease 是天翼云自主创新开发的一个专注于CVE信息的平台,它搜集了多个安全平台发布的各种 CVE 信息,并通过邮件、微信、钉钉等多种渠道及时通知用户。CVE-ease 平台旨在帮助用户快速了解和应对系统中存在的漏洞,在提高系统安全性和稳定性的同时,用户可以通过 CVE-ease 平台查看 CVE 信息的详细内容,包括漏洞描述、影响范围、修复建议等,并根据自己的系统情况选择合适的修复方案。 - -目前 CVE-ease 主要包括以下功能: - -- CVE 信息动态获取和整合,实时跟踪多平台 CVE 披露信息,并进整合放入 CVE 数据库。 - -- CVE 信息提取和更新,对收集到的 CVE 信息提取关键信息并实时更新发生变更的 CVE。 - -- CVE 数据保存和管理,自动维护和管理 CVE 数据库。 - -- 历史 CVE 信息查看,通过交互方式查询各种条件的 CVE。 - -- CVE 信息实时播报,通过企业微信、钉钉、邮箱等方式实时播报历史CVE信息。 - -## PilotGo运维管理平台特性 - -PilotGo 是 openEuler 社区原生孵化的运维管理平台,采用插件式架构设计,功能模块轻量化组合、独立迭代演进,同时保证核心功能稳定;同时使用插件来增强平台功能,并打通不同运维组件之间的壁垒,实现了全局的状态感知及自动化流程。 - -PilotGo核心功能模块包括: - -- **用户管理**:支持按照组织结构分组管理,支持导入已有平台账号,迁移方便。 - -- **权限管理**:支持基于 RBAC 的权限管理,灵活可靠。 - -- **主机管理**:状态前端可视化、直接执行软件包管理、服务管理、内核参数调优、简单易操作。 - -- **批次管理**:支持运维操作并发执行,稳定高效。 - -- **日志审计**:跟踪记录用户及插件的变更操作,方便问题回溯及安全审计。 。 - -- **告警管理**:平台异常实时感知。 - -- **平台异常实时感知**:支持扩展平台功能,插件联动,自动化能力倍增,减少人工干预。 - -## CPDS 支持对容器 TOP 故障、亚健康检测的监测与识别 - -云原生技术的广泛应用,致使现代应用部署环境越来越复杂。容器架构提供了灵活性和便利性,但也带来了更多的监测和维护挑战。CPDS(容器故障检测系统)应运而生,旨在为容器化应用提供可靠性和稳定性的保障。 - -- **集群信息采集**:在宿主机上实现节点代理,采用 systemd、initv、ebpf 等技术,对容器关键服务进行监控,采集集群基础服务类数据;对节点网络、内核、磁盘 LVM 等相关信息进行监控,采集集群 OS 类异常数据;采用无侵入的方式在节点、容器内设置跨NS的代理,针对对应用状态、资源消耗情况、关键系统函数执行情况、IO 执行状态等执行异常进行监控,采集业务服务异常类数据。 - -- **集群异常检测**:处理各节点原始数据,基于异常规则对采集的原始数据进行异常检测,提取关键信息。同时基于异常规则对采集数据进行异常检测,后将检测结果数据和原始据进行在线上传,并同步进行持久化操作。 - -- **节点、业务容器故障/亚健康诊断**:基于异常检测数据,对节点、业务容器进行故障/亚健康诊断,将分析检测结果进行持久化存储,并提供 UI 层进行实时、历史的诊断数据查看。 - -## EulerMaker 构建系统 - -EulerMaker 构建系统是一款软件包构建系统,完成源码到二进制软件包的构建,并支持开发者通过搭积木方式,组装和定制出适合自己需求的场景化 OS。主要提供增量/全量构建,分层定制与镜像定制的能力。 - -- **增量/全量构建**:基于软件包变化,结合软件包依赖关系,分析影响范围,得到待构建软件包列表,按照依赖顺序并行下发构建任务。 - -- **构建依赖查询**:提供工程中软件包构建依赖表,支持筛选及统计软件包依赖及被依赖的软件包内容。 - -- **分层定制**:支持在构建工程中,通过选择与配置层模型,实现对软件包的patch,构建依赖,安装依赖,编译选项等内容的定制,完成针对软件包的场景化定制。 - -- **镜像定制**:支持开发者通过配置 repo 源,生成 iso、qcow2、容器等 OS 镜像,并支持对镜像进行软件包列表定制。 +- **MPTCP特性支持**:MPTCP协议诞生旨在突破传统 TCP 协议的单一路径传输瓶颈,允许应用程序使用多个网络路径进行并行数据传输。这一设计优化了网络硬件资源的利用效率,通过智能地将流量分配至不同传输路径,显著缓解了网络拥塞问题,从而提高数据传输的可靠性和吞吐量。 +目前,MPTCP 在下述网络场景中已经展现出了其优秀的性能: + 1. 网络通路的选择:在现有的网络通路中,根据延迟、带宽等指标评估,选择最优的通路。 + 2. 无缝切网:在不同类型网络之间切换时,数据传输不中断。 + 3. 数据分流:同时使用多个通道传输,对数据包进行分发实现并发传输,增加网络带宽。 + + 在实验环境中,采用MPTCP v1技术的RSYNC文件传输工具展现出了令人满意的效率提升。具体而言,传输1.3GB大小的文件时,传输时间由原来的114.83 s缩短至仅14.35s,平均传输速度由原来的11.08 MB/s提升至88.25 MB/s,可以极大程度的缩减文件传输时间。同时,实验模拟了传输过程中一条或多条路径突发故障而断开的场景,MPTCP在此种场景下可以将数据无缝切换至其他可用的数据通道,确保数据传输的连续性与完整性。 +在openEuler 24.03 LTS SP1中,已经完成了对linux主线内核6.9中MPTCP相关特性的全面移植与功能优化。 +- **ext4文件系统支持Large folio**:iozone性能总分可以提升80%,iomap框架回写流程支持批量映射block。支持ext4默认模式下批量申请block,大幅优化各类benchmark下ext4性能表现(华为贡献)。ext4 buffer io读写流程以及pagecache回写流程弃用老旧的buffer_head框架,切换至iomap框架,并通过iomap框架实现ext4支持large folio。24.03 LTS SP1版本新增对于block size < folio size场景的小buffered IO(<=4KB)的性能优化,性能提升20%。 +- **xcall/xint特性**:随着Linux内核的发展,系统调用成为性能瓶颈,尤其是在功能简单的调用中。AARCH64平台上的SYSCALL共享异常入口,包含安全检查等冗余流程。降低SYSCALL开销的方法包括业务前移和批量处理,但需业务适配。XCALL提供了一种无需业务代码感知的方案,通过优化SYSCALL处理,牺牲部分维测和安全功能来降低系统底噪,降低系统调用处理开销。 +内核为了使中断处理整体架构统一,将所有中断处理全部归一到内核通用中断处理框架中,同时随着内核版本演进,通用中断处理框架附加了很多与中断处理自身的功能关系不大的安全加固和维测特性,这导致中断处理的时延不确定性增大。xint通过提供一套精简的中断处理流程,来降低中断处理的时延和系统底噪。 +- **按需加载支持failover特性**:cachefiles在按需模式下,如果守护进程崩溃或被关闭,按需加载相关的读取和挂载将返回-EIO。所有挂载点必须要在重新拉起 daemon 后重新挂载后方可继续使用。这在公共云服务生产环境中发生时是无法接受的,这样的I/O错误将传播给云服务用户,可能会影响他们作业的执行,并危及系统的整体稳定性。cachefiles failover 特性避免了守护进程崩溃后重新挂载所有挂载点,只需快速重新拉起守护进程即可,用户和服务是不感知守护进程崩溃的。 +- **可编程调度特性**:支持可编程调度功能,为用户态提供可编程的调度接口,是CFS算法的扩展。用户可以根据实际的业务场景定制化调度策略,bypass原有的CFS调度算法。提供的功能包括支持标签机制、支持任务选核可编程、支持负载均衡可编程、支持任务选择可编程、支持任务抢占可编程以及kfunc接口函数。 +- **支持 SMC-D with loopback-ism 特性**:SMC-D (Shared Memory Communication over DMA) 是一种兼容 socket 接口,基于共享内存,透明加速 TCP 通信的内核网络协议栈。SMC-D 早期只能用于 IBM z S390 架构机器,SMC-D with loopback-ism 技术通过创建虚拟设备 loopback-ism 模拟 ISM 功能,使得 SMC-D 可用于非 S390 架构机器,成为内核通用机制。SMC-D with loopback-ism适用于采用TCP协议进行OS内进程间或容器间通信的场景,通过旁路内核TCP/IP协议栈等方法,实现通信加速。结合使用smc-tools工具,可以通过LD_PRELOAD预加载动态库的方法实现TCP协议栈透明替换,无需更改原有应用程序。根据社区反馈结果,与原生TCP相比,SMC-D with loopback-ism能够提升网络吞吐量提升40%以上。 +- **IMA RoT特性**:当前,Linux IMA(Integrity Measurement Architecture)子系统主要使用 TPM 芯片作为可信根(Root of Trust,RoT)设备,针对度量列表提供完整性证明,其在编码上也与 TPM 的操作紧耦合。而机密计算等新场景要求 IMA 可使用新型 RoT 设备,例如 openEuler 已支持的 VirtCCA。本特性为一套 IMA RoT 设备框架,在 IMA 子系统和 RoT 设备之间实现一个抽象层,既简化各类 RoT 设备对 IMA 子系统的适配,也方便用户和 IMA 子系统对各类 RoT 设备实施配置和操作。 +- **支持脚本类病毒程序防护**:目前勒索病毒主要是脚本类文件(如JSP文件),而当前内核防御非法入侵的IMA完整性保护技术,主要针对ELF类病毒文件。脚本类病毒文件通过解释器运行,能够绕开内核中的安全技术实施攻击。为了支持内核中的IMA完整性保护技术可以检查系统中间接执行的脚本类文件,通过系统调用execveat()新增执行检查的flags,查验其执行权限,并在检查中调用IMA完整性度量接口以实现对脚本类文件的完整性保护。经测试验证,目前已经支持脚本解释器主动调用execveat()系统调用函数并传入AT_CHECK参数对脚本文件进行可执行权限检查(包括IMA检查),只有当权限检查成功后,才可继续运行脚本文件。 +- **haltpoll特性**:haltpoll特性通过虚拟机guest vcpu在空闲时进行轮询的机制,避免vcpu唤醒时发送IPI中断,降低了中断发送和处理的开销,并且由于轮询时虚拟机不需要陷出,减少了陷入陷出的开销,该特性能够降低进程间通信时延,显著提升上下文切换效率,提升虚机性能。 +- **内核TCP/IP协议栈支持CAQM拥塞**:内核TCP/IP协议栈支持CAQM拥塞控制算法:CAQM是一种主动队列管理算法,一种网络拥塞控制机制,主要运行于数据中心使用TCP传输数据的计算端侧节点和传输路径上的网侧交换机节点。通过网侧交换节点主动计算网络空闲带宽和最优带宽分配,端侧协议栈与网侧交换机协同工作,在高并发场景下获得网络交换机“零队列“”拥塞控制效果和极低传输时延。CAQM算法通过在以太链路层增加拥塞控制标记字段,实现动态调整队列长度,减少延迟和丢包,提高网络资源的利用率。对于数据中心低延时通算场景,可极大减少延迟和丢包的发生,增强用户体验。在数据中心典型场景下,对比经典Cubic算法,关键指标提升:1)传输时延:CAQM vs Cubic 时延降低92.4%。2)带宽利用率:在交换机队列缓存占用降低90%的情况下,仍保持TCP传输带宽利用率逼近100%(99.97%)。CAQM算法使用说明:1)本算法需要端侧服务器和网侧交换机协同配合,故中间节点交换机,需要支持CAQM协议(协议头识别,拥塞控制字段调整等);2)本算法通过内核编译宏(CONFIG_ETH_CAQM)控制,默认不使能,用户需通过打开编译宏,重新编译替换内核后,使能算法功能。 +## NestOS容器操作系统 +NestOS是在openEuler社区孵化的云底座操作系统,集成了rpm-ostree支持、ignition配置等技术。采用双根文件系统、原子化更新的设计思路,使用nestos-assembler快速集成构建,并针对K8S、OpenStack等平台进行适配,优化容器运行底噪,使系统具备十分便捷的集群组建能力,可以更安全的运行大规模的容器化工作负载。 + - **开箱即用的容器平台**:NestOS集成适配了iSulad、Docker、Podman等主流容器引擎,为用户提供轻量级、定制化的云场景OS。 + - **简单易用的配置过程**:NestOS通过ignition技术,可以以相同的配置方便地完成大批量集群节点的安装配置工作。 + - **安全可靠的包管理**:NestOS使用rpm-ostree进行软件包管理,搭配openEuler软件包源,确保原子化更新的安全稳定状态。 + - **友好可控的更新机制**:NestOS使用zincati提供自动更新服务,可实现节点自动更新与重新引导,实现集群节点有序升级而服务不中断。 + - **紧密配合的双根文件系统**:NestOS采用双根文件系统的设计实现主备切换,确保NestOS运行期间的完整性与安全性。 +## syscare特性增强 +SysCare是一个系统级热修复软件,为操作系统提供安全补丁和系统错误热修复能力,主机无需重新启动即可修复该系统问题。​ SysCare将内核态热补丁技术与用户态热补丁技术进行融合统一,用户仅需聚焦在自己核心业务中,系统修复问题交予SysCare进行处理。后期计划根据修复组件的不同,提供系统热升级技术,进一步解放运维用户提升运维效率。 +- **热补丁制作**:用户仅需输入目标软件的源码RPM包、调试信息RPM包与待打补丁的路径,无需对软件源码进行任何修改,即可生成对应的热补丁RPM包。 +- **热补丁生命周期管理**:SysCare提供一套完整的,傻瓜式补丁生命周期管理方式,旨在减少用户学习、使用成本,通过单条命令即可对热补丁进行管理。依托于RPM系统,SysCare构建出的热补丁依赖关系完整,热补丁分发、安装、更新与卸载流程均无需进行特殊处理,可直接集成放入软件仓repo。 +- **内核热补丁与用户态热补丁融合**:SysCare基于upatch和kpatch技术,覆盖应用、动态库、内核,自顶向下打通热补丁软件栈,提供用户无感知的全栈热修复能力。 +- **新增特性**:支持重启后按照用户操作顺序恢复。ACCEPTED状态热补丁。 +## iSula支持NRI插件式扩展 +NRI (Node Resource Interface), 是用于控制节点资源的公共接口, 是CRI兼容的容器运行时插件扩展的通用框架。它为扩展插件提供了跟踪容器状态,并对其配置进行有限修改的基本机制。允许将用户某些自定的逻辑插入到OCI兼容的运行时中,此逻辑可以对容器进行受控更改,或在容器生命周期的某些时间点执行 OCI 范围之外的额外操作。iSulad新增对NRI插件式扩展的支持,减少k8s场景下对于容器资源管理维护成本,消除调度延迟,规范信息的一致性。 +NRI插件通过请求isula-rust-extension组件中启动的NRI runtime Service服务与iSulad建立连接后,可订阅Pod与Container的生命周期事件: +1. 可订阅Pod生命周期事件,包括:creation、stopping和removal。 +2. 可订阅Container生命周期事件,包括creation、post-creation、starting、post-start、updating、post-update、stopping和removal。 + +iSulad在接收到k8s下发的CRI请求后,对于所有订阅了对应生命周期事件的NRI插件都发送的请求,NRI插件可在请求中获得Pod与Container的元数据与资源信息。之后NRI插件可根据需求更新Pod与Container的资源配置,并将更新的信息传递给iSulad,iSulad将更新后的配置传递给容器运行时,生效配置。 +## oeAware采集、调优插件等功能增强 +oeAware是在openEuler上实现低负载采集感知调优的框架,目标是动态感知系统行为后智能使能系统的调优特性。传统调优特性都以独立运行且静态打开关闭为主,oeAware将调优拆分为采集、感知和调优三层,每层通过订阅方式关联,各层采用插件式开发尽可能复用。 +oeAware 的每个插件都是按oeAware 标准接口开发的动态库,包含若干个实例,每个实例可以是一个独立的采集、感知或调优功能集,每个实例包含若干个topic,其中 topic 主要用于提供采集或者感知的数据结果,这些数据结果可供其他插件或者外部应用进行调优或分析。 +- SDK提供的接口可以实现订阅插件的topic,回调函数接收oeAware的数据,外部应用可以通过SDK开发定制化功能,例如完成集群各节点信息采集,分析本节点业务特征。 +- PMU信息采集插件:采集系统PMU性能记录。 +- Docker 信息采集插件:采集当前环境Docker的一些参数信息。 +- 系统信息采集插件:采集当前环境的内核参数、线程信息和一些资源信息(CPU、内存、IO、网络)等。 +- 线程感知插件:感知关键线程信息。 +- 评估插件:分析业务运行时系统的NUMA和网络信息,给用户推荐使用的调优方式。 +- 系统调优插件:(1)stealtask :优化CPU调优 (2)smc_tune(SMC-D):基于内核共享内存通信特性,提高网络吞吐,降低时延 (3)xcall_tune :跳过非关键流程的代码路径,优化 SYSCALL的处理底噪。 +- Docker调优插件:利用cpuburst特性在突发负载下环境CPU性能瓶颈。 + +## KubeOS特性增强 +KubeOS是针对云原生场景而设计、轻量安全的云原生操作系统及运维工具,提供基于kubernetes的云原生操作系统统一运维能力。KubeOS设计了专为容器运行的云原生操作系统,通过根目录只读,仅包含容器运行所需组件,dm-verity安全加固,减少漏洞和攻击面,提升资源利用率和启动速度,提供云原化的、轻量安全的操作系统。KubeOS支持使用kubernetes原生声明式API,统一对集群worker节点OS的进行升级、配置和运维,从而降低云原生场景的运维难度、解决用户集群节点OS版本分裂,缺乏统一的OS运维管理方案的问题。 +KubeOS新增配置能力、定制化镜像制作能力和rootfs完整性保护dm-verity如图所示,具体能力如下: +- KubeOS支持集群参数统一配置,支持通过KubeOS统一settings配置,支持如下配置: +(1)KubeOS支持limits.conf文件参数统一配置; +(2)KubeOS支持containerd、kubelet等集群参数统一配置。 +- KubeOS支持镜像定制化,镜像制作时支持systemd服务、grub密码、系统盘分区、用户/用户组、文件、脚本和persist分区目录的自定义配置。 +- KubeOS支持静态完整性保护dm-verity,支持在虚拟机镜像制作时开启dm-verity,对rootfs进行完整行校验,并支持dm-verity开启时的升级和配置。 +## IMA特性增强 +内核完整性度量架构(IMA, Integrity Measurement Architecture)是Linux开源,且在业界被广泛使用的文件完整性保护技术。在实际应用场景中,IMA可以用来对系统运行的程序发起完整性检查,一方面检测应用程序篡改,另一方面通过白名单机制以保证只有经过认证(如签名或HMAC)的文件才可以被运行。 +目前运行在Linux操作系统上的应用程序可分为两种: +- 二进制可执行程序:即符合ELF格式标准的程序文件,可直接通过exec/mmap系统调用运行。 +- 解释器类应用程序:即通过解释器间接运行的程序文件,如通过Bash/Python/Lua等解释器加载运行的脚本程序,或JVM运行的Java程序等。 + +对于二进制可执行程序,IMA可以通过在exec/mmap系统调用的hook函数发起度量或校验流程,从而实现完整性保护。 +但是针对解释器类应用程序,现有的IMA机制无法进行有效保护。其原因是此类应用程序主要通过read系统调用由解释器加载并解析运行,而IMA无法将其和其他可变的文件(如配置文件、临时文件等)进行区分,一旦针对read系统调用配置开启IMA机制,则会将其他可变文件也纳入保护范围,而可变文件无法预先生成度量基线或校验凭据,从而导致完整性检查失败。 +因此本特性旨在对现有IMA机制进行增强,有效提升对解释器类应用程序的完整性保护能力。 +## 异构可信根 +典型的攻击手段往往伴随着信息系统真实性、完整性的破坏,目前业界的共识是通过硬件可信根对系统关键组件进行度量/验证,一旦检测到篡改或仿冒行为,就执行告警或拦截。 +当前业界主流是采用TPM作为信任根,结合完整性度量软件栈逐级构筑系统信任链,从而保证系统各组件的真实性和完整性。openEuler当前支持的完整性度量特性包括:度量启动、IMA文件度量、DIM内存度量等。 +openEuler 24.03 LTS SP1版本在内核的integrity子模块中实现了一套可信根框架,南向支持多种可信根驱动,北向提供统一度量接口,对接上层完整性保护软件栈,将完整性度量特性的硬件可信根支持范围从单TPM扩展为多元异构可信根。 +## secGear特性增强 +secGear远程证明统一框架是机密计算远程证明相关的关键组件,屏蔽不同TEE远程证明差异,提供Attestation Agent和Attestation Service两个组件,Agent供用户集成获取证明报告,对接证明服务;Service可独立部署,支持iTrustee、virtCCA远程证明报告的验证。 + +远程证明统一框架聚焦机密计算相关功能,部署服务时需要的服务运维等相关能力由服务部署第三方提供。远程证明统一框架的关键技术如下: +- 报告校验插件框架:支持运行时兼容iTrustee、vritCCA、CCA等不同TEE平台证明报告检验,支持扩展新的TEE报告检验插件。 +- 证书基线管理:支持对不同TEE类型的TCB/TA基线值管理及公钥证书管理,集中部署到服务端,对用户透明。 +- 策略管理:提供默认策略(易用)、用户定制策略(灵活)。 +- 身份令牌:支持对不同TEE签发身份令牌,由第三方信任背书,实现不同TEE类型相互认证。 +- 证明代理:支持对接证明服务/点对点互证,兼容TEE报告获取,身份令牌验证等,易集成,使用户聚焦业务。 + +根据使用场景,支持点对点验证和证明服务验证两种模式。 +证明服务验证流程如下: +(1)用户(普通节点或TEE)对TEE平台发起挑战。 +(2)TEE平台通过证明代理获取TEE证明报告,并返回给用户。 +(3)用户端证明代理将报告转发到远程证明服务。 +(4)远程证明服务完成报告校验,返回由第三方信任背书的统一格式身份令牌。 +(5)证明代理验证身份令牌,并解析得到证明报告校验结果。 +点对点验证流程(无证明服务)如下: +(1)用户向TEE平台发起挑战,TEE平台返回证明报告给用户。 +(2)用户使用本地点对点TEE校验插件完成报告验证。 +注意:点对点验证和远程证明服务验证时的证明代理不同,在编译时可通过编译选项,决定编译有证明服务和点对点模式的证明代理。 +## 容器干扰检测,分钟级完成业务干扰源(CPU/IO)识别与干扰源发现 +gala-anteater是一款基于AI的操作系统灰度故障的异常检测平台,集成了多种异常检测算法,通过自动化模型预训练、线上模型的增量学习和模型更新,可以实现系统级故障发现和故障点上报。 +在线容器高密部署场景下,存在资源无序竞争现象,导致容器实例间相互干扰,使用gala-anteater可以分钟级完成业务干扰源(CPU/IO)识别与干扰源发现,辅助运维人员快速跟踪并解决问题,保障业务Qos。 + +gala-anteater通过线上线下相结合,利用在线学习技术,实现模型的线下学习,线上更新,并应用于线上异常检测。 +1. Offline: 首先,利用线下历史KPI数据集,经过数据预处理、特征选择,得到训练集;然后,利用得到的训练集,对无监督神经网络模型(例如Variational Autoencoder)进行训练调优。最后,利用人工标注的测试集,选择最优模型。 +2. Online: 将线下训练好的模型,部署到线上,然后利用线上真实的数据集,对模型进行在线训练以及参数调优,然后利用训练好的模型,进行线上环境的实时异常检测。 + +## Aops适配authHub统一用户鉴权 +authhub是一个基于oauth2协议开发的统一用户鉴权中心,Aops通过authHub管理应用注册,实现应用间的统一用户鉴权。 +authHub基于oauth2协议,实现统一用户鉴权中心,用户鉴权中心功能包括: +- 应用管理:应用部署后需要在authHub应用管理界面进行注册和配置,用户可以使用注册的应用的功能。 +- 用户鉴权:在authHub管理的应用中,用户可以实现单点登录和单点登出。 +## 微服务性能问题分钟级定界/定位(TCP,IO,调度等) +基于拓扑的根因推导技术提供了大规模集群情况下的故障检测、根因定位服务,新增云原生场景故障定界定位能力,特别是针对网络L7层协议,如HTTPS、PGSQL、MYSQL等。这项技术通过 gala-gopher、gala-spider和 gala-anteater实现,新增内容提供了更细粒度的故障定界能力,通过这种能力运维团队可以快速定位问题源头,从而提高系统的稳定性和可靠性。该服务主要功能如下: +- 指标采集服务:gala-gopher 通过 ebpf 技术提供网络、IO相关的指标采集上报功能。 +- 集群拓扑构建服务:gala-spider 通过接受指标采集服务上报的数据信息,构建容器、进程级别的调用关系拓扑图构建。 +- 故障检测服务:gala-anteater通过故障检测模型对应用上报的指标采集进行分类, 判断是否发生异常。 +- 根因定位服务:gala-anteater 通过节点异常信息和拓扑图信息, 定位导致此次异常的根因节点。 +## utsudo项目发布 +Sudo 是 Unix 和 Linux 操作系统中常用的工具之一,它允许用户在需要超级用户权限的情况下执行特定命令。然而, 传统 Sudo 在安全性和可靠性方面存在一些缺陷,为此 utsudo 项目应运而生。 + utsudo 是一个采用 Rust 重构 Sudo 的项目,旨在提供一个更加高效、安全、灵活的提权工具,涉及的模块主要有通用工具、整体框架和功能插件等。 + +**基本功能** +- 访问控制:可以根据需求,限制用户可以执行的命令,并规定所需的验证方式。 +- 审计日志:可以记录和追踪每个用户使用 utsudo 执行的命令和任务。 +- 临时提权:允许普通用户通过输入自己的密码,临时提升为超级用户执行特定的命令或任务。 +- 灵活配置:可以设置参数如命令别名、环境变量、执行参数等,以满足复杂的系统管理需求。 + +**增强功能** +utsudo在openEuler 24.09版本中是0.0.2版本,当前版本主要功能有: +- 提权流程:把普通用户执行命令的进程,提权为root权限。 +- 插件加载流程:实现了对插件配置文件的解析,以及对插件库的动态加载。 +## utshell 项目发布 +utshell 是一个延续了 bash 使用习惯的全新 shell,它能够与用户进行命令行交互,响应用户的操作去执行命令并给予 反馈。并且能执行自动化脚本帮助运维。 + +**基本功能** +- 命令执行:可以执行部署在用户机器上的命令,并将执行的返回值反馈给用户。 +- 批处理:通过脚本完成自动任务执行。 +- 作业控制:能够将用户命令作为后台作业,从而实现多个命令同时执行。并对并行执行的任务进行管理和控制。 +- 历史记录:记录用户所输入的命令。 +- 别名功能:能够让用户对命令起一个自己喜欢的别名,从而个性化自己的操作功能。 + +**增强功能** +utshell在openEuler24.09版本中是0.5版本,当前版本主要功能有: +- 实现对shell脚本的解析。 +- 实现对第三方命令的执行。 + +## GCC for openEuler +GCC for openEuler基线版本已经从GCC 10.3升级到GCC 12.3版本,支持自动反馈优化、软硬件协同、内存优化、SVE向量化、矢量化数学库等特性。 +1. GCC版本升级到12.3,默认语言标准从C14/C++14升级到C17/C++17标准,支持Armv9-a架构,X86的AVX512 FP16等更多硬件架构特性。 +2. 支持结构体优化,指令选择优化等,充分使能ARM架构的硬件特性,运行效率高,在SPEC CPU 2017等基准测试中性能大幅优于上游社区的GCC 10.3版本。 +3. 支持自动反馈优化特性,实现应用层MySQL数据库等场景性能大幅提升。 + +**功能描述** +- 支持ARM架构下SVE矢量化优化,在支持SVE指令的机器上启用此优化后能够提升程序运行的性能。 +- 支持内存布局优化,通过重新排布结构体成员的位置,使得频繁访问的结构体成员放置于连续的内存空间上,提升Cache的命中率,提升程序运行的性能。 +- 支持SLP转置优化,在循环拆分阶段增强对连续访存读的循环数据流分析能力,同时在SLP矢量化阶段,新增转置操作的分析处理,发掘更多的矢量化机会,提升性能。 +- 支持冗余成员消除优化,消除结构体中从不读取的结构体成员,同时删除冗余的写语句,缩小结构体占用内存大小,降低内存带宽压力,提升性能。 +- 支持数组比较优化,实现数组元素并行比较,提高执行效率。 +- 支持ARM架构下指令优化,增强ccmp指令适用场景,简化指令流水。 +- 支持if语句块拆分优化,增强程序间常量传播能力。 +- 增强SLP矢量优化,覆盖更多矢量化场景,提升性能。 +- 支持自动反馈优化,使用perf收集程序运行信息并解析,完成编译阶段和二进制阶段反馈优化,提升MySQL数据库等主流应用场景的性能。 + +## Gazelle特性增强 +Gazelle是一款高性能用户态协议栈。它基于DPDK在用户态直接读写网卡报文,共享大页内存传递报文,使用轻量级LwIP协议栈。能够大幅提高应用的网络I/O吞吐能力。专注于数据库网络性能加速,兼顾高性能与通用性。本次版本新增容器场景xdp部署模式及openGauss数据库tpcc支持,丰富用户态协议栈。 + +- 高性能(超轻量):基于 dpdk、lwip 实现高性能轻量协议栈能力。 +- 极致性能:基于区域大页划分、动态绑核、全路径零拷贝等技术,实现高线性度并发协议栈。 +- 硬件加速:支持 TSO/CSUM/GRO 等硬件卸载,打通软硬件垂直加速路径。 +- 通用性(posix 兼容):接口完全兼容 posix api,应用零修改,支持 udp 的 recvfrom 和 sendto 接口。 +- 通用网络模型:基于 fd 路由器、代理式唤醒等机制实现自适应网络模型调度,udp 多节点的组播模型,满足任意网络应用场景。 +- 易用性(即插即用):基于 LD_PRELOAD 实现业务免配套,真正实现零成本部署。 +- 易运维(运维工具):具备流量统计、指标日志、命令行等完整运维手段。 + +**新增特性** +- 支持基于ipvlan 的l2模式网卡上通过xdp的方式部署使用Gazelle。 +- 新增中断模式,可以支持在无流量或低流量场景下,lstack不再占满CPU核。 +- 优化pingpong模式的网络,在数据包进行pingpong收发时,优化报文的发送行为。 +- 新增对openGauss数据库的单机及单主备tpcc测试支持。 + +## virtCCA机密虚机 +virtCCA机密虚机特性基于鲲鹏920系列S-EL2能力,在TEE侧实现机密虚机能力,实现现有普通虚机中的软件栈无缝迁移到机密环境中。基于Arm CCA标准接口,在Trustzone固件基础上构建TEE虚拟化管理模块,实现机密虚机间的内存隔离、上下文管理、生命周期管理和页表管理等机制,支持客户应用无缝迁移到机密计算环境中。 +1. 设备直通: +设备直通是基于华为鲲鹏920新型号通过预埋在PCIE Root Complex里的PCIE保护组件,在PCIE总线上增加选通器,对CPU与外设间的通信进行选通,即对SMMU的Outbound Traffic和Inbound Traffic的控制流和数据流进行控制,以此保证整体数据链路的安全性。 +基于virtCCA PCIPC的设备直通能力,实现对PCIE设备的安全隔离和性能提升,存在以下优势: +(1)安全隔离 +TEE侧控制设备的访问权限,Host侧软件无法访问TEE侧设备; +(2)高性能 +机密设备直通,相比业界加解密方案,数据面无损耗; +(3)易用性 +兼容现有开源OS,无需修改开源OS内核驱动代码。 +2. 国密硬件加速: +国密硬件加速是基于华为鲲鹏芯片,通过KAE加速器能力复用到安全侧,并采用openEuler UADK用户态加速器框架,提供客户机密虚机内国密加速性能提升以及算法卸载的能力。 +## 海光CSV3支持 +海光第三代安全虚拟化技术(CSV3)在前二代技术的基础上继续增强,在CPU内部实现了虚拟机数据的安全隔离,禁止主机操作系统读写CSV3虚拟机内存,禁止主机操作系统读写虚拟机嵌套页表,保证了虚拟机数据的完整性,实现了CSV3虚拟机数据机密性和完整性的双重安全。 + +**安全内存隔离单元** +安全内存隔离单元是海光第三代安全虚拟化技术的专用硬件,是实现虚拟机数据完整性的硬件基础。该硬件集成于CPU内部,放置于CPU核心访问内存控制器的系统总线路径上。该硬件可获取CSV3虚拟机安全内存的信息,包括内存物理地址,虚拟机VMID,及相关的权限等。CPU在访问内存时,访问请求先经过安全内存隔离单元做合法性检查,若访问允许,继续访问内存,否则访问请求被拒绝。 +以CSV3架构图为例,在CSV3虚拟机运行过程中读取或写入内存时,先经过页表翻译单元完成虚拟机物理地址(GPA, Guest Physical Address)到主机物理地址(HPA, Host Physical Address)的转换,再向地址总线发出内存访问请求。访问请求中除了包含读写命令、内存地址(HPA),还必须同时发出访问者的VM ID。 +当CPU读取内存数据时,若安全内存隔离单元判断内存读取请求者的VMID错误,返回固定模式的数据。当CPU写入内存数据时,若安全内存隔离单元判断内存写入请求者的VMID错误,丢弃写入请求。 + +**安全处理器** +安全内存隔离单元是海光第三代安全虚拟化保护虚拟机内存完整性的核心硬件,对此硬件单元的配置必须保证绝对安全,无法被主机操作系统修改。 +海光安全处理器是SoC内独立于主CPU之外的处理器,是CPU的信任根。安全处理器在CPU上电后,通过内置验签密钥验证固件的完整性,并加载执行。安全处理器具有独立硬件资源和封闭的运行环境,是CPU内最高安全等级硬件,管理整个CPU的安全。安全内存隔离单元的内容仅安全处理器有权限配置和管理,主机操作系统无权读写。 +在虚拟机启动时,主机操作系统向安全处理器发送请求,安全处理器对安全内存隔离单元做初始配置。虚拟机运行期间,安全处理器固件更新安全内存隔离单元。虚拟机退出后,安全处理器清除安全内存隔离单元的配置。安全处理器会检查主机发来的配置请求是否合法,主机向安全处理器发起的任何非法配置请求,都会被拒绝。 +虚拟机整生命周期内,安全内存隔离单元都在安全处理器的管理控制之下,保证了其配置的安全性。 + +**Virtual Machine Control Block(VMCB)保护** +Virtual Machine Control Block(VMCB)是虚拟机的控制信息块,保存了虚拟机ID(VMID),虚拟机页表基地址,虚拟机寄存器页面基地址等控制信息,主机操作系统通过修改虚拟机控制信息能够影响并更改虚拟机内存数据。 +CSV3增加了对虚拟机控制信息的保护,安全处理器负责创建VMCB,并配置于安全内存隔离单元的硬件保护之下。主机操作系统无法更改CSV3虚拟机VMCB的内容。 +为更好的与主机操作系统的软件配合,CSV3创建了真实VMCB与影子VMCB页面。主机操作系统创建影子VMCB,填入控制信息,传递给安全处理器。安全处理器创建真实VMCB页面,复制影子VMCB中除虚拟机ID,虚拟机页表基地址,虚拟机寄存器页面基地址等关键信息之外的控制信息,并自行添加关键控制信息。虚拟机使用真实VMCB页面启动和运行,阻止了主机操作系统对虚拟机VMCB的攻击。 + +## 密码套件openHiTLS +openHiTLS旨在通过提供轻量化、可裁剪的软件技术架构及丰富的国际主流及中国商用密码算法、协议,满足云计算、大数据、AI、金融等多样化行业的安全需求。它具备算法先进、性能卓越、安全可靠、开放架构及多语言平滑兼容等特点,为开发者提供了一套安全、可扩展的密码解决方案。通过社区共建与生态建设,openHiTLS推动密码安全标准在各行各业的加速落地,同时构建以openEuler为核心的安全开源生态,为用户带来更加安全、可靠的数字环境。 + +**支持主流的密码协议和算法** +支持国际主流及中国商用密码算法和协议,可根据场景需求选择合适的密码算法和协议。 +- 支持中国商用密码算法:SM2、SM3、SM4等 +- 支持国际主流算法:AES、RSA、(EC)DSA、(EC)DH、SHA3、HMAC等 +- 支持GB/T 38636-2020 TLCP标准,即双证书国密通信协议 +- 支持TLS1.2、TLS1.3、DTLS1.2协议 + +**开放架构实现全场景密码应用覆盖** +通过开放架构、技术创新及全产业链的应用实践,向产业界提供一站式全场景覆盖。 +- 灵活南、北向接口:北向统一接口行业应用快速接入;南向设备抽象,广泛的运行在各类业务系统 +- 多语言平滑兼容:统一接口层(FFI)提供多语言兼容的能力,一套密码套件支持多种语言应用 +- 广泛的产品应用实践:全产业链应用场景密码技术实践,确保密码套件在不同场景下的高性能、高安全和高可靠 + +**分层分级解耦、按需裁剪,实现密码套件轻量化** +加密内存成本无法回避,分层分级解耦实现密码算法软件极致成本。 +- 分层分级解耦:TLS、证书、算法功能分层解耦、算法抽象、调度管理、算法原语分级解耦、高内聚低耦合、按需组合 +- 高级抽象接口:提供高级抽象接口,避免算法裁剪引入对外接口变更,降低软件成本前提下,保持对外接口不变 +- 极致成本:按需裁剪,特性依赖关系自动管理,可实现PBKDF2 + AES算法BIN20K、堆内存1K、栈内存256字节 + +**密码算法敏捷架构,应对后量子迁移** +通过密码算法敏捷架构、技术创新实现应用快速迁移和先进算法的快速演进。 +- 统一北向接口:提供对算法标准化、可扩展的接口,避免算法切换造成接口变动,上层业务应用需要大范围适配新接口 +- 算法插件化管理框架:实现对算法插件化管理,算法Provider层支持算法运行时动态加载,提供热加载算法的能力 +- 算法使用配置化:支持通过配置文件获取算法信息,可避免算法标识代码硬编码 +## AI集群慢节点快速发现 Add Fail-slow Detection +AI集群在训练过程中不可避免会发生性能劣化,导致性能劣化的原因很多且复杂。现有方案是在发生性能劣化之后利用日志分析,但是从日志收集到问题定界根因诊断以及现网闭环问题需要长达3-4天之久。基于上述痛点问题,我们设计了一套在线慢节点定界方案,该方案能够实时在线观测系统关键指标,并基于模型和数据驱动的算法对观测数据进行实时分析给出劣慢节点的位置,便于系统自愈或者运维人员修复问题。 +基于分组的指标对比技术提供了AI集群训练场景下的的慢节点/慢卡检测能力。这项技术通过 gala-anteater实现,新增内容包括配置文件、算法库、慢节点空间维度对比算法和慢节点时间维度对比,最终输出慢节点异常时间、异常指标以及对应的慢节点/慢卡ip, 从而提高系统的稳定性和可靠性。该服务主要功能如下: +- 配置文件:主要包括待观测指标类型、指标算法配置参数以及数据接口,用于初始化慢节点检测算法。 +- 算法库:包括常用的时序异常检测算法spot算法,k-sigma算法,异常节点聚类算法和相似度度量算法。 +- 数据:包括指标数据、作业拓扑数据以及通信域数据,指标数据表示指标的时序序列,作业拓扑数据表示训练作业所用的节点信息,通信域数据表示节点通信的连接关系,包括数据并行、张量并行和流水线并行等。 +- 指标分组对比: 包括组内空间异常节点筛选和单节点时间异常筛选。组内空间异常节点筛选根据异常聚类算法输出异常节点;单节点时间异常筛选根据单节点历史数据进行时序异常检测判断节点是否异常。 +## rubik在离线混部调度协同增强 +云数据中心资源利用率低(< 20%)是行业普遍存在的问题,提升资源利用率已经成为了一个重要的技术课题。将业务区分优先级混合部署(简称混部)运行是典型有效的资源利用率提升手段。然而,将多种类型业务混合部署能够显著提升集群资源利用率,也带来了共峰问题,会导致关键业务服务质量(QoS)受损。因此,如何在提升资源利用率之后,保障业务 QoS 不受损是技术上的关键挑战。 +rubik 是 openEuler提供的容器混部引擎,提供一套自适应的单机算力调优和服务质量保障机制,旨在保障关键业务服务质量不下降的前提下,提升节点资源利用率。 +- Cache 及内存带宽控制:支持对低优先级虚拟机的 LLC 和内存带宽进行限制,当前仅支持静态分配。 +- CPU 干扰控制:支持CPU时间片us级抢占及SMT干扰隔离,同时具有防优先级反转能力。 +- 内存资源抢占:支持在节点OOM时优先杀死离线业务,从而保证在线业务的服务质量。 +- memcg异步内存回收:支持限制混部时离线应用使用的总内存,并在在线内存使用量增加时动态压缩离线业务内存使用。 +- QuotaBurst柔性限流:支持关键在线业务被 CPU 限流时允许短时间突破 limit 限制,保障在线业务运行的服务质量。 +- PSI 指标观测增强:支持 cgroup v1 级别的压力信息统计,识别和量化资源竞争导致的业务中断风险,支撑用户实现 硬件资源利用率提升。 +- IOCost限制业务io权重:支持限制混部是离线业务的磁盘读写速率,防止离线业务争抢在线业务的磁盘带宽,从而提升在线业务服务质量。 +- CPI指标观测:支持通过观察CPI指标统计当前节点的压力,识别并驱逐离线业务以保证在线应用的服务质量。 + +此版本新增以下特性: +- 节点CPU/内存干扰控制和驱逐:支持通过观测当前节点的CPU和内存水位,在资源紧张情况下通过驱逐离线业务,保证节点水位安全。 + +## CFGO反馈优化特性增强 +日益膨胀的代码体积导致当前处理器前端瓶颈成为普遍问题,影响程序运行性能。编译器反馈优化技术可以有效解决此类问题。 +CFGO(Continuous Feature Guided Optimization)是GCC for openEuler和毕昇编译器的反馈优化技术名,指多模态(源代码、二进制)、全生命周期(编译、链接、链接后、运行时、OS、库)的持续反馈优化,主要包括以下两类优化技术: + +- 代码布局优化:通过基本块重排、函数重排、冷热分区等技术,优化目标程序的二进制布局,提升i-cache和i-TLB命中率。 +- 高级编译器优化:内联、循环展开、向量化、间接调用等提升编译优化技术受益于反馈信息,能够使编译器执行更精确的优化决策。 + +GCC CFGO反馈优化共包含三个子特性:CFGO-PGO、CFGO-CSPGO、CFGO-BOLT,通过依次使能这些特性可以缓解处理前端瓶颈,提升程序运行时性能。为了进一步提升优化效果,建议CFGO系列优化与链接时优化搭配使用,即在CFGO-PGO、CFGO-CSPGO优化过程中增加-flto=auto编译选项。 + +- CFGO-PGO + +CFGO-PGO在传统PGO优化的基础上,利用AI4C对部分优化遍进行增强,主要包括inline、常量传播、去虚化等优化,从而进一步提升性能。 +- CFGO-CSPGO + +PGO的profile对上下文不敏感,可能导致次优的优化效果。通过在PGO后增加一次CFGO-CSPGO插桩优化流程,收集inline后的程序运行信息,从而为代码布局和寄存器优化等编译器优化遍提供更准确的执行信息,实现性能进一步提升。 +- CFGO-BOLT + +CFGO-BOLT在基线版本的基础上,新增aarch64架构软件插桩、inline优化支持等优化,最终进一步提升性能。 + +## AI4C编译选项调优和AI编译优化提升典型应用性能 +AI4C(AI for Compiler)代表AI辅助编译优化套件,是一个使用AI技术优化编译选项和优化遍关键决策的软件框架,旨在突破当前编译器领域的两个关键业务挑战: +1. 性能提升困难:传统编译器优化开发周期长,新的编译优化技术与已有编译优化过程难以兼容达到1+1>=2的效果,导致性能提升无法达到预期效果。 +2. 调优效率低下:硬件架构或者软件业务场景变更,需要根据新的负载条件投入大量人力成本,重新调整编译优化的成本模型,导致调优时间长。 + +AI4Compiler框架提供编译选项自动调优和AI模型辅助编译优化两个主要模块。 +框架上层调度模块驱动中层编译器核心优化过程,通过不同编译器各自的适配模块调用底层AI模型和模型推理引擎,以优化特性相关数据和硬件架构参数作为模型输入特征运行模型推理,获得编译过程关键参数最佳决策,从而实现编译优化。 +- 编译自动调优(Autotuner) +AI4C的自动调优基于OpenTuner(2015 Ansel et al.)开发,通过插件驱动编译器采集优化特性相关参数信息,通过搜索算法调整关键决策参数(例如循环展开系数),通过插件注入编译过程修改决策,运行编译输出二进制获得反馈因子,迭代自动调优。 +(1)已集成一系列搜索算法,动态选择算法并共享搜索进; +(2)支持用户配置yaml自定义搜索空间和扩展底层搜索算法; +(3)支持细粒度代码块调优与粗粒度编译选项自动调优; +(4)在cormark、dhrystone、Cbench等benchmark上获得3%~5%不等的收益。 +- AI辅助编译优化(ACPO) +ACPO提供全面的工具、库、算法,为编译器工程师提供简单易用的接口使用AI模型能力,替代或增强编译器中启发式优化决策算法。在编译器优化过程中,使用插件提取优化遍的输入结构化数据作为模型输入特征,getAdvice运行预训练模型获得决策系数,编译器使用模型决策结果替代部分启发式决策,获得更好的性能。 +(1)解耦编译器与AI模型和推理引擎,帮助算法开发者专注AI算法模型开发,简化模型应用成本,同时兼容多个编译器、模型、AI推理框架等主流产品,提供AI模型的热更新能力; +(2)实践落地IPA Loop Inline、RTL Loop Unroll等不同优化阶段和优化过程,获得相对显著的收益。 +## RPM国密签名支持 +根据国内相关安全技术标准,在某些应用场景中需要采用国密算法实现对重要可执行程序来源的真实性和完整性保护。openEuler当前采用RPM格式的软件包管理,软件包签名算法基于openPGP签名规范。openEuler 24.03 LTS SP1版本基于RPM包管理机制扩展对于SM2签名算法和SM3摘要算法的支持。 + +本特性主要基于RPM组件以及其调用的GnuPG2签名工具,在现有openPGP签名体系的基础上,进行国密算法使能。在RPM软件包签名场景,用户可调用gpg命令生成SM2签名私钥和证书,并调用rpmsign命令为指定的RPM包添加基于SM2+SM3算法的数字签名。在RPM软件包验签场景,用户可调用rpm命令导入验签证书,并通过校验RPM包的数字签名信息从而验证软件包的真实性和完整性。 +## oneAPI 框架支持 +Unified Acceleration Foundation(UXL)正在推动构建开放的异构加速软件框架的标准化。其中oneAPI作为初始项目的目标是提供一种跨行业、开放、基于标准的统一编程模型,并为异构加速器(如 CPU、GPU、FPGA 和专用加速器)提供统一的开发体验。oneAPI规范扩展了现有的开发者编程模型,通过并行编程语言(Data Parallel C++)或者一组加速软件库以及底层的硬件抽象接口(Level Zero)来支持跨架构的编程,从而支持多种加速硬件和处理器平台。为了提供兼容性并提高开发效率,oneAPI 规范基于行业标准,提供了多种开放的、跨平台和易用的开发者软件套件。 + +为了在openEuler上完整的支持oneAPI,我们从openEuler 24.03 LTS开始分别集成了oneAPI的开发环境Basekit和运行态环境的Runtime容器镜像。并从openEuler 24.03 LTS SP1开始,openEuler原生支持了oneAPI系列底层框架库的适配和集成,包括oneAPI运行所需的各类依赖库,以及英特尔的图形加速编译器,以及OpenCL的runtime,和支持不同平台(x86_64和aarch64)的底层硬件抽象层(Level-Zero)等。同时为了完整支持Data Parallels C++和基于加速库的API的编程模式,我们也对oneAPI官方提供的各类软件包在openEuler上做了相应的适配和验证的工作,这样我们可以方便的通过在openEuler中增加oneAPI的官方DNF/YUM源来安装和更新所有的oneAPI的运行依赖、开发工具和调试工具等。 +## OpenVINO 支持 +OpenVINO是一套开源的AI工具套件和运行库,其能用于优化几乎各类主流框架的深度学习模型,并在各种Intel处理器和加速器以及其他硬件平台如ARM上以最佳性能进行部署并高效提供AI服务。我们从openEuler 24.03 LTS SP1开始,提供了OpenVINO原生的适配和集成,从而在openEuler上提供了完整的OpenVINO的计算能力。 + +OpenVINO的模型转化功能可以将已经使用流行框架(如 TensorFlow、PyTorch、ONNX和PaddlePaddle等)训练的模型进行转换和优化。并在各类的Intel处理器和加速器或者ARM的硬件环境中进行部署,包括在本地、设备端、浏览器或云端等场景下提供服务能力。 +## 鲲鹏KAE加速器 +鲲鹏加速引擎KAE(Kunpeng Accelerator Engine)是基于鲲鹏920处理器提供的硬件加速器解决方案,包含了KAE加解密和KAEZip。KAE加解密和KAEZip分别用于加速SSL(Secure Sockets Lyaer) / TLS(Transport Layer Security)应用和数据压缩,可以显著降低处理器消耗,提高处理器效率。此外,鲲鹏加速引擎对应用层屏蔽了其内部细节,用户通过OpenSSL、zlib标准接口即可实现快速迁移现有业务。 +KAE加解密是鲲鹏加速引擎的加解密模块,使用鲲鹏加速引擎实现RSA/SM3/SM4/DH/MD5/AES算法,结合无损用户态驱动框架,提供高性能对称加解密、非对称加解密算法能力,兼容OpenSSL 1.1.1x系列版本,支持同步&异步机制。 +KAEzip是鲲鹏加速引擎的压缩模块,使用鲲鹏硬加速模块实现deflate算法,结合无损用户态驱动框架,提供高性能Gzip/zlib格式压缩接口。通过加速引擎可以实现不同场景下应用性能的提升,例如在分布式存储场景下,通过zlib加速库加速数据压缩和解压。 \ No newline at end of file diff --git "a/docs/zh/docs/Releasenotes/\345\267\262\344\277\256\345\244\215\351\227\256\351\242\230.md" "b/docs/zh/docs/Releasenotes/\345\267\262\344\277\256\345\244\215\351\227\256\351\242\230.md" index 0f1e2ca427bd1e631b4c6eec858e20829fba8fd0..8afa5eda6283d38830aaf5733e6a34db92060b13 100644 --- "a/docs/zh/docs/Releasenotes/\345\267\262\344\277\256\345\244\215\351\227\256\351\242\230.md" +++ "b/docs/zh/docs/Releasenotes/\345\267\262\344\277\256\345\244\215\351\227\256\351\242\230.md" @@ -2,14 +2,412 @@ 完整问题清单请参见[完整问题清单](https://gitee.com/organizations/src-openeuler/issues)。 -完整的内核提交记录请参见[提交记录](https://gitee.com/openeuler/kernel/commits/openEuler-21.03)。 +已修复问题请参见下表。 -已修复问题请参见[表1](#table249714911433)。 +**表 1** 修复问题列表 -**表 1** 修复问题列表 - -| ISSUE |问题描述 | -|:--- |:---- | -|[I5J302](https://gitee.com/open_euler/dashboard?issue_id=I5J302)|【安装冲突arm/x86_64】openEuler:22.09分支fwupd与dbxtool包安装冲突| -|[I5J36Q](https://gitee.com/open_euler/dashboard?issue_id=I5J36Q)|【安装冲突 arm/x86_64】python3-wrapt在22.09分支存在安装冲突| -|[I5J3K1](https://gitee.com/open_euler/dashboard?issue_id=I5J3K1)|【安装冲突arm/x86_64】openEuler:22.09分支mariadb与mysql包安装冲突| +|ISSUE ID|关联仓库|问题描述|ISSUE 路径| +|-|-|-|-| +|IA4E07|gtk-doc|【x86/arm】gtk-doc软件包license信息识别审阅|https://gitee.com/open_euler/dashboard?issue_id=IA4E07| +|IA4E50|qt5-qtenginio|【x86/arm】qt5-qtenginio-help软件包license信息识别审阅|https://gitee.com/open_euler/dashboard?issue_id=IA4E50| +|IA4V4R|mathjax|【x86/arm】mathjax各个子包license信息识别审阅|https://gitee.com/open_euler/dashboard?issue_id=IA4V4R| +|IA4VRO|libcrystalhd|【x86/arm】crystalhd-firmware软件包license信息识别审阅|https://gitee.com/open_euler/dashboard?issue_id=IA4VRO| +|IA4VRW|mecab-ipadic|【x86/arm】mecab-ipadic、mecab-ipadic-EUCJP软件包license信息识别审阅|https://gitee.com/open_euler/dashboard?issue_id=IA4VRW| +|IA4VU8|python-cssutils|【x86/arm】python3-cssutils、python-cssutils-help软件包license信息识别审阅|https://gitee.com/open_euler/dashboard?issue_id=IA4VU8| +|IA4W5L|raspberrypi-eeprom|【x86/arm】raspberrypi-eeprom软件包license信息识别审阅|https://gitee.com/open_euler/dashboard?issue_id=IA4W5L| +|IA4W60|raspberrypi-firmware|【x86/arm】raspberrypi-firmware软件包license信息识别审阅|https://gitee.com/open_euler/dashboard?issue_id=IA4W60| +|IA4WCX|shared-desktop-ontologies|【x86/arm】shared-desktop-ontologies、shared-desktop-ontologies-devel软件包license信息识别审阅|https://gitee.com/open_euler/dashboard?issue_id=IA4WCX| +|IAGS0Z|umdk|【x86/arm】umdk-urma-bin、umdk-urma-compat-hns-libumdk-urma-devel、umdk-urma-lib、umdk-urma-tools软件包license信息识别审阅|https://gitee.com/open_euler/dashboard?issue_id=IAGS0Z| +|IAGS5K|tidb|【x86/arm】tidb、tidb-debuginfo、tidb-debugsource 软件包license信息识别审阅|https://gitee.com/open_euler/dashboard?issue_id=IAGS5K| +|IAGS6O|tidb|【x86/arm】tidb、tidb-debuginfo、tidb-debugsource软件包license信息识别审阅|https://gitee.com/open_euler/dashboard?issue_id=IAGS6O| +|IAGWFV|aisleriot|【x86/arm】aisleriot、aisleriot-debuginfo、aisleriot-debugsource软件包license信息识别审阅|https://gitee.com/open_euler/dashboard?issue_id=IAGWFV| +|IAGX7M|libmd|【x86/arm】软件包libmd、libmd-devel的license信息识别审阅|https://gitee.com/open_euler/dashboard?issue_id=IAGX7M| +|IAGXT0|plasma-workspace|【x86/arm】plasma-workspace-doc软件包license信息识别审阅|https://gitee.com/open_euler/dashboard?issue_id=IAGXT0| +|IB2D21|python-lxml|【EulerMaker】python-lxml 在openEuler-24.03-LTS-SP1:everything 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB2D21| +|IB2DUG|libclc|【EulerMaker】libclc 在openEuler-24.03-LTS-SP1:everything 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB2DUG| +|IB2ET1|gtk-doc|【EulerMaker】gtk-doc 在openEuler-24.03-LTS-SP1:everything 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB2ET1| +|IB2HEA|python-sybil|【EulerMaker】python-sybil在openEuler-24.03-LTS-SP1:everything 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB2HEA| +|IB2HF9|python-h5py|【EulerMaker】python-h5py 在openEuler-24.03-LTS-SP1:everything 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB2HF9| +|IB2HIR|spirv-llvm-translator|【EulerMaker】spirv-llvm-translator 在openEuler-24.03-LTS-SP1:everything 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB2HIR| +|IB2HNW|nodejs-commander|【EulerMaker】nodejs-commander 在openEuler-24.03-LTS-SP1:everything 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB2HNW| +|IB2HOQ|gcc-cross|【EulerMaker】gcc-cross 在openEuler-24.03-LTS-SP1:everything 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB2HOQ| +|IB2HQ0|phonon|【EulerMaker】phonon 在openEuler-24.03-LTS-SP1:everything 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB2HQ0| +|IB2IHP|mold|【EulerMaker】mold 在openEuler-24.03-LTS-SP1:everything 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB2IHP| +|IB2IIT|intel-ipp-crypto-mb|【EulerMaker】intel-ipp-crypto-mb 在openEuler-24.03-LTS-SP1:everything 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB2IIT| +|IB2IKX|intel-ipsec-mb|【EulerMaker】intel-ipsec-mb 在openEuler-24.03-LTS-SP1:everything 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB2IKX| +|IB2IO5|intel-qatzip|【EulerMaker】intel-ipp-crypto-mb 在openEuler-24.03-LTS-SP1:everything 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB2IO5| +|IB2IOK|intel-qatengine|【EulerMaker】intel-ipp-crypto-mb 在openEuler-24.03-LTS-SP1:everything 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB2IOK| +|IB2LOS|kata-containers|【EulerMaker】kata-containers 在openEuler-24.03-LTS-SP1:everything 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB2LOS| +|IB2VQS|trace-cmd|【24.03-LTS-SP1-alpha】【arm】卸载trace-cmd软件包出现报错:Failed to disable unit: Unit file trace-cmd.service does not exist.|https://gitee.com/open_euler/dashboard?issue_id=IB2VQS| +|IB2W03|crash-trace-command|【24.03-LTS-SP1-alpha】【arm】卸载crash-trace-command软件包出现报错:Failed to disable unit: Unit file trace-cmd.service does not exist.|https://gitee.com/open_euler/dashboard?issue_id=IB2W03| +|IB3RFB|moby|【24.03-LTS-SP1-alpha】【arm/x86】安装docker后执行dnf update,出现依赖错误|https://gitee.com/open_euler/dashboard?issue_id=IB3RFB| +|IB3RUN|python-commonmark|【openEuler-24.03-LTS-SP1-alpha】【arm/x86】 nodejs-commonmark和python3-commonmark存在安装冲突|https://gitee.com/open_euler/dashboard?issue_id=IB3RUN| +|IB3S3F|mandoc|【openEuler-24.03-LTS-SP1-alpha】【arm/x86】 mandoc和groff-help和man-pages-help存在安装冲突|https://gitee.com/open_euler/dashboard?issue_id=IB3S3F| +|IB3SMV|native-turbo|【24.03-SP1-alpha】【autotest】native-turbo 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SMV| +|IB3SMW|lxcfs-tools|【24.03-SP1-alpha】【autotest】lxcfs-tools 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SMW| +|IB3SMY|python-zmq|【24.03-SP1-alpha】【autotest】python-zmq 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SMY| +|IB3SMZ|gcc|【24.03-SP1-alpha】【autotest】gcc 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SMZ| +|IB3SN1|python-inotify|【24.03-SP1-alpha】【autotest】python-inotify 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SN1| +|IB3SN2|python-memcached|【24.03-SP1-alpha】【autotest】python-memcached 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SN2| +|IB3SN3|i2c-tools|【24.03-SP1-alpha】【autotest】i2c-tools 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SN3| +|IB3SN4|pytorch|【24.03-SP1-alpha】【autotest】pytorch 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SN4| +|IB3SN5|secGear|【24.03-SP1-alpha】【autotest】secGear 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SN5| +|IB3SN6|ghostscript|【24.03-SP1-alpha】【autotest】ghostscript 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SN6| +|IB3SN7|qpdf|【24.03-SP1-alpha】【autotest】qpdf 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SN7| +|IB3SNA|check|【24.03-SP1-alpha】【autotest】check 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SNA| +|IB3SNB|enchant2|【24.03-SP1-alpha】【autotest】enchant2 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SNB| +|IB3SNC|python-XlsxWriter|【24.03-SP1-alpha】【autotest】python-XlsxWriter 包在24.03-LTS-SP1中相比24.03-LTS version版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SNC| +|IB3SND|edk2|【24.03-SP1-alpha】【autotest】edk2 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SND| +|IB3SNF|kata-containers|【24.03-SP1-alpha】【autotest】kata-containers 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SNF| +|IB3SNG|libstoragemgmt|【24.03-SP1-alpha】【autotest】libstoragemgmt 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SNG| +|IB3SNH|python-pythran|【24.03-SP1-alpha】【autotest】python-pythran 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SNH| +|IB3SNI|librdkafka|【24.03-SP1-alpha】【autotest】librdkafka 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SNI| +|IB3SNJ|oec-hardware|【24.03-SP1-alpha】【autotest】oec-hardware 包在24.03-LTS-SP1中相比24.03-LTS version版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SNJ| +|IB3SNL|python-hamcrest|【24.03-SP1-alpha】【autotest】python-hamcrest 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SNL| +|IB3SNM|kmod-kvdo|【24.03-SP1-alpha】【autotest】kmod-kvdo 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SNM| +|IB3SNO|sscg|【24.03-SP1-alpha】【autotest】sscg 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SNO| +|IB3SNP|python-pydantic|【24.03-SP1-alpha】【autotest】python-pydantic 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SNP| +|IB3SNQ|python-eventlet|【24.03-SP1-alpha】【autotest】python-eventlet 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SNQ| +|IB3SNR|gstreamer1-plugins-bad-free|【24.03-SP1-alpha】【autotest】gstreamer1-plugins-bad-free 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SNR| +|IB3SNS|pybind11|【24.03-SP1-alpha】【autotest】pybind11 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SNS| +|IB3SNU|tftp|【24.03-SP1-alpha】【autotest】tftp 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SNU| +|IB3SNV|lwip|【24.03-SP1-alpha】【autotest】lwip 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SNV| +|IB3SNW|tpm2-tss|【24.03-SP1-alpha】【autotest】tpm2-tss 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SNW| +|IB3SNX|python-certifi|【24.03-SP1-alpha】【autotest】python-certifi 包在24.03-LTS-SP1中相比24.03-LTS version版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SNX| +|IB3SNZ|automake|【24.03-SP1-alpha】【autotest】automake 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SNZ| +|IB3SO1|python-jedi|【24.03-SP1-alpha】【autotest】python-jedi 包在24.03-LTS-SP1中相比24.03-LTS version版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SO1| +|IB3SO2|lftp|【24.03-SP1-alpha】【autotest】lftp 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SO2| +|IB3SO3|autoconf|【24.03-SP1-alpha】【autotest】autoconf 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SO3| +|IB3SO4|memleax|【24.03-SP1-alpha】【autotest】memleax 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SO4| +|IB3SO5|grub2|【24.03-SP1-alpha】【autotest】grub2 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SO5| +|IB3SO6|jetty|【24.03-SP1-alpha】【autotest】jetty 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SO6| +|IB3SO7|gala-gopher|【24.03-SP1-alpha】【autotest】gala-gopher 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SO7| +|IB3SO8|kiwi|【24.03-SP1-alpha】【autotest】kiwi 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SO8| +|IB3SOA|inih|【24.03-SP1-alpha】【autotest】inih 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SOA| +|IB3SOB|containerd|【24.03-SP1-alpha】【autotest】containerd 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SOB| +|IB3SOC|openEuler-repos|【24.03-SP1-alpha】【autotest】openEuler-repos 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SOC| +|IB3SOD|criu|【24.03-SP1-alpha】【autotest】criu 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SOD| +|IB3SOE|python-filelock|【24.03-SP1-alpha】【autotest】python-filelock 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SOE| +|IB3SOF|uadk_engine|【24.03-SP1-alpha】【autotest】uadk_engine 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SOF| +|IB3SOG|autofdo|【24.03-SP1-alpha】【autotest】autofdo 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SOG| +|IB3SOJ|dhcp|【24.03-SP1-alpha】【autotest】dhcp 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SOJ| +|IB3SOL|python-iniparse|【24.03-SP1-alpha】【autotest】python-iniparse 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SOL| +|IB3SOM|gazelle|【24.03-SP1-alpha】【autotest】gazelle 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SOM| +|IB3SON|bcc|【24.03-SP1-alpha】【autotest】bcc 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SON| +|IB3SOO|ethtool|【24.03-SP1-alpha】【autotest】ethtool 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SOO| +|IB3SOP|rootsh|【24.03-SP1-alpha】【autotest】rootsh 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SOP| +|IB3SOQ|python-moto|【24.03-SP1-alpha】【autotest】python-moto 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SOQ| +|IB3SOR|kylin-usb-creator|【24.03-SP1-alpha】【autotest】kylin-usb-creator 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SOR| +|IB3SOS|dtkgui|【24.03-SP1-alpha】【autotest】dtkgui 包在24.03-LTS-SP1中相比24.03-LTS version版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SOS| +|IB3SOT|kylin-burner|【24.03-SP1-alpha】【autotest】kylin-burner 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SOT| +|IB3SOU|ukui-session-manager|【24.03-SP1-alpha】【autotest】ukui-session-manager 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SOU| +|IB3SOV|ukui-menu|【24.03-SP1-alpha】【autotest】ukui-menu 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SOV| +|IB3SOW|k3s-containerd|【24.03-SP1-alpha】【autotest】k3s-containerd 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SOW| +|IB3SOX|ukui-greeter|【24.03-SP1-alpha】【autotest】ukui-greeter 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SOX| +|IB3SOY|python-faust|【24.03-SP1-alpha】【autotest】python-faust 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SOY| +|IB3SP0|cri-tools|【24.03-SP1-alpha】【autotest】cri-tools 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SP0| +|IB3SP2|ukui-bluetooth|【24.03-SP1-alpha】【autotest】ukui-bluetooth 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SP2| +|IB3SP3|ukui-panel|【24.03-SP1-alpha】【autotest】ukui-panel 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SP3| +|IB3SP5|distributed-build|【24.03-SP1-alpha】【autotest】distributed-build 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SP5| +|IB3SP6|apptainer|【24.03-SP1-alpha】【autotest】apptainer 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SP6| +|IB3SP7|ukui-system-monitor|【24.03-SP1-alpha】【autotest】ukui-system-monitor 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SP7| +|IB3SP8|ukui-screensaver|【24.03-SP1-alpha】【autotest】ukui-screensaver 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SP8| +|IB3SPA|indicator-china-weather|【24.03-SP1-alpha】【autotest】indicator-china-weather 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SPA| +|IB3SPC|peony-extensions|【24.03-SP1-alpha】【autotest】peony-extensions 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SPC| +|IB3SPE|dde|【24.03-SP1-alpha】【autotest】dde 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SPE| +|IB3SPF|kylin-recorder|【24.03-SP1-alpha】【autotest】kylin-recorder 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SPF| +|IB3SPG|kiran-icon-theme|【24.03-SP1-alpha】【autotest】kiran-icon-theme 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SPG| +|IB3SPH|python-blurb|【24.03-SP1-alpha】【autotest】python-blurb 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SPH| +|IB3SPI|cri-o|【24.03-SP1-alpha】【autotest】cri-o 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SPI| +|IB3SPK|distributed-build_lite|【24.03-SP1-alpha】【autotest】distributed-build_lite 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SPK| +|IB3SPL|migration-tools|【24.03-SP1-alpha】【autotest】migration-tools 包在24.03-LTS-SP1中相比24.03-LTS version版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SPL| +|IB3SPM|ukui-search|【24.03-SP1-alpha】【autotest】ukui-search 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SPM| +|IB3SPN|ukui-control-center|【24.03-SP1-alpha】【autotest】ukui-control-center 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SPN| +|IB3SPO|kylin-nm|【24.03-SP1-alpha】【autotest】kylin-nm 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SPO| +|IB3SPP|kylin-calculator|【24.03-SP1-alpha】【autotest】kylin-calculator 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SPP| +|IB3SPQ|python-pytimeparse|【24.03-SP1-alpha】【autotest】python-pytimeparse 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SPQ| +|IB3SPR|peony|【24.03-SP1-alpha】【autotest】peony 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SPR| +|IB3SPS|ukui-clock|【24.03-SP1-alpha】【autotest】ukui-clock 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SPS| +|IB3SPT|python-jose|【24.03-SP1-alpha】【autotest】python-jose 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SPT| +|IB3SPU|python-pytest-metadata|【24.03-SP1-alpha】【autotest】python-pytest-metadata 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SPU| +|IB3SPV|python-asgiref|【24.03-SP1-alpha】【autotest】python-asgiref 包在24.03-LTS-SP1中相比24.03-LTS version版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SPV| +|IB3SPW|kylin-scanner|【24.03-SP1-alpha】【autotest】kylin-scanner 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SPW| +|IB3SPY|libkysdk-applications|【24.03-SP1-alpha】【autotest】libkysdk-applications 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SPY| +|IB3SQ0|kylin-music|【24.03-SP1-alpha】【autotest】kylin-music 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SQ0| +|IB3SQ1|ukui-notification-daemon|【24.03-SP1-alpha】【autotest】ukui-notification-daemon 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SQ1| +|IB3SQ2|k3s-selinux|【24.03-SP1-alpha】【autotest】k3s-selinux 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3SQ2| +|IB3TSL|python-parse-type|【openEuler-24.03-LTS-SP1-alpha】【arm/x86】 python3-parse_type和python3-parse-type存在安装冲突|https://gitee.com/open_euler/dashboard?issue_id=IB3TSL| +|IB3U4K|python-fonttools|【24.03-SP1-alpha】【autotest】fonttools 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3U4K| +|IB3U5L|openjfx11|【24.03-SP1-alpha】【autotest】openjfx 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3U5L| +|IB3U5Z|etcd|【24.03-SP1-alpha】【autotest】etcd 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3U5Z| +|IB3U66|KubeOS|【24.03-SP1-alpha】【autotest】KubeOS 包在24.03-LTS-SP1中相比24.03-LTS version版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3U66| +|IB3U6G|xdp-tools|【24.03-SP1-alpha】【autotest】xdp-tools 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3U6G| +|IB3U6O|perl-Text-BibTeX|【24.03-SP1-alpha】【autotest】perl-Text-BibTeX 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3U6O| +|IB3U6X|jboss-servlet-2.5-api|【24.03-SP1-alpha】【autotest】jboss-servlet-2.5-api 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3U6X| +|IB3U77|openapi-schema-validator|【24.03-SP1-alpha】【autotest】openapi-schema-validator 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB3U77| +|IB44JW|moby|【openEuler-24.03-LTS-SP1-alpha】【arm/x86】 libnetwork和docker-engine-25.0.3-17.oe2403sp1存在安装冲突|https://gitee.com/open_euler/dashboard?issue_id=IB44JW| +|IB47CH|netavark|【EulerMaker】netavark 在openEuler-24.03-LTS-SP1:everything 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB47CH| +|IB47EL|tomcatjss|【EulerMaker】tomcatjss 在openEuler-24.03-LTS-SP1:everything 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB47EL| +|IB47GD|firefox|【EulerMaker】firefox 在openEuler-24.03-LTS-SP1:everything 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB47GD| +|IB47IK|osinfo-db-tools|【EulerMaker】osinfo-db-tools 在openEuler-24.03-LTS-SP1:everything 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB47IK| +|IB4DCG|fwupd|【24.03_LTS_SP1_alpha】【x86/arm】fwupd存在不安全的编译选项RPATH|https://gitee.com/open_euler/dashboard?issue_id=IB4DCG| +|IB4VTI|commonlibrary_c_utils|【EulerMaker】 commonlibrary_c_utils 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4VTI| +|IB4VYF|communication_ipc|【EulerMaker】 communication_ipc 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4VYF| +|IB4VYG|distributedhardware_device_manager|【EulerMaker】 distributedhardware_device_manager 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4VYG| +|IB4VYH|nautilus-python|【EulerMaker】 nautilus-python 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4VYH| +|IB4VYI|oceanbase-ce|【EulerMaker】 oceanbase-ce 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4VYI| +|IB4VYJ|notification_eventhandler|【EulerMaker】 notification_eventhandler 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4VYJ| +|IB4VYK|security_dataclassification|【EulerMaker】 security_dataclassification 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4VYK| +|IB4VYL|security_device_auth|【EulerMaker】 security_device_auth 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4VYL| +|IB4VYM|security_device_security_level|【EulerMaker】 security_device_security_level 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4VYM| +|IB4VYN|security_huks|【EulerMaker】 security_huks 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4VYN| +|IB4VYO|systemabilitymgr_safwk|【EulerMaker】 systemabilitymgr_safwk 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4VYO| +|IB4VYP|systemabilitymgr_samgr|【EulerMaker】 systemabilitymgr_samgr 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4VYP| +|IB4W1O|butane|【EulerMaker】 butane 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4W1O| +|IB4W1P|cinnamon|【EulerMaker】 cinnamon 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4W1P| +|IB4W1Q|dde-launcher|【EulerMaker】 dde-launcher 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4W1Q| +|IB4W1R|dde-session-shell|【EulerMaker】 dde-session-shell 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4W1R| +|IB4W1S|dpu-utilities|【EulerMaker】 dpu-utilities 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4W1S| +|IB4W1U|dsoftbus|【EulerMaker】 dsoftbus 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4W1U| +|IB4W1V|ffmpegthumbnailer|【EulerMaker】 ffmpegthumbnailer 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4W1V| +|IB4W1W|kde-connect|【EulerMaker】 kde-connect 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4W1W| +|IB4W1X|kf5-akonadi-mime|【EulerMaker】 kf5-akonadi-mime 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4W1X| +|IB4W1Y|kf5-akonadi-server|【EulerMaker】 kf5-akonadi-server 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4W1Y| +|IB4W1Z|kf5-bluez-qt|【EulerMaker】 kf5-bluez-qt 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4W1Z| +|IB4W20|kf5-kactivities|【EulerMaker】 kf5-kactivities 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4W20| +|IB4W21|kf5-kauth|【EulerMaker】 kf5-kauth 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4W21| +|IB4W22|kf5-kconfig|【EulerMaker】 kf5-kconfig 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4W22| +|IB4W23|kf5-khtml|【EulerMaker】 kf5-khtml 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4W23| +|IB4W24|kf5-kimap|【EulerMaker】 kf5-kimap 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4W24| +|IB4W25|kf5-kirigami2|【EulerMaker】 kf5-kirigami2 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4W25| +|IB4W26|kf5-kjs|【EulerMaker】 kf5-kjs 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4W26| +|IB4W27|kf5-knotifications|【EulerMaker】 kf5-knotifications 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4W27| +|IB4W28|kf5-knotifyconfig|【EulerMaker】 kf5-knotifyconfig 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4W28| +|IB4W29|kf5-mailcommon|【EulerMaker】 kf5-mailcommon 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4W29| +|IB4W2A|kf5-solid|【EulerMaker】 kf5-solid 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4W2A| +|IB4W2B|kf5-syndication|【EulerMaker】 kf5-syndication 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4W2B| +|IB4W2C|Kmesh|【EulerMaker】 Kmesh 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4W2C| +|IB4W2D|kuserfeedback|【EulerMaker】 kuserfeedback 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4W2D| +|IB4W2E|kwin|【EulerMaker】 kwin 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4W2E| +|IB4W2F|kylin-screenshot|【EulerMaker】 kylin-screenshot 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4W2F| +|IB4W2G|libksysguard|【EulerMaker】 libksysguard 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4W2G| +|IB4W2H|mate-power-manager|【EulerMaker】 mate-power-manager 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4W2H| +|IB4W2I|octave|【EulerMaker】 octave 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4W2I| +|IB4W2J|okular|【EulerMaker】 okular 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4W2J| +|IB4W2K|ovirt-engine-ui-extensions|【EulerMaker】 ovirt-engine-ui-extensions 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4W2K| +|IB4W2M|plasma-workspace|【EulerMaker】 plasma-workspace 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4W2M| +|IB4W2N|polkit-kde|【EulerMaker】 polkit-kde 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4W2N| +|IB4W2O|python-plum-py|【EulerMaker】 python-plum-py 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4W2O| +|IB4W2P|python-typeguard|【EulerMaker】 python-typeguard 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4W2P| +|IB4W2Q|qtav|【EulerMaker】 qtav 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4W2Q| +|IB4W2R|zram-generator|【EulerMaker】 zram-generator 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4W2R| +|IB4W2S|ukui-kwin|【EulerMaker】 ukui-kwin 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB4W2S| +|IB4YJK|stratovirt|[openEuler-24.03-LTS-SP1-RC1][arm] 使用源自带的镜像启动虚拟机失败|https://gitee.com/open_euler/dashboard?issue_id=IB4YJK| +|IB519M|CBS|【2024-1230】简化k8s部署,说明文档修改|https://gitee.com/open_euler/dashboard?issue_id=IB519M| +|IB54RM|hiviewdfx_hilog|【EulerMaker】hiviewdfx_hilog 在openEuler-24.03-LTS-SP1:epol中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB54RM| +|IB5662|epkg_manager|【2024-1230】main环境下 安装epkg软件包失败|https://gitee.com/open_euler/dashboard?issue_id=IB5662| +|IB57E7|epkg_manager|【2024-1230】EPKG global模式下 user用户不应该保留common环境|https://gitee.com/open_euler/dashboard?issue_id=IB57E7| +|IB57N7|epkg_manager|【2024-1230】epkg 安装使用过程中易用性问题|https://gitee.com/open_euler/dashboard?issue_id=IB57N7| +|IB5NFO|sysmaster|【24.03 SP1】虚拟机场景,yum install sysmaster, reboot无法使用卸载后reboot正常|https://gitee.com/open_euler/dashboard?issue_id=IB5NFO| +|IB5OPE|epkg_manager|【2024-1230】root用户下user模式安装,切换普通用户执行init 异常退出|https://gitee.com/open_euler/dashboard?issue_id=IB5OPE| +|IB5R1U|kae_driver|【24.03-SP1-rc1】【autotest】kae_driver 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R1U| +|IB5R1W|selinux-policy|【24.03-SP1-rc1】【autotest】selinux-policy 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R1W| +|IB5R1X|python-mypy|【24.03-SP1-rc1】【autotest】python-mypy 包在24.03-LTS-SP1中相比24.03-LTS version版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R1X| +|IB5R1Y|shadow|【24.03-SP1-rc1】【autotest】shadow 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R1Y| +|IB5R21|hadoop|【24.03-SP1-rc1】【autotest】hadoop 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R21| +|IB5R22|intel-sgx-ssl|【24.03-SP1-rc1】【autotest】intel-sgx-ssl 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R22| +|IB5R24|xorg-x11-drv-nouveau|【24.03-SP1-rc1】【autotest】xorg-x11-drv-nouveau 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R24| +|IB5R25|libnetfilter_queue|【24.03-SP1-rc1】【autotest】libnetfilter_queue 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R25| +|IB5R26|xorg-x11-xauth|【24.03-SP1-rc1】【autotest】xorg-x11-xauth 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R26| +|IB5R27|lsscsi|【24.03-SP1-rc1】【autotest】lsscsi 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R27| +|IB5R28|xorg-x11-drv-fbdev|【24.03-SP1-rc1】【autotest】xorg-x11-drv-fbdev 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R28| +|IB5R29|mock|【24.03-SP1-rc1】【autotest】mock 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R29| +|IB5R2A|qt6-qtquick3dphysics|【24.03-SP1-rc1】【autotest】qt6-qtquick3dphysics 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R2A| +|IB5R2C|kata_integration|【24.03-SP1-rc1】【autotest】kata_integration 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R2C| +|IB5R2D|lldpad|【24.03-SP1-rc1】【autotest】lldpad 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R2D| +|IB5R2E|protobuf|【24.03-SP1-rc1】【autotest】protobuf 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R2E| +|IB5R2F|spice-vdagent|【24.03-SP1-rc1】【autotest】spice-vdagent 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R2F| +|IB5R2G|containernetworking-plugins|【24.03-SP1-rc1】【autotest】containernetworking-plugins 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R2G| +|IB5R2I|libepoxy|【24.03-SP1-rc1】【autotest】libepoxy 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R2I| +|IB5R2K|crash|【24.03-SP1-rc1】【autotest】crash 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R2K| +|IB5R2N|libXtst|【24.03-SP1-rc1】【autotest】libXtst 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R2N| +|IB5R2O|libteam|【24.03-SP1-rc1】【autotest】libteam 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R2O| +|IB5R2P|ipmitool|【24.03-SP1-rc1】【autotest】ipmitool 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R2P| +|IB5R2Q|musl|【24.03-SP1-rc1】【autotest】musl 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R2Q| +|IB5R2R|xorg-x11-drv-evdev|【24.03-SP1-rc1】【autotest】xorg-x11-drv-evdev 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R2R| +|IB5R2S|kexec-tools|【24.03-SP1-rc1】【autotest】kexec-tools 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R2S| +|IB5R2W|supermin|【24.03-SP1-rc1】【autotest】supermin 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R2W| +|IB5R2X|openjdk-1.8.0|【24.03-SP1-rc1】【autotest】openjdk-1.8.0 包在24.03-LTS-SP1中相比24.03-LTS version版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R2X| +|IB5R2Y|openjfx8|【24.03-SP1-rc1】【autotest】openjfx8 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R2Y| +|IB5R2Z|curl|【24.03-SP1-rc1】【autotest】curl 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R2Z| +|IB5R31|kpatch|【24.03-SP1-rc1】【autotest】kpatch 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R31| +|IB5R32|rpmrebuild|【24.03-SP1-rc1】【autotest】rpmrebuild 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R32| +|IB5R33|golang|【24.03-SP1-rc1】【autotest】golang 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R33| +|IB5R35|open-isns|【24.03-SP1-rc1】【autotest】open-isns 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R35| +|IB5R36|thai-scalable-fonts|【24.03-SP1-rc1】【autotest】thai-scalable-fonts 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R36| +|IB5R37|keybinder3|【24.03-SP1-rc1】【autotest】keybinder3 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R37| +|IB5R38|coreutils|【24.03-SP1-rc1】【autotest】coreutils 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R38| +|IB5R3B|lxc|【24.03-SP1-rc1】【autotest】lxc 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R3B| +|IB5R3C|secpaver|【24.03-SP1-rc1】【autotest】secpaver 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R3C| +|IB5R3F|dmidecode|【24.03-SP1-rc1】【autotest】dmidecode 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R3F| +|IB5R3L|openssl|【24.03-SP1-rc1】【autotest】openssl 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R3L| +|IB5R3O|iotop|【24.03-SP1-rc1】【autotest】iotop 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R3O| +|IB5R3R|sssd|【24.03-SP1-rc1】【autotest】sssd 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R3R| +|IB5R3T|xorg-x11-drv-v4l|【24.03-SP1-rc1】【autotest】xorg-x11-drv-v4l 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R3T| +|IB5R3U|efivar|【24.03-SP1-rc1】【autotest】efivar 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R3U| +|IB5R3V|lua|【24.03-SP1-rc1】【autotest】lua 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R3V| +|IB5R3W|libXxf86vm|【24.03-SP1-rc1】【autotest】libXxf86vm 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R3W| +|IB5R3X|cryfs|【24.03-SP1-rc1】【autotest】cryfs 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R3X| +|IB5R3Y|libchardet|【24.03-SP1-rc1】【autotest】libchardet 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R3Y| +|IB5R3Z|gnome-settings-daemon|【24.03-SP1-rc1】【autotest】gnome-settings-daemon 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R3Z| +|IB5R40|mate-control-center|【24.03-SP1-rc1】【autotest】mate-control-center 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R40| +|IB5R43|dtkcommon|【24.03-SP1-rc1】【autotest】dtkcommon 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R43| +|IB5R44|python-pyeclib|【24.03-SP1-rc1】【autotest】python-pyeclib 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R44| +|IB5R46|deepin-clone|【24.03-SP1-rc1】【autotest】deepin-clone 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R46| +|IB5R48|kubekey|【24.03-SP1-rc1】【autotest】kubekey 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R48| +|IB5R49|cppzmq|【24.03-SP1-rc1】【autotest】cppzmq 包在24.03-LTS-SP1中相比24.03-LTS version版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R49| +|IB5R4A|pushgateway|【24.03-SP1-rc1】【autotest】pushgateway 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R4A| +|IB5R4B|deepin-log-viewer|【24.03-SP1-rc1】【autotest】deepin-log-viewer 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R4B| +|IB5R4C|qt5integration|【24.03-SP1-rc1】【autotest】qt5integration 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R4C| +|IB5R4D|pigpio|【24.03-SP1-rc1】【autotest】pigpio 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R4D| +|IB5R4E|dde-calendar|【24.03-SP1-rc1】【autotest】dde-calendar 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R4E| +|IB5R4G|tidb|【24.03-SP1-rc1】【autotest】tidb 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R4G| +|IB5R4I|python-jaeger-client|【24.03-SP1-rc1】【autotest】python-jaeger-client 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R4I| +|IB5R4K|mate-menus|【24.03-SP1-rc1】【autotest】mate-menus 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R4K| +|IB5R4M|deepin-compressor|【24.03-SP1-rc1】【autotest】deepin-compressor 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5R4M| +|IB5RII|A-Tune|【24.03-SP1-rc1】【autotest】A-Tune 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5RII| +|IB5RJK|cscope|【24.03-SP1-rc1】【autotest】cscope 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5RJK| +|IB5RKT|rubygem-actionpack|【24.03-SP1-rc1】【autotest】rubygem-actionpack 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB5RKT| +|IB5SQT|scap-security-guide|【openEuler-24.03-LTS-SP1】content_rule_configure_dump_journald_log规则存在问题|https://gitee.com/open_euler/dashboard?issue_id=IB5SQT| +|IB5T8Q|bash|【openEuler-24.03-LTS-SP1】5.2.15-11版本 构建iso镜像无法正常启动|https://gitee.com/open_euler/dashboard?issue_id=IB5T8Q| +|IB5XUK|rubik|【openEuler-24.03-LTS-SP1-round1】rubik cpuevict功能 验证过程中,rubik持续输出error日志|https://gitee.com/open_euler/dashboard?issue_id=IB5XUK| +|IB5Z0F|hplip|【24.03-SP1-rc1】【arm/x86】hplip源码包本地自编译失败|https://gitee.com/open_euler/dashboard?issue_id=IB5Z0F| +|IB60DF|gcc|【EulerMaker】gcc-12.3.1-37版本缺少头文件,导致多包编译失败|https://gitee.com/open_euler/dashboard?issue_id=IB60DF| +|IB66JT|systemtap|【openEuler-24.03-LTS-SP1 rc1 】【autotest 】【arm】stap-exporter.service 服务启动有报错信息|https://gitee.com/open_euler/dashboard?issue_id=IB66JT| +|IB6709|kylin-video|【EulerMaker】 kylin-video 在openEuler-24.03-LTS-SP1:epol 中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB6709| +|IB6DM2|oeAware-manager|sar 等命令 缺少 sysstat require|https://gitee.com/open_euler/dashboard?issue_id=IB6DM2| +|IB6Q5P|scap-security-guide|【openEuler-24.03-LTS-SP1】规则Build and Test AIDE Database检查结果失败|https://gitee.com/open_euler/dashboard?issue_id=IB6Q5P| +|IB6R4K|xorg-x11-server|【openEuler-24.03-sp1-rc2】【x86】UEFI模式安装没有进入到图形化安装页面,而是进入到了text模式|https://gitee.com/open_euler/dashboard?issue_id=IB6R4K| +|IB6UPE|dde-control-center|【24.03-SP1-rc2】【autotest】dde-control-center 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB6UPE| +|IB6UQ2|kylin-video|【24.03-SP1-rc2】【autotest】kylin-video 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB6UQ2| +|IB6USO|kylin-screenshot|【24.03-SP1-rc2】【autotest】kylin-screenshot 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB6USO| +|IB6UU1|dde-dock|【24.03-SP1-rc2】【autotest】dde-dock 包在24.03-LTS-SP1中相比24.03-LTS release版本降级|https://gitee.com/open_euler/dashboard?issue_id=IB6UU1| +|IB6UV0|three-eight-nine-ds-base|[24.03-LTS-SP1-RC2]dsctl localhost cockpit open-firewall执行报错|https://gitee.com/open_euler/dashboard?issue_id=IB6UV0| +|IB6WYQ|python-commonmark|【openEuler-24.03-LTS-SP1-round2】【arm/x86】 nodejs-commonmark与python3-commonmark有共同文件/usr/bin/commonmark,导致安装冲突|https://gitee.com/open_euler/dashboard?issue_id=IB6WYQ| +|IB6XNG|spdk|【EulerMaker】spdk 在openEuler-24.03-LTS-SP1:everything中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB6XNG| +|IB6XSB|libxcvt|【openEuler-24.03-LTS-SP1-round2】【arm/x86】 xorg-x11-server-help和cvt 有共同文件/usr/share/man/man1/cvt.1.gz导致安装冲突|https://gitee.com/open_euler/dashboard?issue_id=IB6XSB| +|IB6XT5|scap-security-guide|【openEuler-24.03-LTS-SP1】对应当正确配置硬盘空间阈值规则进行扫描,存在问题|https://gitee.com/open_euler/dashboard?issue_id=IB6XT5| +|IB6XVI|mandoc|【openEuler-24.03-LTS-SP1-round2】【arm/x86】 mandoc和groff-help, mandoc和man-pages-help存在安装冲突|https://gitee.com/open_euler/dashboard?issue_id=IB6XVI| +|IB6ZHB|gcc-cross|【24.03-SP1-rc2】【arm/x86】gcc-cross源码包本地自编译失败,build阶段报错,build.sh不存在|https://gitee.com/open_euler/dashboard?issue_id=IB6ZHB| +|IB6ZKC|python-dns|【24.03-SP1-rc2】【arm/x86】python-dns源码包本地自编译失败,check阶段报错|https://gitee.com/open_euler/dashboard?issue_id=IB6ZKC| +|IB71JF|kata-containers|【24.03-SP1-rc2】【autotest】kata-containers 软件包arm架构和x86架构的二进制包版本不一致|https://gitee.com/open_euler/dashboard?issue_id=IB71JF| +|IB71L7|three-eight-nine-ds-base|[24.03-LTS-SP1-RC2]dsidm -b "dc=example,dc=com" localhost role subtree-status dc=example,dc=com执行报错|https://gitee.com/open_euler/dashboard?issue_id=IB71L7| +|IB73JZ|clevis|[24.03-LTS-SP1] tang命令问题|https://gitee.com/open_euler/dashboard?issue_id=IB73JZ| +|IB746P|powerapi|【openEuler-24.03-LTS-SP1-rc2】【autotest】【arm】调用SetSmartGridLevel接口,打印信息为SetSmartGridState succeed|https://gitee.com/open_euler/dashboard?issue_id=IB746P| +|IB74HE|autofdo|[24.03-LTS-SP1 RC2]dump_gcov 使用--gcov_version=2时报错|https://gitee.com/open_euler/dashboard?issue_id=IB74HE| +|IB76HT|ods|【2024-1230】离线登录 /api/modify-auth-code接口 新旧密码相同 修改密码成功 |https://gitee.com/open_euler/dashboard?issue_id=IB76HT| +|IB77RL|aops-hermes|[24.03-LTS-SP1-RC2][arm/x86]创建主机组,主机组名称校验不正确|https://gitee.com/open_euler/dashboard?issue_id=IB77RL| +|IB7C0A|oeAware-manager|【openEuler-24.03-LTS-SP1-rc2】【autotest】【arm/x86】安装oeAware-manager之后,执行oeawarectl -e smc_tune显示enable成功,实际实例状态仍为smc_tune(available, close)|https://gitee.com/open_euler/dashboard?issue_id=IB7C0A| +|IB7E17|oeAware-manager|sdk权限及服务卡住问题|https://gitee.com/open_euler/dashboard?issue_id=IB7E17| +|IB7FDZ|aops-hermes|[24.03-LTS-SP1-RC2][arm/x86] 脚本管理,编辑脚本后再点击“新建脚本”,页面为“修改脚本”|https://gitee.com/open_euler/dashboard?issue_id=IB7FDZ| +|IB7FSL|aops-zeus|[24.03-LTS-SP1-RC2][arm/x86]操作名可修改为已存在的值|https://gitee.com/open_euler/dashboard?issue_id=IB7FSL| +|IB7GCE|eagle|【openEuler-24.03-LTS-SP1-rc2】【autotest】【arm/x86】安装eagle查看日志文件中存在拼写错误,应为service|https://gitee.com/open_euler/dashboard?issue_id=IB7GCE| +|IB7HS0|nototools|[24.03-LTS-SP1 RC2][autotest]部分命令无法使用|https://gitee.com/open_euler/dashboard?issue_id=IB7HS0| +|IB7JLG|intel-compute-runtime|【EulerMaker】intel-compute-runtime 在openEuler-24.03-LTS-SP1:everything中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB7JLG| +|IB7JNQ|mailman|【EulerMaker】mailman 在openEuler-24.03-LTS-SP1:everything中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB7JNQ| +|IB7JPP|pipewire|【EulerMaker】pipewire 在openEuler-24.03-LTS-SP1:everything中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB7JPP| +|IB7KB5|secpaver|【openEuler-24.03-LTS-SP1】使用工具配置dim功能,创建基线时未考虑国密算法|https://gitee.com/open_euler/dashboard?issue_id=IB7KB5| +|IB7KEB|iSulad|【EulerMaker】iSulad 在openEuler-24.03-LTS-SP1:everything中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB7KEB| +|IB7KXY|oeAware-manager|【openEuler-24.03-LTS-SP1-rc2】【autotest】【arm/x86】安装oeAware-manager,执行oeawarectl -l libsystem_tune.so打印结果中缺少换行|https://gitee.com/open_euler/dashboard?issue_id=IB7KXY| +|IB7LC7|oeAware-manager|【openEuler-24.03-LTS-SP1-rc2】【autotest】【arm/x86】安装oeAware-manager,执行oeawarectl -e xcall_tune;oeawarectl -r libsystem_tune.so结果异常|https://gitee.com/open_euler/dashboard?issue_id=IB7LC7| +|IB7LDI|k3s-selinux|【openEuler-24.03-LTS-SP1-round2】【arm/x86】 k3s-selinux包安装过程中有报错|https://gitee.com/open_euler/dashboard?issue_id=IB7LDI| +|IB7LDQ|secpaver|【openEuler-24.03-LTS-SP1】使用工具配置dim功能,未考虑模块参数log_cap的最大值|https://gitee.com/open_euler/dashboard?issue_id=IB7LDQ| +|IB7MRU|setroubleshoot|【EulerMaker】setroubleshoot 在openEuler-24.03-LTS-SP1:everything中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB7MRU| +|IB7NRV|secpaver|【openEuler-24.03-LTS-SP1】使用工具检查dim功能,缺少对kernel静态基线的检查|https://gitee.com/open_euler/dashboard?issue_id=IB7NRV| +|IB7P1W|oeAware-manager|command_collector缺少参数校验|https://gitee.com/open_euler/dashboard?issue_id=IB7P1W| +|IB7PCT|aops-hermes|[24.03-LTS-SP1-RC2][arm/x86] 上传安全公告报400错误|https://gitee.com/open_euler/dashboard?issue_id=IB7PCT| +|IB7PIH|oeAware-manager|【openEuler-24.03-LTS-SP1-rc2】【autotest】【arm】安装oeAware-manager,执行oeawarectl -i numafast显示:/sbin/ldconfig: /usr/lib64/libkperf.so is not a symbolic link|https://gitee.com/open_euler/dashboard?issue_id=IB7PIH| +|IB7PNH|oeAware-manager|【openEuler-24.03-LTS-SP1-rc2】【autotest】【arm/x86】安装oeAware-manager,执行oeawarectl -r libdocker_collector.so之后执行oeawarectl -e docker_cpu_burst(依赖libdocker_collector.so)显示成功|https://gitee.com/open_euler/dashboard?issue_id=IB7PNH| +|IB7PXL|oeAware-manager|【openEuler-24.03-LTS-SP1-rc2】【autotest】【arm/x86】安装oeAware-manager,执行oeawarectl --query-dep xcall_tune显示依赖关系图生成成功,实际查看依赖关系图异常|https://gitee.com/open_euler/dashboard?issue_id=IB7PXL| +|IB7Q47|oeAware-manager|【openEuler-24.03-LTS-SP1-rc2】【autotest】【arm/x86】安装oeAware-manager,执行oeawarectl -e xcall_tune;oeawarectl --query-dep xcall_tune显示依赖关系图生成成功,实际查看依赖关系图缺少依赖|https://gitee.com/open_euler/dashboard?issue_id=IB7Q47| +|IB7RMQ|secpaver|【openEuler-24.03-LTS-SP1】使用工具配置ima功能,需要自行导入摘要列表,缺少提示信息|https://gitee.com/open_euler/dashboard?issue_id=IB7RMQ| +|IB7SUK|oeAware-manager|【openEuler-24.03-LTS-SP1-rc2】【autotest】【arm/x86】安装oeAware-manager,配置/etc/oeAware/config.yaml格式不正确,重启服务日志中没有报错提示|https://gitee.com/open_euler/dashboard?issue_id=IB7SUK| +|IB7SYE|oeAware-manager|【openEuler-24.03-LTS-SP1-rc2】【autotest】【arm】虚拟机安装oeAware-manager,执行oeawarectl -e pmu_spe_collector(只在物理上支持的实例),显示使能成功,状态为running|https://gitee.com/open_euler/dashboard?issue_id=IB7SYE| +|IB7VUC|secpaver|【openEuler-24.03-LTS-SP1】使用工具开启ima功能后,再关闭ima,未恢复被度量文件的selinux标签|https://gitee.com/open_euler/dashboard?issue_id=IB7VUC| +|IB7YD4|aops-apollo|[24.03-LTS-SP1-RC2][arm/x86] 任务详情页面,任务描述的部分内容显示可读性差|https://gitee.com/open_euler/dashboard?issue_id=IB7YD4| +|IB810U|secpaver|【openEuler-24.03-LTS-SP1】sec_conf工具中选择是否安装包时,选择未安装且环境上没有对应包,应退出执行|https://gitee.com/open_euler/dashboard?issue_id=IB810U| +|IB81LE|gazelle|22.03SP4发布最新的gazelle包时,对应的dpdk包未更新|https://gitee.com/open_euler/dashboard?issue_id=IB81LE| +|IB82Z3|gazelle|【openGauss】用户态备执行gs_ctl switchover会导致数据库异常中断|https://gitee.com/open_euler/dashboard?issue_id=IB82Z3| +|IB83OJ|gazelle|【openGauss】用户态起gaussdb后使用tpcc进行测试报错|https://gitee.com/open_euler/dashboard?issue_id=IB83OJ| +|IB89HL|gcc|[24.03-LTS-SP1 RC3][Angha]-O3 -fif-split编译报ICE:during GIMPLE pass: profile_estimate(at cfganal.cc:1587)|https://gitee.com/open_euler/dashboard?issue_id=IB89HL| +|IB89PH|gcc|[24.03-LTS-SP1 RC3][Angha]-O3 -fif-split编译报Segmentation fault: during GIMPLE pass: local-pure-const|https://gitee.com/open_euler/dashboard?issue_id=IB89PH| +|IB8HD7|prometheus|[24.03-LTS-SP1-RC3]promtool check metrics执行报错|https://gitee.com/open_euler/dashboard?issue_id=IB8HD7| +|IB8JZ2|AI4C|【24.03-LTS-SP1 RC3】编译报ICE:in get_attr_type|https://gitee.com/open_euler/dashboard?issue_id=IB8JZ2| +|IB8KDT|scap-security-guide|【openEuler-24.03-LTS-SP1】检查和修复确保弱口令字典设置正确规则时,存在问题|https://gitee.com/open_euler/dashboard?issue_id=IB8KDT| +|IB8KYZ|gcc|[24.03-LTS-SP1 RC3]-O3 -fwhole-program -flto-partition=one -fipa-struct-reorg=2编译运行运行报core dump|https://gitee.com/open_euler/dashboard?issue_id=IB8KYZ| +|IB8P6E|trafficserver|【openEuler -24.03-LTS-SP1 rc3 】【autotest 】trafficserver.service 服务启动有报错信息|https://gitee.com/open_euler/dashboard?issue_id=IB8P6E| +|IB8PD4|python-blivet|[24.03-LTS-SP1]安装界面添加超出可用空间的分区时,提示不正确|https://gitee.com/open_euler/dashboard?issue_id=IB8PD4| +|IB8Q55|ncbi-blast|【openEuler-24.03-LTS-SP1-round3】【arm/x86】 ncbi-blast和bzip2-devel, ncbi-blast和proj-devel存在安装文件冲突|https://gitee.com/open_euler/dashboard?issue_id=IB8Q55| +|IB8QB5|oeAware-manager|缺少执行多条命令的功能|https://gitee.com/open_euler/dashboard?issue_id=IB8QB5| +|IB8RF9|scap-security-guide|【openEuler-24.03-LTS-SP1】修复确保普通用户不能借助pkexec配置提权root规则,应保持与规范一致|https://gitee.com/open_euler/dashboard?issue_id=IB8RF9| +|IB8TGJ|gcc|[24.03-LTS-SP1 RC3]-fcfgo-profile-generate不指定路径时编译报ICE:Segmentation fault|https://gitee.com/open_euler/dashboard?issue_id=IB8TGJ| +|IB8TSE|KubeOS|【24.03-lts-sp1】使用镜像制作脚本制作镜像,报错后返回值依旧为0|https://gitee.com/open_euler/dashboard?issue_id=IB8TSE| +|IB8U1L|scap-security-guide|【openEuler-24.03-LTS-SP1】修复确保SSH服务MACs算法配置正确规则后,sshd服务启动失败|https://gitee.com/open_euler/dashboard?issue_id=IB8U1L| +|IB8U5R|secpaver|【openEuler-24.03-LTS-SP1】使用工具检查dim配置时,未默认检查dim_core、dim_monitor模块是否加载|https://gitee.com/open_euler/dashboard?issue_id=IB8U5R| +|IB8UDV|KubeOS|【24.03-lts-sp1】下发sysctl参数时,配置格式中存在引号,会配置失败|https://gitee.com/open_euler/dashboard?issue_id=IB8UDV| +|IB8UQO|scap-security-guide|【openEuler-24.03-LTS-SP1】修复确保口令中不包含账号字符串规则,结果为error|https://gitee.com/open_euler/dashboard?issue_id=IB8UQO| +|IB8W9N|criu|【EulerMaker】criu 在openEuler-24.03-LTS-SP1:everything中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB8W9N| +|IB8X9N|dde|【openEuler 24.03 LTS SP1 rc3】控制中心中的声音输入界面卡死或闪崩|https://gitee.com/open_euler/dashboard?issue_id=IB8X9N| +|IB8XDM|dde|【openEuler 24.03 LTS SP1 rc3】画图应用选择添加文字后应用界面消失|https://gitee.com/open_euler/dashboard?issue_id=IB8XDM| +|IB8XSC|dde|【openEuler 24.03 LTS SP1 rc3】应用“Rygel首选项” 的X关闭应用按钮无法点击并关闭应用|https://gitee.com/open_euler/dashboard?issue_id=IB8XSC| +|IB8YID|secpaver|【openEuler-24.03-LTS-SP1】使用工具配置和检查secure_boot功能,wget下载外网证书时限制重试次数和超时时间|https://gitee.com/open_euler/dashboard?issue_id=IB8YID| +|IB8ZOX|stratovirt|[openEuler-24.03-LTS-SP1-RC3][arm/x86] 使用快照恢复虚拟机失败|https://gitee.com/open_euler/dashboard?issue_id=IB8ZOX| +|IB90VA|secpaver|【openEuler-24.03-LTS-SP1】使用工具检查ima功能时,只检查了当前环境是否支持ima,不符合预期|https://gitee.com/open_euler/dashboard?issue_id=IB90VA| +|IB91QR|oeAware-manager|评估模式客户端无法输出报告|https://gitee.com/open_euler/dashboard?issue_id=IB91QR| +|IB91ZE|oeAware-manager|pmu 相关数据interval 字段计算不准|https://gitee.com/open_euler/dashboard?issue_id=IB91ZE| +|IB96L3|scap-security-guide|【openEuler-24.03-LTS-SP1】修复确保at、cron配置正确规则后,部分文件权限不符合要求|https://gitee.com/open_euler/dashboard?issue_id=IB96L3| +|IB9EDA|dde|【openEuler 24.03 LTS SP1 rc3】DDE创建的管理员用户在特殊情况下无法登录系统|https://gitee.com/open_euler/dashboard?issue_id=IB9EDA| +|IB9EM9|KubeOS|【24.03-lts-sp1】kubelte/containerd等配置下发存在:1、格式转换等问题会导致节点配置卡住,2、仅配置value值后修改正确的值但是value值一直不变,3、pam.limits下发时未打印setting日志|https://gitee.com/open_euler/dashboard?issue_id=IB9EM9| +|IB9FF8|oeAware-manager|steal task 在不同内核下使能方式有差异,6.6 下 需要配置sched_steal_node_limit=[numa number]|https://gitee.com/open_euler/dashboard?issue_id=IB9FF8| +|IB9K6P|CBS|【2024-1230】eulermaker支持一键导入EulerMaker生成对应update版本构建工程,提供的默认yaml文件中build_tag字段错误|https://gitee.com/open_euler/dashboard?issue_id=IB9K6P| +|IB9K7F|pesign|[24.03-LTS-SP1-RC3]pesign -i部分命令执行失败|https://gitee.com/open_euler/dashboard?issue_id=IB9K7F| +|IB9M1F||【openEuler-24.03-LTS-SP1-round3】【block】kubeedge镜像需要替换成k3s|https://gitee.com/open_euler/dashboard?issue_id=IB9M1F| +|IB9NT0|autoconf|【EulerMaker】autoconf更新导致 多个包在openEuler-24.03-LTS-SP1:everything中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB9NT0| +|IB9RFD|gazelle|【openGauss】使用tpcc进行预热测试的时候会打印一堆fail日志|https://gitee.com/open_euler/dashboard?issue_id=IB9RFD| +|IB9RGW|gazelle|【openGauss】使用tpcc测试时首次测试必定出现The connection attempt failed|https://gitee.com/open_euler/dashboard?issue_id=IB9RGW| +|IB9RUF|KubeOS|【24.03-lts-sp1】operation配置错误时,提示语中多了一层引号;imagetype和opstype配置错误时下发直接报错,与之前行为不一致|https://gitee.com/open_euler/dashboard?issue_id=IB9RUF| +|IB9WSS|ao.space|【EulerMaker】ao.space 在openEuler-24.03-LTS-SP1:everything中构建失败|https://gitee.com/open_euler/dashboard?issue_id=IB9WSS| +|IB9XFN|secpaver|【openEuler-24.03-LTS-SP1】配置和检查modsign功能时,修改grub参数部分待优化|https://gitee.com/open_euler/dashboard?issue_id=IB9XFN| +|IBA5E4|gazelle|【openGauss】分别带gazelle启动gauss一主两备,反复手动切换主备三小时左右出现Cannot reserve memory,gauss进程退出|https://gitee.com/open_euler/dashboard?issue_id=IBA5E4| +|IBA7CU|aops-hermes|[24.03-LTS-SP1-RC4][arm/x86] CVE详情页部分展示不正确|https://gitee.com/open_euler/dashboard?issue_id=IBA7CU| +|IBA897|CBS|【2024-1230】yaml 导入方式创建工程,yaml软件包配置不传入git_tag/commit_id/spec_branch,build_tag传入值,工程刚创建后,软件包配置显示的是spec_branch,经过一次构建后包配置显示spec_branch的commit_id|https://gitee.com/open_euler/dashboard?issue_id=IBA897| +|IBA8N5|utshell|[24.03-LTS-SP1-RC4]utshell -c "echo {a..z}"执行报错Segmentation fault(核心已转储)|https://gitee.com/open_euler/dashboard?issue_id=IBA8N5| +|IBA918|utshell|[24.03-LTS-SP1-RC4]utshell -c 'type -p pwd'返回码为255|https://gitee.com/open_euler/dashboard?issue_id=IBA918| +|IBAAW2|oeAware-manager|spe 采样周期单向增大,无法回退|https://gitee.com/open_euler/dashboard?issue_id=IBAAW2| +|IBADES|euler-copilot-rag|生成问答对脚本只能针对单个文件或整个目录|https://gitee.com/open_euler/dashboard?issue_id=IBADES| +|IBADFM|ods|【2024-1230】版本测试在EulerPipeline运行:指定x86虚拟机os_version为20.03-LTS-SP4版本运行job,任务超时|https://gitee.com/open_euler/dashboard?issue_id=IBADFM| +|IBAEW1|gnome-settings-daemon|【openEuler-24.03-LTS-SP1-round4】【x86】 gnome-settings-daemon 卸载有Failed信息|https://gitee.com/open_euler/dashboard?issue_id=IBAEW1| +|IBAFG5|scap-security-guide|【openEuler-24.03-LTS-SP1】修复确保UMASK配置正确规则时,不符合规范|https://gitee.com/open_euler/dashboard?issue_id=IBAFG5| +|IBAGYK|ods|【2024-1230】版本测试在EulerPipeline运行:指定arm虚拟机os_version为20.03-LTS-SP4版本运行job,任务超时|https://gitee.com/open_euler/dashboard?issue_id=IBAGYK| +|IBALNI|KubeOS|【24.03-lts-sp1】pxe镜像配置磁盘名称有误时,安装时提示语是磁盘空间不足|https://gitee.com/open_euler/dashboard?issue_id=IBALNI| +|IBALTA|KubeOS|【24.03-lts-sp1】[chroot_script]配置rm = true,制作完成的镜像中的chroot脚本未删除|https://gitee.com/open_euler/dashboard?issue_id=IBALTA| +|IBALZL|KubeOS|【24.03-lts-sp1】admin镜像制作时dockerfile中的COPY ./set-ssh-pub-key.sh ./hostshell /usr/local/bin后面需要加/|https://gitee.com/open_euler/dashboard?issue_id=IBALZL| +|IBAPM3|gazelle|【SDV】在服务端启动后,立即启动客户端TCP V6无法建连|https://gitee.com/open_euler/dashboard?issue_id=IBAPM3| +|IBAPNP|gazelle|【SDV】开启单网卡后带gazelle启动,使用sysbench打流偶现coredump|https://gitee.com/open_euler/dashboard?issue_id=IBAPNP| +|IBARLD|gcc|[24.03-LTS-SP1 RC4][csmith]-O3 -fif-split编译报Segmentation fault: during GIMPLE pass:if-split|https://gitee.com/open_euler/dashboard?issue_id=IBARLD| +|IBAU9O|hadoop|【EulerMaker】hadoop在openEuler-24.03-LTS-SP1:everything构建失败|https://gitee.com/open_euler/dashboard?issue_id=IBAU9O| +|IBAWUP|sysSentry|rasdaemon插件缺少配置文件|https://gitee.com/open_euler/dashboard?issue_id=IBAWUP| +|IBAYOV|sysSentry|type=period, onstart=no时rasdaemon插件功能异常|https://gitee.com/open_euler/dashboard?issue_id=IBAYOV| +|IBAYW8|KubeOS|【24.03-lts-sp1】镜像制作参数存在部分值为空未校验|https://gitee.com/open_euler/dashboard?issue_id=IBAYW8| +|IBB1PX|gcc|[24.03-LTS-SP1 RC4]-flto -fipa-reorder-fields运行结果不一致|https://gitee.com/open_euler/dashboard?issue_id=IBB1PX| +|IBB1UK|gcc|[24.03-LTS-SP1 RC4]-flto -fipa-struct-reorg运行结果不一致|https://gitee.com/open_euler/dashboard?issue_id=IBB1UK| +|IBB67T|oncn-bwm|【EulerMaker】oncn-bwm 在openEuler-24.03-LTS-SP1:everything构建失败|https://gitee.com/open_euler/dashboard?issue_id=IBB67T| +|IBBP4Y|multipath-tools|构造两次网络故障,多路径未进入降级状态|https://gitee.com/open_euler/dashboard?issue_id=IBBP4Y| +|IBBPDD|rpcbind|pwck检查有报错:user 'rpc': directory '/var/lib/rpcbind' does not exist|https://gitee.com/open_euler/dashboard?issue_id=IBBPDD| +|IBBQ1S|secDetector|openEuler 24.03-LTS-SP1版本,文件探针失效|https://gitee.com/open_euler/dashboard?issue_id=IBBQ1S| +|IBBWT2|gcc|[24.03-LTS-SP1 RC5]-O3 -fif-split编译报ICE:during GIMPLE pass: profile_estimate(at cfganal.cc:1587)|https://gitee.com/open_euler/dashboard?issue_id=IBBWT2| +|IBBWT3|mariadb|【24.03 LTS SP1 auto-test】mysqladmin命令执行报错|https://gitee.com/open_euler/dashboard?issue_id=IBBWT3| +|IBBWTS|yocto-meta-openeuler|【24.03-LTS-SP1】kp920配置机器网卡时因没有驱动导致网卡不显示|https://gitee.com/open_euler/dashboard?issue_id=IBBWTS| +|IBBWXI|openGemini|【EulerMaker】openGemini 在openEuler-24.03-LTS-SP1:epol构建失败|https://gitee.com/open_euler/dashboard?issue_id=IBBWXI| +|IBC1X0|opengauss-server|【openEuler-24.03-LTS-SP1-round5】【arm/x86】 opengauss 由24.03-LTS升级24.03-LTS-SP1的版本 失败|https://gitee.com/open_euler/dashboard?issue_id=IBC1X0| +|IBC3CT|oeAware-manager|【openEuler-24.03-LTS-SP1-rc5】【arm】安装oeAware-manager,执行oeawarectl -e tune_numa_mem_access报错:can't connect to server!|https://gitee.com/open_euler/dashboard?issue_id=IBC3CT| \ No newline at end of file diff --git "a/docs/zh/docs/Releasenotes/\345\267\262\347\237\245\351\227\256\351\242\230.md" "b/docs/zh/docs/Releasenotes/\345\267\262\347\237\245\351\227\256\351\242\230.md" index 47c4f2d8b861b794e432f25261178779ebf2a4bb..1d79285d42fa44966dc7e06e6078d3810163f24b 100644 --- "a/docs/zh/docs/Releasenotes/\345\267\262\347\237\245\351\227\256\351\242\230.md" +++ "b/docs/zh/docs/Releasenotes/\345\267\262\347\237\245\351\227\256\351\242\230.md" @@ -1,10 +1,7 @@ # 已知问题 -| 序号 | 问题单号 | 问题简述 | 问题级别 | 影响分析 | 规避措施 | 历史发现场景 | -| ---- | ------- | -------- | -------- | ------- | -------- | --------- | -| 1 | [I5LZXD](https://gitee.com/src-openEuler/openldap/issues/I5LZXD) | openldap build problem in openEuler:22.09 | 次要 | 构建过程中,用例执行失败。为用例设计问题,影响可控,通过sleep的方式等待操作执行完成,在高负载下偶先失败 | skip相关用力,并持续跟踪上游社区解决 | | -| 2 | [I5NLZI](https://gitee.com/src-openEuler/dde/issues/I5NLZI) | 【openEuler 22.09 rc2】启动器中个别应用图标显示异常 | 次要 | 仅为DDE桌面启动器的图标显示异常,无功能影响,易用性问题整体影响可控 | 建议切换主题规避 | | -| 3 | [I5P5HM](https://gitee.com/src-openEuler/afterburn/issues/I5P5HM) | 【22.09_RC3_EPOL】【arm/x86】卸载afterburn提示Failed to stop afterburn-sshkeys@.service | 次要 | | | | -| 4 | [I5PQ3O](https://gitee.com/src-openEuler/openmpi/issues/I5PQ3O) | 【openEuler-22.09-RC3】ompi-clean -v -d参数执行报错 | 主要 | 该包为NestOS使用软件包,使用范围较为局限,默认为 NestOS 中的“core”用户启用,对服务器版本影响较小 | sig暂未提供规避手段 | | -| 5 | [I5Q2FE](https://gitee.com/src-openEuler/udisks2/issues/I5Q2FE) | udisks2 build problem in openEuler:22.09 | 次要 | 构建过程中,用例执行失败。环境未保留,长期本地构建未复现 | 持续跟踪社区构建成功率 | | -| 6 | [I5SJ0R](https://gitee.com/src-openEuler/podman/issues/I5SJ0R) | [22.09RC5 arm/x86]podman create --blkio-weight-device /dev/loop0:123:15 fedora ls 执行报错 | 次要 | blkio-weight为4.xx版本内核特性。5.10版本不支持 | 需跟进升级podman组件 | | +|ISSUE ID|关联仓库|问题描述|ISSUE 链接| +|-|-|-|-| +|IBA13Y|yocto-meta-openeuler|【2403LTSSP1】raspberrypi4-64镜像测试用例oe_test_nfs-utils_test_001失败|https://gitee.com/open_euler/dashboard?issue_id=IBA13Y| +|IBACHW|yocto-meta-openeuler|【2403LTSSP1】qemu-aarch64-mcs-ros镜像名称错误,并且micad启动失败|https://gitee.com/open_euler/dashboard?issue_id=IBACHW| +|IBANAZ|yocto-meta-openeuler|【2403LTSSP1】x86-64-hmi-mcs-ros-rt镜像mica启动失败|https://gitee.com/open_euler/dashboard?issue_id=IBANAZ| \ No newline at end of file diff --git "a/docs/zh/docs/Releasenotes/\346\263\225\345\276\213\345\243\260\346\230\216.md" "b/docs/zh/docs/Releasenotes/\346\263\225\345\276\213\345\243\260\346\230\216.md" index 440c626c2f65b5553291e3b309486e4d3c64acbd..e9af6a7a85f9a8c9540cb89c8b4fc089eec8adf1 100644 --- "a/docs/zh/docs/Releasenotes/\346\263\225\345\276\213\345\243\260\346\230\216.md" +++ "b/docs/zh/docs/Releasenotes/\346\263\225\345\276\213\345\243\260\346\230\216.md" @@ -1,6 +1,6 @@ # 法律声明 -**版权所有 © 2023 openEuler社区** +**版权所有 © 2024 openEuler社区** 您对“本文档”的复制、使用、修改及分发受知识共享\(Creative Commons\)署名—相同方式共享4.0国际公共许可协议\(以下简称“CC BY-SA 4.0”\)的约束。为了方便用户理解,您可以通过访问[https://creativecommons.org/licenses/by-sa/4.0/](https://creativecommons.org/licenses/by-sa/4.0/)了解CC BY-SA 4.0的概要 \(但不是替代\)。CC BY-SA 4.0的完整协议内容您可以访问如下网址获取:[https://creativecommons.org/licenses/by-sa/4.0/legalcode](https://creativecommons.org/licenses/by-sa/4.0/legalcode)。 diff --git "a/docs/zh/docs/Releasenotes/\347\224\250\346\210\267\351\241\273\347\237\245.md" "b/docs/zh/docs/Releasenotes/\347\224\250\346\210\267\351\241\273\347\237\245.md" index 19ad23cf5fda65d4a6057b33c9cdb5ce3691d424..1490add98610e003d485be3e02034d0680f86986 100644 --- "a/docs/zh/docs/Releasenotes/\347\224\250\346\210\267\351\241\273\347\237\245.md" +++ "b/docs/zh/docs/Releasenotes/\347\224\250\346\210\267\351\241\273\347\237\245.md" @@ -2,4 +2,4 @@ - openEuler版本号计数规则由openEuler x.x变更为以年月为版本号,以便用户了解版本发布时间,例如openEuler 21.03表示发布时间为2021年3月。 - [Python核心团队](https://www.python.org/dev/peps/pep-0373/#update)已经于2020年1月停止对Python 2的维护。2021年,openEuler 21.03版本仅修复Python 2的致命CVE。 -- 从openEuler 22.03-LTS版本开始,停止支持和维护Python 2,仅支持Python 3,请您切换并使用Python 3。 +- 从openEuler 22.03-LTS版本开始,停止支持和维护Python 2,仅支持Python 3,请您切换并使用Python 3。 diff --git "a/docs/zh/docs/Releasenotes/\347\263\273\347\273\237\345\256\211\350\243\205.md" "b/docs/zh/docs/Releasenotes/\347\263\273\347\273\237\345\256\211\350\243\205.md" index 2fb1285b37bce8d1b454e46eb306601498f6f4bf..13c6de0db001817495798599b59781622606ce5c 100644 --- "a/docs/zh/docs/Releasenotes/\347\263\273\347\273\237\345\256\211\350\243\205.md" +++ "b/docs/zh/docs/Releasenotes/\347\263\273\347\273\237\345\256\211\350\243\205.md" @@ -2,7 +2,7 @@ ## 发布件 -openEuler发布件包括[ISO发布包](https://www.openeuler.org/zh/download/archive/)、[虚拟机镜像](http://repo.openeuler.org/openEuler-22.09/virtual_machine_img/)、[容器镜像](http://repo.openeuler.org/openEuler-22.09/docker_img/)、[嵌入式镜像](http://repo.openeuler.org/openEuler-22.09/embedded_img/)和[repo源](http://repo.openeuler.org/openEuler-22.09/)。ISO发布包请参见[表1](#table8396719144315)。容器镜像清单参见[表3](#table1276911538154)。repo源方便在线使用,repo源目录请参见[表5](#table953512211576)。 +openEuler发布件包括[ISO发布包](https://www.openeuler.org/zh/download/archive/)、[虚拟机镜像](http://repo.openeuler.org/openEuler-24.03-LTS/virtual_machine_img/)、[容器镜像](http://repo.openeuler.org/openEuler-24.03-LTS/docker_img/)、[嵌入式镜像](http://repo.openeuler.org/openEuler-24.03-LTS/embedded_img/)和[repo源](http://repo.openeuler.org/openEuler-24.03-LTS/)。 **表 1** 发布ISO列表 @@ -14,57 +14,57 @@ openEuler发布件包括[ISO发布包](https://www.openeuler.org/zh/download/arc -

openEuler-22.09-aarch64-dvd.iso

+

openEuler-24.03-LTS-SP1-aarch64-dvd.iso

AArch64架构的基础安装ISO,包含了运行最小系统的核心组件

-

openEuler-22.09-everything-aarch64-dvd.iso

+

openEuler-24.03-LTS-SP1-everything-aarch64-dvd.iso

AArch64架构的全量安装ISO,包含了运行完整系统所需的全部组件

-

openEuler-22.09-everything-debug-aarch64-dvd.iso

+

openEuler-24.03-LTS-SP1-everything-debug-aarch64-dvd.iso

AArch64架构下openEuler的调试ISO,包含了调试所需的符号表信息

-

openEuler-22.09-x86_64-dvd.iso

+

openEuler-24.03-LTS-SP1-x86_64-dvd.iso

x86_64架构的基础安装ISO,包含了运行最小系统的核心组件

-

openEuler-22.09-everything-x86_64-dvd.iso

+

openEuler-24.03-LTS-SP1-everything-x86_64-dvd.iso

x86_64架构的全量安装ISO,包含了运行完整系统所需的全部组件

-

openEuler-22.09-everything-debuginfo-x86_64-dvd.iso

+

openEuler-24.03-LTS-SP1-everything-debug-x86_64-dvd.iso

x86_64架构下openEuler的调试ISO,包含了调试所需的符号表信息

-

openEuler-22.09-source-dvd.iso

+

openEuler-24.03-LTS-SP1-source-dvd.iso

openEuler源码ISO

-

openEuler-21.09-edge-aarch64-dvd.iso

+

openEuler-24.03-LTS-SP1-edge-aarch64-dvd.iso

AArch64架构的边缘ISO,包含了运行最小系统的核心组件

-

openEuler-21.09-edge-x86_64-dvd.iso

+

openEuler-24.03-LTS-SP1-edge-x86_64-dvd.iso

x86_64架构的边缘ISO,包含了运行最小系统的核心组件

-

openEuler-21.09-Desktop-aarch64-dvd.iso

+

openEuler-24.03-LTS-SP1-Desktop-aarch64-dvd.iso

AArch64架构的开发者桌面ISO,包含了运行开发桌面的最小软件集合

-

openEuler-21.09-Desktop-x86_64-dvd.iso

+

openEuler-24.03-LTS-SP1-Desktop-x86_64-dvd.iso

x86_64架构的开发者桌面ISO,包含了运行开发桌面的最小软件集合

@@ -82,12 +82,12 @@ openEuler发布件包括[ISO发布包](https://www.openeuler.org/zh/download/arc -

openEuler-22.09-aarch64.qcow2.xz

+

openEuler-24.03-LTS-SP1-aarch64.qcow2.xz

AArch64架构下openEuler虚拟机镜像

-

openEuler-22.09-x86_64.qcow2.xz

+

openEuler-24.03-LTS-SP1-x86_64.qcow2.xz

x86_64架构下openEuler虚拟机镜像

@@ -125,10 +125,10 @@ openEuler发布件包括[ISO发布包](https://www.openeuler.org/zh/download/arc | 名称 | 描述 | | -------------------------------------- | ------------------------------- | | arm64/aarch64-std/zImage | AArch64架构下支持qemu的内核镜像 | -| arm64/aarch64-std/\*toolchain-22.09.sh | AArch64架构下对应的开发编译链 | +| arm64/aarch64-std/\*toolchain-24.03.sh | AArch64架构下对应的开发编译链 | | arm64/aarch64-std/\*rootfs.cpio.gz | AArch64架构下支持qemu的文件系统 | | arm32/arm-std/zImage | Arm架构下支持qemu的内核镜像 | -| arm32/arm-std/\*toolchain-22.09.sh | Arm架构下对应的开发编译链 | +| arm32/arm-std/\*toolchain-24.03.sh | Arm架构下对应的开发编译链 | | arm32/arm-std/\*rootfs.cpio.gz | Arm架构下支持qemu的文件系统 | | source-list/manifest.xml | 构建使用的源码清单 | @@ -202,7 +202,7 @@ openEuler发布件包括[ISO发布包](https://www.openeuler.org/zh/download/arc ## 最小硬件要求 -安装 openEuler 22.09-LTS 所需的最小硬件要求如[表6](#zh-cn_topic_0182825778_tff48b99c9bf24b84bb602c53229e2541)所示。 +安装 openEuler 所需的最小硬件要求如[表6](#zh-cn_topic_0182825778_tff48b99c9bf24b84bb602c53229e2541)所示。 **表 6** 最小硬件要求 diff --git "a/docs/zh/docs/SecHarden/\345\256\211\345\205\250\351\205\215\347\275\256\345\212\240\345\233\272\345\267\245\345\205\267.md" "b/docs/zh/docs/SecHarden/\345\256\211\345\205\250\351\205\215\347\275\256\345\212\240\345\233\272\345\267\245\345\205\267.md" new file mode 100644 index 0000000000000000000000000000000000000000..b6aa1bc4612692cebadfe45c04e207b9b030a2e0 --- /dev/null +++ "b/docs/zh/docs/SecHarden/\345\256\211\345\205\250\351\205\215\347\275\256\345\212\240\345\233\272\345\267\245\345\205\267.md" @@ -0,0 +1,147 @@ +# 安全配置加固工具 + +## 前言 + +本文档为安全配置加固工具sec_conf基本介绍以及使用说明。 + +## sec_conf简介 + +### 背景和概述 +openEuler已支持多种安全特性,包括Linux原生安全特性和社区自研安全特性,但是存在特性分散,配置难度大,用户学习成本高等问题。同时对于一些具备拦截功能的安全特性(如IMA评估、安全启动、访问控制等),一旦用户配置错误,可能导致系统无法启动或无法正常运行。因此,sec_conf旨在实现自动化安全配置机制,用户可基于工具进行系统的安全检查和加固,以更好地推进openEuler安全特性在各应用场景的落地。 + +### 功能介绍 +sec_conf是一个帮助管理员配置openEuler安全特性(如IMA、DIM、secure boot等)的安全加固工具。用户可以输入配置信息,即需要实现的安全加固目标,生成相应的安全特性配置脚本。 + +目前sec_conf支持可配置的安全机制为IMA、DIM、secure boot、modsign,其他安全特性暂未支持。 + +## 安装与部署 +### 安装依赖软件包 + +编译secPaver需要的软件有: +* make +* golang 1.11+ + +### 下载源码 +``` +git clone https://gitee.com/openeuler/secpaver.git -b sec_conf +``` + +### 编译安装 +``` +cd secpaver +make +``` +安装软件: +``` +make install +``` + +## 工程文件格式说明 + +sec_conf工程文件由策略配置文件、检查脚本模板文件、配置脚本模板文件组成。 + +### 策略配置文件 +策略配置文件保护DIM\IMA\安全启动\内核模块校验相关特性配置,采用yaml格式表示,各字段说明如下: + + + + + + + + + + + + + + + + + + + + + +
一级配置项二级配置项类型属性说明
nameN/Astringoptional配置文件命名
versionN/Astringoptional配置策略版本号
dimenablebooloptional打开/关闭DIM功能
measure_liststring arrayoptionalDIM需要度量的文件;用户态文件,需要指定绝对路径;内核模块,需要指定有效的内核模块名称;内核,需要指定为“kernel”
log_capintoptional度量日志最大条数,当记录的度量日志数量达到参数设置时,停止记录度量日志,默认值为100000
scheduleintoptional度量完一个进程/模块后调度的时间,单位毫秒,设置为0代表不调度,默认值为0
intervalintoptional自动度量周期,单位分钟,设置为0代表不设置自动度量,默认值为0
hashstringoptional度量哈希算法,默认值为sha256
core_pcrintoptional将dim_core度量结果扩展至TPM芯片的PCR寄存器,设置为0代表不扩展(注意需要与芯片实际的PCR编号保持一致),默认值为0
monitor_pcrintoptional将dim_monitor度量结果扩展至TPM芯片的PCR寄存器,设置为0代表不扩展(注意需要与芯片实际的PCR编号保持一致),默认值为0
signaturebooloptional是否启用策略文件和签名机制
auto_baselinebooloptional是否建立DIM基线,若为false,则需管理员手动生成基线
secure_bootenablebooloptional是否使能安全启动
anti_rollbackbooloptional打开/关闭安全启动防回滚策略
verbosebooloptional打开/关闭安全启动相关日志
modsignenablebooloptional是否使能内核模块校验特性
imameasure_liststring arrayoptionalIMA保护文件列表(需要指定绝对路径)
appraise_liststring arrayoptionalIMA评估文件列表(需要指定绝对路径)
+ +>![](./public_sys-resources/icon-note.gif) **说明:** +> 1. sec_conf.yaml文件必须放在/usr/share/secpaver/scripts/sec_conf/sec_conf.yaml,不可重命名。 +> 2. 参数类型需遵守上述表格要求。 +> 3. 相关配置若不存在,则使用默认值。 + +### 检查脚本模板、配置脚本模板文件 +模板文件实现利用go-tamplate引擎,将脚本文件与数据结合,生成最终的文本输出。 + +检查脚本模板统一放置/usr/share/secpaver/scripts/sec_conf/check/目录,该目录下存放DIM、IMA等特性的脚本模板,这些脚本模板不能单独执行,只能通过sec_conf生成最新的脚本去执行openEuler特性检查。 + +配置脚本模板统一放置/usr/share/secpaver/scripts/sec_conf/gen/目录,该目录下存放DIM、IMA等特性的脚本模板,这些脚本模板不能单独执行,只能通过sec_conf生成最新的脚本去执行openEuler特性配置。 + +>![](./public_sys-resources/icon-note.gif) **说明:** +> 1. 配置、检查模板文件不可修改,仅用于被sec_conf解析生成脚本。 + +## 安全配置命令行接口 +| 参数 | 功能介绍 | 命令格式 | +| :--------: | :--------: |:--------: | +|--help,-h|打印sec_conf命令行帮助信息|sec_conf -h| +|gen_check|生成安全配置检查脚本,并输出到命令行界面|sec_conf gen_check| +|gen_config|生成安全配置脚本,并输出到命令行界面|sec_conf gen_config| +|--output,-o|将生成的配置脚本输出到指定文件|sec_conf gen_config -o config.sh| + +## 使用说明 +配置yaml文件,示例: +``` +# cat /usr/share/secpaver/scripts/sec_conf/sec_conf.yaml +--- +name: "openEuler security configuration" +version: "1" +dim: + enable: true + measure_list: + - "/usr/bin/bash" + - "nf_nat" + - "kernel" + log_cap: 100000 + schedule: 0 + interval: 0 + hash: "sha256" + core_pcr: 11 + monitor_pcr: 12 + signature: true + auto_baseline: true +secure_boot: + enable: true + anti_rollback: true + verbose: true +modsign: + enable: true +ima: + measure_list: + - "/usr/bin/ls" + - "/usr/bin/cat" + - "/usr/bin/base64" + - "/usr/bin/base32" + appraise_list: + - "/usr/bin/base64" + - "/usr/bin/base32" + - "/usr/bin/sleep" + - "/usr/bin/date" +... +``` +生成特性配置脚本、检查脚本 +``` +sec_conf gen_config -o ./config.sh +sec_conf gen_check -o ./check.sh +``` +执行配置脚本,并检查配置是否正确,若配置正确,则重启系统使配置生效 +``` +sh ./config.sh -s +sh ./check.sh -s +reboot +``` + +重启后再次执行配置脚本,并检查配置是否正确,此时预期所有功能检查通过 +``` +sh ./config.sh -s +sh ./check.sh -s +``` \ No newline at end of file diff --git "a/docs/zh/docs/SecHarden/\346\223\215\344\275\234\347\263\273\347\273\237\345\212\240\345\233\272\346\246\202\350\277\260.md" "b/docs/zh/docs/SecHarden/\346\223\215\344\275\234\347\263\273\347\273\237\345\212\240\345\233\272\346\246\202\350\277\260.md" index 7145e9610ac4806b4b95f78c5f18b9c3257ea705..9a54afbd0ae4b4d4a3e579809419b65ecfa28b98 100644 --- "a/docs/zh/docs/SecHarden/\346\223\215\344\275\234\347\263\273\347\273\237\345\212\240\345\233\272\346\246\202\350\277\260.md" +++ "b/docs/zh/docs/SecHarden/\346\223\215\344\275\234\347\263\273\347\273\237\345\212\240\345\233\272\346\246\202\350\277\260.md" @@ -27,7 +27,7 @@ ### 加固方式 -用户可以通过手动修改加固配置或执行相关命令对系统进行加固,也可以通过加固工具批量修改加固项。openEuler的安全加固工具security tool以openEuler-security.service服务的形式运行。系统首次启动时会自动运行该服务去执行默认加固策略,且自动设置后续开机不启动该服务。 +用户可以通过手动修改加固配置或执行相关命令对系统进行加固,也可以通过加固工具批量修改加固项。openEuler的安全加固工具security tool以openEuler-security.service服务的形式运行。系统首次启动时会自动运行该服务去执行默认加固策略,服务运行后会将该服务自动设置为后续开机不启动。 用户可以通过修改security.conf,使用安全加固工具实现个性化安全加固的效果。 diff --git "a/docs/zh/docs/ShangMi/RPM\347\255\276\345\220\215\351\252\214\347\255\276.md" "b/docs/zh/docs/ShangMi/RPM\347\255\276\345\220\215\351\252\214\347\255\276.md" new file mode 100644 index 0000000000000000000000000000000000000000..dca4184fb1e9e91ca8f42de46780c9dab6f294e2 --- /dev/null +++ "b/docs/zh/docs/ShangMi/RPM\347\255\276\345\220\215\351\252\214\347\255\276.md" @@ -0,0 +1,85 @@ +# RPM验签 + +## 概述 + +openEuler当前采用RPM格式的软件包管理,RPM采用符合openPGP签名规范,openEuler-24.03-LTS-SP1版本发布的RPM软件在开源版本的基础上增加了对SM2/3算法的签名/验签功能支持。 + +对如下软件包进行商密使能: +- GnuPG:gpg命令行应用程序支持生成国密签名 +- RPM:支持调用gpg命令以及openSSL API实现国密签名生成/验证 +- openSSL:支持国密签名验证(开源已支持) + +## 前置条件 + +1. openEuler操作系统安装的gnupg2、libgcrypt、rpm软件版本号需大于等于如下版本: + + ``` + $ rpm -qa libgcrypt + libgcrypt-1.10.2-3.oe2403sp1.x86_64 + + $ rpm -qa gnupg2 + gnupg2-2.4.3-5.oe2403sp1.x86_64 + + $ rpm -qa rpm + rpm-4.18.2-20.oe2403sp1.x86_64 + ``` +2. ecdsa的签名及验签仅支持sm2的 + +## 使用方法 + + +1. 生成秘钥 + + 方法1: + ```sh + $ gpg --full-generate-key --expert + ``` + 方法2: + ```sh + $ gpg --quick-generate-key <密钥标识> sm2p256v1 + ``` + 中间会要求输入密码,后续操作秘钥或签名需要输入密码,若直接不输入,按回车,则表示无密码。 + +2. 导出证书 + + ```sh + $ gpg -o <证书路径> --export <密钥标识> + ``` +3. 打开配置sm3哈希算法和sm2算法的宏 + ``` + $ vim /usr/lib/rpm/macros + %_enable_sm2p256v1_sm3_algo 1 + ``` +4. 将证书导入rpm数据库 + ```sh + $ rpm --import <证书路径> + ``` +5. 编写签名所需的macro + + ``` + $ vim ~/.rpmmacros + %_signature gpg + %_gpg_path /root/.gnupg + %_gpg_name <密钥标识> + %_gpgbin /usr/bin/gpg2 + + %__gpg_sign_cmd %{shescape:%{__gpg}} \ + gpg --no-verbose --no-armor --no-secmem-warning --passphrase-file /root/passwd \ + %{?_gpg_digest_algo:--digest-algo=%{_gpg_digest_algo}} \ + %{?_gpg_sign_cmd_extra_args} \ + %{?_gpg_name:-u %{shescape:%{_gpg_name}}} \ + -sbo %{shescape:%{?__signature_filename}} \ + %{?__plaintext_filename:-- %{shescape:%{__plaintext_filename}}} + ``` + 其中%__gpg_sign_cmd信息为默认信息加上了--passphrase-file /root/passwd,passwd文件存的是密码,若步骤1没有设置密码,则无需添加。 + +6. 生成RPM包签名 + ```sh + $ rpmsign --addsign + ``` + +7. 验证RPM包签名 + ```sh + $ rpm -Kv + ``` + 如果输出中显示“Header V4 ECDSA/SM3 Signature”,并且显示“OK”,则说明签名验证成功。 diff --git "a/docs/zh/docs/ShangMi/SSH\345\215\217\350\256\256\346\240\210.md" "b/docs/zh/docs/ShangMi/SSH\345\215\217\350\256\256\346\240\210.md" index 89f8d05685516bf567f8f18610755c5fe341a2c1..b906fe6938d251f6113acda89b8db99692000d91 100644 --- "a/docs/zh/docs/ShangMi/SSH\345\215\217\350\256\256\346\240\210.md" +++ "b/docs/zh/docs/ShangMi/SSH\345\215\217\350\256\256\346\240\210.md" @@ -32,20 +32,20 @@ $ ssh-keygen -t sm2 -m PEM -f /etc/ssh/ssh_host_sm2_key $ cat /path/to/id_sm2.pub >> ~/.ssh/authorized_keys ``` -3. 服务端修改/etc/ssh/sshd_config,配置支持商密算法登录。商密的配置项可选商密参数如下表: +3. 修改配置文件,配置支持商密算法登录。服务端的配置文件路径为/etc/ssh/sshd_config,客户端配置文件路径为/etc/ssh/ssh_config。可配置的商密参数如下表: | 配置项含义 | 配置项参数 | 配置项参数的商密取值 | |---------------------|------------------------|---------------| -| 主机密钥公钥认证密钥 (仅服务端可配) | HostKeyAlgorithms | /etc/ssh/ssh_host_sm2_key | -| 主机密钥公钥认证算法 | HostKeyAlgorithms | sm2 | -| 密钥交换算法 | KexAlgorithms | sm2-sm3 | -| 对称加密算法 | Ciphers | sm4-ctr | -| 完整性校验算法 | MACs | hmac-sm3 | -| 用户公钥认证算法 | PubkeyAcceptedKeyTypes | sm2 | -| 用户公钥认证密钥(仅客户端可配) | IdentityFile | ~/.ssh/id_sm2 | -| 打印密钥指纹使用的哈希算法 | FingerprintHash | sm3 | - -4. 客户端配置商密算法完成登录。客户端可以使用命令行方式或者修改配置文件的方式使能商密算法套件。使用命令行登录方式如下: +| HostKey | 主机密钥公钥认证密钥 | /etc/ssh/ssh_host_sm2_key | +| HostKeyAlgorithms | 主机密钥公钥认证算法 | sm2 | +| KexAlgorithms | 密钥交换算法 | sm2-sm3 | +| Ciphers | 对称加密算法 | sm4-ctr | +| MACs | 完整性校验算法 | hmac-sm3 | +| PubkeyAcceptedKeyTypes | 用户公钥认证算法 | sm2 | +| IdentityFile | 用户公钥认证密钥 | ~/.ssh/id_sm2 | +| FingerprintHash | 打印密钥指纹使用的哈希算法 | sm3 | + +4. 客户端配置商密算法完成登录。客户端可以使用命令行方式或者修改配置文件(默认配置文件路径为/etc/ssh/ssh_config)的方式使能商密算法套件。使用命令行登录方式如下: ``` ssh -o PreferredAuthentications=publickey -o HostKeyAlgorithms=sm2 -o PubkeyAcceptedKeyTypes=sm2 -o Ciphers=sm4-ctr -o MACs=hmac-sm3 -o KexAlgorithms=sm2-sm3 -i ~/.ssh/id_sm2 [remote-ip] diff --git "a/docs/zh/docs/ShangMi/TLCP\345\215\217\350\256\256\346\240\210.md" "b/docs/zh/docs/ShangMi/TLCP\345\215\217\350\256\256\346\240\210.md" index 899621ed4b29fd59c9b94913df9fe007e25ad634..06b846c821088154f57a50b5c2f262535d6d723c 100644 --- "a/docs/zh/docs/ShangMi/TLCP\345\215\217\350\256\256\346\240\210.md" +++ "b/docs/zh/docs/ShangMi/TLCP\345\215\217\350\256\256\346\240\210.md" @@ -2,7 +2,7 @@ ## 概述 -TLCP是指符合《GB/T38636 2020信息安全技术 传输层密码协议(TLCP)》的安全通信协议,其特点是采用加密证书/私钥和签名证书/私钥相分离的方式。openEuler 22.09版本之后发布的openSSL软件在开源版本的基础上增加了对商密TLCP协议的支持,提供了如下主要的功能: +TLCP是指符合《GB/T38636 2020信息安全技术 传输层密码协议(TLCP)》的安全通信协议,其特点是采用加密证书/私钥和签名证书/私钥相分离的方式。openEuler 发布的openSSL软件在开源版本的基础上增加了对商密TLCP协议的支持,提供了如下主要的功能: - 新增对TLCP商密双证书加载的支持; - 新增对ECC_SM4_CBC_SM3和ECDHE_SM4_CBC_SM3算法套件的支持; @@ -16,6 +16,7 @@ openEuler操作系统安装的openSSL软件版本号大于1.1.1m-4: $ rpm -qa openssl openssl-1.1.1m-6.oe2209.x86_64 ``` +注意:当前仅openssl 1.1.1支持TLCP协议栈,openssl 3.x版本暂未支持。 ## 如何使用 diff --git "a/docs/zh/docs/ShangMi/\345\256\211\345\205\250\345\220\257\345\212\250.md" "b/docs/zh/docs/ShangMi/\345\256\211\345\205\250\345\220\257\345\212\250.md" index bb16a42f41eb358c9165e303c9f7dcb2d3b9928e..d4cf5b791b34f52fc9b6e67cab01160e013d6395 100644 --- "a/docs/zh/docs/ShangMi/\345\256\211\345\205\250\345\220\257\345\212\250.md" +++ "b/docs/zh/docs/ShangMi/\345\256\211\345\205\250\345\220\257\345\212\250.md" @@ -30,7 +30,7 @@ crypto-policies-20200619-3.git781bbd4.oe2203.noarch 2. 下载openEuler shim组件源码,注意需要检查spec文件中的版本号大于15.6-7: ```shell -git clone https://gitee.com/src-openeuler/shim.git -b openEuler-22.03-LTS-SP1 --depth 1 +git clone https://gitee.com/src-openeuler/shim.git -b openEuler-{version} --depth 1 ``` 3. 安装编译shim组件所需要的软件包: diff --git "a/docs/zh/docs/ShangMi/\346\226\207\344\273\266\345\256\214\346\225\264\346\200\247\344\277\235\346\212\244.md" "b/docs/zh/docs/ShangMi/\346\226\207\344\273\266\345\256\214\346\225\264\346\200\247\344\277\235\346\212\244.md" index c65eb80e5e2e1dc22a6ef7276b8a5fb65d25a296..ba5628edfb7c8125d436dee0f2960366979b539c 100644 --- "a/docs/zh/docs/ShangMi/\346\226\207\344\273\266\345\256\214\346\225\264\346\200\247\344\277\235\346\212\244.md" +++ "b/docs/zh/docs/ShangMi/\346\226\207\344\273\266\345\256\214\346\225\264\346\200\247\344\277\235\346\212\244.md" @@ -7,46 +7,34 @@ IMA全称Integrity Measurement Architecture,是Linux内核提供的强制访 ### 前置条件 1. 准备openEuler内核编译环境,可参考: -2. 内核模块签名支持商密算法在openEuler 5.10内核支持,建议选取最新5.10内核源码进行编译; -3. 生成内核SM2根证书: - ```sh - # 生成证书配置文件(配置文件其他字段可按需定义) - # echo 'subjectKeyIdentifier=hash' > ca.cfg - # 生成SM2签名私钥 - # openssl ecparam -genkey -name SM2 -out ca.key - # 生成签名请求 - # openssl req -new -sm3 -key ca.key -out ca.csr - # 生成SM2证书 - # openssl x509 -req -days 3650 -extfile ca.cfg -signkey ca.key -in ca.csr -out ca.crt - ``` +2. 建议选取最新内核源码进行编译; -4. 生成IMA二级证书: +3. 生成IMA校验证书(仅评估模式涉及): - ```sh - # 创建证书配置文件 + ``` + # 生成证书配置文件(配置文件其他字段可按需定义) echo 'subjectKeyIdentifier=hash' > ima.cfg echo 'authorityKeyIdentifier=keyid,issuer' >> ima.cfg - # 生成私钥 - openssl ecparam -genkey -name SM2 -out ima.key + echo 'keyUsage=digitalSignature,nonRepudiation' >> ima.cfg + # 生成SM2签名私钥 + # openssl ecparam -genkey -name SM2 -out ima.key # 生成签名请求 - openssl req -new -sm3 -key ima.key -out ima.csr - # 基于一级证书生成二级证书 - openssl x509 -req -sm3 -CAcreateserial -CA ca.crt -CAkey ca.key -extfile ima.cfg -in ima.csr -out ima.crt - # 转换为DER格式 - openssl x509 -outform DER -in ima.crt -out x509_ima.der + # openssl req -new -sm3 -key ima.key -out ima.csr + # 生成SM2证书 + # openssl x509 -req -days 3650 -extfile ima.cfg -signkey ima.key -in ima.csr -out ima.crt ``` -5. 将根证书放置到内核源码目录,并修改内核编译选项CONFIG_SYSTEM_TRUSTED_KEYS,将指定证书编译到内核TRUSTED密钥中: +4. 将根证书放置到内核源码目录,并修改内核编译选项CONFIG_SYSTEM_TRUSTED_KEYS,将指定证书编译到内核TRUSTED密钥中(仅评估模式涉及): ```sh - # cp /path/to/ca.crt . + # cp /path/to/ima.crt . # make openeuler_defconfig # cat .config | grep CONFIG_SYSTEM_TRUSTED_KEYS - CONFIG_SYSTEM_TRUSTED_KEYS="ca.crt" + CONFIG_SYSTEM_TRUSTED_KEYS="ima.crt" ``` -6. 编译并安装内核: +5. 编译并安装内核(仅评估模式涉及): ```sh make -j64 @@ -123,31 +111,6 @@ $ rpm -qa ima-evm-utils ima-evm-utils-1.3.2-4.oe2209.x86_64 ``` -生成IMA二级证书: - -```sh -# 创建证书配置文件 -# echo 'subjectKeyIdentifier=hash' > ima.cfg -# echo 'authorityKeyIdentifier=keyid,issuer' >> ima.cfg -# 生成私钥 -# openssl ecparam -genkey -name SM2 -out ima.key -# 生成签名请求 -# openssl req -new -sm3 -key ima.key -out ima.csr -# 基于一级证书生成二级证书 -# openssl x509 -req -sm3 -CAcreateserial -CA ca.crt -CAkey ca.key -extfile ima.cfg -in ima.csr -out ima.crt -# 转换为DER格式 -# openssl x509 -outform DER -in ima.crt -out x509_ima.der -``` - -将IMA证书放置在/etc/keys目录下,执行dracut重新制作initrd: - -```sh -# mkdir -p /etc/keys -# cp x509_ima.der /etc/keys -# echo 'install_items+=" /etc/keys/x509_ima.der "' >> /etc/dracut.conf -# dracut -f -``` - 对需要进行保护的文件执行签名操作,如此处对/usr/bin目录下所有root用户的可执行文件进行签名: ```sh @@ -212,7 +175,7 @@ cat /sys/kernel/security/ima/ascii_runtime_measurements ...... ``` -#### 配置SM2证书校验摘要列表 +#### 配置SM2证书校验摘要列表(评估模式) **前置条件:** @@ -228,60 +191,22 @@ digest-list-tools-0.3.95-10.oe2209.x86_64 **执行步骤**: -1. 生成IMA/EVM二级证书(需要为内核预置的商密根证书的子证书): +将IMA摘要列表使用IMA/EVM证书对应的私钥进行签名,签名后可被正常导入内核: - ```sh - # 创建证书配置文件 - # echo 'subjectKeyIdentifier=hash' > ima.cfg - # echo 'authorityKeyIdentifier=keyid,issuer' >> ima.cfg - # 生成私钥 - # openssl ecparam -genkey -name SM2 -out ima.key - # 生成签名请求 - # openssl req -new -sm3 -key ima.key -out ima.csr - # 基于一级证书生成二级证书 - # openssl x509 -req -sm3 -CAcreateserial -CA ca.crt -CAkey ca.key -extfile ima.cfg -in ima.csr -out ima.crt - # 转换为DER格式 - # openssl x509 -outform DER -in ima.crt -out x509_ima.der - # openssl x509 -outform DER -in ima.crt -out x509_evm.der - ``` - -2. 将IMA/EVM证书放置在/etc/keys目录下,执行dracut重新制作initrd: - - ```sh - # mkdir -p /etc/keys - # cp x509_ima.der /etc/keys - # cp x509_evm.der /etc/keys - # echo 'install_items+=" /etc/keys/x509_ima.der /etc/keys/x509_evm.der "' >> /etc/dracut.conf - # dracut -f -e xattr - ``` - -3. 配置启动参数,开启IMA摘要列表功能,重启后可以检查证书被导入IMA/EVM密钥环: - - ```sh - # cat /proc/keys - ...... - 024dee5e I------ 1 perm 1f0f0000 0 0 keyring .evm: 1 - ...... - 3980807f I------ 1 perm 1f0f0000 0 0 keyring .ima: 1 - ...... - ``` - -4. 将IMA摘要列表使用IMA/EVM证书对应的私钥进行签名,签名后可被正常导入内核: - - ```sh - # 使用evmctl对摘要列表进行签名 - # evmctl ima_sign --key /path/to/ima.key -a sm3 0-metadata_list-compact-tree-1.8.0-2.oe2209.x86_64 - # 检查签名后的扩展属性 - # getfattr -m - -d 0-metadata_list-compact-tree-1.8.0-2.oe2209.x86_64 - file: 0-metadata_list-compact-tree-1.8.0-2.oe2209.x86_64 - security.ima=0sAwIRNJFkBQBHMEUCIQCzdKVWdxw1hoVm9lgZB6sl+sxapptUFNjqHt5XZD87hgIgBMuZqBdrcNm7fXq/reQw7rzY/RN/UXPrIOxrVvpTouw= - security.selinux="unconfined_u:object_r:admin_home_t:s0" - # 将签名后的摘要列表文件导入内核 - # echo /root/tree/etc/ima/digest_lists/0-metadata_list-compact-tree-1.8.0-2.oe2209.x86_64 > /sys/kernel/security/ima/digest_list_data - # 检查度量日志,可以看到摘要列表的导入记录 - # cat /sys/kernel/security/ima/ascii_runtime_measurements - 11 43b6981f84ba2725d05e91f19577cedb004adffb ima-sig sm3:b9430bbde2b7f30e935d91e29ab6778b6a825a2c3e5e7255895effb8747b7c1a /root/tree/etc/ima/digest_lists/0-metadata_list-compact-tree-1.8.0-2.oe2209.x86_64 0302113491640500473045022100b374a556771c35868566f6581907ab25facc5aa69b5414d8ea1ede57643f3b86022004cb99a8176b70d9bb7d7abfade430eebcd8fd137f5173eb20ec6b56fa53a2ec - ``` +```sh +# 使用evmctl对摘要列表进行签名 +# evmctl ima_sign --key /path/to/ima.key -a sm3 0-metadata_list-compact-tree-1.8.0-2.oe2209.x86_64 +# 检查签名后的扩展属性 +# getfattr -m - -d 0-metadata_list-compact-tree-1.8.0-2.oe2209.x86_64 +file: 0-metadata_list-compact-tree-1.8.0-2.oe2209.x86_64 +security.ima=0sAwIRNJFkBQBHMEUCIQCzdKVWdxw1hoVm9lgZB6sl+sxapptUFNjqHt5XZD87hgIgBMuZqBdrcNm7fXq/reQw7rzY/RN/UXPrIOxrVvpTouw= +security.selinux="unconfined_u:object_r:admin_home_t:s0" +# 将签名后的摘要列表文件导入内核 +# echo /root/tree/etc/ima/digest_lists/0-metadata_list-compact-tree-1.8.0-2.oe2209.x86_64 > /sys/kernel/security/ima/digest_list_data +# 检查度量日志,可以看到摘要列表的导入记录 +# cat /sys/kernel/security/ima/ascii_runtime_measurements +11 43b6981f84ba2725d05e91f19577cedb004adffb ima-sig sm3:b9430bbde2b7f30e935d91e29ab6778b6a825a2c3e5e7255895effb8747b7c1a /root/tree/etc/ima/digest_lists/0-metadata_list-compact-tree-1.8.0-2.oe2209.x86_64 0302113491640500473045022100b374a556771c35868566f6581907ab25facc5aa69b5414d8ea1ede57643f3b86022004cb99a8176b70d9bb7d7abfade430eebcd8fd137f5173eb20ec6b56fa53a2ec +``` **注意:** @@ -306,6 +231,8 @@ digest-list-tools-0.3.95-10.oe2209.x86_64 Cannon parse /etc/ima/digest_lists/0-metadata_list-rpm-...... ``` +3. 当前openEuler 24.03内核暂不支持二级商密IMA证书(/etc/keys/x509_ima.der和/etc/keys/x509_evm.der)导入。 + ## 轻量入侵检测(AIDE) AIDE是一款轻量级的入侵检测工具,主要通过检测文件的完整性,以及时发现针对系统的恶意入侵行为。AIDE数据库能够使用sha256、sha512等哈希算法,用密文形式建立每个文件的校验码或散列号。openEuler提供的AIDE在开源软件的基础上新增了对SM3算法的支持。 diff --git "a/docs/zh/docs/ShangMi/\346\246\202\350\277\260.md" "b/docs/zh/docs/ShangMi/\346\246\202\350\277\260.md" index 2b4ff12dc95ff2bc488afb751eff61e31b5b7531..bac4db358ed3a0b22b75db08f73025d209ab4d5a 100644 --- "a/docs/zh/docs/ShangMi/\346\246\202\350\277\260.md" +++ "b/docs/zh/docs/ShangMi/\346\246\202\350\277\260.md" @@ -1,4 +1,12 @@ # 概述 +国产商用密码算法(后文简称商密)属于商用的、不涉及国家秘密的密码技术。密码算法是信息系统的安全技术基础,在国际上已经有广泛使用的RSA、AES、SHA256等密码算法。与之相对的,国内也有一系列自主研发的密码算法,可以覆盖主流的应用场景。其中在操作系统场景,相对应用广泛的算法是SM2/3/4: +| 算法 | 是否公开 | 类型 | 应用场景 | +|---|---|---|---| +| SM2 | 是 | 非对称加解密算法 | 数字签名、密钥交换、加解密,广泛应用于PKI体系 | +| SM3 | 是 | 杂凑算法(哈希算法) | 应用于完整性保护、单向加密等通用场景 | +| SM4 | 是 | 对称加解密算法) | 数据加密存储、安全传输 | + +除此之外,还包括SM9、ZUC等公开算法,以及SM1、SM7等非公开算法。值得一提的是,所有已经公开的国产算法都已经纳入ISO/IEC标准,成为了被国际所认可的密码算法。围绕这些密码算法,我国制定并发布一系列密码技术规范和应用标准,如商密证书标准、TLCP协议栈等。它们共同构成了我国的商密标准体系,指导了国内的密码安全产业链的构建。 openEuler操作系统商密支持旨在对操作系统的关键安全特性进行商密算法使能,并为上层应用提供商密算法库、证书、安全传输协议等密码服务。 @@ -15,4 +23,5 @@ openEuler操作系统商密支持旨在对操作系统的关键安全特性进 9. 内核模块签名/验签支持SM2证书; 10. 内核KTLS支持SM4-CBC和SM4-GCM算法; 11. 鲲鹏KAE加速引擎支持SM3/4算法加速; -12. UEFI安全启动支持SM3摘要算法和SM2数字签名。 \ No newline at end of file +12. UEFI安全启动支持SM3摘要算法和SM2数字签名; +13. RPM支持国密SM2加解密算法+SM3摘要算法的签名及验签。 \ No newline at end of file diff --git "a/docs/zh/docs/ShangMi/\347\243\201\347\233\230\345\212\240\345\257\206.md" "b/docs/zh/docs/ShangMi/\347\243\201\347\233\230\345\212\240\345\257\206.md" index e708e6e488ae9f79634afdceba4b89e66645df07..3738629ad8116956c326ff99fc654ea9281eec3b 100644 --- "a/docs/zh/docs/ShangMi/\347\243\201\347\233\230\345\212\240\345\257\206.md" +++ "b/docs/zh/docs/ShangMi/\347\243\201\347\233\230\345\212\240\345\257\206.md" @@ -37,6 +37,7 @@ a. luks2模式 ``` # cryptsetup luksFormat /dev/sdd -c sm4-xts-plain64 --key-size 256 --hash sm3 +# cryptsetup luksOpen /dev/sdd crypt1 ``` b. plain模式 @@ -85,5 +86,5 @@ sdd 8:48 0 50G 0 disk 关闭设备: ``` -# cryptsetup luksClose crypt1 +# cryptsetup close crypt1 ``` \ No newline at end of file diff --git "a/docs/zh/docs/ShangMi/\350\257\201\344\271\246.md" "b/docs/zh/docs/ShangMi/\350\257\201\344\271\246.md" index 12d4ba5c104378351407e01d8eb23dcddbc1a2a8..f1ce7fdd9b6e18dac287e58544b75a1f131f15f7 100644 --- "a/docs/zh/docs/ShangMi/\350\257\201\344\271\246.md" +++ "b/docs/zh/docs/ShangMi/\350\257\201\344\271\246.md" @@ -45,7 +45,15 @@ $ openssl x509 -text -in sm2.crt #### 使用x509命令(一般用于功能测试) -1. 生成SM2签名私钥和签名请求: +1. 生成CA私钥和证书: + +``` +$ openssl ecparam -genkey -name SM2 -out ca.key +$ openssl req -new -sm3 -key ca.key -out ca.csr +$ openssl x509 -req -days 3650 -signkey ca.key -in ca.csr -out ca.crt +``` + +1. 生成二级签名私钥和签名请求: ``` $ openssl ecparam -genkey -name SM2 -out sm2.key diff --git "a/docs/zh/docs/StratoVirt/StratoVirt-VFIO\344\275\277\347\224\250\350\257\264\346\230\216.md" "b/docs/zh/docs/StratoVirt/StratoVirt-VFIO\344\275\277\347\224\250\350\257\264\346\230\216.md" index fc3b1afb4b84eb3004e13a3dd2c40f1fc4f114cb..948518e3f2b5499add5766989e89b13dfb76382b 100644 --- "a/docs/zh/docs/StratoVirt/StratoVirt-VFIO\344\275\277\347\224\250\350\257\264\346\230\216.md" +++ "b/docs/zh/docs/StratoVirt/StratoVirt-VFIO\344\275\277\347\224\250\350\257\264\346\230\216.md" @@ -84,7 +84,7 @@ 最后将该 PCI 设备重新绑定到 vfio-pci 驱动。 ```shell - lspci -ns 0000:03:00.0 |awk -F':| ' '{print 5" "6}' > /sys/bus/pci/drivers/vfio-pci/new_id + lspci -ns 0000:03:00.0 |awk -F':| ' '{print $5" "$6}' > /sys/bus/pci/drivers/vfio-pci/new_id ``` 将网卡绑定到 vfio-pci 驱动后,在主机上无法查询到对应网卡信息,只能查询到对应的 PCI 设备信息。 diff --git a/docs/zh/docs/StratoVirt/StratoVirtGuide.md b/docs/zh/docs/StratoVirt/StratoVirtGuide.md index f4a7fc9fb1c17a06728a1d5f91eecee87e4e654d..a30dd3ff222b84a3866aca5df1e25fb0480ef17c 100644 --- a/docs/zh/docs/StratoVirt/StratoVirtGuide.md +++ b/docs/zh/docs/StratoVirt/StratoVirtGuide.md @@ -1,4 +1,3 @@ # StratoVirt用户指南 本文档介绍Stratovirt虚拟化,并给出基于openEuler安装StratoVirt的方法,以及StratoVirt虚拟化的使用指导。让用户了解Stratovirt,并指导用户和管理员安装和使用StratoVirt。 - diff --git "a/docs/zh/docs/StratoVirt/StratoVirt\344\273\213\347\273\215.md" "b/docs/zh/docs/StratoVirt/StratoVirt\344\273\213\347\273\215.md" index 9842e25a699a15057e001e585e46da8d4a05ef60..d83c172b8ac2c919a9b334deab28d52afc8e4d03 100644 --- "a/docs/zh/docs/StratoVirt/StratoVirt\344\273\213\347\273\215.md" +++ "b/docs/zh/docs/StratoVirt/StratoVirt\344\273\213\347\273\215.md" @@ -1,13 +1,10 @@ # StratoVirt介绍 - ## 概述 StratoVirt是计算产业中面向云数据中心的企业级虚拟化平台,实现了一套架构统一支持虚拟机、容器、Serverless三种场景。StratoVirt在轻量低噪、软硬协同、Rust语言级安全等方面具备关键技术竞争优势。 StratoVirt在架构设计和接口上预留了组件化拼装的能力和接口,StratoVirt可以按需灵活组装高级特性直至演化到支持标准虚拟化,在特性需求、应用场景和轻快灵巧之间找到最佳的平衡点。 - - ## 架构说明 StratoVirt核心架构自顶向下分为三层: @@ -15,8 +12,8 @@ StratoVirt核心架构自顶向下分为三层: - 外部接口:兼容QMP(QEMU Monitor Protocol)协议,具有完备的OCI兼容能力,同时支持对接libvirt。 - BootLoader:轻量化场景下抛弃传统BIOS+GRUB的启动模式实现快速启动,同时标准虚拟化场景下支持UEFI启动。 - 模拟主板: - - microvm: 充分利用软硬协同能力,精简化设备模型,低时延资源伸缩能力。 - - 标准机型:提供ACPI表实现UEFI启动,支持添加virtio-pci以及VFIO直通设备等,极大提高虚拟机的I/O性能。 + - microvm: 充分利用软硬协同能力,精简化设备模型,低时延资源伸缩能力。 + - 标准机型:提供ACPI表实现UEFI启动,支持添加virtio-pci以及VFIO直通设备等,极大提高虚拟机的I/O性能。 整体架构视图如**图1**所示。 @@ -37,15 +34,15 @@ StratoVirt核心架构自顶向下分为三层: ## 实现 -#### 运行架构 +### 运行架构 - StratoVirt虚拟机是Linux中一个独立的进程。进程有三种线程:主线程、VCPU线程、I/O线程: - - 主线程是异步收集和处理来自外部模块(如VCPU线程)的事件的循环; - - 每个VCPU都有一个线程处理本VCPU的trap事件; - - 可以为I/O设备配置iothread提升I/O性能; + - 主线程是异步收集和处理来自外部模块(如VCPU线程)的事件的循环; + - 每个VCPU都有一个线程处理本VCPU的trap事件; + - 可以为I/O设备配置iothread提升I/O性能; -#### 约束 +### 约束 -- 仅支持Linux操作系统,推荐内核版本为4.19, 5.10; -- 虚拟机操作系统仅支持Linux,内核版本建议为4.19, 5.10; +- 仅支持Linux操作系统; +- 虚拟机操作系统仅支持Linux; - 最大支持254个CPU; diff --git "a/docs/zh/docs/StratoVirt/\345\207\206\345\244\207\344\275\277\347\224\250\347\216\257\345\242\203.md" "b/docs/zh/docs/StratoVirt/\345\207\206\345\244\207\344\275\277\347\224\250\347\216\257\345\242\203.md" index 5c17d7213dd925c78d18a73921501ad26119b99f..f4994ed79128e022967f0c1577fb9b7d7b7145b2 100644 --- "a/docs/zh/docs/StratoVirt/\345\207\206\345\244\207\344\275\277\347\224\250\347\216\257\345\242\203.md" +++ "b/docs/zh/docs/StratoVirt/\345\207\206\345\244\207\344\275\277\347\224\250\347\216\257\345\242\203.md" @@ -1,10 +1,8 @@ # 准备环境 - ## 使用说明 - StratoVirt仅支持运行于x86_64和AArch64处理器架构下并启动相同架构的Linux虚拟机。 -- 建议在 openEuler 22.03 LTS 版本编译、调测和部署该版本 StratoVirt。 - StratoVirt支持以非root权限运行。 ## 环境要求 @@ -15,8 +13,6 @@ - nmap工具 - Kernel镜像和rootfs镜像 - - ## 准备设备和工具 - StratoVirt运行需要实现mmio设备,所以运行之前确保存在设备`/dev/vhost-vsock` @@ -54,10 +50,10 @@ $ cd kernel ``` -2. 查看并切换kernel的版本到openEuler-22.03-LTS,参考命令如下: +2. 查看并切换kernel的版本到openEuler-24.03-LTS,参考命令如下: ``` - $ git checkout openEuler-22.03-LTS + $ git checkout openEuler-24.03-LTS ``` 3. 配置并编译Linux kernel。目前有两种方式可以生成配置文件:1. 使用推荐配置([获取配置文件](https://gitee.com/openeuler/stratovirt/tree/master/docs/kernel_config)),将指定版本的推荐文件复制到kernel路径下并重命名为`.config`, 并执行命令`make olddefconfig`更新到最新的默认配置(否则后续编译可能有选项需要手动选择)。2. 通过以下命令进行交互,根据提示完成kernel配置,可能会提示缺少指定依赖,按照提示使用`yum install`命令进行安装。 @@ -78,9 +74,6 @@ $ make -j bzImage ``` - - ​ - ## 制作rootfs镜像 rootfs镜像是一种文件系统镜像,在StratoVirt启动时可以装载带有init的ext4格式的镜像。下面是制作ext4 rootfs镜像的简单方法。 diff --git "a/docs/zh/docs/StratoVirt/\345\257\271\346\216\245libvirt.md" "b/docs/zh/docs/StratoVirt/\345\257\271\346\216\245libvirt.md" index d59f4badae1e4b8763c8aa02d9d2bed6364a5980..7c9a39fbc20f10b8b78c74e3ef70ccc36b560009 100644 --- "a/docs/zh/docs/StratoVirt/\345\257\271\346\216\245libvirt.md" +++ "b/docs/zh/docs/StratoVirt/\345\257\271\346\216\245libvirt.md" @@ -187,7 +187,7 @@ StratoVirt 对接 libvirt 之前,需要先配置 XML 文件。本小节介绍 ###### 配置示例 -配置网络前请参考 [配置linux网桥](https://docs.openeuler.org/zh/docs/20.03_LTS_SP2/docs/Virtualization/%E5%87%86%E5%A4%87%E4%BD%BF%E7%94%A8%E7%8E%AF%E5%A2%83.html#%E5%87%86%E5%A4%87%E8%99%9A%E6%8B%9F%E6%9C%BA%E7%BD%91%E7%BB%9C),配置好 Linux 网桥。配置 mac 地址为:`de:ad:be:ef:00:01`,网桥为配置好的 br0 ,使用 virtio-net 设备,并将其挂载在 bus 为 2、slot 为 0,function 为 0 的 PCI 总线上,示例为: +配置网络前请参考 [配置linux网桥](https://docs.openeuler.org/zh/docs/24.03_LTS/docs/Virtualization/%E5%87%86%E5%A4%87%E4%BD%BF%E7%94%A8%E7%8E%AF%E5%A2%83.html#%E5%87%86%E5%A4%87%E8%99%9A%E6%8B%9F%E6%9C%BA%E7%BD%91%E7%BB%9C),配置好 Linux 网桥。配置 mac 地址为:`de:ad:be:ef:00:01`,网桥为配置好的 br0 ,使用 virtio-net 设备,并将其挂载在 bus 为 2、slot 为 0,function 为 0 的 PCI 总线上,示例为: ```xml diff --git "a/docs/zh/docs/SysCare/\344\275\277\347\224\250SysCare.md" "b/docs/zh/docs/SysCare/\344\275\277\347\224\250SysCare.md" index 4fbd5748f29d20eb8c60d01f49a18cad302a5c24..105076742569e17dbad1119ce96c6972c0900437 100644 --- "a/docs/zh/docs/SysCare/\344\275\277\347\224\250SysCare.md" +++ "b/docs/zh/docs/SysCare/\344\275\277\347\224\250SysCare.md" @@ -2,7 +2,7 @@ 本章介绍在openEuler中使用SysCare的方法。 ## 前提条件 -安装openEuler 23.09版本。 +安装openEuler 24.03 LTS SP1版本。 ## SysCare使用 本章节将介绍 SysCare 的使用方法,包含热补丁制作及热补丁管理。 diff --git "a/docs/zh/docs/SysCare/\345\256\211\350\243\205SysCare.md" "b/docs/zh/docs/SysCare/\345\256\211\350\243\205SysCare.md" index de53088b8e7af2f6fc78c667de2e1c7221d36d1b..5394f0116019143b60bc3f4fbedbcf2f71b0eec5 100644 --- "a/docs/zh/docs/SysCare/\345\256\211\350\243\205SysCare.md" +++ "b/docs/zh/docs/SysCare/\345\256\211\350\243\205SysCare.md" @@ -9,7 +9,7 @@ * 100GB 硬盘 ### 前提条件 -安装openEuler 23.09版本。 +安装openEuler 24.03 LTS SP1版本。 ### 源码编译安装SysCare SysCare源码已经归档至代码仓,用户可自行下载并编译安装。 diff --git "a/docs/zh/docs/SystemOptimization/MySQL\346\200\247\350\203\275\350\260\203\344\274\230\346\214\207\345\215\227.md" "b/docs/zh/docs/SystemOptimization/MySQL\346\200\247\350\203\275\350\260\203\344\274\230\346\214\207\345\215\227.md" index c9b155884a311c945109acedc70a0b84dca5b0ef..8377e53b5fc243bb92c5c099f43daab24e9ca4d9 100644 --- "a/docs/zh/docs/SystemOptimization/MySQL\346\200\247\350\203\275\350\260\203\344\274\230\346\214\207\345\215\227.md" +++ "b/docs/zh/docs/SystemOptimization/MySQL\346\200\247\350\203\275\350\260\203\344\274\230\346\214\207\345\215\227.md" @@ -110,7 +110,7 @@ numactl -C 0-90 -i 0-3 $mysql_path/bin/mysqld --defaults-file=/etc/my.cnf & #### 目的 -在高负载场景下,CPU利用率并不能达到100%,深入分析每个线程的调度轨迹发现内核在做负载均衡时,经常无法找到一个合适的进程来迁移,导致CPU在间断空闲负载均衡失败,空转浪费CPU资源,通过使能openEuler调度特性STEAL模式,可以进一步提高CPU利用率,从而有效提升系统性能。(**当前该特性仅在openEuler 20.03 SP2版本及之后版本支持**) +在高负载场景下,CPU利用率并不能达到100%,深入分析每个线程的调度轨迹发现内核在做负载均衡时,经常无法找到一个合适的进程来迁移,导致CPU在间断空闲负载均衡失败,空转浪费CPU资源,通过使能openEuler调度特性STEAL模式,可以进一步提高CPU利用率,从而有效提升系统性能。 #### 方法 diff --git "a/docs/zh/docs/TailorCustom/imageTailor\344\275\277\347\224\250\346\214\207\345\215\227.md" "b/docs/zh/docs/TailorCustom/imageTailor\344\275\277\347\224\250\346\214\207\345\215\227.md" index 72d3a4281f666f38958118d6149af570de0e6954..8b1984d32e137eb2de41f1cb37853e015139c9b2 100644 --- "a/docs/zh/docs/TailorCustom/imageTailor\344\275\277\347\224\250\346\214\207\345\215\227.md" +++ "b/docs/zh/docs/TailorCustom/imageTailor\344\275\277\347\224\250\346\214\207\345\215\227.md" @@ -15,16 +15,12 @@ ## 安装工具 -本节以 openEuler 22.03 LTS 版本 AArch64 架构为例,说明安装方法。 - ### 软硬件要求 安装和运行 imageTailor 需要满足以下软硬件要求: - 机器架构为 x86_64 或者 AArch64 -- 操作系统为 openEuler 22.03 LTS(该版本内核版本为 5.10,python 版本为 3.9,满足工具要求) - - 运行工具的机器根目录 '/' 需要 40 GB 以上空间 - python 版本 3.9 及以上 @@ -49,33 +45,30 @@ ```shell cd /root/temp - wget https://repo.openeuler.org/openEuler-22.03-LTS/ISO/aarch64/openEuler-22.03-LTS-everything-aarch64-dvd.iso - wget https://repo.openeuler.org/openEuler-22.03-LTS/ISO/aarch64/openEuler-22.03-LTS-everything-aarch64-dvd.iso.sha256sum + wget https://repo.openeuler.org/openEuler-{version}/ISO/aarch64/openEuler-{version}-everything-aarch64-dvd.iso + wget https://repo.openeuler.org/openEuler-{version}/ISO/aarch64/openEuler-{version}-everything-aarch64-dvd.iso.sha256sum ``` 2. 获取 sha256sum 校验文件中的校验值。 ```shell - cat openEuler-22.03-LTS-everything-aarch64-dvd.iso.sha256sum + cat openEuler-{version}-everything-aarch64-dvd.iso.sha256sum ``` 3. 计算 ISO 镜像文件的校验值。 ```shell - sha256sum openEuler-22.03-LTS-everything-aarch64-dvd.iso + sha256sum openEuler-{version}-everything-aarch64-dvd.iso ``` 4. 对比上述 sha256sum 文件的检验值和 ISO 镜像的校验值,如果两者相同,说明文件完整性检验成功。否则说明文件完整性被破坏,需要重新获取文件。 ### 安装 imageTailor -此处以 openEuler 22.03 LTS 版本的 AArch64 架构为例,介绍如何安装 imageTailor 工具。 - -1. 确认机器已经安装操作系统 openEuler 22.03 LTS( imageTailor 工具的运行环境)。 +1. 确认机器已经安装操作系统( imageTailor 工具的运行环境)。 ```shell - $ cat /etc/openEuler-release - openEuler release 22.03 LTS + # cat /etc/openEuler-release ``` 2. 创建文件 /etc/yum.repos.d/local.repo,配置对应 yum 源。配置内容参考如下,其中 baseurl 是用于挂载 ISO 镜像的目录: @@ -92,17 +85,17 @@ ```shell mkdir /root/imageTailor_mount - sudo mount -o loop /root/temp/openEuler-22.03-LTS-everything-aarch64-dvd.iso /root/imageTailor_mount/ + sudo mount -o loop /root/temp/openEuler-{version}-everything-aarch64-dvd.iso /root/imageTailor_mount/ ``` -4. 使 yum 源生效: +4. 使 yum 源生效。 ```shell yum clean all yum makecache ``` -5. 使用 root 权限,安装 imageTailor 裁剪工具: +5. 使用 root 权限,安装 imageTailor 裁剪工具。 ```shell sudo yum install -y imageTailor @@ -184,7 +177,7 @@ imageTailor 工具安装完成后,工具包的目录结构如下: - 配置安全加固策略 - imageTailor 提供了默认地安全加固策略。用户可以根据业务需要,通过编辑 security_s.conf 对系统进行二次加固(仅在系统 ISO 镜像定制阶段),具体的操作方法请参见 《 [安全加固指南](https://docs.openeuler.org/zh/docs/22.03_LTS/docs/SecHarden/secHarden.html) 》。 + imageTailor 提供了默认地安全加固策略。用户可以根据业务需要,通过编辑 security_s.conf 对系统进行二次加固(仅在系统 ISO 镜像定制阶段),具体的操作方法请参见 《 [安全加固指南](https://docs.openeuler.org/zh/docs/24.03_LTS/docs/SecHarden/secHarden.html) 》。 - 制作操作系统 ISO 镜像 @@ -194,7 +187,7 @@ imageTailor 工具安装完成后,工具包的目录结构如下: 用户可以根据业务需要,将业务 RPM 包、自定义文件、驱动、命令和库文件打包至目标 ISO 镜像。 -#### 配置本地 repo 源 +#### 配置本地repo源 定制 ISO 操作系统镜像,必须在 /opt/imageTailor/repos/euler_base/ 目录配置 repo 源。本节主要介绍配置本地 repo 源的方法。 @@ -202,14 +195,14 @@ imageTailor 工具安装完成后,工具包的目录结构如下: ```shell cd /opt - wget https://repo.openeuler.org/openEuler-22.03-LTS/ISO/aarch64/openEuler-22.03-LTS-everything-aarch64-dvd.iso + wget https://repo.openeuler.org/openEuler-{version}/ISO/aarch64/openEuler-{version}-everything-aarch64-dvd.iso ``` 2. 创建挂载目录 /opt/openEuler_repo ,并挂载 ISO 到该目录 。 ```shell $ sudo mkdir -p /opt/openEuler_repo - $ sudo mount openEuler-22.03-LTS-everything-aarch64-dvd.iso /opt/openEuler_repo + $ sudo mount openEuler-{version}-everything-aarch64-dvd.iso /opt/openEuler_repo mount: /opt/openEuler_repo: WARNING: source write-protected, mounted read-only. ``` @@ -268,7 +261,7 @@ imageTailor 工具安装完成后,工具包的目录结构如下: > >- 下述 rpm.conf 和 cmd.conf 均在 /opt/imageTailor/custom/cfg_openEuler/ 目录下。 >- 下述 RPM 包裁剪粒度是指 sys_cut='no' 。裁剪粒度详情请参见 [配置主机参数](#配置主机参数) 。 ->- 若没有配置本地 repo 源,请参见 [配置本地 repo 源 ](#配置本地 repo 源)进行配置。 +>- 若没有配置本地 repo 源,请参见 [配置本地repo源 ](#配置本地repo源)进行配置。 > 1. 确认 /opt/imageTailor/repos/euler_base/ 目录中是否包含需要添加的 RPM 包。 @@ -324,7 +317,7 @@ imageTailor 工具安装完成后,工具包的目录结构如下: - 添加库文件 ```shell - + @@ -461,7 +454,7 @@ openEuler 提供的默认配置如下,用户可以根据需要进行修改: > > - sys_cut='no' > -> 无论 sys_usrrpm_cut='no' 还是 sys_usrrpm_cut='yes' ,都为系统 RPM 包裁剪粒度,即imageTailor 会安装 repo 源中的 RPM 包和 usr_rpm 目录下的 RPM 包,但不会裁剪 RPM 包中的文件。即使用户不需要这些 RPM 包中的部分文件,imageTailor 也不会进行裁剪。 +> 无论 sys_usrrpm_cut='no' 还是 sys_usrrpm_cut='yes' ,都为系统 RPM 包裁剪粒度,即 imageTailor 会安装 repo 源中的 RPM 包和 usr_rpm 目录下的 RPM 包,但不会裁剪 RPM 包中的文件。即使用户不需要这些 RPM 包中的部分文件,imageTailor 也不会进行裁剪。 > > - sys_cut='yes' > @@ -619,14 +612,14 @@ hd0 /home max logical ext4 各参数含义如下: -- hd 磁盘号 +- 【hd 磁盘号】 磁盘的编号。请按照 hdx 的格式填写,x 指第 x 块盘。 >![](./public_sys-resources/icon-note.gif) **说明:** > >分区配置只在被安装机器的磁盘能被识别时才有效。 -- 挂载路径 +- 【挂载路径】 指定分区挂载的路径。用户既可以配置业务分区,也可以对默认配置中的系统分区进行调整。如果不挂载,则设置为 '-'。 >![](./public_sys-resources/icon-note.gif) **说明:** @@ -634,7 +627,7 @@ hd0 /home max logical ext4 >- 分区配置中必须有 '/' 挂载路径。其他的请用户自行调整。 >- 采用 UEFI 引导时,在 x86_64 的分区配置中必须有 '/boot' 挂载路径,在 AArch64 的分区配置中必须有 '/boot/efi' 挂载路径。 -- 分区大小 +- 【分区大小】 分区大小的取值有以下四种: - G/g:指定以 GB 为单位的分区大小,例如:2G。 @@ -643,21 +636,21 @@ hd0 /home max logical ext4 - MAX/max:指定将硬盘上剩余的空间全部用来创建一个分区。只能在最后一个分区配置该值。 >![](./public_sys-resources/icon-note.gif) **说明:** -> + > >- 分区大小不支持小数,如果是小数,请换算成其他单位,调整为整数的数值。例如:不能填写 1.5G,应填写为 1536M。 >- 分区大小取 MAX/max 值时,剩余分区大小不能超过支持文件系统类型的限制(默认文件系统类型 ext4,限制大小 16T)。 -- 分区类型 - 分区有以下三种: +- 【分区类型】 + 分区有以下三种类型: - 主分区: primary - 扩展分区:extended(该分区只需配置 hd 磁盘号即可) - 逻辑分区:logical -- 文件系统类型 +- 【文件系统类型】 目前支持的文件系统类型有:ext4、vfat -- 二次格式化标志位 +- 【二次格式化标志位】 可选配置,表示二次安装时是否格式化: - 是:yes @@ -685,13 +678,13 @@ STARTMODE="auto" 各参数含义请参见下表: -- | 参数名称 | 是否必配 | 参数值 | 说明 | - | :-------- | -------- | :------------------------------------------------ | :----------------------------------------------------------- | - | BOOTPROTO | 是 | none / static / dhcp | none:引导时不使用协议,不配地址
static:静态分配地址
dhcp:使用 DHCP 协议动态获取地址 | - | DEVICE | 是 | 如:eth1 | 网卡名称 | - | IPADDR | 是 | 如:192.168.11.100 | IP 地址
当 BOOTPROTO 参数为 static 时,该参数必配;其他情况下,该参数不用配置 | - | NETMASK | 是 | - | 子网掩码
当 BOOTPROTO 参数为 static 时,该参数必配;其他情况下,该参数不用配置 | - | STARTMODE | 是 | manual / auto / hotplug / ifplugd / nfsroot / off | 启用网卡的方法:
manual:用户在终端执行 ifup 命令启用网卡。
auto \ hotplug \ ifplug \ nfsroot:当 OS 识别到该网卡时,便启用该网卡。
off:任何情况下,网卡都无法被启用。
各参数更具体的说明请在制作 ISO 镜像的机器上执行 `man ifcfg` 命令查看。 | +| 参数名称 | 是否必配 | 参数值 | 说明 | +| :-------- | -------- | :------------------------------------------------ | :----------------------------------------------------------- | +| BOOTPROTO | 是 | none / static / dhcp | none:引导时不使用协议,不配地址
static:静态分配地址
dhcp:使用 DHCP 协议动态获取地址 | +| DEVICE | 是 | 如:eth1 | 网卡名称 | +| IPADDR | 是 | 如:192.168.11.100 | IP 地址
当 BOOTPROTO 参数为 static 时,该参数必配;其他情况下,该参数不用配置 | +| NETMASK | 是 | - | 子网掩码
当 BOOTPROTO 参数为 static 时,该参数必配;其他情况下,该参数不用配置 | +| STARTMODE | 是 | manual / auto / hotplug / ifplugd / nfsroot / off | 启用网卡的方法:
manual:用户在终端执行 ifup 命令启用网卡。
auto \ hotplug \ ifplug \ nfsroot:当 OS 识别到该网卡时,便启用该网卡。
off:任何情况下,网卡都无法被启用。
各参数更具体的说明请在制作 ISO 镜像的机器上执行 `man ifcfg` 命令查看。 | #### 配置内核参数 @@ -759,11 +752,13 @@ mkdliso -p openEuler -c custom/cfg_openEuler [--minios yes|no|force] [--sec] [-h > > - mkdliso 所在的绝对路径中不能有空格,否则会导致制作 ISO 失败。 > - 制作 ISO 的环境中,umask 的值必须设置为 0022。 +> - 命令需在/opt/imageTailor目录中执行 1. 使用 root 权限,执行 mkdliso 命令,生成 ISO 镜像文件。参考命令如下: ```shell - # sudo /opt/imageTailor/mkdliso -p openEuler -c custom/cfg_openEuler --sec + # cd /opt/imageTailor/ + # sudo mkdliso -p openEuler -c custom/cfg_openEuler --sec ``` 命令执行完成后,制作出的新文件在 /opt/imageTailor/result/{日期} 目录下,包括 openEuler-aarch64.iso 和 openEuler-aarch64.iso.sha256 。 @@ -821,8 +816,7 @@ Pacific/ zone.tab 1. 检查制作 ISO 所在环境是否满足要求。 ``` shell - $ cat /etc/openEuler-release - openEuler release 22.03 LTS + # cat /etc/openEuler-release ``` 2. 确保根目录有 40 GB 以上空间。 @@ -850,9 +844,9 @@ Pacific/ zone.tab 4. 配置本地 repo 源。 ```shell - $ wget https://repo.openeuler.org/openEuler-22.03-LTS/ISO/aarch64/openEuler-22.03-LTS-everything-aarch64-dvd.iso + $ wget https://repo.openeuler.org/openEuler-{version}/ISO/aarch64/openEuler-{version}-everything-aarch64-dvd.iso $ sudo mkdir -p /opt/openEuler_repo - $ sudo mount openEuler-22.03-LTS-everything-aarch64-dvd.iso /opt/openEuler_repo + $ sudo mount openEuler-{version}-everything-aarch64-dvd.iso /opt/openEuler_repo mount: /opt/openEuler_repo: WARNING: source write-protected, mounted read-only. $ sudo rm -rf /opt/imageTailor/repos/euler_base && sudo mkdir -p /opt/imageTailor/repos/euler_base $ sudo cp -ar /opt/openEuler_repo/Packages/* /opt/imageTailor/repos/euler_base @@ -863,7 +857,7 @@ Pacific/ zone.tab $ cd /opt/imageTailor ``` -5. 修改 grub/root 密码 +5. 修改 grub/root 密码。 以下 ${pwd} 的实际内容请参见 [配置初始密码](#配置初始密码) 章节生成并替换。 diff --git "a/docs/zh/docs/TailorCustom/isocut\344\275\277\347\224\250\346\214\207\345\215\227.md" "b/docs/zh/docs/TailorCustom/isocut\344\275\277\347\224\250\346\214\207\345\215\227.md" index 79d839523105d0e9415064c965a7f5b294ae2463..19790dae0f02539e6a3dd2faf173f487e76c02ea 100644 --- "a/docs/zh/docs/TailorCustom/isocut\344\275\277\347\224\250\346\214\207\345\215\227.md" +++ "b/docs/zh/docs/TailorCustom/isocut\344\275\277\347\224\250\346\214\207\345\215\227.md" @@ -23,33 +23,22 @@ openEuler 光盘镜像较大,下载、传输镜像很耗时。另外,使用 使用 openEuler 裁剪定制工具制作 ISO 所使用的机器需要满足如下软硬件要求: - CPU 架构为 AArch64 或者 x86_64 -- 操作系统为 openEuler 20.03 LTS SP3 - 建议预留 30 GB 以上的磁盘空间(用于运行裁剪定制工具和存放 ISO 镜像) ## 安装工具 -此处以 openEuler 20.03 LTS SP3 版本的 AArch64 架构为例,介绍 ISO 镜像裁剪定制工具的安装操作。 - -1. 确认机器已安装操作系统 openEuler 20.03 LTS SP3(镜像裁剪定制工具的运行环境)。 +1. 确认机器已安装操作系统。 ``` shell script - $ cat /etc/openEuler-release - openEuler release 20.03 (LTS-SP3) + # cat /etc/openEuler-release ``` 2. 下载对应架构的 ISO 镜像(必须是 everything 版本),并存放在任一目录(建议该目录磁盘空间大于 20 GB),此处假设存放在 /home/isocut_iso 目录。 - AArch64 架构的镜像下载链接为: - - https://repo.openeuler.org/openEuler-20.03-LTS-SP3/ISO/aarch64/openEuler-20.03-LTS-SP3-everything-aarch64-dvd.iso + AArch64 架构的镜像下载链接为:https://repo.openeuler.org/openEuler-{version}/ISO/aarch64/openEuler-{version}-everything-aarch64-dvd.iso - > **说明:** - > x86_64 架构的镜像下载链接为: - > - > https://repo.openeuler.org/openEuler-20.03-LTS-SP3/ISO/x86_64/openEuler-20.03-LTS-SP3-everything-x86_64-dvd.iso +3. 创建文件 /etc/yum.repos.d/local.repo,配置对应 yum 源。配置内容参考如下,其中 baseurl 是用于挂载 ISO 镜像的目录。 -3. 创建文件 /etc/yum.repos.d/local.repo,配置对应 yum 源。配置内容参考如下,其中 baseurl 是用于挂载 ISO 镜像的目录: - ``` shell script [local] name=local @@ -57,21 +46,21 @@ openEuler 光盘镜像较大,下载、传输镜像很耗时。另外,使用 gpgcheck=0 enabled=1 ``` - -4. 使用 root 权限,挂载光盘镜像到 /home/isocut_mount 目录(请与上述 repo 文件中配置的 baseurl 保持一致)作为 yum 源,参考命令如下: + +4. 使用 root 权限,挂载光盘镜像到 /home/isocut_mount 目录(请与上述 repo 文件中配置的 baseurl 保持一致)作为 yum 源。参考命令如下: ```shell - sudo mount -o loop /home/isocut_iso/openEuler-20.03-LTS-SP3-everything-aarch64-dvd.iso /home/isocut_mount + sudo mount -o loop /home/isocut_iso/openEuler-24.03-LTS-everything-aarch64-dvd.iso /home/isocut_mount ``` -5. 使 yum 源生效: +5. 使 yum 源生效。 ```shell yum clean all yum makecache ``` -6. 使用 root 权限,安装镜像裁剪定制工具: +6. 使用 root 权限,安装镜像裁剪定制工具。 ```shell sudo yum install -y isocut @@ -165,7 +154,7 @@ isocut 为用户提供了 kickstart 模板,路径是 /etc/isocut/anaconda-ks.c rootpw --iscrypted ${pwd} ``` -这里给出设置 root 初始密码的方法(需使用 root 权限): +这里给出设置 root 初始密码的方法(需使用 root 权限)。 1. 添加用于生成密码的用户,此处假设 testUser。 @@ -173,7 +162,7 @@ rootpw --iscrypted ${pwd} $ sudo useradd testUser ``` -2. 设置 testUser 用户的密码。参考命令如下,根据提示设置密码: +2. 设置 testUser 用户的密码。参考命令如下,根据提示设置密码。 ``` shell script $ sudo passwd testUser @@ -190,14 +179,14 @@ rootpw --iscrypted ${pwd} testUser:***:19052:0:90:7:35:: ``` -4. 拷贝上述加密密码替换 /etc/isocut/anaconda-ks.cfg 中的 pwd 字段,如下所示(请用实际内容替换 *** ): +4. 拷贝上述加密密码替换 /etc/isocut/anaconda-ks.cfg 中的 pwd 字段,如下所示(请用实际内容替换 *** )。 ``` shell script rootpw --iscrypted *** ``` ###### 配置 grub2 初始密码 -/etc/isocut/anaconda-ks.cfg 文件中添加以下配置,配置 grub2 初始密码。其中 ${pwd} 需要替换成用户实际的加密密文: +/etc/isocut/anaconda-ks.cfg 文件中添加以下配置,配置 grub2 初始密码。其中 ${pwd} 需要替换成用户实际的加密密文。 ```shell %addon com_huawei_grub_safe --iscrypted --password='${pwd}' @@ -211,7 +200,7 @@ rootpw --iscrypted ${pwd} > > - 系统中需有 grub2-set-password 命令,若不存在,请提前安装该命令。 -1. 执行如下命令,根据提示设置 grub2 密码: +1. 执行如下命令,根据提示设置 grub2 密码。 ```shell $ sudo grub2-set-password -o ./ @@ -229,7 +218,7 @@ rootpw --iscrypted ${pwd} GRUB2_PASSWORD=grub.pbkdf2.sha512.*** ``` -3. 复制上述密文,并在 /etc/isocut/anaconda-ks.cfg 文件中增加如下配置: +3. 复制上述密文,并在 /etc/isocut/anaconda-ks.cfg 文件中增加如下配置。 ```shell %addon com_huawei_grub_safe --iscrypted --password='grub.pbkdf2.sha512.***' @@ -274,7 +263,7 @@ kernel.aarch64 > >- 请不要修改或删除 /etc/isocut/rpmlist 文件中的默认配置项。 >- isocut 的所有操作需要使用 root 权限。 ->- 待裁剪的源镜像可以为基础镜像,也可以是 everything 版镜像,例子中以基础版镜像 openEuler-20.03-LTS-SP3-aarch64-dvd.iso 为例。 +>- 待裁剪的源镜像可以为基础镜像,也可以是 everything 版镜像,例子中以基础版镜像 openEuler-24.03-LTS-aarch64-dvd.iso 为例。 >- 例子中假设新生成的镜像名称为 new.iso,且存放在 /home/result 路径;运行工具的临时目录为 /home/temp;额外的 RPM 软件包存放在 /home/rpms 目录。 @@ -304,7 +293,7 @@ kernel.aarch64 **场景一**:新镜像的所有 RPM 包来自原有 ISO 镜像 ``` shell script - $ sudo isocut -t /home/temp /home/isocut_iso/openEuler-20.03-LTS-SP3-aarch64-dvd.iso /home/result/new.iso + $ sudo isocut -t /home/temp /home/isocut_iso/openEuler-24.03-LTS-aarch64-dvd.iso /home/result/new.iso Checking input ... Checking user ... Checking necessary tools ... @@ -327,12 +316,12 @@ kernel.aarch64 **场景二**:新镜像的 RPM 包除来自原有 ISO 镜像,还包含来自 /home/rpms 的额外软件包 ```shell - sudo isocut -t /home/temp -r /home/rpms /home/isocut_iso/openEuler-20.03-LTS-SP3-aarch64-dvd.iso /home/result/new.iso + sudo isocut -t /home/temp -r /home/rpms /home/isocut_iso/openEuler-24.03-LTS-aarch64-dvd.iso /home/result/new.iso ``` **场景三**:使用 kickstart 文件实现自动化安装,需要修改 /etc/isocut/anaconda-ks.cfg 文件 ```shell - sudo isocut -t /home/temp -k /etc/isocut/anaconda-ks.cfg /home/isocut_iso/openEuler-20.03-LTS-SP3-aarch64-dvd.iso /home/result/new.iso + sudo isocut -t /home/temp -k /etc/isocut/anaconda-ks.cfg /home/isocut_iso/openEuler-24.03-LTS-aarch64-dvd.iso /home/result/new.iso ``` @@ -365,9 +354,9 @@ kernel.aarch64 1. 增加缺少的包 - 1. 根据报错的提示整理缺少的 RPM 包列表 + 1. 根据报错的提示整理缺少的 RPM 包列表。 2. 将上述 RPM 包列表添加到配置文件 /etc/isocut/rpmlist 中。 - 3. 再次裁剪安装 iso 镜像 + 3. 再次裁剪安装 iso 镜像。 以问题描述中的缺包情况为例,修改 rpmlist 配置文件如下: ```shell diff --git a/docs/zh/docs/Virtualization/LibcarePlus.md b/docs/zh/docs/Virtualization/LibcarePlus.md index 52f5fef43ad2e035748ac5ce8bd91f69cb7fac49..ce7cd7d736ef7cffba4a45508322d3192d136e92 100644 --- a/docs/zh/docs/Virtualization/LibcarePlus.md +++ b/docs/zh/docs/Virtualization/LibcarePlus.md @@ -6,6 +6,7 @@ - [安装 LibcarePlus](#安装-libcareplus) - [制作 LibcarePlus 热补丁](#制作-libcareplus-热补丁) - [应用 LibcarePlus 热补丁](#应用-libcareplus-热补丁) +- [使用 LibcarePlus 工具制作 qemu 热补丁](#使用-libcareplus-工具制作-qemu-热补丁) ## 概述 @@ -395,3 +396,98 @@ LibcarePlus 支持如下方式制作热补丁: Hello world! Hello world! ``` + +## 使用 LibcarePlus 工具制作 qemu 热补丁 + +制作方法如下: + +### 1.下载qemu制品仓代码,保持代码版本与openEuler环境中qemu版本一致 + ```shell +# 下载qemu源码并解压 +yum download --source qemu +rpm2cpio qemu-8.2.0-13.oe2403.src.rpm | cpio -id + ``` + +### 2.编译qemu制品仓代码 + +- 将解压后的qemu源码挪至/root/rpmbuild/SOURCES(由多个patch、一个qemu.spec、一个qemu-8.2.0.tar.xz组成) + +- 编译qemu.spec + + ```shell + rpmbuild -ba qemu.spec + ``` + +有两份成果物: +- /root/rpmbuild/BUILD/qemu-8.2.0中生成中间代码,为编译qemu对应代码。将代码拷贝到/home/abuild/rpmbuild/BUILD/qemu-8.2.0,编译环境的路径也会影响补丁地址的偏移。 +- /root/rpmbuild/RPMS/中生成qemu相关的rpm包。 + +### 3.制作热补丁所需的patch文件 + + 使用git format-patch指令制作patch即可。 + + ```shell + # cat 0001-hack-hmp-qtree-info.patch + From bb2f4e6fe43ca7b3d73026966ac3411b2d8342b9 Mon Sep 17 00:00:00 2001 + From: zhangsan + Date: Mon, 7 Mar 2022 20:53:41 +0800 + Subject: [PATCH 1/3] hack hmp qtree info + + --- + softmmu/qdev-monitor.c | 1 + + 1 file changed, 1 insertion(+) + + diff --git a/softmmu/qdev-monitor.c b/softmmu/qdev-monitor.c + index 05e1d88d99..96fd596c2e 100644 + --- a/softmmu/qdev-monitor.c + +++ b/softmmu/qdev-monitor.c + @@ -833,6 +833,7 @@ static void qbus_print(Monitor *mon, BusState *bus, int indent) + + void hmp_info_qtree(Monitor *mon, const QDict *qdict) + { + + fprintf(stderr, "---------------you hack me---------------------"); + if (sysbus_get_default()) + qbus_print(mon, sysbus_get_default(), 0); + } + -- + 2.33.0 + + ``` + +### 3.配置/etc/libcare.conf + /etc/libcare.conf填上patch文件修改的函数,用于后续制作补丁时,过滤掉不相关的函数; + 当前修改内容如下: + + ```shell + # cat /etc/libcare.conf + hmp_info_qtree + ``` + +### 4.查看qemu buildID + + ```shell + # whereis qemu-kvm + qemu-kvm: /usr/bin/qemu-kvm /usr/libexec/qemu-kvm + # file /usr/libexec/qemu-kvm + /usr/libexec/qemu-kvm: ELF 64-bit LSB pie executable, ARM aarch64, version 1 (GNU/Linux), dynamically linked, interpreter /lib/ld-linux-aarch64.so.1, BuildID[sha1]=68f4ec13e140d3a688f3e0fb93442b8c7a86be8b, for GNU/Linux 3.7.0, stripped + ``` +注:需保持制作热补丁的环境和制作qemu包环境一致,buildID可作为二者是否一致的判定标准。因用户无qemu版本的制作环境,故可以自行编包并安装,使用自编包中的/usr/libexec/qemu-kvm的buildID。 + +### 5.制作热补丁 + +在/home/abuild/rpmbuild/BUILD/qemu-8.2.0/build中执行热补丁制作指令,**注意是build目录!!!** + +```shell +# libcare-patch-make --clean -s ../ 0002-patch-hello-qdm.patch -i 0001 --buildid=68f4ec13e140d3a688f3e0fb93442b8c7a86be8b -j 64 +``` +参数说明: + +--clean 类似make clean + +-s ../ 指定源文件夹,这里是上层目录 + +-i 0001 热补丁id + +buildid=xxx 保持和系统中qemu-kvm buildid一致 + +-j 64 多线程编译 \ No newline at end of file diff --git a/docs/zh/docs/Virtualization/Skylark.md b/docs/zh/docs/Virtualization/Skylark.md index 06a0a2eea9f79491c09c1456473d3cdeaaf23367..eb146e5250e577c2c649d5fa94d09bef17fb4f11 100644 --- a/docs/zh/docs/Virtualization/Skylark.md +++ b/docs/zh/docs/Virtualization/Skylark.md @@ -21,7 +21,7 @@ 将业务区分优先级混合部署(下文简称混部)是典型有效的资源利用率提升手段。业务可根据时延敏感性分为高优先级业务和低优先级业务。当高优先级业务和低优先级业务发生资源竞争时,需优先保障高优先级业务的资源供给。因此,业务混部的核心技术是资源隔离控制,主要涉及内核态基础资源隔离技术及用户态 QoS 控制技术。 -本文描述的对象为用户态 QoS 控制技术,由 openEuler Skylark 组件承载,首发于 openEuler 22.09 版本。在 Skylark 视角下,优先级粒度为虚拟机级别,即给虚拟机新增高低优先级属性,以虚拟机为粒度进行资源的隔离和控制。Skylark 是一种混部场景下的 QoS 感知的资源调度器,在保障高优先级虚拟机 QoS 前提下提升物理机资源利用率。 +本文描述的对象为用户态 QoS 控制技术,由 openEuler Skylark 组件承载。在 Skylark 视角下,优先级粒度为虚拟机级别,即给虚拟机新增高低优先级属性,以虚拟机为粒度进行资源的隔离和控制。Skylark 是一种混部场景下的 QoS 感知的资源调度器,在保障高优先级虚拟机 QoS 前提下提升物理机资源利用率。 在实际应用场景中如何更好地利用 Skylark 的高低优先级特性,请参考[最佳实践](#最佳实践)章节。 diff --git "a/docs/zh/docs/Virtualization/\345\207\206\345\244\207\344\275\277\347\224\250\347\216\257\345\242\203.md" "b/docs/zh/docs/Virtualization/\345\207\206\345\244\207\344\275\277\347\224\250\347\216\257\345\242\203.md" index bb05c54bfbb483023f205d6bb60545122c3209c4..aa3655d0398cde0ac0a9dbde86923b072d283d71 100644 --- "a/docs/zh/docs/Virtualization/\345\207\206\345\244\207\344\275\277\347\224\250\347\216\257\345\242\203.md" +++ "b/docs/zh/docs/Virtualization/\345\207\206\345\244\207\344\275\277\347\224\250\347\216\257\345\242\203.md" @@ -13,7 +13,7 @@ 1. 使用root用户安装qemu-img软件包。 ```sh - # yum install -y qemu-img + yum install -y qemu-img ``` 2. 使用qemu-img工具的create命令,创建镜像文件,命令格式为: @@ -62,7 +62,7 @@ corrupt: false ``` -2. 修改镜像磁盘空间大小,命令如下,其中_imgFiLeName_为镜像名称,“+”和“-”分别表示需要增加或减小的镜像磁盘空间大小,单位为K、M、G、T,代表KiB、MiB、GiB、TiB。 +2. 修改镜像磁盘空间大小,命令如下,其中 _imgFiLeName_ 为镜像名称,“+”和“-”分别表示需要增加或减小的镜像磁盘空间大小,单位为K、M、G、T,代表KiB、MiB、GiB、TiB。 ```sh qemu-img resize [+|-] @@ -117,39 +117,39 @@ Linux网桥通常通过brctl工具管理,其对应的安装包为bridge-utils,安装命令如下: ```sh - # yum install -y bridge-utils + yum install -y bridge-utils ``` 2. 创建网桥br0。 ```sh - # brctl addbr br0 + brctl addbr br0 ``` 3. 将物理网卡eth0绑定到Linux网桥。 ```sh - # brctl addif br0 eth0 + brctl addif br0 eth0 ``` 4. eth0与网桥连接后,不再需要IP地址,将eth0的IP设置为0.0.0.0。 ```sh - # yum install -y net-tools - # ifconfig eth0 0.0.0.0 + yum install -y net-tools + ifconfig eth0 0.0.0.0 ``` 5. 设置br0的IP地址。 - 如果有DHCP服务器,可以通过dhclient设置动态IP地址。 ```sh - # dhclient br0 + dhclient br0 ``` - 如果没有DHCP服务器,给br0配置静态IP,例如设置静态IP为192.168.1.2,子网掩码为255.255.255.0。 ```sh - # ifconfig br0 192.168.1.2 netmask 255.255.255.0 + ifconfig br0 192.168.1.2 netmask 255.255.255.0 ``` ### 搭建Open vSwitch网桥 @@ -227,32 +227,32 @@ Open vSwitch网桥,具有更便捷的自动化编排能力。搭建Open vSwitc 1. 创建Open vSwitch网桥br0。 ```sh - # ovs-vsctl add-br br0 + ovs-vsctl add-br br0 ``` 2. 将物理网卡eth0添加到br0。 ```sh - # ovs-vsctl add-port br0 eth0 + ovs-vsctl add-port br0 eth0 ``` 3. eth0与网桥连接后,不再需要IP地址,将eth0的IP设置为0.0.0.0。 ```sh - # ifconfig eth0 0.0.0.0 + ifconfig eth0 0.0.0.0 ``` 4. 为OVS网桥br0分配IP。 - 如果有DHCP服务器,可以通过dhclient设置动态IP地址。 ```sh - # dhclient br0 + dhclient br0 ``` - 如果没有DHCP服务器,给br0配置静态IP,例如192.168.1.2。 ```sh - # ifconfig br0 192.168.1.2 + ifconfig br0 192.168.1.2 ``` ## 准备引导固件 @@ -272,13 +272,13 @@ Open vSwitch网桥,具有更便捷的自动化编排能力。搭建Open vSwitc 在AArch64架构下edk2的包名为edk2-aarch64 ```sh - # yum install -y edk2-aarch64 + yum install -y edk2-aarch64 ``` 在x86\_64架构下edk2的包名为edk2-ovmf ```sh - # yum install -y edk2-ovmf + yum install -y edk2-ovmf ``` 2. 查询edk软件是否安装成功,命令如下: @@ -331,13 +331,13 @@ openEuler虚拟化使用virsh管理虚拟机。如果希望在非root用户使 2. 将非root用户添加到libvirt用户组。 ```sh - # usermod -a -G libvirt userName + usermod -a -G libvirt userName ``` 3. 切换到非root用户。 ```sh - # su userName + su userName ``` 4. 配置非root用户的环境变量。使用vim打开~/.bashrc文件: @@ -360,6 +360,15 @@ openEuler虚拟化使用virsh管理虚拟机。如果希望在非root用户使 5. 在虚拟机XML配置文件中的domain根元素中添加如下内容,使qemu-kvm进程可以访问磁盘镜像文件。 + 执行如下命令查询当前运行的虚拟机 + ```sh + virsh list + ``` + 使用如下命令编辑对应虚拟机的XML,name参数为上一步查询到的需要修改的目标虚拟机名称 + ```sh + virsh edit + ``` + 添加如下配置,编辑后保存即可生效 ```xml ``` diff --git "a/docs/zh/docs/Virtualization/\345\270\270\350\247\201\351\227\256\351\242\230\344\270\216\350\247\243\345\206\263\346\226\271\346\263\225.md" "b/docs/zh/docs/Virtualization/\345\270\270\350\247\201\351\227\256\351\242\230\344\270\216\350\247\243\345\206\263\346\226\271\346\263\225.md" new file mode 100644 index 0000000000000000000000000000000000000000..a7901d1fdea9dcec650912f44c4f8d08569f0d72 --- /dev/null +++ "b/docs/zh/docs/Virtualization/\345\270\270\350\247\201\351\227\256\351\242\230\344\270\216\350\247\243\345\206\263\346\226\271\346\263\225.md" @@ -0,0 +1,17 @@ +# 常见问题与解决方法 + +## **问题1:使用libcareplus工具制作的qemu热补丁无法加载** + +原因:qemu版本和热补丁版本不一致。 + +解决方法:下载qemu对应版本的源码,同时需保持制作热补丁的环境和制作qemu包环境一致,buildID可作为二者是否一致的判定标准。因用户无qemu版本的制作环境,故可以 **自行编包并安装** ,使用自编包中的/usr/libexec/qemu-kvm的buildID。 + +## **问题2:使用libcareplus工具制作的热补丁已加载但未生效** + +原因:不支持死循环、不退出、递归的函数,不支持对初始化函数、inline 函数以及小于5字节的短函数。 + +解决方法:查看补丁所在函数是否在约束限制中。 + +## **问题3:使用kvmtop工具第一次显示的结果为间隔0.05秒的两次采样计算得到的结果,波动较大** + +此为开源top框架缺陷导致,暂无解决方案。 \ No newline at end of file diff --git "a/docs/zh/docs/Virtualization/\346\234\200\344\275\263\345\256\236\350\267\265.md" "b/docs/zh/docs/Virtualization/\346\234\200\344\275\263\345\256\236\350\267\265.md" index d4603782079867966f398f47415110e222681f7b..0f321d6f77a9f6585acf3893e21d09dc1725bcae 100644 --- "a/docs/zh/docs/Virtualization/\346\234\200\344\275\263\345\256\236\350\267\265.md" +++ "b/docs/zh/docs/Virtualization/\346\234\200\344\275\263\345\256\236\350\267\265.md" @@ -585,20 +585,17 @@ swtpm提供了一个可集成到虚拟化环境中的TPM仿真器(TPM1.2和TPM ```Conf ... - - ... + + ... - ... + ... ... ``` - >![](public_sys-resources/icon-note.gif) **说明:** - >目前,openEuler 20.09 版本 AArch64 架构上的虚拟机可信启动不支持 ACPI 特性,所以虚拟机请勿配置 ACPI 特性,否则启动虚拟机后无法识别 vTPM 设备。AArch64 架构在openEuler 22.03 LTS 之前的版本,tpm model 配置为 \。 - 2. 创建虚拟机。 ```Shell diff --git "a/docs/zh/docs/Virtualization/\347\256\241\347\220\206\347\263\273\347\273\237\350\265\204\346\272\220.md" "b/docs/zh/docs/Virtualization/\347\256\241\347\220\206\347\263\273\347\273\237\350\265\204\346\272\220.md" index 1021dd4c648947e2796d2bbd61d3ea9e59c9a52a..7128d26331ac8562ccf1307e668c504ca081d2d0 100644 --- "a/docs/zh/docs/Virtualization/\347\256\241\347\220\206\347\263\273\347\273\237\350\265\204\346\272\220.md" +++ "b/docs/zh/docs/Virtualization/\347\256\241\347\220\206\347\263\273\347\273\237\350\265\204\346\272\220.md" @@ -196,36 +196,40 @@ QEMU主进程绑定特性是将QEMU主进程绑定到特定的物理CPU范围内 以上命令把虚拟机_open__Euler__VM_的vCPU0绑定到物理CPU0、2、3上,即限制了vCPU0只在这三个物理CPU上调度。**这一绑定关系的调整不会立即生效,在虚拟机下一次启动后才生效,并持久生效**。 -### CPU热插 +### CPU热插拔 #### 概述 -在线增加(热插)虚拟机CPU是指在虚拟机处于运行状态下,为虚拟机热插CPU而不影响虚拟机正常运行的方案。当虚拟机内部业务压力不断增大,会出现所有CPU均处于较高负载的情形。为了不影响虚拟机内的正常业务运行,可以使用CPU热插功能(在不关闭虚拟机情况下增加虚拟机的CPU数目),提升虚拟机的计算能力。 +在线增加或减少(热插拔)虚拟机CPU是指在虚拟机处于运行状态下,为虚拟机热插拔CPU而不影响虚拟机正常运行的方案。当虚拟机内部业务压力不断增大,会出现所有CPU均处于较高负载的情形。为了不影响虚拟机内的正常业务运行,可以使用CPU热插功能(在不关闭虚拟机情况下增加虚拟机的CPU数目),提升虚拟机的计算能力。当虚拟机内部业务压力下降时,可以使用CPU热拔功能(在不关闭虚拟机情况下减少虚拟机的CPU数目)去除多余的计算能力,降低业务成本。 + +注意:从2403版本开始,AArch64架构新增了CPU热拔功能,但实现上采用了新的主线社区方案,和之前版本的CPU热插协议不兼容。Guest版本和Host版本需匹配,即2403及将来版本的Guest需搭配2403及将来版本的Host,2403之前版本的Guest需搭配2403之前版本的Host,才能正常使用CPU热插(拔)功能。 #### 约束限制 -- 如果处理器为AArch64架构,创建虚拟机时指定的虚拟机芯片组类型\(machine\)需为virt-4.1或virt更高版本。如果处理器为x86\_64架构,创建虚拟机时指定的虚拟机芯片组类型\(machine\)需为pc-i440fx-1.5或pc更高版本。 -- 在配置Guest NUMA的场景中,必须把属于同一个socket的vcpu配置在同一vNode中,否则热插CPU后可能导致虚拟机softlockup,进而可能导致虚拟机panic。 -- 虚拟机在迁移、休眠唤醒、快照过程中均不支持CPU热插。 +- 如果处理器为AArch64架构,创建虚拟机时指定的虚拟机芯片组类型\(machine\)需为virt-4.2或virt更高版本。如果处理器为x86\_64架构,创建虚拟机时指定的虚拟机芯片组类型\(machine\)需为pc-i440fx-1.5或pc更高版本。 +- 对于AArch64架构虚拟机,初始启动时就存在的CPU不支持热插拔。 +- 在配置Guest NUMA的场景中,必须把属于同一个socket的vcpu配置在同一vNode中,否则热插拔CPU后可能导致虚拟机softlockup,进而可能导致虚拟机panic。 +- 虚拟机在迁移、休眠唤醒、快照过程中均不支持CPU热插拔。 - 虚拟机CPU热插是否自动上线取决于虚拟机操作系统自身逻辑,虚拟化层不保证热插CPU自动上线。 - CPU热插同时受限于Hypervisor和GuestOS支持的最大CPU数目。 - 虚拟机启动、关闭、重启过程中可能出现热插CPU失效的情况,但再次重启会生效。 -- 热插虚拟机CPU的时候,如果新增CPU数目不是虚拟机CPU拓扑配置项中Cores的整数倍,可能会导致虚拟机内部看到的CPU拓扑是混乱的,建议每次新增的CPU数目为Cores的整数倍。 -- 若需要热插CPU在线生效且在虚拟机重启后仍有效,virsh setvcpus接口中需要同时传入--config和--live选项, 将热插CPU动作持久化。 +- 虚拟机启动、关闭、重启过程中可能出现热拔CPU超时失败的情况,需等虚拟机回到正常运行状态重试。 +- 热插拔虚拟机CPU的时候,如果新增CPU数目不是虚拟机CPU拓扑配置项中Cores的整数倍,可能会导致虚拟机内部看到的CPU拓扑是混乱的,建议每次新增或减少的CPU数目为Cores的整数倍。 +- 若需要热插拔CPU在线生效且在虚拟机重启后仍有效,virsh setvcpus接口中需要同时传入--config和--live选项, 将热插拔CPU动作持久化。 #### 操作步骤 **一、配置虚拟机XML** -1. 使用CPU热插功能,需要在创建虚拟机时配置虚拟机当前的CPU数目、虚拟机所支持的最大CPU数目,以及虚拟机芯片组类型(对于AArch64架构,需为virt-4.1及以上版本。对于x86\_64架构,需为pc-i440fx-1.5及以上版本)。这里以AArch64架构虚拟机为例,配置模板如下: +1. 使用CPU热插拔功能,需要在创建虚拟机时配置虚拟机当前的CPU数目、虚拟机所支持的最大CPU数目,以及虚拟机芯片组类型(对于AArch64架构,需为virt-4.2及以上版本。对于x86\_64架构,需为pc-i440fx-1.5及以上版本)。这里以AArch64架构虚拟机为例,配置模板如下: ``` ... n - hvm + hvm ... @@ -242,7 +246,7 @@ QEMU主进程绑定特性是将QEMU主进程绑定到特定的物理CPU范围内 …… 64 - hvm + hvm …… ``` @@ -283,6 +287,24 @@ QEMU主进程绑定特性是将QEMU主进程绑定到特定的物理CPU范围内 >- --live: 选项,选填。在线生效。 +**三、热拔CPU** + +利用virsh工具进行虚拟机CPU热拔操作。例如给虚拟机openEulerVM热拔CPU到4,参考命令如下: + + ``` + virsh setvcpus openEulerVM 4 --live + ``` + + >![](./public_sys-resources/icon-note.gif) **说明:** + >virsh setvcpus 进行虚拟机CPU热拔操作的格式如下: + >``` + >virsh setvcpus [--config] [--live] + >``` + >- domain: 参数,必填。指定虚拟机名称。 + >- count: 参数,必填。指定目标CPU数目,即热拔后虚拟机CPU数目。 + >- --config: 选项,选填。虚拟机下次启动时仍有效。 + >- --live: 选项,选填。在线生效。 + ## 管理虚拟内存 ### NUMA简介 diff --git "a/docs/zh/docs/Virtualization/\347\256\241\347\220\206\350\231\232\346\213\237\346\234\272\345\217\257\347\273\264\346\212\244\346\200\247.md" "b/docs/zh/docs/Virtualization/\347\256\241\347\220\206\350\231\232\346\213\237\346\234\272\345\217\257\347\273\264\346\212\244\346\200\247.md" index 5b90254e7e63a43aab3a553a2adacc879912463e..f6fcdbec031cd12c22f144e5cb37620207b1d844 100644 --- "a/docs/zh/docs/Virtualization/\347\256\241\347\220\206\350\231\232\346\213\237\346\234\272\345\217\257\347\273\264\346\212\244\346\200\247.md" +++ "b/docs/zh/docs/Virtualization/\347\256\241\347\220\206\350\231\232\346\213\237\346\234\272\345\217\257\347\273\264\346\212\244\346\200\247.md" @@ -15,9 +15,6 @@ NMI Watchdog是一种用来检测Linux出现hardlockup(硬死锁)的机制 ### 操作步骤 针对ARM架构虚拟机配置NMI Watchdog的操作步骤如下: -1. 在虚拟机的引导配置文件grub.cfg中添加如下参数:nmi_watchdog=1 pmu_nmi_enable hardlockup_cpu_freq=auto irqchip.gicv3_pseudo_nmi=1 disable_sdei_nmi_watchdog -2. 检查虚拟机内部PMU Watchdog是否加载成功,如果加载成功,内核dmesg日志打印类似如下内容 - - ``` - [2.1173222] NMI watchdog: CPU0 freq probed as 2399999942 HZ. - ``` \ No newline at end of file + +1. 在虚拟机的引导配置文件grub.cfg中添加如下参数:`nmi_watchdog=1 pmu_nmi_enable hardlockup_cpu_freq=auto irqchip.gicv3_pseudo_nmi=1 disable_sdei_nmi_watchdog hardlockup_enable=1` +2. 检查虚拟机内部PMU Watchdog是否加载成功,如果加载成功,内核dmesg日志打印类似如下内容:`[2.1173222] NMI watchdog: CPU0 freq probed as 2399999942 HZ.` diff --git "a/docs/zh/docs/Virtualization/\347\256\241\347\220\206\350\256\276\345\244\207.md" "b/docs/zh/docs/Virtualization/\347\256\241\347\220\206\350\256\276\345\244\207.md" index c357f0943c245befeb58da3f9cb90d8c2fc8ddaa..f24962aca712e935259b892336dd9827de46df90 100644 --- "a/docs/zh/docs/Virtualization/\347\256\241\347\220\206\350\256\276\345\244\207.md" +++ "b/docs/zh/docs/Virtualization/\347\256\241\347\220\206\350\256\276\345\244\207.md" @@ -464,6 +464,153 @@ SR-IOV(Single Root I/O Virtualizaiton)是一种基于硬件的虚拟化解 >echo $VFNUMS > /sys/class/uacce/hisi_hpre-$hpre_num/device/sriov_numvfs >``` +### vDPA直通 + +#### 概述 + +vDPA直通是将host上的一个设备对接到vDPA框架,通过vhost-vdpa驱动对外呈现字符设备,并将该字符设备配置给虚拟机,供虚拟机使用的一种方式。vDPA直通的磁盘支持作为虚拟机的系统盘或数据盘使用,并支持数据盘热扩容。 + +vDPA直通提供了与VFIO直通持平的IO性能,同时提供了virtio设备的灵活性,可以支持vDPA直通设备热迁移。 + +配合SR-IOV方案,vDPA直通可以实现一个物理网卡(PF)虚拟成多个VF网卡,再将VF网卡对接到vDPA框架后,提供给虚拟机使用。 + +#### 操作方法 + +请使用root用户按照如下操作步骤配置vDPA设备直通 + +1. 创建及配置VF设备,详细流程参考SR-IOV直通中的第1-3步,以下述virtio-net设备为例(08:00.6和08:00.7为PF,其余为创建的VF): + + ```shell + # lspci | grep -i Eth | grep Virtio + 08:00.6 Ethernet controller: Virtio: Virtio network device + 08:00.7 Ethernet controller: Virtio: Virtio network device + 08:01.1 Ethernet controller: Virtio: Virtio network device + 08:01.2 Ethernet controller: Virtio: Virtio network device + 08:01.3 Ethernet controller: Virtio: Virtio network device + 08:01.4 Ethernet controller: Virtio: Virtio network device + 08:01.5 Ethernet controller: Virtio: Virtio network device + 08:01.6 Ethernet controller: Virtio: Virtio network device + 08:01.7 Ethernet controller: Virtio: Virtio network device + 08:02.0 Ethernet controller: Virtio: Virtio network device + 08:02.1 Ethernet controller: Virtio: Virtio network device + 08:02.2 Ethernet controller: Virtio: Virtio network device + ``` + +2. 解绑VF驱动,并绑定对应硬件的厂商vdpa驱动 + + ```shell + echo 0000:08:01.1 > /sys/bus/pci/devices/0000\:08\:01.1/driver/unbind + echo 0000:08:01.2 > /sys/bus/pci/devices/0000\:08\:01.2/driver/unbind + echo 0000:08:01.3 > /sys/bus/pci/devices/0000\:08\:01.3/driver/unbind + echo 0000:08:01.4 > /sys/bus/pci/devices/0000\:08\:01.4/driver/unbind + echo 0000:08:01.5 > /sys/bus/pci/devices/0000\:08\:01.5/driver/unbind + echo -n "1af4 1000" > /sys/bus/pci/drivers/vender_vdpa/new_id + ``` + +3. 绑定vDPA设备后,可以通过vdpa命令查询vdpa管理设备列表 + + ```shell + # vdpa mgmtdev show + pci/0000:08:01.1: + supported_classes net + pci/0000:08:01.2: + supported_classes net + pci/0000:08:01.3: + supported_classes net + pci/0000:08:01.4: + supported_classes net + pci/0000:08:01.5: + supported_classes net + ``` + +4. 完成vdpa设备的创建后,创建vhost-vDPA设备 + + ```shell + vdpa dev add name vdpa0 mgmtdev pci/0000:08:01.1 + vdpa dev add name vdpa1 mgmtdev pci/0000:08:01.2 + vdpa dev add name vdpa2 mgmtdev pci/0000:08:01.3 + vdpa dev add name vdpa3 mgmtdev pci/0000:08:01.4 + vdpa dev add name vdpa4 mgmtdev pci/0000:08:01.5 + ``` + +5. 完成vhost-vDPA的设备创建后,可以通过vdpa命令查询vdpa设备列表;也可以通过libvirt命令查询环境的vhost-vDPA设备信息 + + ```shell + # vdpa dev show + vdpa0: type network mgmtdev pci/0000:08:01.1 vendor_id 6900 max_vqs 3 max_vq_size 256 + vdpa1: type network mgmtdev pci/0000:08:01.2 vendor_id 6900 max_vqs 3 max_vq_size 256 + vdpa2: type network mgmtdev pci/0000:08:01.3 vendor_id 6900 max_vqs 3 max_vq_size 256 + vdpa3: type network mgmtdev pci/0000:08:01.4 vendor_id 6900 max_vqs 3 max_vq_size 256 + vdpa4: type network mgmtdev pci/0000:08:01.5 vendor_id 6900 max_vqs 3 max_vq_size 256 + + # virsh nodedev-list vdpa + vdpa_vdpa0 + vdpa_vdpa1 + vdpa_vdpa2 + vdpa_vdpa3 + vdpa_vdpa4 + + # virsh nodedev-dumpxml vdpa_vdpa0 + + vdpa_vdpa0 + /sys/devices/pci0000:00/0000:00:0c.0/0000:08:01.1/vdpa0 + pci_0000_08_01_1 + + vhost_vdpa + + + /dev/vhost-vdpa-0 + + + ``` + +6. 挂载vDPA设备到虚拟机中 + + 创建虚拟机时,在虚拟机配置文件中增加vDPA直通设备的配置项 + + ```xml + + + + + + ``` + + **表 4** vDPA配置选项说明 + + + + + + + + + + + + + + + + +
+

参数名

+
+

说明

+
+

取值

+
+

hostdev.source.dev

+
+

host上vhost-vdpa字符设备的路径。

+
+

/dev/vhost-vdpa-x

+
+ + >![](./public_sys-resources/icon-note.gif) **说明:** + >根据各硬件厂商的设计不同,创建/配置VF、绑定厂商vdpa驱动等流程如有差异,请以各厂商流程为准。 + + ## 管理虚拟机USB 为了方便在虚拟机内部使用USBkey设备、USB海量存储设备等USB设备,openEuler提供了USB设备直通的功能。用户可以通过USB直通和热插拔相关接口给虚拟机配置直通USB设备、或者在虚拟机处于运行的状态下热插/热拔USB设备。 diff --git "a/docs/zh/docs/astream/\345\256\211\350\243\205\344\270\216\344\275\277\347\224\250\346\226\271\346\263\225.md" "b/docs/zh/docs/astream/\345\256\211\350\243\205\344\270\216\344\275\277\347\224\250\346\226\271\346\263\225.md" index cd8c47239f6831cc9e900de423b27882d7a90f30..b2d41b6f4a9b0226440cc03459766a80529e096a 100644 --- "a/docs/zh/docs/astream/\345\256\211\350\243\205\344\270\216\344\275\277\347\224\250\346\226\271\346\263\225.md" +++ "b/docs/zh/docs/astream/\345\256\211\350\243\205\344\270\216\344\275\277\347\224\250\346\226\271\346\263\225.md" @@ -6,9 +6,9 @@ astream是一款延长磁盘使用寿命的工具。它基于Linux提供的inoti ## 安装 -配置openEuler 22.09 LTS的yum源,直接使用yum命令安装 +配置 openEuler 的yum源,直接使用yum命令安装 -``` +```shell yum install astream ``` @@ -28,7 +28,7 @@ yum install astream 如下示例为一个具体的MySQL的流分配规则文件。 -``` +```text ^/data/mysql/data/ib_logfile 2 ^/data/mysql/data/ibdata1$ 3 ^/data/mysql/data/undo 4 @@ -51,25 +51,25 @@ yum install astream 假设规则文件`stream_rule1.txt`和`stream_rule2.txt` 位于`/home`下,则 -- 监控单目录 +- 监控单目录 ```shell astream -i /data/mysql/data -r /home/stream_rule1.txt ``` -- 监控多目录 +- 监控多目录 本工具支持同时监控多个目录,即每个监控目录都需要传入与之匹配的流分配规则文件。 - 如下示例同时监控两个目录: + 如下示例同时监控两个目录: ```shell astream -i /data/mysql-1/data /data/mysql-2/data -r /home/stream_rule1.txt /home/stream_rule2.txt ``` 上述命令中监控以下两个目录: - - 目录1`/data/mysql-1/data`,对应的流分配规则文件为`/home/stream_rule1.txt`。 - - 目录2`/data/mysql-2/data`,对应的流分配规则文件为`/home/stream_rule2.txt`。 + - 目录1`/data/mysql-1/data`,对应的流分配规则文件为`/home/stream_rule1.txt`。 + - 目录2`/data/mysql-2/data`,对应的流分配规则文件为`/home/stream_rule2.txt`。 ## 命令行参数说明 @@ -103,4 +103,4 @@ astream [options] - 在启动astream守护进程后,不能删除受监控的目录然后创建相同目录,需要重新启动astream; - 规则文件中支持正则匹配一组文件,用户需具备一定的正则匹配知识。 -- 测试时的NVMe SSD磁盘实现了基于NVMe 1.3协议描述的多流功能。 \ No newline at end of file +- 测试时的NVMe SSD磁盘实现了基于NVMe 1.3协议描述的多流功能。 diff --git "a/docs/zh/docs/desktop/HA\347\232\204\345\256\211\350\243\205\344\270\216\351\203\250\347\275\262.md" "b/docs/zh/docs/desktop/HA\347\232\204\345\256\211\350\243\205\344\270\216\351\203\250\347\275\262.md" index f5447752941f61652f409023ee834395f7957903..970233bb035086ad440fc63d153d9ad6de6c0be7 100644 --- "a/docs/zh/docs/desktop/HA\347\232\204\345\256\211\350\243\205\344\270\216\351\203\250\347\275\262.md" +++ "b/docs/zh/docs/desktop/HA\347\232\204\345\256\211\350\243\205\344\270\216\351\203\250\347\275\262.md" @@ -21,7 +21,7 @@ ## 安装与部署 -- 环境准备:需要至少两台安装了openEuler 20.03 LTS SP2的物理机/虚拟机(现以两台为例),安装方法参考《[安装指南](../Installation/installation.md)》。 +- 环境准备:需要至少两台安装了 openEuler 的物理机/虚拟机(现以两台为例),安装方法参考《[安装指南](../Installation/installation.md)》。 ### 修改主机名称及/etc/hosts文件 @@ -49,24 +49,24 @@ ```conf [OS] name=OS -baseurl=http://repo.openeuler.org/openEuler-20.03-LTS-SP2/OS/$basearch/ +baseurl=http://repo.openeuler.org/openEuler-{version}/OS/$basearch/ enabled=1 gpgcheck=1 -gpgkey=http://repo.openeuler.org/openEuler-20.03-LTS-SP2/OS/$basearch/RPM-GPG-KEY-openEuler +gpgkey=http://repo.openeuler.org/openEuler-{version}/OS/$basearch/RPM-GPG-KEY-openEuler [everything] name=everything -baseurl=http://repo.openeuler.org/openEuler-20.03-LTS-SP2/everything/$basearch/ +baseurl=http://repo.openeuler.org/openEuler-{version}/everything/$basearch/ enabled=1 gpgcheck=1 -gpgkey=http://repo.openeuler.org/openEuler-20.03-LTS-SP2/everything/$basearch/RPM-GPG-KEY-openEuler +gpgkey=http://repo.openeuler.org/openEuler-{version}/everything/$basearch/RPM-GPG-KEY-openEuler [EPOL] name=EPOL -baseurl=http://repo.openeuler.org/openEuler-20.03-LTS-SP2/EPOL/$basearch/ +baseurl=http://repo.openeuler.org/openEuler-{version}/EPOL/$basearch/ enabled=1 gpgcheck=1 -gpgkey=http://repo.openeuler.org/openEuler-20.03-LTS-SP2/OS/$basearch/RPM-GPG-KEY-openEuler +gpgkey=http://repo.openeuler.org/openEuler-{version}/OS/$basearch/RPM-GPG-KEY-openEuler ``` ### 安装HA软件包组件 diff --git a/docs/zh/docs/desktop/installha.md b/docs/zh/docs/desktop/installha.md index 133acdbb5745b1f3ebe2971b2df8978e51599595..cbef222f555d92008e63143c02df39186f302f97 100644 --- a/docs/zh/docs/desktop/installha.md +++ b/docs/zh/docs/desktop/installha.md @@ -6,7 +6,7 @@ ### 环境准备 -需要至少两台安装了openEuler 21.03 的物理机/虚拟机(现以两台为例),安装方法参考《[安装指南](../Installation/installation.md)》。 +需要至少两台安装了openEuler 的物理机/虚拟机(现以两台为例),安装方法参考《[安装指南](../Installation/installation.md)》。 ### 修改主机名称及/etc/hosts文件 @@ -34,24 +34,24 @@ ```text [OS] name=OS -baseurl=http://repo.openeuler.org/openEuler-21.03/OS/$basearch/ +baseurl=http://repo.openeuler.org/openEuler-{version}/OS/$basearch/ enabled=1 gpgcheck=1 -gpgkey=http://repo.openeuler.org/openEuler-21.03/OS/$basearch/RPM-GPG-KEY-openEuler +gpgkey=http://repo.openeuler.org/openEuler-{version}/OS/$basearch/RPM-GPG-KEY-openEuler [everything] name=everything -baseurl=http://repo.openeuler.org/openEuler-21.03/everything/$basearch/ +baseurl=http://repo.openeuler.org/openEuler-{version}/everything/$basearch/ enabled=1 gpgcheck=1 -gpgkey=http://repo.openeuler.org/openEuler-21.03/everything/$basearch/RPM-GPG-KEY-openEuler +gpgkey=http://repo.openeuler.org/openEuler-{version}/everything/$basearch/RPM-GPG-KEY-openEuler [EPOL] name=EPOL -baseurl=http://repo.openeuler.org/openEuler-21.03/EPOL/$basearch/ +baseurl=http://repo.openeuler.org/openEuler-{version}/EPOL/$basearch/ enabled=1 gpgcheck=1 -gpgkey=http://repo.openeuler.org/openEuler-21.03/OS/$basearch/RPM-GPG-KEY-openEuler +gpgkey=http://repo.openeuler.org/openEuler-{version}/OS/$basearch/RPM-GPG-KEY-openEuler ``` ### 安装HA软件包组件 diff --git "a/docs/zh/docs/desktop/kiran\345\256\211\350\243\205\346\211\213\345\206\214.md" "b/docs/zh/docs/desktop/kiran\345\256\211\350\243\205\346\211\213\345\206\214.md" index d3076e1d4304d4706a66b32d15c234fea280046a..e608e6e455367dcf845d073904fa0a98cab05d64 100644 --- "a/docs/zh/docs/desktop/kiran\345\256\211\350\243\205\346\211\213\345\206\214.md" +++ "b/docs/zh/docs/desktop/kiran\345\256\211\350\243\205\346\211\213\345\206\214.md" @@ -8,7 +8,7 @@ kiran 桌面是湖南麒麟信安团队以用户和市场需求为导向,研 安装时建议使用 root 用户或者新建一个管理员用户。 -1.下载 openEuler 23.09 镜像并安装系统。 +1.下载 openEuler 镜像并安装系统。 2.更新软件源: diff --git a/docs/zh/docs/desktop/kubesphere.md b/docs/zh/docs/desktop/kubesphere.md index a3f6dd7e7ffc4030b1f0a6f47185dc4b0835848b..f241fc254c1818e503db2a3cc5e36db6d060c623 100644 --- a/docs/zh/docs/desktop/kubesphere.md +++ b/docs/zh/docs/desktop/kubesphere.md @@ -1,6 +1,6 @@ # KubeSphere 安装指南 -本文介绍如何在 openEuler 21.09 上安装和部署 Kubernetes 和 KubeSphere 集群。 +本文介绍如何在 openEuler 上安装和部署 Kubernetes 和 KubeSphere 集群。 ### 什么是 KubeSphere @@ -8,7 +8,7 @@ ### 前提条件 -您需要准备一台安装了 openEuler 21.09 的物理机/虚拟机,安装方法参考《[安装指南](../Installation/installation.md)》。 +您需要准备一台安装了 openEuler 的物理机/虚拟机,安装方法参考《[安装指南](../Installation/installation.md)》。 ### 软件安装 diff --git a/docs/zh/docs/oeAware/figures/dep.png b/docs/zh/docs/oeAware/figures/dep.png new file mode 100644 index 0000000000000000000000000000000000000000..493f118a9a822fa16f8c8375ba9261c1e10ac935 Binary files /dev/null and b/docs/zh/docs/oeAware/figures/dep.png differ diff --git "a/docs/zh/docs/oeAware/oeAware\347\224\250\346\210\267\346\214\207\345\215\227.md" "b/docs/zh/docs/oeAware/oeAware\347\224\250\346\210\267\346\214\207\345\215\227.md" new file mode 100644 index 0000000000000000000000000000000000000000..910a6c2e72a9abebabeebbde84f6123736862e42 --- /dev/null +++ "b/docs/zh/docs/oeAware/oeAware\347\224\250\346\210\267\346\214\207\345\215\227.md" @@ -0,0 +1,534 @@ +# oeAware用户指南 + +## 简介 + +oeAware是在openEuler上实现低负载采集感知调优的框架,目标是动态感知系统行为后智能使能系统的调优特性。传统调优特性都以独立运行且静态打开关闭为主,oeAware将调优拆分采集、感知和调优三层,每层通过订阅方式关联,各层采用插件式开发尽可能复用。 + +## 安装 + +配置openEuler的yum源,使用yum命令安装。在openEuler-22.03-LTS-SP4版本中会默认安装。 + +```shell +yum install oeAware-manager +``` + +## 使用方法 + +首先启动oeaware服务,然后通过`oeawarectl`命令进行使用。 + +### 服务启动 + +通过systemd服务启动。安装完成后会默认启动。 + +```shell +systemctl start oeaware +``` + +配置文件 + +配置文件路径:/etc/oeAware/config.yaml + +```yaml +log_path: /var/log/oeAware #日志存储路径 +log_level: 1 #日志等级 1:DEBUG 2:NFO 3:WARN 4:ERROR +enable_list: #默认使能插件 + - name: libtest.so #只配置插件,使能本插件的所有实例 + - name: libtest1.so #配置插件实例,使能配置的插件实例 + instances: + - instance1 + - instance2 + ... + ... +plugin_list: #可支持下载的包 + - name: test #名称需要唯一,如果重复取第一个配置 + description: hello world + url: https://gitee.com/openeuler/oeAware-manager/raw/master/README.md #url非空 + ... +``` + +修改配置文件后,通过以下命令重启服务。 + +```shell +systemctl restart oeaware +``` + +### 插件说明 + +**插件定义**:一个插件对应一个.so文件,插件分为采集插件、感知插件和调优插件。 + +**实例定义**:服务中的调度单位是实例,一个插件中包括多个实例。例如,一个采集插件包括多个采集项,每个采集项是一个实例。 + + +### 插件加载 + +服务会默认加载插件存储路径下的插件。 + +插件路径:/usr/lib64/oeAware-plugin/ + + +另外也可以通过手动加载的方式加载插件。 + +```shell +oeawarectl -l | --load <插件名> +``` + +示例 + +```shell +[root@localhost ~]# oeawarectl -l libthread_collect.so +Plugin loaded successfully. +``` + +失败返回错误说明。 + +### 插件卸载 + +```shell +oeawarectl -r <插件名> | --remove <插件名> +``` + +示例 + +```shell +[root@localhost ~]# oeawarectl -r libthread_collect.so +Plugin remove successfully. +``` + +失败返回错误说明。 + +### 插件查询 + +#### 查询插件状态信息 + +```shell +oeawarectl -q #查询系统中已经加载的所有插件 +oeawarectl --query <插件名> #查询指定插件 +``` + +示例 + +```shell +[root@localhost ~]# oeawarectl -q +Show plugins and instances status. +------------------------------------------------------------ +libsystem_tune.so + stealtask_tune(available, close, count: 0) + smc_tune(available, close, count: 0) + xcall_tune(available, close, count: 0) + seep_tune(available, close, count: 0) +libpmu.so + pmu_counting_collector(available, close, count: 0) + pmu_sampling_collector(available, close, count: 0) + pmu_spe_collector(available, close, count: 0) + pmu_uncore_collector(available, close, count: 0) +libdocker_tune.so + docker_cpu_burst(available, close, count: 0) +libthread_scenario.so + thread_scenario(available, close, count: 0) +libsystem_collector.so + thread_collector(available, close, count: 0) + kernel_config(available, close, count: 0) + command_collector(available, close, count: 0) +libdocker_collector.so + docker_collector(available, close, count: 0) +libub_tune.so + unixbench_tune(available, close, count: 0) +libanalysis_oeaware.so + analysis_aware(available, close, count: 0) +------------------------------------------------------------ +format: +[plugin] + [instance]([dependency status], [running status], [enable cnt]) +dependency status: available means satisfying dependency, otherwise unavailable. +running status: running means that instance is running, otherwise close. +enable cnt: number of instances enabled. +``` + +失败返回错误说明。 + +#### 查询运行实例订阅关系 + +```shell +oeawarectl -Q #查询所有运行实例的订阅关系图 +oeawarectl --query-dep= <插件实例> #查询运行实例订阅关系图 +``` + +在当前目录下生成dep.png,显示订阅关系。 + +实例未运行,不会显示订阅关系。 + +示例 + +```sh +oeawarectl -e thread_scenario +oeawarectl -Q +``` +![img](./figures/dep.png) + + +### 插件实例使能 + +#### 使能插件实例 + +```shell +oeawarectl -e | --enable <插件实例> +``` + +使能某个插件实例,会将其订阅的topic实例一起使能。 + +失败返回错误说明。 + +推荐使能插件列表: +- libsystem_tune.so:stealtask_tune,smc_tune,xcall_tune,seep_tune。 +- libub_tune.so:unixbench_tune。 +- libtune_numa.so:tune_numa_mem_access。 + +其他插件主要用来提供数据,可通过sdk获取插件数据。 +#### 关闭插件实例 + +```shell +oeawarectl -d | --disable <插件实例> +``` +关闭某个插件实例,会将其订阅的topic实例一起关闭。 + +失败返回错误说明。 + +### 插件下载安装 + +通过`--list`命令查询支持下载的rpm包和已安装的插件。 + +```shell +oeawarectl --list +``` + +查询结果如下。 + +```shell +Supported Packages: #可下载的包 +[name1] #config中配置的plugin_list +[name2] +... +Installed Plugins: #已安装的插件 +[name1] +[name2] +... +``` + +通过`--install`命令下载安装rpm包。 + +```shell +oeawarectl -i | --install #指定--list下查询得到的包名称(Supported Packages下的包) +``` + +失败返回错误说明。 +### 帮助 +通过`--help`查看帮助。 +```shell +usage: oeawarectl [options]... + options + -l|--load [plugin] load plugin. + -r|--remove [plugin] remove plugin from system. + -e|--enable [instance] enable the plugin instance. + -d|--disable [instance] disable the plugin instance. + -q query all plugins information. + --query [plugin] query the plugin information. + -Q query all instances dependencies. + --query-dep [instance] query the instance dependency. + --list the list of supported plugins. + -i|--install [plugin] install plugin from the list. + --help show this help message. +``` + +## 插件开发说明 + +### 基础数据结构 +```c++ +typedef struct { + char *instanceName; // 实例名称 + char *topicName; // 主题名称 + char *params; // 参数 +} CTopic; + +typedef struct { + CTopic topic; + unsigned long long len; // data数组的长度 + void **data; // 存储的数据 +} DataList; + +const int OK = 0; +const int FAILED = -1; + +typedef struct { + int code; // 成功返回OK,失败返回FAILED + char *payload; // 附带信息 +} Result; + +``` + +### 实例基类 + +```c++ +namespace oeaware { +// Instance type. +const int TUNE = 0b10000; +const int SCENARIO = 0b01000; +const int RUN_ONCE = 0b00010; +class Interface { +public: + virtual Result OpenTopic(const Topic &topic) = 0; + virtual void CloseTopic(const Topic &topic) = 0; + virtual void UpdateData(const DataList &dataList) = 0; + virtual Result Enable(const std::string ¶m = "") = 0; + virtual void Disable() = 0; + virtual void Run() = 0; +protected: + std::string name; + std::string version; + std::string description; + std::vector supportTopics; + int priority; + int type; + int period; +} +} +``` +实例开发继承实例基类,实现6个虚函数,并对类的7个属性赋值。 + +实例采用订阅发布模式,通过Subscribe获取数据,通过Publish接口发布数据。 + +### 属性说明 +| 属性 | 类型 | 说明 | +| --- | --- | --- | +| name | string | 实例名称 | +| version | string | 实例版本(预留) | +| description | string | 实例描述 | +| supportTopics | vector | 支持的topic | +| priority | int | 实例执行的优先级 (调优 > 感知 > 采集)| +| type | int | 实例类型,通过比特位标识,第二位表示单次执行实例,第三位表示采集实例,第四位表示感知实例,第5位表示调优实例| +| period | int | 实例执行周期,单位ms,period为10的倍数 | + +### 接口说明 +| 函数名 | 参数 | 返回值 | 说明 | +| --- | --- | --- | --- | +|Result OpenTopic(const Topic &topic) | topic:打开的主题 | | 打开对应的topic | +| void CloseTopic(const Topic &topic) | topic:关闭的主题| |关闭对应的topic | +| void UpdateData(const DataList &dataList) | dataList:订阅的数据 | | 当订阅topic时,被订阅的topic每周期会通过UpdateData更新数据 | +| Result Enable(const std::string ¶m = "") | param:预留 | | 使能本实例 | +| void Disable() | | | 关闭本实例 | +| void Run() | | | 每周期会执行run函数 | +### 实例示例 +```C++ +#include +#include + +class Test : public oeaware::Interface { +public: + Test() { + name = "TestA"; + version = "1.0"; + description = "this is a test plugin"; + supportTopics; + priority = 0; + type = 0; + period = 20; + } + oeaware::Result OpenTopic(const oeaware::Topic &topic) override { + return oeaware::Result(OK); + } + void CloseTopic(const oeaware::Topic &topic) override { + + } + void UpdateData(const DataList &dataList) override { + for (int i = 0; i < dataList.len; ++i) { + ThreadInfo *info = static_cast(dataList.data[i]); + INFO(logger, "pid: " << info->pid << ", name: " << info->name); + } + } + oeaware::Result Enable(const std::string ¶m = "") override { + Subscribe(oeaware::Topic{"thread_collector", "thread_collector", ""}); + return oeaware::Result(OK); + } + void Disable() override { + + } + void Run() override { + DataList dataList; + oeaware::SetDataListTopic(&dataList, "test", "test", ""); + dataList.len = 1; + dataList.data = new void* [1]; + dataList.data[0] = &pubData; + Publish(dataList); + } +private: + int pubData = 1; +}; + +extern "C" void GetInstance(std::vector> &interfaces) +{ + interfaces.emplace_back(std::make_shared()); +} +``` +## 内部插件 +### libpmu.so + +| 实例名称 | 架构 | 说明 | topic | +| --- | --- | --- | --- | +| pmu_counting_collector | aarch64 | 采集count相关事件 |cycles,net:netif_rx,L1-dcache-load-misses,L1-dcache-loads,L1-icache-load-misses,L1-icache-loads,branch-load-misses,branch-loads,dTLB-load-misses,dTLB-loads,iTLB-load-misses,iTLB-loads,cache-references,cache-misses,l2d_tlb_refill,l2d_cache_refill,l1d_tlb_refill,l1d_cache_refill,inst_retired,instructions | +| pmu_sampling_collector | aarch64 | 采集sample相关事件 | cycles, skb:skb_copy_datagram_iovec,net:napi_gro_receive_entry | +| pmu_spe_collector | aarch64 | 采集spe事件 | spe | +| pmu_uncore_collector | aarch64 | 采集uncore事件 | uncore | +#### 限制条件 +采集spe事件需要依赖硬件能力,此插件运行依赖 BIOS 的 SPE,运行前需要将 SPE 打开。 + +运行perf list | grep arm_spe查看是否已经开启SPE,如果开启,则有如下显示 +``` +arm_spe_0// [Kernel PMU event] +``` +如果没有开启,则按下述步骤开启。 + +检查BIOS配置项 MISC Config --> SPE 的状态, 如果状态为 Disable,则需要更改为 Enable。如果找不到这个选项,可能是BIOS版本过低。 + +进入系统 vim /etc/grub2-efi.cfg,定位到内核版本对应的开机启动项,在末尾增加“kpti=off”。例如: +``` +linux /vmlinuz-4.19.90-2003.4.0.0036.oe1.aarch64 root=/dev/mapper/openeuler-root ro rd.lvm.lv=openeuler/root rd.lvm.lv=openeuler/swap video=VGA-1:640x480-32@60me rhgb quiet smmu.bypassdev=0x1000:0x17 smmu.bypassdev=0x1000:0x15 crashkernel=1024M,high video=efifb:off video=VGA-1:640x480-32@60me kpti=off +``` +按“ESC”,输入“:wq”,按“Enter”保存并退出。执行reboot命令重启服务器。 +### libsystem_collector.so +系统信息采集插件。 +| 实例名称 | 架构 | 说明 | topic | +| --- | --- | --- | --- | +| thread_collector | aarch64/x86 | 采集系统中的线程信息 | thread_collector | +| kernel_config | aarch64/x86| 采集内核相关参数,包括sysctl所有参数、lscpu、meminfo等 | get_kernel_config,get_cmd,set_kernel_config | +| command_collector | aarch64/x86 | 采集sysstat相关数据 | mpstat,iostat,vmstat,sar,pidstat | + +### libdocker_collector.so +docker信息采集插件。 +| 实例名称 | 架构 | 说明 | topic | +| --- | --- | --- | --- | +| docker_collector | aarch64/x86 | 采集docker相关信息 | docker_collector | +### libthread_scenario.so +线程感知插件。 +| 实例名称 | 架构 | 说明 | 订阅 | +| --- | --- | --- | --- | +| thread_scenario | aarch64/x86 | 通过配置文件获取对应线程信息 | thread_collector::thread_collector | +#### 配置文件 +thread_scenario.conf +``` +redis +fstime +fsbuffer +fsdisk +``` +### libanalysis_oeaware.so +| 实例名称 | 架构 | 说明 | 订阅 | +| --- | --- | --- | --- | +| analysis_aware | 分析当前环境的业务特征,并给出优化建议 | aarch64 | pmu_spe_collector::spe, pmu_counting_collector::net:netif_rx, pmu_sampling_collector::cycles, pmu_sampling_collector::skb:skb_copy_datagram_iovec, pmu_sampling_collector::net:napi_gro_receive_entry | +### libsystem_tune.so +系统调优插件。 +| 实例名称 | 架构 | 说明 | 订阅 | +| --- | --- | --- | --- | +| stealtask_tune | aarch64 | 高负载场景下,通过轻量级搜索算法,实现多核间快速负载均衡,最大化cpu资源利用率 | 无 | +| smc_tune | aarch64 | 使能smc加速,对使用tcp协议的连接无感加速 | 无 | +| xcall_tune | aarch64 | 通过减少系统调用底噪,提升系统性能 | thread_collector::thread_collector | +| seep_tune | aarch64 | 使能智能功耗模式,降低系统能耗 | 无 | +#### 配置文件 +xcall.yaml +``` yaml +redis: # 线程名称 + - xcall_1: 1 #xcall_1表示xcall优化方式,目前只有xcall_1; 1表示需要优化系统调用号 +mysql: + - xcall_1: 1 +node: + - xcall_1: 1 +``` +#### 限制条件 + +xcall_tune依赖内核特性,需要开启FAST_SYSCALL编译内核,并且增加在cmdline里增加xcall字段。 + +### libub_tune.so +unixbench调优插件。 +| 实例名称 | 架构 | 说明 | 订阅 | +| --- | --- | --- | --- | +| unixbench_tune | aarch64/x86 | 通过减少远端内存访问,优化ub性能 | thread_collector::thread_collector | +### libdocker_tune.so + +| 实例名称 | 架构 | 说明 | 订阅 | +| --- | --- | --- | --- | +| docker_cpu_burst | aarch64 | 在出现突发负载时,CPUBurst可以为容器临时提供额外的CPU资源,缓解CPU限制带来的性能瓶颈 | pmu_counting_collector::cycles,docker_collector::docker_collector | +## 外部插件 +外部插件需要通过以下命令安装,例如安装numafast相关插件 +``` +oeawarectl -i numafast +``` +### libscenario_numa.so +| 实例名称 | 架构 | 说明 | 订阅 | topic | +| --- | --- | --- | --- | --- | +| scenario_numa | aarch64 | 感知当前环境跨NUMA访存比例,用于实例或sdk订阅(无法单独使能) | pmu_uncore_collector::uncore | system_score | +### libtune_numa.so +| 实例名称 | 架构 | 说明 | 订阅 | +| --- | --- | --- | --- | +| tune_numa_mem_access | aarch64 | 周期性迁移线程和内存,减少跨NUMA内存访问 | scenario_numa::system_score, pmu_spe_collector::spe, pmu_counting_collector::cycles | +## SDK使用说明 +```C +typedef int(*Callback)(const DataList *); +int OeInit(); // 初始化资源,与server建立链接 +int OeSubscribe(const CTopic *topic, Callback callback); // 订阅topic,异步执行callback +int OeUnsubscribe(const CTopic *topic); // 取消订阅topic +int OePublish(const DataList *dataList); // 发布数据到server +void OeClose(); // 释放资源 +``` +**示例** +```C +#include "oe_client.h" +#include "command_data.h" +int f(const DataList *dataList) +{ + int i = 0; + for (; i < dataList->len; i++) { + CommandData *data = (CommandData*)dataList->data[i]; + for (int j = 0; j < data->attrLen; ++j) { + printf("%s ", data->itemAttr[j]); + } + printf("\n"); + } + return 0; +} +int main() { + OeInit(); + CTopic topic = { + "command_collector", + "sar", + "-q 1", + }; + if (OeSubscribe(&topic, f) < 0) { + printf("failed\n"); + } else { + printf("success\n"); + } + sleep(10); + OeClose(); +} +``` +## 约束限制 + +### 功能约束 + +oeAware默认集成了arm的微架构采集libkperf模块,该模块同一时间只能有一个进程进行调用,如其他进程调用或者使用perf命令可能存在冲突。 + +### 操作约束 + +当前oeAware仅支持root组用户进行操作,sdk支持root组和oeaware组用户使用。 + +## 注意事项 + +oeAware的配置文件和插件用户组和权限有严格校验,不要对oeAware的相关文件进行权限和用户组进行修改。 + +权限说明: + +- 插件文件:440 + +- 客户端执行文件:750 + +- 服务端执行文件:750 + +- 服务配置文件:640 diff --git a/docs/zh/docs/oncn-bwm/overview.md b/docs/zh/docs/oncn-bwm/overview.md index 52d72ec7602e546f49b4c1e1cbd3d7d882aaf6d8..4f4b973a3187385fb9edf680cabf14e2268ede8d 100644 --- a/docs/zh/docs/oncn-bwm/overview.md +++ b/docs/zh/docs/oncn-bwm/overview.md @@ -2,99 +2,84 @@ ## 简介 -随着云计算、大数据、人工智能、5G、物联网等技术的迅速发展,数据中心的建设越来越重要。然而,数据中心的服务器资源利用率很低,造成了巨大的资源浪费。为了提高服务器资源利用率,oncn-bwm 应运而生。 +随着云计算、大数据、人工智能、5G、物联网等技术的迅速发展,数据中心的建设越来越重要。然而,数据中心的服务器资源利用率很低,造成了巨大的资源浪费。为了提高服务器资源利用率,oncn-bwm应运而生。 -oncn-bwm 是一款适用于离线业务混合部署场景的 Pod 带宽管理工具,它会根据 QoS 分级对节点内的网络资源进行合理调度,保障在线业务服务体验的同时,大幅提升节点整体的网络带宽利用率。 +oncn-bwm是一款适用于在、离线业务混合部署场景的Pod带宽管理工具,它会根据QoS分级对节点内的网络资源进行合理调度,保障在线业务服务体验的同时,大幅提升节点整体的网络带宽利用率。 -oncn-bwm 工具支持如下功能: +oncn-bwm工具支持如下功能: -- 使能/去除/查询 Pod 带宽管理 -- 设置 Pod 网络优先级 +- 使能/去除/查询Pod带宽管理 +- 设置Pod网络优先级 - 设置离线业务带宽范围和在线业务水线 - 内部统计信息查询 - - ## 安装 -安装 oncn-bwm 工具需要操作系统为 openEuler 22.09,在配置了 openEuler yum 源的机器直接使用 yum 命令安装,参考命令如下: - -```shell -# yum install oncn-bwm -``` - -此处介绍如何安装 oncn-bwm 工具。 - ### 环境要求 -* 操作系统:openEuler 22.09 - -### 安装步骤 +操作系统为openEuler-24.03-LTS,且配置了openEuler-24.03-LTS的yum源。 -安装 oncn-bwm 工具的操作步骤如下: +### 安装步骤 -1. 配置openEuler的yum源,直接使用yum命令安装 +使用以下命令直接安装: - ``` - yum install oncn-bwm - ``` +```shell +yum install oncn-bwm +``` ## 使用方法 -oncn-bwm 工具提供了 `bwmcli` 命令行工具来使能 Pod 带宽管理或进行相关配置。`bwmcli` 命令的整体格式如下: +oncn-bwm工具提供了`bwmcli`命令行工具来使能Pod带宽管理或进行相关配置。`bwmcli`命令的整体格式如下: **bwmcli** < option(s) > > 说明: > -> 使用 `bwmcli` 命令需要 root 权限。 +> 使用`bwmcli`命令需要root权限。 > -> 仅支持节点上出方向(报文从节点内发往其他节点)的 Pod 带宽管理。 +> 仅支持节点上出方向(报文从节点内发往其他节点)的Pod带宽管理。 > -> 已设置 tc qdisc 规则的网卡,不支持使能 Pod 带宽管理。 +> 已设置tc qdisc规则的网卡,不支持使能Pod带宽管理。 > -> 升级 oncn-bwm 包不会影响升级前的使能状态;卸载 oncn-bwm 包会关闭所有网卡的 Pod 带宽管理。 - +> 升级oncn-bwm包不会影响升级前的使能状态;卸载oncn-bwm包会关闭所有网卡的Pod带宽管理。 ### 命令接口 -#### Pod 带宽管理 +#### Pod带宽管理 -**命令和功能** +##### 命令和功能 | 命令格式 | 功能 | | --------------------------- | ------------------------------------------------------------ | -| **bwmcli –e** <网卡名称> | 使能指定网卡的 Pod 带宽管理。 | -| **bwmcli -d** <网卡名称> | 去除指定网卡的 Pod 带宽管理。 | -| **bwmcli -p devs** | 查询节点所有网卡的 Pod 带宽管理。 | +| **bwmcli -e** <网卡名称> | 使能指定网卡的Pod带宽管理。 | +| **bwmcli -d** <网卡名称> | 去除指定网卡的Pod带宽管理。 | +| **bwmcli -p devs** | 查询节点所有网卡的Pod带宽管理。 | > 说明: > > - 不指定网卡名时,上述命令会对节点上的所有的网卡生效。 > -> - 执行 `bwmcli` 其他命令前需要开启 Pod 带宽管理。 - +> - 执行 `bwmcli` 其他命令前需要开启Pod带宽管理。 +##### 使用示例 -**使用示例** - -- 使能网卡 eth0 和 eth1 的 Pod 带宽管理 +- 使能网卡eth0和eth1的Pod带宽管理 ```shell - # bwmcli –e eth0 –e eth1 + # bwmcli -e eth0 -e eth1 enable eth0 success enable eth1 success ``` -- 取消网卡 eth0 和 eth1 的 Pod 带宽管理 +- 取消网卡eth0和eth1的Pod带宽管理 ```shell - # bwmcli –d eth0 –d eth1 + # bwmcli -d eth0 -d eth1 disable eth0 success disable eth1 success ``` -- 查询节点所有网卡的 Pod 带宽管理 +- 查询节点所有网卡的Pod带宽管理 ```shell # bwmcli -p devs @@ -105,57 +90,53 @@ oncn-bwm 工具提供了 `bwmcli` 命令行工具来使能 Pod 带宽管理或 lo : disabled ``` -#### Pod 网络优先级 +#### Pod网络优先级 -**命令和功能** +##### 命令和功能 | 命令格式 | 功能 | | ------------------------------------------------------------ | ------------------------------------------------------------ | -| **bwmcli –s** *path* | 设置 Pod 网络优先级。其中 *path* 为 Pod 对应的 cgroup 路径, *prio* 为优先级。*path* 取相对路径或者绝对路径均可。 *prio* 缺省值为 0,可选值为 0 和 -1,0 标识为在线业务,-1 标识为离线业务。 | -| **bwmcli –p** *path* | 查询 Pod 网络优先级。 | +| **bwmcli -s** *path* ** | 设置Pod网络优先级。其中*path*为Pod对应的cgroup路径,*prio*为优先级。*path*取相对路径或者绝对路径均可。 *prio*默认值为0,可选值为0和-1,0标识为在线业务,-1标识为离线业务。 | +| **bwmcli -p** *path* | 查询Pod网络优先级。 | > 说明: > -> 支持在线或离线两种网络优先级,oncn-bwm 工具会按照网络优先级实时控制 Pod 的带宽,具体策略为:对于在线类型的 Pod ,不会限制其带宽;对于离线类型的 Pod ,会将其带宽限制在离线带宽范围内。 +> 支持在线或离线两种网络优先级,oncn-bwm工具会按照网络优先级实时控制Pod的带宽,具体策略为:对于在线类型的Pod,不会限制其带宽;对于离线类型的Pod,会将其带宽限制在离线带宽范围内。 -**使用示例** +##### 使用示例 -- 设置 cgroup 路径为 /sys/fs/cgroup/net_cls/test_online 的 Pod 的优先级为 0 +- 设置cgroup路径为/sys/fs/cgroup/net_cls/test_online的Pod的优先级为0 ```shell # bwmcli -s /sys/fs/cgroup/net_cls/test_online 0 set prio success ``` -- 查询 cgroup 路径为 /sys/fs/cgroup/net_cls/test_online 的 Pod 的优先级 +- 查询cgroup路径为/sys/fs/cgroup/net_cls/test_online的Pod的优先级 ```shell # bwmcli -p /sys/fs/cgroup/net_cls/test_online 0 ``` - - #### 离线业务带宽范围 -| 命令格式 | 功能 | -| ---------------------------------- | ------------------------------------------------------------ | -| **bwmcli –s bandwidth** | 设置一个主机/虚拟机的离线带宽。其中 low 表示最低带宽,high 表示最高带宽,其单位可取值为 kb/mb/gb ,有效范围为 [1mb, 9999gb]。 | -| **bwmcli –p bandwidth** | 查询设置一个主机/虚拟机的离线带宽。 | +| 命令格式 | 功能 | +| ------------------------------------ | ------------------------------------------------------------ | +| **bwmcli -s bandwidth** ** | 设置一个主机/虚拟机的离线带宽。其中*low*表示最低带宽,*high*表示最高带宽,其单位可取值为kb/mb/gb,有效范围为[1mb, 9999gb]。| +| **bwmcli -p bandwidth** | 查询设置一个主机/虚拟机的离线带宽。 | -> 说明: -> -> - 一个主机上所有使能 Pod 带宽管理的网卡在实现内部被当成一个整体看待,也就是共享设置的在线业务水线和离线业务带宽范围。 +> 说明: > -> - 使用 `bwmcli` 设置 Pod 带宽对此节点上所有离线业务生效,所有离线业务的总带宽不能超过离线业务带宽范围。在线业务没有网络带宽限制。 +> - 一个主机上所有使能Pod带宽管理的网卡在实现内部被当成一个整体看待,也就是共享设置的在线业务水线和离线业务带宽范围。 > -> - 离线业务带宽范围与在线业务水线共同完成离线业务带宽限制,当在线业务带宽低于设置的水线时,离线业务允许使用设置的最高带宽;当在线业务带宽高于设置的水线时,离线业务允许使用设置的最低带宽。 - - +> - 使用 `bwmcli` 设置Pod带宽对此节点上所有离线业务生效,所有离线业务的总带宽不能超过离线业务带宽范围。在线业务没有网络带宽限制。 +> +> - 离线业务带宽范围与在线业务水线共同完成离线业务带宽限制,当在线业务带宽低于设置的水线时:离线业务允许使用设置的最高带宽;当在线业务带宽高于设置的水线时,离线业务允许使用设置的最低带宽。 -**使用示例** +##### 使用示例 -- 设置离线带宽范围在 30mb 到 100mb +- 设置离线带宽范围在30mb到100mb ```shell # bwmcli -s bandwidth 30mb,100mb @@ -169,26 +150,23 @@ oncn-bwm 工具提供了 `bwmcli` 命令行工具来使能 Pod 带宽管理或 bandwidth is 31457280(B),104857600(B) ``` - - - #### 在线业务水线 -**命令和功能** +##### 命令和功能 | 命令格式 | 功能 | | ---------------------------------------------- | ------------------------------------------------------------ | -| **bwmcli –s waterline** | 设置一个主机/虚拟机的在线业务水线,其中 *val* 为水线值,单位可取值为 kb/mb/gb ,有效范围为 [20mb, 9999gb]。 | -| **bwmcli –p waterline** | 查询一个主机/虚拟机的在线业务水线。 | +| **bwmcli -s waterline** ** | 设置一个主机/虚拟机的在线业务水线,其中*val*为水线值,单位可取值为kb/mb/gb ,有效范围为[20mb, 9999gb]。 | +| **bwmcli -p waterline** | 查询一个主机/虚拟机的在线业务水线。 | > 说明: > > - 当一个主机上所有在线业务的总带宽高于水线时,会限制离线业务可以使用的带宽,反之当一个主机上所有在线业务的总带宽低于水线时,会提高离线业务可以使用的带宽。 -> - 判断在线业务的总带宽是否超过/低于设置的水线的时机:每 10 ms 判断一次,根据每个 10 ms 内统计的在线带宽是否高于水线来决定对离线业务采用的带宽限制。 +> - 判断在线业务的总带宽是否超过/低于设置的水线的时机:每10ms判断一次,根据每个10ms内统计的在线带宽是否高于水线来决定对离线业务采用的带宽限制。 -**使用示例** +##### 使用示例 -- 设置在线业务水线为 20mb +- 设置在线业务水线为20mb ```shell # bwmcli -s waterline 20mb @@ -202,31 +180,27 @@ oncn-bwm 工具提供了 `bwmcli` 命令行工具来使能 Pod 带宽管理或 waterline is 20971520(B) ``` - - #### 统计信息 -**命令和功能** +##### 命令和功能 | 命令格式 | 功能 | | ------------------- | ------------------ | -| **bwmcli –p stats** | 查询内部统计信息。 | - +| **bwmcli -p stats** | 查询内部统计信息。 | > 说明: > > - offline_target_bandwidth 表示离线业务目标带宽 > -> - online_pkts 表示开启 Pod 带宽管理后在线业务总包数 +> - online_pkts 表示开启Pod带宽管理后在线业务总包数 > -> - offline_pkts 表示开启 Pod 带宽管理后离线业务总包数 +> - offline_pkts 表示开启Pod带宽管理后离线业务总包数 > > - online_rate 表示当前在线业务速率 > > - offline_rate 表示当前离线业务速率 - -**使用示例** +##### 使用示例 查询内部统计信息 @@ -239,20 +213,28 @@ online_rate: 602 offline_rate: 0 ``` - - - - ### 典型使用案例 -完整配置一个节点上的 Pod 带宽管理可以按照如下步骤顺序操作: +完整配置一个节点上的Pod带宽管理可以按照如下步骤顺序操作: -``` -bwmcli -p devs # 查询系统当前网卡 Pod 带宽管理状态 -bwmcli -e eth0 # 使能 eth0 的网卡 Pod 带宽管理 -bwmcli -s /sys/fs/cgroup/net_cls/online 0 # 设置在线业务 Pod 的网络优先级为 0 -bwmcli -s /sys/fs/cgroup/net_cls/offline -1 # 设置离线业务 Pod 的网络优先级为 -1 +```shell +bwmcli -p devs # 查询系统当前网卡Pod带宽管理状态 +bwmcli -e eth0 # 使能eth0的网卡Pod带宽管理 +bwmcli -s /sys/fs/cgroup/net_cls/online 0 # 设置在线业务Pod的网络优先级为0 +bwmcli -s /sys/fs/cgroup/net_cls/offline -1 # 设置离线业务Pod的网络优先级为-1 bwmcli -s bandwidth 20mb,1gb # 配置离线业务带宽范围 bwmcli -s waterline 30mb # 配置在线业务的水线 ``` +### 约束限制 + +1. 仅允许root用户执行bwmcli命令行。 +2. 本特性当前仅支持设置两档网络QoS优先级:离线和在线。 +3. 某个网卡上已经设置过tc qdisc规则的情况下,对此网卡使能网络QoS功能会失败。 +4. 网卡被插拔重新恢复后,原来设置的QoS规则会丢失,需要手动重新配置网络QoS功能。 +5. 用一条命令同时使能/去使能多张网卡的时候,如果中间有网卡执行失败,则终止对后面网卡的执行。 +6. 环境上开启SELinux的情况下,未对bwmcli程序配置SELinux策略可能导致部分命令(例如水线,带宽,优先级的设置或查询)失败,可在SELinux日志中确认。此情况可以通过关闭SELinux或对bwmcli程序配置SELinux策略解决。 +7. 升级包不会影响升级前的使能状态,卸载包会关闭对所有设备的使能。 +8. 网卡名仅支持数字、英文字母、中划线“-” 和下划线“_”这四类字符类型,包含其他字符类型的网卡不被识别。 +9. 实际使用过程中,带宽限速有可能造成协议栈内存积压,此时依赖传输层协议自行反压,对于udp等无反压机制的协议场景,可能出现丢包、ENOBUFS、限速有偏差等问题。 +10. 使用bwmcli使能某个网卡的网络Qos功能后,不能再使用tc命令修改该网卡的tc规则,否则可能会影响该网卡的网络Qos功能,导致功能异常。 diff --git "a/docs/zh/docs/ops_guide/\345\270\270\347\224\250\346\212\200\350\203\275.md" "b/docs/zh/docs/ops_guide/\345\270\270\347\224\250\346\212\200\350\203\275.md" index c90f6ff4bf173778baa5b29e18f9995232445172..6affb0446a02dba87efb2cb4ab2243ea61452c7f 100644 --- "a/docs/zh/docs/ops_guide/\345\270\270\347\224\250\346\212\200\350\203\275.md" +++ "b/docs/zh/docs/ops_guide/\345\270\270\347\224\250\346\212\200\350\203\275.md" @@ -352,7 +352,7 @@ rpm -q --changelog python-2.6.6 | grep -i "CVE-2019-9636" 默认情况下,当向Linux系统添加新的存储库时,GPG密钥将自动导入。同时,也可在RPM命令中添加**--import** 手动导入RPM GPG密钥,用于从存储库下载时检查包的完整性。 ```shell -rpm --import /etc/pki/rpm-gpg/RPM-GPG-KEY-OpenEuler-22.03-LTS-SP2 +rpm --import /etc/pki/rpm-gpg/RPM-GPG-KEY-OpenEuler-24.03-LTS-SP1 ``` 3. dnf命令 diff --git "a/docs/zh/docs/ops_guide/\346\225\205\351\232\234\345\272\224\346\200\245\345\244\204\347\220\206.md" "b/docs/zh/docs/ops_guide/\346\225\205\351\232\234\345\272\224\346\200\245\345\244\204\347\220\206.md" index 8e11dfc0f8eb8cbe7cf26375c6d8d51f9dbc0a54..9102679c9c39ed706da3513848501caaf31be6b5 100644 --- "a/docs/zh/docs/ops_guide/\346\225\205\351\232\234\345\272\224\346\200\245\345\244\204\347\220\206.md" +++ "b/docs/zh/docs/ops_guide/\346\225\205\351\232\234\345\272\224\346\200\245\345\244\204\347\220\206.md" @@ -78,18 +78,15 @@ echo 3 > /proc/sys/vm/drop_caches - 救援模式 - 挂载openEuler 22.03 LTS SP2镜像进入救援模式。 + 挂载 openEuler 镜像进入救援模式。 1. 选择Troubleshooting。 2. 选择Rescue a openEuler system。 3. 按提示操作进行。 1)Continue - 2)Read-only mount - 3)Skip to shell - 4)Quit(Reboot) - 单用户模式 diff --git a/docs/zh/docs/rubik/figures/iocost.PNG b/docs/zh/docs/rubik/figures/iocost.PNG new file mode 100644 index 0000000000000000000000000000000000000000..db9bf8c351e8b7047c5815c5779a98c406a62ccd Binary files /dev/null and b/docs/zh/docs/rubik/figures/iocost.PNG differ diff --git "a/docs/zh/docs/rubik/http\346\216\245\345\217\243\346\226\207\346\241\243.md" "b/docs/zh/docs/rubik/http\346\216\245\345\217\243\346\226\207\346\241\243.md" deleted file mode 100644 index 75bca4a21377e1cd452552922cb1da17e27520ed..0000000000000000000000000000000000000000 --- "a/docs/zh/docs/rubik/http\346\216\245\345\217\243\346\226\207\346\241\243.md" +++ /dev/null @@ -1,67 +0,0 @@ -# http接口 - -## 概述 - -rubik对外开放接口均为http接口,当前包括pod优先级设置/更新接口、rubik探活接口和rubik版本号查询接口。 - -## 接口介绍 - -### 设置、更新Pod优先级接口 - -rubik提供了设置或更新pod优先级的功能,外部可通过调用该接口发送pod相关信息,rubik根据接收到的pod信息对其设置优先级从而达到资源隔离的目的。接口调用格式为: - -```bash -HTTP POST /run/rubik/rubik.sock -{ - "Pods": { - "podaaa": { - "CgroupPath": "kubepods/burstable/podaaa", - "QosLevel": 0 - }, - "podbbb": { - "CgroupPath": "kubepods/burstable/podbbb", - "QosLevel": -1 - } - } -} -``` - -Pods 配置中为需要设置或更新优先级的 Pod 信息,每一个http请求至少需要指定配置1个 pod,每个 pod 必须指定CgroupPath 和 QosLevel,其含义如下: - -| 配置项 | 配置值类型 | 配置取值范围 | 配置含义 | -| ---------- | ---------- | ------------ | ------------------------------------------------------- | -| QosLevel | int | 0、-1 | pod优先级,0表示其为在线业务,-1表示其为离线业务 | -| CgroupPath | string | 相对路径 | 对应Pod的cgroup子路径(即其在cgroup子系统下的相对路径) | - -接口调用示例如下: - -```sh -curl -v -H "Accept: application/json" -H "Content-type: application/json" -X POST --data '{"Pods": {"podaaa": {"CgroupPath": "kubepods/burstable/podaaa","QosLevel": 0},"podbbb": {"CgroupPath": "kubepods/burstable/podbbb","QosLevel": -1}}}' --unix-socket /run/rubik/rubik.sock http://localhost/ -``` - -### 探活接口 - -rubik作为HTTP服务,提供探活接口用于帮助判断rubik是否处于运行状态。 - -接口形式:HTTP/GET /ping - -接口调用示例如下: - -```sh -curl -XGET --unix-socket /run/rubik/rubik.sock http://localhost/ping -``` - -若返回ok则代表rubik服务处于运行状态。 - -### 版本信息查询接口 - -rubik支持通过HTTP请求查询当前rubik的版本号。 - -接口形式:HTTP/GET /version - -接口调用示例如下: - -```sh -curl -XGET --unix-socket /run/rubik/rubik.sock http://localhost/version -{"Version":"0.0.1","Release":"1","Commit":"29910e6","BuildTime":"2021-05-12"} -``` diff --git a/docs/zh/docs/rubik/modules.md b/docs/zh/docs/rubik/modules.md new file mode 100644 index 0000000000000000000000000000000000000000..a6e651b0b889de33163f25a3bd094832ab4e5bf6 --- /dev/null +++ b/docs/zh/docs/rubik/modules.md @@ -0,0 +1,510 @@ +# 特性介绍 + +## preemption 绝对抢占 + +rubik支持业务优先级配置,针对在离线业务混合部署的场景,确保在线业务相对离线业务的资源抢占。目前仅支持CPU资源和内存资源。 + +使用该特性,用户需开启rubik的绝对抢占特性。 + +```yaml +... + "agent": { + "enabledFeatures": [ + "preemption" + ] + }, + "preemption": { + "resource": [ + "cpu", + "memory" + ] + } +... +``` + +配置参数详见[配置文档](./%E9%85%8D%E7%BD%AE%E6%96%87%E6%A1%A3.md#preemption)。 + +同时,用户需要在pod的yaml注解中增加`volcano.sh/preemptable`字段来指定业务优先级。业务优先级配置示例如下: + +```yaml +annotations: + volcano.sh/preemptable: true +``` + +> ![](./figures/icon-note.gif) **说明**: +> +> 在rubik中,所有特性均通过识别`volcano.sh/preemptable`注解作为业务在离线标志。true代表业务为离线业务,false代表业务为在线业务。 + +### CPU绝对抢占 + +针对在离线业务混合部署的场景,确保在线业务相对离线业务的CPU资源抢占。 + +**前置条件** + +- 内核支持针对cgroup的cpu优先级配置,cpu子系统存在接口`cpu.qos_level`。建议使用内核版本openEuler-22.03+。 + +**内核接口** + +- /sys/fs/cgroup/cpu 目录下容器的 cgroup 中,如`/sys/fs/cgroup/cpu/kubepods/burstable//`目录。 +- cpu.qos_level:开启 CPU 优先级配置,默认值为 0, 有效值为 0 和-1。 + - 0:标识为在线业务。 + - -1:标识为离线业务。 + +### 内存绝对抢占 + +针对在离线业务混合部署的场景,确保系统内存不足时优先杀死离线业务。 + +**前置条件** + +- 内核支持针对cgroup的memory优先级配置,memory子系统存在接口`memory.qos_level`。建议使用内核版本openEuler-22.03+。 +- 开启内存优先级支持: `echo 1 > /proc/sys/vm/memcg_qos_enable` + +**内核接口** + +- /proc/sys/vm/memcg_qos_enable:开启内存优先级特性,默认值为 0,有效值为 0 和 1。开启命令为:`echo 1 > /proc/sys/vm/memcg_qos_enable`。 + - 0:表示关闭特性。 + - 1:表示开启特性。 + +- /sys/fs/cgroup/memory 目录下容器的 cgroup 中,如`/sys/fs/cgroup/memory/kubepods/burstable//`目录 + - memory.qos_level:开启内存优先级配置,默认值为 0,有效值为 0 和-1。 + - 0:标识为在线业务。 + - -1:标识为离线业务。 + +## dynCache 访存带宽和LLC限制 + +rubik 支持业务的 Pod 访存带宽(memory bandwidth)和 LLC(Last Level Cache)限制,通过限制离线业务的访存带宽/LLC 使用,减少其对在线业务的干扰。 + +**前置条件**: + +- cache/访存限制功能仅支持物理机,不支持虚拟机。 + - X86 物理机,需要 OS 支持且开启 intel RDT 的 CAT 和 MBA 功能,内核启动项 cmdline 需要添加`rdt=l3cat,mba` + - ARM 物理机,需要 OS 支持且开启 mpam 功能,内核启动项需要添加`mpam=acpi`。 +- 由于内核限制,RDT mode 当前不支持 pseudo-locksetup 模式。 +- 用户需手动挂载目录`/sys/fs/resctrl`。 rubik 需要读取和设置`/sys/fs/resctrl` 目录下的文件,该目录需在 rubik 启动前挂载,且需保障在 rubik 运行过程中不被卸载。 +- rubik运行依赖SYS_ADMIN权限. 设置主机`/sys/fs/resctrl` 目录下的文件需要 rubik 容器被赋有 SYS_ADMIN 权限。 +- rubik 需要获取业务容器进程在主机上的 pid,所以 rubik 容器需与主机共享 pid namespace。 + +**rubik rdt 控制组**: + +rubik 在 RDT resctrl 目录(默认为 /sys/fs/resctrl)下创建 5 个控制组,分别为 rubik_max、rubik_high、rubik_middle、rubik_low、rubik_dynamic。rubik 启动后,将水位线写入对应控制组的 schemata。其中,low、middle、high 的水位线可在 dynCache 中配置;max 控制组为默认最大值,dynamic 控制组初始水位线和 low 控制组一致。 + +**rubik dynamic 控制组**: + +当存在 level 为 dynamic 的离线 Pod 时,rubik 通过采集当前节点在线业务 Pod 的 cache miss 和 llc miss 指标,调整 rubik_dynamic 控制组的水位线,实现对 dynamic 控制组内离线应用 Pod 的动态控制。 + +### 为Pod设置访存带宽和LLC限制 + +rubik支持两种方式为业务Pod配置访存带宽和LLC控制组: + +- 全局方式 + 用户可在rubik的全局参数中配置`defaultLimitMode`字段,rubik会自动为离线业务Pod(即绝对抢占特性中的注解`volcano.sh/preemptable`)配置控制组。 + - 取值为`static`时,pod将被加入到`rubik_max`控制组。 + - 取值为`dynamic`时,pod将被加入到`rubik_dynamic`控制组。 + +- 手动指定 + 用户可手动通过为业务Pod增加注解`volcano.sh/cache-limit`设置其 cache level, 并被加入到指定的控制组中。如下列配置的pod将被加入rubik_low控制组: + + ```yaml + annotations: + volcano.sh/cache-limit: "low" + ``` + +> ![](./figures/icon-note.gif) **说明**: +> +> 1. cache限制只针对离线业务。 +> +> 2. 手动指定注解优先于全局方式。即,若用户在rubik的全局参数中配置了`defaultLimitMode`字段,并且在业务 Pod yaml 中指定了cache level,则dynCache限制将以Pod yaml中的注解为准。 + +### dynCache 内核接口 + +- /sys/fs/resctrl: 在该目录下创建 5 个控制组目录,并修改其 schemata 和 tasks 文件。 + +### dynCache 配置详解 + +dynCache 功能相关的配置如下: + +```json +"agent": { + "enabledFeatures": [ + "dynCache" + ] +}, +"dynCache": { + "defaultLimitMode": "static", + "adjustInterval": 1000, + "perfDuration": 1000, + "l3Percent": { + "low": 20, + "mid": 30, + "high": 50 + }, + "memBandPercent": { + "low": 10, + "mid": 30, + "high": 50 + } +} +``` + +配置参数详见[配置文档](./%E9%85%8D%E7%BD%AE%E6%96%87%E6%A1%A3.md#dyncache)。 + +- l3Percent 和 memBandPercent: + + 通过 l3Percent 和 memBandPercent 配置 low, mid, high 控制组的水位线。 + + 比如当环境的`rdt bitmask=fffff, numa=2`时,rubik_low 的控制组将根据 l3Percent low=20 和 memBandPercent low=10 两个参数,将为/sys/fs/resctrl/rubik_low 控制组配置: + + ```bash + L3:0=f;1=f + MB:0=10;1=10 + ``` + +- defaultLimitMode: + + 如果离线 Pod 未指定`volcano.sh/cache-limit`注解,将根据 dynCache 的 defaultLimitMode 来决定 Pod 将被加入哪个控制组。 +- adjustInterval: + + dynCache 动态调整 rubik_dynamic 控制组的间隔时间,单位 ms,默认 1000ms。 +- perfDuration: + + dynCache 性能 perf 执行时长,单位 ms,默认 1000ms。 + +### dynCache 注意事项 + +- dynCache 仅针对离线 Pod,对在线业务不生效。 +- 若业务容器运行过程中被手动重启(容器 ID 不变但容器进程 PID 变化),针对该容器的 dynCache 无法生效。 +- 业务容器启动并已设置 dynCache 级别后,不支持对其限制级别进行修改。 +- 动态限制组的调控灵敏度受到 rubik 配置文件内 adjustInterval、perfDuration 值以及节点在线业务 Pod 数量的影响,每次调整(若干扰检测结果为需要调整)间隔在区间[adjustInterval+perfDuration, adjustInterval+perfDuration*Pod 数量]内波动,用户可根据灵敏度需求调整配置项。 + +## dynMemory 内存异步分级回收 + +rubik 中支持多种内存策略。针对不同场景使用不同的内存分配方案,以解决多场景内存分配。目前仅支持fssr策略。 + +### fssr 策略 + +fssr策略是基于内核 cgroup 的动态水位线快压制慢恢复策略。memory.high 是内核提供的 memcg 级的水位线接口,rubik 动态检测内存压力,动态调整离线应用的 memory.high 上限,实现对离线业务的内存压制,保障在线业务的服务质量。 + +其核心为: + +- rubik启动时计算预留内存,默认为总内存的10%,如果总内存的10%超过10G,则为10G。 +- 配置离线容器的cgroup级别水位线,内核提供`memory.high`和`memory.high_async_ratio`两个接口,分别配置cgroup的软上限和警戒水位线。启动rubik时默认配置`memory.high`为`total_memory`(总内存)`*`80%。 +- 获取剩余内存free_memory。 +- free_memory小于预留内存reserved_memory时降低离线的memory.high,每次降低总内存的10%,total_memory`*`10%。 +- 持续一分钟free_memory>2`*`reserved_memory时提高离线的memory.high,每次提升总内存的1%,total_memory`*`1%。 + +**内核接口** + +- memory.high + +### dynMemory 配置详解 + +rubik 提供 dynMemory 的指定策略,在`dynMemory`中 + +```json +"dynMemory": { + "policy": "fssr" +} +``` + +- policy 为 memory 的策略名称,支持 fssr 选项。 + +## 支持弹性限流 + +为有效解决由业务CPU限流导致QoS下降的问题,rubik容器提供了弹性限流功能,允许容器使用额外的CPU资源,从而保证业务的平稳运行。弹性限流方案包括内核态和用户态配置两种。二者不可同时使用。 + +用户态通过Linux内核提供的`CFS bandwidth control`能力实现,在保障整机负载水位安全稳定及不影响其他业务运行的前提下,通过双水位机制允许业务容器自适应调整CPU限制,缓解CPU资源瓶颈,提高业务的运行性能。 + +内核态通过Linux内核提供的`CPU burst`能力,允许容器短时间内突破其cpu使用限制。内核态配置需要用户手动设置和修改每个pod的burst值的大小,rubik不作自适应调整。 + +### quotaTurbo 用户态解决方案 + +用户手动为需要自适应调整CPU限额的业务Pod指定“volcano.sh/quota-turbo="true"”注解,(仅针对限额Pod生效,即yaml中指定CPULimit)。 +弹性限流用户态策略根据当前整机CPU负载和容器运行情况定时调整白名单容器的CPU quota,并在启停rubik时自动检验并恢复全部容器的quota值 (本节描述的CPU quota指容器当前的cpu.cfs_quota_us参数)。调整策略包括: + +1. 整机CPU负载低于警戒水位时,若白名单容器在当前周期受到CPU压制,则rubik按照压制情况缓慢提升容器CPU quota。单轮容器Quota提升总量最多不超过当前节点总CPU quota的1%。 +2. 整机CPU负载高于高水位时,若白名单容器在当前周期未受到CPU压制,则rubik依据水位慢速回调容器quota值。 +3. 整机CPU负载高于警戒水位时,若白名单容器当前Quota值超过配置值,则rubik快速回落所有容器CPU quota值,尽力保证负载低于警戒水位。 +4. 容器最大可调整CPU quota不超过2倍用户配置值(例如Pod yaml中指定CPUlimit参数),但不应小于用户配置值。 +5. 容器在60个同步间隔时间内的整体CPU利用率不得超过用户配置值。 +6. 若节点在1分钟内整体 CPU 利用率超过10%,则本轮不提升容器配额。 + +**内核接口** + +/sys/fs/cgroup/cpu 目录下容器的 cgroup 中,如`/sys/fs/cgroup/cpu,cpuacct/kubepods/burstable//`目录,涉及下列文件中: + +- cpu.cfs_quota_us +- cpu.cfs_period_us +- cpu.stat + +#### quotaTurbo配置详解 + +quotaTurbo 功能相关的配置如下: + +```json +"agent": { + "enabledFeatures": [ + "quotaTurbo" + ] + }, +"quotaTurbo": { + "highWaterMark": 60, + "alarmWaterMark": 80, + "syncInterval": 100 +} +``` + +配置参数详见[配置文档](./%E9%85%8D%E7%BD%AE%E6%96%87%E6%A1%A3.md#quotaturbo)。 + +- highWaterMark是CPU负载的高水位值。 +- alarmWaterMark是CPU负载的警戒水位值。 +- syncInterval是触发容器quota值更新的间隔(单位:毫秒)。 + +用户手动为需要业务Pod指定`volcano.sh/quota-turbo="true"`注解。示例如下: + +```yaml +metadata: + annotations: + # true表示列入quotaturbo特性的白名单中 + volcano.sh/quota-turbo : "true" +``` + +### quotaBurst 内核态解决方案 + +quotaBurst通过配置容器的`cpu.cfs_burst_us`内核接口,允许容器在其 cpu 使用量低于 quota 时累积 cpu 资源,并在 cpu 使用量超过 quota 时,使用容器累积的 cpu 资源。 + +**内核接口** + +/sys/fs/cgroup/cpu 目录下容器的 cgroup 中,如`/sys/fs/cgroup/cpu/kubepods/burstable//`目录,注解的值将被写入下列文件中: + +- cpu.cfs_burst_us + +> ![](./figures/icon-note.gif) **说明**: +> +> 内核态通过内核接口cpu.cfs_burst_us实现。支持内核态配置需要确认cgroup的cpu子系统目录下存在cpu.cfs_burst_us文件,其值约束如下: +> +> 1. 当cpu.cfs_quota_us的值不为-1时,需满足cfs_burst_us + cfs_quota_us <= $2^{44}$-1 且 cfs_burst_us <= cfs_quota_us。 +> 2. 当cpu.cfs_quota_us的值为-1时,CPU burst功能不生效,cfs_burst_us默认为0,不支持配置其他任何值。 + +#### quotaBurst配置详解 + +quotaBurst 功能相关的配置如下: + +```json +"agent": { + "enabledFeatures": [ + "quotaBurst" + ] +} +``` + +用户手动为需要业务Pod指定`volcano.sh/quota-burst-time`注解,或者在 Pod 运行期间通过 kubectl annotate 进行动态的修改。 + +- 创建时:在 yaml 文件中 + + ```yaml + metadata: + annotations: + # 默认单位是 microseconds + volcano.sh/quota-burst-time : "2000" + ``` + +- 修改注解: 可通过 kubectl annotate 动态修改,如: + + ```bash + kubectl annotate --overwrite pods volcano.sh/quota-burst-time='3000' + ``` + +### 约束限制 + +- 用户态通过CFS bandwidth control调整cpu.cfs_period_us和cpu.cfs_quota_us参数实现CPU带宽控制。因此用户态约束如下: + - 禁止第三方更改CFS bandwidth control相关参数(包括但不限于cpu.cfs_quota_us、cpu.cfs_period_us等文件),以避免未知错误。 + - 禁止与具有限制CPU资源功能的同类产品同时使用,否则导致用户态功能无法正常使用。 + - 若用户监控CFS bandwidth control相关指标,使用本特性可能会破坏监测指标的一致性。 +- 内核态约束如下: + - 用户应使用k8s接口设置pod的busrt值,禁止用户手动直接修改容器的cpu cgroup目录下的cpu.cfs_burst_us文件。 +- 禁止用户同时使能弹性限流用户态和内核态方案。 + +## ioCost 支持iocost对IO权重控制 + +为了有效解决由离线业务IO占用过高,导致在线业务QoS下降的问题,rubik容器提供了基于cgroup v1 iocost的IO权重控制功能。 +资料参见: +[iocost内核相关功能介绍](https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html#io:~:text=correct%20memory%20ownership.-,IO,-%C2%B6)。 + +**前置条件** + +rubik 支持通过在 cgroup v1 下的 iocost 控制不同 Pod 的 io 权重分配。因此需要内核支持如下特性: + +- 内核支持 cgroup v1 blkcg iocost +- 内核支持 cgroup v1 writeback + +即在 blkcg 根系统文件下存在`blkio.cost.qos`和`blkio.cost.model`两个文件接口。实现方式和接口说明可以访问 openEuler 内核文档。 + +### ioCost实现说明 + +![iocost implement](./figures/iocost.PNG) + +步骤如下: + +- 部署 rubik 时,rubik 解析配置并设置 iocost 相关参数。 +- rubik 注册检测事件到 k8s api-server。 +- Pod 被部署时将 Pod 配置信息等回调到 rubik。 +- rubik 解析 Pod 配置信息,并根据 qos level 配置 Pod iocost 权重。 + +### ioCost配置说明 + +```json +"agent": { + "enabledFeatures": [ + "ioCost" + ] +} +"ioCost": [{ + "nodeName": "k8s-single", + "config": [ + { + "dev": "sdb", + "enable": true, + "model": "linear", + "param": { + "rbps": 10000000, + "rseqiops": 10000000, + "rrandiops": 10000000, + "wbps": 10000000, + "wseqiops": 10000000, + "wrandiops": 10000000 + } + } + ] +}] +``` + +配置参数详见[配置文档](./%E9%85%8D%E7%BD%AE%E6%96%87%E6%A1%A3.md#iocost)。 + +> ![](./figures/icon-note.gif) **说明**: +> +> iocost linear 模型相关参数可以通过 iocost_coef_gen.py 脚本获取,可以从[此链接](https://github.com/torvalds/linux/blob/master/tools/cgroup/iocost_coef_gen.py)获得。 + +## PSI 支持基于PSI指标的干扰检测 + +rubik支持观察在线Pod的PSI指标判断当前在线业务的压力,并通过驱逐离线Pod、日志告警等手段预警。rubik以`some avg10`作为指标。它表示任一任务在10s内的平均阻塞时间占比。用户可按需选择对CPU、内存、IO资源进行监测,并设置相应阈值。若阻塞占比超过该阈值,则rubik按照一定策略驱逐离线Pod,释放相应资源。若在线Pod的CPU和内存利用率偏高,rubik会驱逐当前占用CPU资源/内存资源最多的离线业务。若离线业务I/O高,则会选择驱逐CPU资源占用最多的离线业务。 + +在离线业务由注解`volcano.sh/preemptable="true"/"false"`标识。 + +```yaml +annotations: + volcano.sh/preemptable: true +``` + +**前置条件** + +rubik 依赖于 cgroup v1 下的 psi 特性。openEuler 2203及以上版本支持psi cgroup v1接口。 +通过如下方法查看当前内核是否开启cgroup v1的psi接口: + +```bash +cat /proc/cmdline | grep "psi=1 psi_v1=1" +``` + +若无,则为内核启动命令行新增参数: + +```bash +# 查看内核版本号 +uname -a +# 配置内核的boot文件 +grubby --update-kernel="$(grubby --default-kernel)" --args="psi=1 psi_v1=1" +# 重启 +reboot +``` + +**内核接口** + +/sys/fs/cgroup/cpuacct 目录下容器的 cgroup 中,如`/sys/fs/cgroup/cpu,cpuacct/kubepods/burstable//`目录,涉及下列文件中: + +- cpu.pressure +- memory.pressure +- io.pressure + +### psi配置说明 + +```json +"agent": { + "enabledFeatures": [ + "psi" + ] +} +"psi": { + "interval": 10, + "resource": [ + "cpu", + "memory", + "io" + ], + "avg10Threshold": 5.0 +} +``` + +配置参数详见[配置文档](./%E9%85%8D%E7%BD%AE%E6%96%87%E6%A1%A3.md#psi)。 + +## CPU驱逐水位线控制 + +rubik支持通过根据节点CPU利用率驱逐离线Pod从而避免节点CPU资源过载。用户可以配置CPU驱逐水位线,rubik会统计指定窗口期间节点的平均CPU利用率。若窗口期内平均CPU利用率大于CPU驱逐水位线,则rubik则驱逐资源利用率高且运行时间较短的离线Pod,释放相应资源。 + +> ![](./figures/icon-note.gif) **说明**: +> +> 在离线业务由注解`volcano.sh/preemptable="true"/"false"`标识。 +> +> ```yaml +> annotations: +> volcano.sh/preemptable: true +> ``` + +**配置说明** + +```json +{ + "agent": { + "enabledFeatures": [ + "cpuevict" + ] + } + "cpuevict": { + "threshold": 60, + "interval": 1, + "windows": 2, + "cooldown": 20 + } +} +``` + +配置参数详见[配置文档](./%E9%85%8D%E7%BD%AE%E6%96%87%E6%A1%A3.md#cpu驱逐水位线控制)。 + +## 内存驱逐水位线控制 + +rubik支持通过根据节点内存利用率驱逐离线Pod从而避免节点内存资源过载。用户可以配置内存驱逐水位线。若节点内存利用率大于内存驱逐水位线,则rubik则驱逐资源利用率高且运行时间较短的离线Pod,释放相应资源。 + +> ![](./figures/icon-note.gif) **说明**: +> +> 在离线业务由注解`volcano.sh/preemptable="true"/"false"`标识。 +> +> ```yaml +> annotations: +> volcano.sh/preemptable: true +> ``` + +**配置说明** + +```json +{ + "agent": { + "enabledFeatures": [ + "memoryevict" + ] + } + "memoryevict": { + "threshold": 60, + "interval": 1, + "cooldown": 4 + } +} +``` + +配置参数详见[配置文档](./%E9%85%8D%E7%BD%AE%E6%96%87%E6%A1%A3.md#内存驱逐水位线控制)。 diff --git a/docs/zh/docs/rubik/overview.md b/docs/zh/docs/rubik/overview.md index 52f70b7f583482bf51f21b8956771da0ada6b3d5..d9fca47da6166509ca7185fcc62a5052919eee7d 100644 --- a/docs/zh/docs/rubik/overview.md +++ b/docs/zh/docs/rubik/overview.md @@ -1,18 +1,27 @@ -# rubik使用指南 - -## 概述 - -服务器资源利用率低一直是业界公认的难题,随着云原生技术的发展,将在线(高优先级)、离线(低优先级)业务混合部署成为了当下提高资源利用率的有效手段。 - -rubik容器调度在业务混合部署的场景下,根据QoS分级,对资源进行合理调度,从而实现在保障在线业务服务质量的前提下,大幅提升资源利用率。 - -rubik当前支持如下特性: - -- pod CPU优先级的配置 -- pod memory优先级的配置 - -本文档适用于使用openEuler系统并希望了解和使用rubik的社区开发者、开源爱好者以及相关合作伙伴。使用人员需要具备以下经验和技能: - -* 熟悉Linux基本操作 -* 熟悉kubernetes和docker/iSulad基本操作 - +# rubik 使用指南 + +## 概述 + +如何改善服务器资源利用率低的现状一直是业界公认的难题,随着云原生技术的发展,将在线(高优先级)、离线(低优先级)业务混合部署成为了当下提高资源利用率的有效手段。 + +rubik 容器调度在业务混合部署的场景下,根据 QoS 分级,对资源进行合理调度,从而实现在保障在线业务服务质量的前提下,大幅提升资源利用率。 + +rubik 当前支持如下特性: + +- [preemption 绝对抢占](./modules.md#preemption-绝对抢占) + - [CPU绝对抢占](./modules.md#cpu绝对抢占) + - [内存绝对抢占](./modules.md#内存绝对抢占) +- [dynCache 访存带宽和LLC限制](./modules.md#dyncache-访存带宽和llc限制) +- [dynMemory 内存异步分级回收](./modules.md#dynmemory-内存异步分级回收) +- [支持弹性限流](./modules.md#支持弹性限流) + - [quotaBurst 支持弹性限流内核态解决方案](./modules.md#quotaburst-内核态解决方案) + - [quotaTurbo 支持弹性限流用户态解决方案](./modules.md#quotaturbo-用户态解决方案) +- [ioCost 支持iocost对IO权重控制](./modules.md#iocost-支持iocost对io权重控制) +- [PSI 支持基于PSI指标的干扰检测](./modules.md#psi-支持基于psi指标的干扰检测) +- [CPU驱逐水位线控制](./modules.md#cpu驱逐水位线控制) +- [内存驱逐水位线控制](./modules.md#内存驱逐水位线控制) + +本文档适用于使用 openEuler 系统并希望了解和使用 rubik 的社区开发者、开源爱好者以及相关合作伙伴。使用人员需要具备以下经验和技能: + +- 熟悉 Linux 基本操作 +- 熟悉 kubernetes 和 docker/iSulad 基本操作 diff --git "a/docs/zh/docs/rubik/\345\256\211\350\243\205\344\270\216\351\203\250\347\275\262.md" "b/docs/zh/docs/rubik/\345\256\211\350\243\205\344\270\216\351\203\250\347\275\262.md" index 44f26233aee1c0051e4c14615603f7de4f4f0c70..a6ceb3f6d06a29b576186a1c66f06516c06aebb4 100644 --- "a/docs/zh/docs/rubik/\345\256\211\350\243\205\344\270\216\351\203\250\347\275\262.md" +++ "b/docs/zh/docs/rubik/\345\256\211\350\243\205\344\270\216\351\203\250\347\275\262.md" @@ -2,98 +2,98 @@ ## 概述 -本章节主要介绍rubik组件的安装以及部署方式。 +本章节主要介绍 rubik 组件的安装以及部署方式,以openEuler 24.03-LTS-SP1版本为例进行部署。 ## 软硬件要求 ### 硬件要求 -* 当前仅支持 x86、aarch64架构。 -* rubik磁盘使用需求:配额1GB及以上。 -* rubik内存使用需求:配额100MB及以上。 +* 当前仅支持 x86、aarch64 架构。 +* rubik 磁盘使用需求:配额 1GB 及以上。 +* rubik 内存使用需求:配额 100MB 及以上。 ### 软件要求 -* 操作系统:openEuler 22.03-LTS -* 内核:openEuler 22.03-LTS版本内核 +* 操作系统:openEuler 24.03-LTS-SP1。 +* 内核:openEuler 24.03-LTS-SP1 版本内核。 ### 环境准备 -* 安装 openEuler 系统,安装方法参考《[安装指南](../Installation/installation.md)》。 -* 安装并部署 kubernetes,安装及部署方法参考《Kubernetes 集群部署指南》。 -* 安装docker或isulad容器引擎,若采用isulad容器引擎,需同时安装isula-build容器镜像构建工具。 +* 安装 openEuler 系统。 +* 安装并部署 kubernetes。 +* 安装 docker 或 containerd 容器引擎。 -## 安装rubik +## 安装 rubik -rubik以k8s daemonSet形式部署在k8s的每一个节点上,故需要在每一个节点上使用以下步骤安装rubik rpm包。 +rubik 以`DaemonSet`形式部署在 k8s 的每一个节点上,故需要在每一个节点上使用以下步骤安装 rubik rpm 包。 -1. 配置 yum 源:openEuler 22.03-LTS 和 openEuler 22.03-LTS:EPOL(rubik组件当前仅在EPOL源中),参考如下: +1. 配置 yum 源:rubik组件位于openEuler EPOL源中,以openEuler 24.03-LTS-SP1版本为例,参考如下: ``` - # openEuler 22.03-LTS 官方发布源 - name=openEuler22.03 - baseurl=https://repo.openeuler.org/openEuler-22.03-LTS/everything/$basearch/ + # openEuler 24.03-LTS-SP1 官方发布源 + name=openEuler24.03-LTS-SP1 + baseurl=https://repo.openeuler.org/openEuler-24.03-LTS-SP1/everything/$basearch/ enabled=1 gpgcheck=1 - gpgkey=https://repo.openeuler.org/openEuler-22.03-LTS/everything/$basearch/RPM-GPG-KEY-openEuler + gpgkey=https://repo.openeuler.org/openEuler-24.03-LTS-SP1/everything/$basearch/RPM-GPG-KEY-openEuler ``` ``` - # openEuler 22.03-LTS:Epol 官方发布源 - name=Epol - baseurl=https://repo.openeuler.org/openEuler-22.03-LTS/EPOL/$basearch/ + # openEuler 24.03-LTS-SP1:Epol 官方发布源 + name=openEuler24.03-LTS-SP1-Epol + baseurl=https://repo.openeuler.org/openEuler-24.03-LTS-SP1/EPOL/$basearch/ enabled=1 - gpgcheck=0 + gpgcheck=1 + gpgkey=https://repo.openeuler.org/openEuler-24.03-LTS-SP1/everything/$basearch/RPM-GPG-KEY-openEuler ``` -2. 使用root权限安装rubik: +2. 使用 root 权限安装 rubik: ```shell sudo yum install -y rubik ``` - -> ![](./figures/icon-note.gif)**说明**: +> ![](./figures/icon-note.gif) **说明**: > -> rubik工具相关文件会安装在/var/lib/rubik目录下 +> rubik 工具相关文件会安装在/var/lib/rubik 目录下。 -## 部署rubik +## 部署 rubik -rubik以容器形式运行在混合部署场景下的k8s集群中,用于对不同优先级业务进行资源隔离和限制,避免离线业务对在线业务产生干扰,在提高资源总体利用率的同时保障在线业务的服务质量。当前rubik支持对CPU、内存资源进行隔离和限制,需配合openEuler 22.03-LTS版本的内核使用。若用户想要开启内存优先级特性(即针对不同优先级业务实现内存资源的分级),需要通过设置/proc/sys/vm/memcg_qos_enable开关,有效值为0和1,其中0为缺省值表示关闭特性,1表示开启特性。 +rubik 以容器形式运行在混合部署场景下的 k8s 集群中,用于对不同优先级业务进行资源隔离和限制,避免离线业务对在线业务产生干扰,在提高资源总体利用率的同时保障在线业务的服务质量。当前 rubik 支持对 CPU、内存资源进行隔离和限制等特性,需配合 openEuler 24.03-LTS-SP1 版本的内核使用。若用户想要开启内存优先级特性(即针对不同优先级业务实现内存资源的分级),需要通过设置/proc/sys/vm/memcg_qos_enable 开关,有效值为 0 和 1,其中 0 为默认值表示关闭特性,1 表示开启特性。 ```bash sudo echo 1 > /proc/sys/vm/memcg_qos_enable ``` -### 部署rubik daemonset +### 部署 rubik daemonset -1. 使用docker或isula-build容器引擎构建rubik镜像,由于rubik以daemonSet形式部署,故每一个节点都需要rubik镜像。用户可以在一个节点构建镜像后使用docker save/load功能将rubik镜像load到k8s的每一个节点,也可以在各节点上都构建一遍rubik镜像。以isula-build为例,参考命令如下: +1. 构建rubik镜像:使用`/var/lib/rubik/build_rubik_image.sh`脚本自动构建或者直接使用 docker容器引擎构建 rubik 镜像。由于 rubik 以 daemonSet 形式部署,故每一个节点都需要 rubik 镜像。用户可以在一个节点构建镜像后使用 docker save/load 功能将 rubik 镜像 load 到 k8s 的每一个节点,也可以在各节点上都构建一遍 rubik 镜像。以 docker 为例,其构建命令如下: ```sh -isula-build ctr-img build -f /var/lib/rubik/Dockerfile --tag rubik:0.1.0 . +docker build -f /var/lib/rubik/Dockerfile -t rubik:2.0.1-2 . ``` -2. 在k8s master节点,修改`/var/lib/rubik/rubik-daemonset.yaml`文件中的rubik镜像名,与上一步构建出来的镜像名保持一致。 +2. 在 k8s master 节点,修改`/var/lib/rubik/rubik-daemonset.yaml`文件中的 rubik 镜像名,与上一步构建出来的镜像名保持一致。 ```yaml ... containers: - name: rubik-agent - image: rubik:0.1.0 # 此处镜像名需与上一步构建的rubik镜像名一致 + image: rubik_image_name_and_tag # 此处镜像名需与上一步构建的 rubik 镜像名一致 imagePullPolicy: IfNotPresent ... ``` -3. 在k8s master节点,使用kubectl命令部署rubik daemonset,rubik会自动被部署在k8s的所有节点: +3. 在 k8s master 节点,使用 kubectl 命令部署 rubik daemonset,rubik 会自动被部署在 k8s 的所有节点: ```sh kubectl apply -f /var/lib/rubik/rubik-daemonset.yaml ``` -4. 使用`kubectl get pods -A`命令查看rubik是否已部署到集群每一个节点上(rubik-agent数量与节点数量相同且均为Running状态) +4. 使用`kubectl get pods -A`命令查看 rubik 是否已部署到集群每一个节点上(rubik-agent 数量与节点数量相同且均为 Running 状态): ```sh -[root@localhost rubik]# kubectl get pods -A +[root@localhost rubik]# kubectl get pods -A | grep rubik NAMESPACE NAME READY STATUS RESTARTS AGE ... kube-system rubik-agent-76ft6 1/1 Running 0 4s @@ -102,52 +102,37 @@ kube-system rubik-agent-76ft6 1/1 Running ## 常用配置说明 -通过以上方式部署的rubik将以默认配置启动,用户可以根据实际需要修改rubik配置,可通过修改rubik-daemonset.yaml文件中的config.json段落内容后重新部署rubik daemonset实现。 +通过以上方式部署的 rubik 将以默认配置启动,用户可以根据实际需要修改 rubik 配置,可通过修改 rubik-daemonset.yaml 文件中的 config.json 段落内容后重新部署 rubik daemonset 实现。以下介绍几个常见配置,其他配置详见 [配置文档](./配置文档.md)。 -本章介绍 config.json 的常用配置,以方便用户根据需要进行配置。 +### Pod 绝对抢占特性 -### 配置项说明 +用户在开启了 rubik 绝对抢占特性后,仅需在部署业务 Pod 时在 yaml 中通过 annotation 指定其优先级。部署后 rubik 会自动感知当前节点 Pod 的创建与更新,并根据用户配置的优先级设置 Pod 优先级。对于已经启动的或者更改注解的Pod, rubik 会自动更正Pod的优先级配置。 ```yaml -# 该部分配置内容位于rubik-daemonset.yaml文件中的config.json段落 -{ - "autoConfig": true, - "autoCheck": false, - "logDriver": "stdio", - "logDir": "/var/log/rubik", - "logSize": 1024, - "logLevel": "info", - "cgroupRoot": "/sys/fs/cgroup" -} +... + "agent": { + "enabledFeatures": [ + "preemption" + ] + }, + "preemption": { + "resource": [ + "cpu", + "memory" + ] + } +... ``` -| 配置项 | 配置值类型 | 配置取值范围 | 配置含义 | -| ---------- | ---------- | ------------------ | ------------------------------------------------------------ | -| autoConfig | bool | true、false | true:开启Pod自动感知功能。
false:关闭 Pod 自动感知功能。 | -| autoCheck | bool | true、false | true:开启 Pod 优先级校验功能。
false:关闭 Pod 优先级校验功能。 | -| logDriver | string | stdio、file | stdio:直接向标准输出打印日志,日志收集和转储由调度平台完成。
file:将文件打印到日志目录,路径由logDir指定。 | -| logDir | string | 绝对路径 | 指定日志存放的目录路径。 | -| logSize | int | [10,1048576] | 指定日志存储总大小,单位 MB,若日志总量达到上限则最早的日志会被丢弃。 | -| logLevel | string | error、info、debug | 指定日志级别。 | -| cgroupRoot | string | 绝对路径 | 指定 cgroup 挂载点。 | - -### Pod优先级自动配置 - -若在rubik config中配置autoConfig为true开启了Pod自动感知配置功能,用户仅需在部署业务pod时在yaml中通过annotation指定其优先级,部署后rubik会自动感知当前节点pod的创建与更新,并根据用户配置的优先级设置pod优先级。 - -### 依赖于kubelet的Pod优先级配置 - -由于Pod优先级自动配置依赖于来自api-server pod创建事件的通知,具有一定的延迟性,无法在进程启动之前及时完成Pod优先级的配置,导致业务性能可能存在抖动。用户可以关闭优先级自动配置选项,通过修改kubelet源码,在容器cgroup创建后、容器进程启动前调用rubik http接口配置pod优先级,http接口具体使用方法详见[http接口文档](./http接口文档.md) - -### 支持自动校对Pod优先级 - -rubik支持在启动时对当前节点Pod QoS优先级配置进行一致性校对,此处的一致性是指k8s集群中的配置和rubik对pod优先级的配置之间的一致性。该校对功能默认关闭,用户可以通过 autoCheck 选项控制是否开启。若开启该校对功能,启动或者重启 rubik 时,rubik会自动校验并更正当前节点pod优先级配置。 +> ![](./figures/icon-note.gif) **说明**: +> +> 优先级配置仅支持Pod由在线切换为离线,不允许由离线切换为在线。 -## 在离线业务配置示例 +## 在/离线业务配置示例 -rubik部署成功后,用户在部署实际业务时,可以根据以下配置示例对业务yaml文件进行修改,指定业务的在离线类型,rubik即可在业务部署后对其优先级进行配置,从而达到资源隔离的目的。 +rubik 部署成功后,用户在部署实际业务时,可以根据以下配置示例对业务 yaml 文件进行修改,指定业务的在离线类型,rubik 即可在业务部署后对其优先级进行配置,从而达到资源隔离的目的。 -以下为部署一个nginx在线业务的示例: +以下为部署一个 nginx 在线业务的示例: ```yaml apiVersion: v1 @@ -156,7 +141,7 @@ metadata: name: nginx namespace: qosexample annotations: - volcano.sh/preemptable: "false" # volcano.sh/preemptable为true代表业务为离线业务,false代表业务为在线业务,默认为false + volcano.sh/preemptable: "false" # volcano.sh/preemptable 为 true 代表业务为离线业务,false 代表业务为在线业务,默认为 false spec: containers: - name: nginx @@ -169,31 +154,3 @@ spec: memory: "200Mi" cpu: "1" ``` - -## 约束限制 - -- rubik接受HTTP请求并发量上限1000QPS,并发量超过上限则报错。 - -- rubik接受的单个请求中pod上限为100个,pod数量越界则报错。 - -- 每个k8s节点只能部署一个rubik,多个rubik会冲突。 - -- rubik不提供端口访问,只能通过socket通信。 - -- rubik只接收合法http请求路径及网络协议:http://localhost/(POST)、http://localhost/ping(GET)、http://localhost/version(GET)。各http请求的功能详见[http接口文档](./http接口文档.md)。 - -- rubik磁盘使用需求:配额1GB及以上。 - -- rubik内存使用需求:配额100MB及以上。 - -- 禁止将业务从低优先级(离线业务)往高优先级(在线业务)切换。如业务A先被设置为离线业务,接着请求设置为在线业务,rubik报错。 - -- 容器挂载目录时,rubik本地套接字/run/rubik的目录权限需由业务侧保证最小权限700。 - -- rubik服务端可用时,单个请求超时时间为120s。如果rubik进程进入T(暂停状态或跟踪状态)、D状态(不可中断的睡眠状态),则服务端不可用,此时rubik服务不会响应任何请求。为了避免此情况的发生,请在客户端设置超时时间,避免无限等待。 - -- 使用混部后,原始的cgroup cpu share功能存在限制。具体表现为: - - 若当前CPU中同时有在线任务和离线任务运行,则离线任务的CPU share配置无法生效。 - - 若当前CPU中只有在线任务或只有离线任务,CPU share能生效。 \ No newline at end of file diff --git "a/docs/zh/docs/rubik/\346\267\267\351\203\250\351\232\224\347\246\273\347\244\272\344\276\213.md" "b/docs/zh/docs/rubik/\346\267\267\351\203\250\351\232\224\347\246\273\347\244\272\344\276\213.md" index 109179e5a80cf48f39365f3f5036f185a816340a..46ea8634236abe117cdf476e2727f3f2b077d6e4 100644 --- "a/docs/zh/docs/rubik/\346\267\267\351\203\250\351\232\224\347\246\273\347\244\272\344\276\213.md" +++ "b/docs/zh/docs/rubik/\346\267\267\351\203\250\351\232\224\347\246\273\347\244\272\344\276\213.md" @@ -1,233 +1,230 @@ -## 混部隔离示例 - -### 环境准备 - -查看内核是否支持混部隔离功能 - -```bash -# 查看/boot/config-系统配置是否开启混部隔离功能 -# 若CONFIG_QOS_SCHED=y则说明使能了混部隔离功能,例如: -cat /boot/config-5.10.0-60.18.0.50.oe2203.x86_64 | grep CONFIG_QOS -CONFIG_QOS_SCHED=y -``` - -安装docker容器引擎 - -```bash -yum install -y docker-engine -docker version -# 如下为docker version显示结果 -Client: - Version: 18.09.0 - EulerVersion: 18.09.0.300 - API version: 1.39 - Go version: go1.17.3 - Git commit: aa1eee8 - Built: Wed Mar 30 05:07:38 2022 - OS/Arch: linux/amd64 - Experimental: false - -Server: - Engine: - Version: 18.09.0 - EulerVersion: 18.09.0.300 - API version: 1.39 (minimum version 1.12) - Go version: go1.17.3 - Git commit: aa1eee8 - Built: Tue Mar 22 00:00:00 2022 - OS/Arch: linux/amd64 - Experimental: false -``` - -### 混部业务 - -**在线业务(clickhouse)** - -使用clickhouse-benchmark测试工具进行性能测试,统计出QPS/P50/P90/P99等相关性能指标,用法参考:https://clickhouse.com/docs/zh/operations/utilities/clickhouse-benchmark/ - -**离线业务(stress)** - -stress是一个CPU密集型测试工具,可以通过指定--cpu参数启动多个并发CPU密集型任务给系统环境加压 - -### 使用说明 - -1)启动一个clickhouse容器(在线业务)。 - -2)进入容器内执行clickhouse-benchmark命令,设置并发线程数为10个、查询10000次、查询总时间30s。 - -3)同时启动一个stress容器(离线业务),并发执行10个CPU密集型任务对环境进行加压。 - -4)clickhouse-benchmark执行完后输出一个性能测试报告。 - -混部隔离测试脚本(**test_demo.sh**)如下: - -```bash -#!/bin/bash - -with_offline=${1:-no_offline} -enable_isolation=${2:-no_isolation} -stress_num=${3:-10} -concurrency=10 -timeout=30 -output=/tmp/result.json -online_container= -offline_container= - -exec_sql="echo \"SELECT * FROM system.numbers LIMIT 10000000 OFFSET 10000000\" | clickhouse-benchmark -i 10000 -c $concurrency -t $timeout" - -function prepare() -{ - echo "Launch clickhouse container." - online_container=$(docker run -itd \ - -v /tmp:/tmp:rw \ - --ulimit nofile=262144:262144 \ - -p 34424:34424 \ - yandex/clickhouse-server) - - sleep 3 - echo "Clickhouse container lauched." -} - -function clickhouse() -{ - echo "Start clickhouse benchmark test." - docker exec $online_container bash -c "$exec_sql --json $output" - echo "Clickhouse benchmark test done." -} - -function stress() -{ - echo "Launch stress container." - offline_container=$(docker run -itd joedval/stress --cpu $stress_num) - echo "Stress container launched." - - if [ $enable_isolation == "enable_isolation" ]; then - echo "Set stress container qos level to -1." - echo -1 > /sys/fs/cgroup/cpu/docker/$offline_container/cpu.qos_level - fi -} - -function benchmark() -{ - if [ $with_offline == "with_offline" ]; then - stress - sleep 3 - fi - clickhouse - echo "Remove test containers." - docker rm -f $online_container - docker rm -f $offline_container - echo "Finish benchmark test for clickhouse(online) and stress(offline) colocation." - echo "===============================clickhouse benchmark==================================================" - cat $output - echo "===============================clickhouse benchmark==================================================" -} - -prepare -benchmark -``` - -### 测试结果 - -单独执行clickhouse在线业务 - -```bash -sh test_demo.sh no_offline no_isolation -``` - -得到在线业务的QoS(QPS/P50/P90/P99等指标)**基线数据**如下: - -```json -{ -"localhost:9000": { -"statistics": { -"QPS": 1.8853412284364512, -...... -}, -"query_time_percentiles": { -...... -"50": 0.484905256, -"60": 0.519641313, -"70": 0.570876148, -"80": 0.632544937, -"90": 0.728295525, -"95": 0.808700418, -"99": 0.873945121, -...... -} -} -} -``` - -启用stress离线业务,未开启混部隔离功能下,执行test_demo.sh测试脚本 - -```bash -# with_offline参数表示启用stress离线业务 -# no_isolation参数表示未开启混部隔离功能 -sh test_demo.sh with_offline no_isolation -``` - -**未开启混部隔离的情况下**,clickhouse业务QoS数据(QPS/P80/P90/P99等指标)如下: - -```json -{ -"localhost:9000": { -"statistics": { -"QPS": 0.9424028693636205, -...... -}, -"query_time_percentiles": { -...... -"50": 0.840476774, -"60": 1.304607373, -"70": 1.393591017, -"80": 1.41277543, -"90": 1.430316688, -"95": 1.457534764, -"99": 1.555646855, -...... -} -} -``` - -启用stress离线业务,开启混部隔离功能下,执行test_demo.sh测试脚本 - -```bash -# with_offline参数表示启用stress离线业务 -# enable_isolation参数表示开启混部隔离功能 -sh test_demo.sh with_offline enable_isolation -``` - -**开启混部隔离功能的情况下**,clickhouse业务QoS数据(QPS/P80/P90/P99等指标)如下: - -```json -{ -"localhost:9000": { -"statistics": { -"QPS": 1.8825798759270718, -...... -}, -"query_time_percentiles": { -...... -"50": 0.485725185, -"60": 0.512629901, -"70": 0.55656488, -"80": 0.636395956, -"90": 0.734695906, -"95": 0.804118275, -"99": 0.887807409, -...... -} -} -} -``` - -从上面的测试结果整理出一个表格如下: - -| 业务部署方式 | QPS | P50 | P90 | P99 | -| -------------------------------------- | ------------- | ------------- | ------------- | ------------- | -| 单独运行clickhouse在线业务(基线) | 1.885 | 0.485 | 0.728 | 0.874 | -| clickhouse+stress(未开启混部隔离功能) | 0.942(-50%) | 0.840(-42%) | 1.430(-49%) | 1.556(-44%) | -| clickhouse+stress(开启混部隔离功能) | 1.883(-0.11%) | 0.486(-0.21%) | 0.735(-0.96%) | 0.888(-1.58%) | - -在未开启混部隔离功能的情况下,在线业务clickhouse的QPS从1.9下降到0.9,同时业务的响应时延(P90)也从0.7s增大到1.4s,在线业务QoS下降了50%左右;而在开启混部隔离功能的情况下,不管是在线业务的QPS还是响应时延(P50/P90/P99)相比于基线值下降不到2%,在线业务QoS基本没有变化。 +# 混部隔离示例 + +## 环境准备 + +查看内核是否支持混部隔离功能。 + +```bash +# 查看/boot/config-系统配置是否开启混部隔离功能 +# 若 CONFIG_QOS_SCHED=y 则说明使能了混部隔离功能,例如: +cat /boot/config-5.10.0-60.18.0.50.oe2203.x86_64 | grep CONFIG_QOS +CONFIG_QOS_SCHED=y +``` + +安装 docker 容器引擎。 + +```bash +yum install -y docker-engine +docker version +# 如下为 docker version 显示结果 +Client: + Version: 18.09.0 + EulerVersion: 18.09.0.325 + API version: 1.39 + Go version: go1.17.3 + Git commit: ce4ae23 + Built: Mon Jun 26 12:56:54 2023 + OS/Arch: linux/arm64 + Experimental: false + +Server: + Engine: + Version: 18.09.0 + EulerVersion: 18.09.0.325 + API version: 1.39 (minimum version 1.12) + Go version: go1.17.3 + Git commit: ce4ae23 + Built: Mon Jun 26 12:56:10 2023 + OS/Arch: linux/arm64 + Experimental: false +``` + +## 混部业务 + +### 在线业务 (clickhouse) + +使用 clickhouse-benchmark 测试工具进行性能测试,统计出 QPS/P50/P90/P99 等相关性能指标,用法参考: + +### 离线业务 (stress) + +stress 是一个 CPU 密集型测试工具,可以通过指定--cpu 参数启动多个并发 CPU 密集型任务给系统环境加压。 + +## 使用说明 + +1. 启动一个 clickhouse 容器(在线业务)。 + +2. 进入容器内执行 clickhouse-benchmark 命令,设置并发线程数为 10 个、查询 10000 次、查询总时间 30s。 + +3. 同时启动一个 stress 容器(离线业务),并发执行 10 个 CPU 密集型任务对环境进行加压。 + +4. clickhouse-benchmark 执行完后输出一个性能测试报告。 + +混部隔离测试脚本 (**test_demo.sh**) 如下: + +```bash +#!/bin/bash + +with_offline=${1:-no_offline} +enable_isolation=${2:-no_isolation} +stress_num=${3:-10} +concurrency=10 +timeout=30 +output=/tmp/result.json +online_container= +offline_container= + +exec_sql="echo \"SELECT * FROM system.numbers LIMIT 10000000 OFFSET 10000000\" | clickhouse-benchmark -i 10000 -c $concurrency -t $timeout" + +function prepare() { + echo "Launch clickhouse container." + online_container=$(docker run -itd \ + -v /tmp:/tmp:rw \ + --ulimit nofile=262144:262144 \ + -p 34424:34424 \ + yandex/clickhouse-server) + + sleep 3 + echo "Clickhouse container lauched." +} + +function clickhouse() { + echo "Start clickhouse benchmark test." + docker exec $online_container bash -c "$exec_sql --json $output" + echo "Clickhouse benchmark test done." +} + +function stress() { + echo "Launch stress container." + offline_container=$(docker run -itd joedval/stress --cpu $stress_num) + echo "Stress container launched." + + if [ $enable_isolation == "enable_isolation" ]; then + echo "Set stress container qos level to -1." + echo -1 > /sys/fs/cgroup/cpu/docker/$offline_container/cpu.qos_level + fi +} + +function benchmark() { + if [ $with_offline == "with_offline" ]; then + stress + sleep 3 + fi + clickhouse + echo "Remove test containers." + docker rm -f $online_container + docker rm -f $offline_container + echo "Finish benchmark test for clickhouse(online) and stress(offline) colocation." + echo "===============================clickhouse benchmark==================================================" + cat $output + echo "===============================clickhouse benchmark==================================================" +} + +prepare +benchmark +``` + +## 测试结果 + +单独执行 clickhouse 在线业务。 + +```bash +sh test_demo.sh no_offline no_isolation +``` + +得到在线业务的 QoS(QPS/P50/P90/P99 等指标)**基线数据**如下: + +```json +{ + "localhost:9000": { + "statistics": { + "QPS": 1.8853412284364512, + ...... + } + }, + "query_time_percentiles": { + ...... + "50": 0.484905256, + "60": 0.519641313, + "70": 0.570876148, + "80": 0.632544937, + "90": 0.728295525, + "95": 0.808700418, + "99": 0.873945121, + ...... + } +} +``` + +启用 stress 离线业务,未开启混部隔离功能下,执行 test_demo.sh 测试脚本。 + +```bash +# with_offline 参数表示启用 stress 离线业务 +# no_isolation 参数表示未开启混部隔离功能 +sh test_demo.sh with_offline no_isolation +``` + +**未开启混部隔离的情况下**,clickhouse 业务 QoS 数据 (QPS/P80/P90/P99 等指标)如下: + +```json +{ + "localhost:9000": { + "statistics": { + "QPS": 0.9424028693636205, + ...... + } + }, + "query_time_percentiles": { + ...... + "50": 0.840476774, + "60": 1.304607373, + "70": 1.393591017, + "80": 1.41277543, + "90": 1.430316688, + "95": 1.457534764, + "99": 1.555646855, + ...... + } +} +``` + +启用 stress 离线业务,开启混部隔离功能下,执行 test_demo.sh 测试脚本。 + +```bash +# with_offline 参数表示启用 stress 离线业务 +# enable_isolation 参数表示开启混部隔离功能 +sh test_demo.sh with_offline enable_isolation +``` + +**开启混部隔离功能的情况下**,clickhouse 业务 QoS 数据 (QPS/P80/P90/P99 等指标)如下: + +```json +{ + "localhost:9000": { + "statistics": { + "QPS": 1.8825798759270718, + ...... + } + }, + "query_time_percentiles": { + ...... + "50": 0.485725185, + "60": 0.512629901, + "70": 0.55656488, + "80": 0.636395956, + "90": 0.734695906, + "95": 0.804118275, + "99": 0.887807409, + ...... + } +} +``` + +从上面的测试结果整理出一个表格如下: + +| 业务部署方式 | QPS | P50 | P90 | P99 | +| -------------------------------------- | ------------- | ------------- | ------------- | ------------- | +| 单独运行 clickhouse 在线业务(基线) | 1.885 | 0.485 | 0.728 | 0.874 | +| clickhouse+stress(未开启混部隔离功能) | 0.942(-50%) | 0.840(-42%) | 1.430(-49%) | 1.556(-44%) | +| clickhouse+stress(开启混部隔离功能) | 1.883(-0.11%) | 0.486(-0.21%) | 0.735(-0.96%) | 0.888(-1.58%) | + +在未开启混部隔离功能的情况下,在线业务 clickhouse 的 QPS 从 1.9 下降到 0.9,同时业务的响应时延 (P90) 也从 0.7s 增大到 1.4s,在线业务 QoS 下降了 50% 左右;而在开启混部隔离功能的情况下,不管是在线业务的 QPS 还是响应时延 (P50/P90/P99) 相比于基线值下降不到 2%,在线业务 QoS 基本没有变化。 diff --git "a/docs/zh/docs/rubik/\351\205\215\347\275\256\346\226\207\346\241\243.md" "b/docs/zh/docs/rubik/\351\205\215\347\275\256\346\226\207\346\241\243.md" new file mode 100644 index 0000000000000000000000000000000000000000..001eb16809e987c4acde180f94285fd99d166766 --- /dev/null +++ "b/docs/zh/docs/rubik/\351\205\215\347\275\256\346\226\207\346\241\243.md" @@ -0,0 +1,220 @@ +# Rubik配置说明 + +rubik执行程序由Go语言实现,并编译为静态可执行文件,以便尽可能与系统依赖解耦。 + +## 命令 + +Rubik仅支持 使用`-v` 参数查询版本信息,不支持其他参数。 +版本信息输出示例如下所示,该信息中的内容和格式可能随着版本发生变化。 + +```bash +$ ./rubik -v +Version: 2.0.1 +Release: 2.oe2403sp1 +Go Version: go1.22.1 +Git Commit: bcaace8 +Built: 2024-12-10 +OS/Arch: linux/amd64 +``` + +## 配置 + +执行rubik二进制时,rubik首先会解析配置文件,配置文件的路径固定为`/var/lib/rubik/config.json`。 + +> ![](./figures/icon-note.gif) **说明**: +> +> 1. 为避免配置混乱,暂不支持指定其他路径。 +> 2. rubik支持以daemonset形式运行在kubernetes集群中。我们提供了yaml脚本(`hack/rubik-daemonset.yaml`),并定义了`ConfigMap`作为配置。因此,以daemonset形式运行rubik时,应修改`hack/rubik-daemonset.yaml`中的相应配置。 + +配置文件采用json格式,字段键采用驼峰命名规则,且首字母小写。 +配置文件示例内容如下: + +```json +{ + "agent": { + "logDriver": "stdio", + "logDir": "/var/log/rubik", + "logSize": 2048, + "logLevel": "info", + "cgroupRoot": "/sys/fs/cgroup", + "enabledFeatures": [ + "preemption", + "dynCache", + "ioLimit", + "ioCost", + "quotaBurst", + "quotaTurbo", + "psi", + "cpuevict", + "memoryevict" + ] + }, + "preemption": { + "resource": [ + "cpu", + "memory" + ] + }, + "quotaTurbo": { + "highWaterMark": 50, + "syncInterval": 100 + }, + "dynCache": { + "defaultLimitMode": "static", + "adjustInterval": 1000, + "perfDuration": 1000, + "l3Percent": { + "low": 20, + "mid": 30, + "high": 50 + }, + "memBandPercent": { + "low": 10, + "mid": 30, + "high": 50 + } + }, + "ioCost": [ + { + "nodeName": "k8s-single", + "config": [ + { + "dev": "sdb", + "enable": true, + "model": "linear", + "param": { + "rbps": 10000000, + "rseqiops": 10000000, + "rrandiops": 10000000, + "wbps": 10000000, + "wseqiops": 10000000, + "wrandiops": 10000000 + } + } + ] + } + ], + "psi": { + "interval": 10, + "resource": [ + "cpu", + "memory", + "io" + ], + "avg10Threshold": 5.0 + }, + "cpuevict": { + "threshold": 60, + "interval": 1, + "windows": 2, + "cooldown": 20 + }, + "memoryevict": { + "threshold": 60, + "interval": 1, + "cooldown": 4 + } +} +``` + +Rubik配置分为两类:通用配置和特性配置。通用配置由agent关键字标识,用于保存全局的配置。特性配置按服务类型区分,应用于各个子特性。特性配置必须在通用配置的`enabledFeatures`字段中声明方可使用。 + +### agent + +`agent`配置用于记录保存rubik运行的通用配置,例如日志、cgroup挂载点等信息。 +| 配置键[=默认值] | 类型 | 描述 | 可选值 | +| ------------------------- | ---------- | -------------------------------------- | --------------------------- | +| logDriver=stdio | string | 日志驱动,支持标准输出和文件 | stdio, file | +| logDir=/var/log/rubik | string | 日志保存目录 | 可读可写的目录 | +| logSize=1024 | int | 日志限额,单位MB,仅logDriver=file生效 | [10, $2^{20}$] | +| logLevel=info | string | 输出日志级别 | debug,info,warn,error | +| cgroupRoot=/sys/fs/cgroup | string | 系统cgroup挂载点路径 | 系统cgroup挂载点路径 | +| enabledFeatures=[] | string数组 | 需要使能的rubik特性列表 | rubik支持特性,参见特性介绍 | + +### preemption + +`preemption`字段用于标识绝对抢占特性配置。目前,Preemption特性支持CPU和内存的绝对抢占,用户可以按需配置该字段,单独或组合使用资源的绝对抢占。 + +| 配置键[=默认值] | 类型 | 描述 | 可选值 | +| --------------- | ---------- | -------------------------------- | ----------- | +| resource=[] | string数组 | 资源类型,声明何种资源需要被访问 | cpu, memory | + +### dynCache + +`dynCache`字段用于标识支持Pod访存带宽和LLC限制特性配置。`l3Percent`字段用于标识最后一级缓存(LLC)水位控制线,`memBandPercent`字段用于标识访存带宽(MB)水位控制线。 + +| 配置键[=默认值] | 类型 | 描述 | 可选值 | +| ----------------------- | ------ | ------------------ | --------------- | +| defaultLimitMode=static | string | dynCache的控制模式 | static, dynamic | +| adjustInterval=1000 | int | dynCache动态控制间隔时间,单位ms| [10, 10000] | +| perfDuration=1000 | int | dynCache性能perf执行时长,单位ms | [10, 10000] | +| l3Percent | map | dynCache控制中L3各级别对应水位(%)| / | +| .low=20 | int | L3 Cache低水位组控制线 | [10, 100] | +| .mid=30 | int | L3 Cache中水位组控制线 | [low, 100] | +| .high=50 | int | L3 Cache高水位组控制线 | [mid, 100]| +| memBandPercent | map | dynCache控制中MB各级别对应水位(%)|/| +| .low=10 | int | MB(访存带宽)低水位组控制线 | [10, 100]| +| .mid=30 | int | MB中水位组控制线 | [low, 100] | +| .high=50 | int | MB高水位组控制线 | [mid, 100] | + +### quotaTurbo + +`quotaTurbo`字段用于标识支持弹性限流技术(用户态)配置。 +| 配置键[=默认值] | 类型 | 描述 | 可选值 | +| ----------------- | ------ | -------------------------------- | -------------------- | +| highWaterMark=60 | int | CPU负载的高水位值 |\[0,警戒水位) | +| alarmWaterMark=80 | int | CPU负载的警戒水位 | (高水位,100\] | +| syncInterval=100 | int | 触发容器quota值更新的间隔(单位:毫秒) | [100,10000] | + +### ioCost + +`ioCost`字段用于标识支持iocost对IO权重控制特性配置。其类型为数组,数组中的每一个元素由节点名称`nodeName`和设备参数数组`config`组成。 +| 配置键 | 类型 | 描述 | 可选值 | +| ----------------- | ------ | -------------------------------- | -------------------- | +| nodeName | string | 节点名称 | kubernetes中节点名称 | +| config | 数组 | 单个设备的配置信息 | / | + +单个块设备配置`config`参数: +| 配置键[=默认值] | 类型 | 描述 | 可选值 | +| --------------- | ------ | --------------------------------------------- | -------------- | +| dev | string | 块设备名称,仅支持物理设备 | / | +| model | string | iocost模型名 | linear | +| param | / | 设备参数,根据不同模型有不同参数 | / | + +模型为linear时,`param`字段支持如下参数: +| 配置键[=默认值] | 类型 | 描述 | 可选值 | +| --------------- | ---- | ---- | ------ | +|rbps | int64 | 块设备最大读带宽 | (0, $2^{63}$) | +| rseqiops | int64 | 块设备最大顺序读iop | (0, $2^{63}$) | +| rrandiops | int64 | 块设备最大随机读iops | (0, $2^{63}$) | +| wbps | int64 | 块设备最大写带宽 | (0, $2^{63}$) | +| wseqiops | int64 | 块设备最大顺序写iops | (0, $2^{63}$) | +| wrandiops | int64 | 块设备最大随机写iops | (0, $2^{63}$) | + +### psi + +`psi`字段用于标识基于psi指标的干扰检测特性配置。目前,psi特性支持监测CPU、内存和I/O资源,用户可以按需配置该字段,单独或组合监测资源的PSI取值。 +| 配置键[=默认值] | 类型 | 描述 | 可选值 | +| --------------- | ---------- | -------------------------------- | ----------- | +| interval=10 |int|psi指标监测间隔(单位:秒)| [10,30]| +| resource=[] | string数组 | 资源类型,声明何种资源需要被访问 | cpu, memory, io | +| avg10Threshold=5.0 | float | psi some类型资源平均10s内的压制百分比阈值(单位:%),超过该阈值则驱逐离线业务 | [5.0,100]| + +### CPU驱逐水位线控制 + +`cpuevict`字段用于标识CPU驱逐水位线控制特性配置。该特性依照指定采样间隔采集节点CPU利用率,并统计指定窗口内的CPU平均利用率。若CPU平均利用率大于驱逐水位线,则驱逐离线Pod。一旦rubik驱逐离线Pod,则在冷却时间内不再驱逐Pod。 +| 配置键[=默认值] | 类型 | 描述 | 可选值 | +| --------------- | ---------- | -------------------------------- | ----------- | +| threshold=60 | int | 窗口期内平均CPU利用率的阈值(%),超过该阈值,则驱逐离线Pod | [1,99]| +| interval=1 | int | 节点CPU利用率采集间隔(s) | [1, 3600] | +| windows=2 | int | 节点平均CPU利用率的窗口时间(s)。窗口必须大于interval。若未设置windows,则windows设置为interval的两倍 | [1, 3600]| +| cooldown=20 | int | 冷却时间(s),两次驱逐之间至少需要间隔冷却时间 | [1, 9223372036854775806]| + +### 内存驱逐水位线控制 + +`memoryevict`字段用于标识内存驱逐水位线控制特性配置。该特性依照指定采样间隔采集节点内存利用率。若节点内存利用率大于驱逐水位线,则驱逐离线Pod。一旦rubik驱逐离线Pod,则在冷却时间内不再驱逐Pod。 +| 配置键[=默认值] | 类型 | 描述 | 可选值 | +| --------------- | ---------- | -------------------------------- | ----------- | +| threshold | int | 内存利用率的阈值(%),超过该阈值,则驱逐离线Pod。若不指定该值,则无法使用本功能。 | [1,99]| +| interval=1 | int | 节点CPU利用率采集间隔(s) | [1, 3600] | +| cooldown=4 | int | 冷却时间(s),两次驱逐之间至少需要间隔冷却时间 | [1, 9223372036854775806]| \ No newline at end of file diff --git "a/docs/zh/docs/rubik/\351\231\204\345\275\225.md" "b/docs/zh/docs/rubik/\351\231\204\345\275\225.md" new file mode 100644 index 0000000000000000000000000000000000000000..098554e053449d01472e94a7bbe937594f99e6a2 --- /dev/null +++ "b/docs/zh/docs/rubik/\351\231\204\345\275\225.md" @@ -0,0 +1,252 @@ +# 附录 + +## DaemonSet 配置模板 + +```yaml +kind: ClusterRole +apiVersion: rbac.authorization.k8s.io/v1 +metadata: + name: rubik +rules: + - apiGroups: [""] + resources: ["pods"] + verbs: ["list", "watch"] + - apiGroups: [""] + resources: ["pods/eviction"] + verbs: ["create"] +--- +kind: ClusterRoleBinding +apiVersion: rbac.authorization.k8s.io/v1 +metadata: + name: rubik +roleRef: + apiGroup: rbac.authorization.k8s.io + kind: ClusterRole + name: rubik +subjects: + - kind: ServiceAccount + name: rubik + namespace: kube-system +--- +apiVersion: v1 +kind: ServiceAccount +metadata: + name: rubik + namespace: kube-system +--- +apiVersion: v1 +kind: ConfigMap +metadata: + name: rubik-config + namespace: kube-system +data: + config.json: | + { + "agent": { + "logDriver": "stdio", + "logDir": "/var/log/rubik", + "logSize": 1024, + "logLevel": "info", + "cgroupRoot": "/sys/fs/cgroup", + "enabledFeatures": [ + "preemption" + ] + }, + "preemption": { + "resource": [ + "cpu" + ] + } + } +--- +apiVersion: apps/v1 +kind: DaemonSet +metadata: + name: rubik-agent + namespace: kube-system + labels: + k8s-app: rubik-agent +spec: + selector: + matchLabels: + name: rubik-agent + template: + metadata: + namespace: kube-system + labels: + name: rubik-agent + spec: + serviceAccountName: rubik + hostPID: true + containers: + - name: rubik-agent + image: hub.oepkgs.net/cloudnative/rubik:latest + imagePullPolicy: IfNotPresent + env: + - name: RUBIK_NODE_NAME + valueFrom: + fieldRef: + fieldPath: spec.nodeName + securityContext: + capabilities: + add: + - SYS_ADMIN + resources: + limits: + memory: 200Mi + requests: + cpu: 100m + memory: 200Mi + volumeMounts: + - name: rubiklog + mountPath: /var/log/rubik + readOnly: false + - name: runrubik + mountPath: /run/rubik + readOnly: false + - name: sysfs + mountPath: /sys/fs + readOnly: false + - name: devfs + mountPath: /dev + readOnly: false + - name: config-volume + mountPath: /var/lib/rubik + terminationGracePeriodSeconds: 30 + volumes: + - name: rubiklog + hostPath: + path: /var/log/rubik + - name: runrubik + hostPath: + path: /run/rubik + - name: sysfs + hostPath: + path: /sys/fs + - name: devfs + hostPath: + path: /dev + - name: config-volume + configMap: + name: rubik-config + items: + - key: config.json + path: config.json +``` + +## Dockerfile 模板 + +```dockefile +FROM scratch +COPY ./build/rubik /rubik +ENTRYPOINT ["/rubik"] +``` + +## 镜像构建脚本 + +```bash +#!/bin/bash +set -e + +CURRENT_DIR=$(cd "$(dirname "$0")" && pwd) +BINARY_NAME="rubik" + +RUBIK_FILE="${CURRENT_DIR}/build/rubik" +DOCKERFILE="${CURRENT_DIR}/Dockerfile" +YAML_FILE="${CURRENT_DIR}/rubik-daemonset.yaml" + +# Get version and release number of rubik binary +VERSION=$(${RUBIK_FILE} -v | grep ^Version | awk '{print $NF}') +RELEASE=$(${RUBIK_FILE} -v | grep ^Release | awk '{print $NF}') +IMG_TAG="${VERSION}-${RELEASE}" + +# Get rubik image name and tag +IMG_NAME_AND_TAG="${BINARY_NAME}:${IMG_TAG}" + +# Build container image for rubik +docker build -f "${DOCKERFILE}" -t "${IMG_NAME_AND_TAG}" "${CURRENT_DIR}" + +echo -e "\n" +# Check image existence +docker images | grep -E "REPOSITORY|${BINARY_NAME}" + +# Modify rubik-daemonset.yaml file, set rubik image name +sed -i "/image:/s/:.*/: ${IMG_NAME_AND_TAG}/" "${YAML_FILE}" +``` + +## 通信矩阵 + +- rubik 服务进程作为客户端通过 List/Watch 机制与 kubernetes API Server 进行通信,从而获取 Pod 等信息 + +|源IP|源端口|目的IP|目标端口|协议|端口说明|侦听端口是否可更改|认证方式| +|----|----|----|----|----|----|----|----| +|rubik所在节点机器|32768-61000|api-server所在服务器|443|tcp|kubernetes对外提供的访问资源的端口|不可更改|token| + +## 文件与权限 + +- rubik 所有的操作均需要使用 root 权限。 + +- 涉及文件及权限如下表所示: + +|文件路径|文件/文件夹权限|说明| +|----|----|----| +|/var/lib/rubik|750|rpm 安装完成后生成目录,存放 rubik 相关文件| +|/var/lib/rubik/build|550|存放 rubik 二进制文件的目录| +|/var/lib/rubik/build/rubik|550|rubik 二进制文件| +|/var/lib/rubik/rubik-daemonset.yaml|640|rubik daemon set 配置模板,供 k8s 部署使用| +|/var/lib/rubik/Dockerfile|640|Dockerfile 模板| +|/var/lib/rubik/build_rubik_image.sh|550|rubik 容器镜像构建脚本| +|/var/log/rubik|700|rubik 日志存放目录(需开启 logDriver=file 后使能)| +|/var/log/rubik/rubik.log*|600|rubik 日志文件| + +## 约束限制 + +### 规格 + +- 磁盘:1GB+ + +- 内存:100MB+ + +## 运行时 + +- 每个 k8s 节点只能部署一个 rubik,多个 rubik 会冲突 + +- rubik 不接收任何命令行参数,若添加参数启动会报错退出 + +- 如果 rubik 进程进入 T、D 状态,则服务端不可用,此时服务不会响应,需恢复异常状态之后才可继续使用 + +### Pod 优先级设置 + +- 禁止低优先级往高优先级切换。如业务 A 先被设置为低优先级(-1),接着设置为高优先级(0),rubik 报错 + +- 用户添加注解、修改注解、修改 Pod yaml 中的注解并重新 apply 等操作不会触发 Pod 重建。rubik 会通过 List/Watch 机制感知 Pod 注解变化情况 + +- 禁止将任务从在线组迁移到离线组后再迁移回在线组,此操作会导致该任务 QoS 异常 + +- 禁止将重要的系统服务和内核线程加入到离线组中,否则可能导致调度不及时,进而导致系统异常 + +- CPU 和 memory 的在线、离线配置需要统一,否则可能导致两个子系统的 QoS 冲突 + +- 使用混部后,原始的 CPU share 功能存在限制。具体表现为: + - 若当前 CPU 中同时存放在线任务和离线任务,则离线任务的 CPU share 无法生效 + - 若当前 CPU 中只有在线任务或只有离线任务,CPU share 能生效 + - 建议离线业务 Pod 优先级配置为 best effort + +- 用户态的优先级反转、smt、cache、numa 负载均衡、离线任务的负载均衡,当前不支持 + +### 其他 + +禁止用户直接手动修改 Pod 对应 cgroup 或 resctrl 参数,否则可能出现数据不一致情况。 + +- CPU cgroup 目录, 如:`/sys/fs/cgroup/cpu/kubepods/burstable//` + - cpu.qos_level + - cpu.cfs_burst_us + +- memory cgroup 目录,如:`/sys/fs/cgroup/memory/kubepods/burstable//` + - memory.qos_level + - memory.soft_limit_in_bytes + - memory.force_empty + - memory.limit_in_bytes + - memory.high + +- RDT 控制组目录,如:`/sys/fs/resctrl` diff --git a/docs/zh/docs/secDetector/public_sys-resources/icon-note.gif b/docs/zh/docs/secDetector/public_sys-resources/icon-note.gif new file mode 100644 index 0000000000000000000000000000000000000000..6314297e45c1de184204098efd4814d6dc8b1cda Binary files /dev/null and b/docs/zh/docs/secDetector/public_sys-resources/icon-note.gif differ diff --git a/docs/zh/docs/secDetector/secDetector.md b/docs/zh/docs/secDetector/secDetector.md new file mode 100644 index 0000000000000000000000000000000000000000..e20d3fc09effae2c13b7125bad2a0ba932716600 --- /dev/null +++ b/docs/zh/docs/secDetector/secDetector.md @@ -0,0 +1,5 @@ +# secDetector 使用指南 + +本文档介绍 openEuler 操作系统内构入侵检测系统 secDetector 的架构、特性、安装、开发指导、落地应用场景等,帮助用户快速了解并使用 secDetector。 + +本文档适用于使用 openEuler 系统并希望了解和使用 secDetector 的社区开发者、开源爱好者以及相关合作伙伴。使用人员需要具备基本的Linux操作系统知识。 diff --git "a/docs/zh/docs/secDetector/\344\275\277\347\224\250secDetector.md" "b/docs/zh/docs/secDetector/\344\275\277\347\224\250secDetector.md" new file mode 100644 index 0000000000000000000000000000000000000000..f54d519cf494f8115262b3787b92d1acc125781a --- /dev/null +++ "b/docs/zh/docs/secDetector/\344\275\277\347\224\250secDetector.md" @@ -0,0 +1,46 @@ +# 使用 secDetector + +secDetector 提供了SDK,一个so库,用户可以在自己的应用程序中集成该动态链接库从而通过简单的接口使用secDetector。本章介绍其使用方法。 + +## 基本用法 + +用户按照指南《[安装secDetector](./安装secDetector.md)》安装完secDetector之后,libsecDetectorsdk.so、secDetector_sdk.h、secDetector_topic.h就已经被部署到系统用户库默认路径中。 + +1. 使用 C 或 C++ 开发的应用程序确保include路径包含后,可以首先在程序中引用这两个头文件。 + + ``` + #include + #include + ``` + +2. 参考指南《[接口参考](./接口参考.md)》调用SDK提供的接口访问secDetector。 + + 1. 首先调用订阅接口secSub,订阅所需的主题。 + 2. 然后在独立线程中调用消息读取接口secReadFrom阻塞式的读取被订阅主题产生的信息。 + 3. 最后当不需要使用secDetector时,调用退订接口secUnsub。退订时请严格使用订阅时的返回值。 + +## 代码示例 + +可以参考secDetector代码仓上的示例代码,由python语言编写。 + +1. 可以在如下链接中查看示例代码 + + [examples/python · openEuler/secDetector (gitee.com)](https://gitee.com/openeuler/secDetector/tree/master/examples/python) + +2. 也可以下载后参考 + +```shell +git clone https://gitee.com/openeuler/secDetector.git +``` + +## 规格与约束 + +1. 部分功能(如内存修改探针-安全开关)依赖硬件体系结构,因此在不同指令集架构上的表现并不相同。 +2. 从内核到用户态传输数据缓存空间为探针共享,缓冲区满会丢弃新采集的事件信息。缓存空间可配置范围为4~1024 MB, 必须为2的幂。 +3. 服务进程secDetectord支持root用户运行,不支持多实例,非第一个运行的程序会退出。 +4. 用户订阅连接数限制为5个。 +5. 用户订阅后,读取消息时需要为消息读取接口提供一块缓冲区,超过缓冲区长度的消息将被截断。建议缓冲区长度不低于4096。 +6. 对于文件名、节点名之类的描述字符串都有一定的长度限制,过长可能会被截断。 +7. 应用程序单进程内不支持并行多连接 secDetectord 接收消息。只能一次订阅,单连接接受消息。去订阅后才能重新订阅。 +8. secDetectord 进程应当等待所有应用程序的连接中断即完全退订所有主题后,才可以关闭退出。 +9. 部分功能(如内存修改探针-安全开关)基于当前CPU状态。因此检测的基本功能是可以检测到当前CPU上的状态变化,其他CPU上的状态变化如果未能及时同步到当前CPU,则不会被检测到。 \ No newline at end of file diff --git "a/docs/zh/docs/secDetector/\345\256\211\350\243\205secDetector.md" "b/docs/zh/docs/secDetector/\345\256\211\350\243\205secDetector.md" new file mode 100644 index 0000000000000000000000000000000000000000..30d7923a20ba7c7f9d36f8886227bec64f04e93f --- /dev/null +++ "b/docs/zh/docs/secDetector/\345\256\211\350\243\205secDetector.md" @@ -0,0 +1,104 @@ +# 安装 secDetector + +## 软硬件要求 + +### 硬件要求 + +* 当前仅支持 x86_64、aarch64 架构处理器。 +* secDetector磁盘使用需求:配额1GB及以上。 +* secDetector内存使用需求:配额100MB及以上。 + +### 环境准备 + +安装 openEuler 系统,安装方法参考《[安装指南](../Installation/installation.md)》。 + +## 安装secDetector + +1. 配置openEuler yum源:openEuler 发布版本上已默认配置完成yum源,无需额外操作。特殊情况下请参考openEuler官方文档配置在线yum源或通过ISO挂载配置本地yum源。 + +2. 安装secDetector。 + + ```shell + #安装secDetector + sudo yum install secDetector + ``` + +> ![](./public_sys-resources/icon-note.gif)说明: +> +> 安装secDetector后在指定目录下可获得部署secDetector所需的相关文件: + +```shell +#secDetector的kerneldriver的核心框架 +/lib/modules/%{kernel_version}/extra/secDetector/secDetector_core.ko + +#secDetector的kerneldriver的功能组件 +/lib/modules/%{kernel_version}/extra/secDetector/secDetector_xxx.ko + +#secDetector的守护者进程文件 +/usr/bin/secDetectord + +#secDetector的SDK库文件 +/usr/lib64/secDetector/libsecDetectorsdk.so +/usr/include/secDetector/secDetector_sdk.h +/usr/include/secDetector/secDetector_topic.h +``` + +## 部署 secDetector + +secDetector的主体secDetectord是以系统服务的形式部署在系统中的,前台业务系统可以通过集成SDK来与之通信。由于secDetector的部分能力必须构建在内核之中,因此secDetectord的功能全集还依赖于其后台驱动的部署。 + +### 部署 kernel driver + +1. 插入 kernel driver 的基础框架:secDetector_core.ko 是 kernel driver 的基础框架,要优先于其他内核模块进行部署。找到安装后的 secDetector_core.ko 目录,将其插入内核。参考命令如下: + + ```shell + sudo insmod secDetector_core.ko + ``` + + secDetector_core 支持一个命令行参数ringbuf_size。用户可以通过指定该参数的值来控制 kernel driver 与 用户态secDetectord之间数据通道的缓存空间尺寸。该参数可以被指定为4~1024中的一个整数,单位是MB。默认值是4,必须为2的幂。参考命令如下: + + ``` + sudo insmod secDetector_core.ko ringbuf_size=128 + ``` + + +2. 插入 kernel driver 的功能模块:secDetector的 kernel driver 采用模块化部署方式。用户可以选择基于框架部署满足需要的功能模块,也可以选择部署全部模块。参考命令如下: + + ```shell + sudo insmod secDetector_kmodule_baseline.ko + + sudo insmod secDetector_memory_corruption.ko + + sudo insmod secDetector_program_action.ko + + sudo insmod secDetector_xxx.ko + ``` + + - secDetector_kmodule_baseline.ko 提供了内核模块列表检测的能力,属于内存修改类探针; + - secDetector_memory_corruption.ko 提供了内存修改检测的能力,属于内存修改类探针; + - secDetector_program_action.ko 提供了程序行为检测的能力,属于程序行为类探针。 + +### 部署 usr driver 和 observer_agent + +当前用户态驱动 usr driver 和服务 observer_agent 已经都被集成到secDetectord中,参考命令如下: + +```shell +sudo ./secDetectord & +``` + +usr driver当前包含了文件操作类探针和进程管理类探针的能力。 + +secDetectord支持如下一些配置选项: + +``` +用法:secDetectord [选项] +secDetectord 默认会在后台运行,从探针中取得数据并转发给订阅者。 +选项: + -d 进入调试模式,进入前台运行,并且在控制台打印探针数据。 + -s 配置eBPF缓冲区大小,单位为Mb,默认为4; size可选范围为4~1024,且必须为2的幂次方。当前拥有2个独立的缓冲区。 + -t 支持配置订阅的事件,默认为所有事件。topic 是位图格式。例如 -t 0x60 同时订阅进程创建和进程退出事件。详细请查阅 include/secDetector_topic.h。 +``` + +### 部署SDK + +SDK的库文件默认已经被部署到系统库目录中,用户需要在自己的程序中引用SDK的头文件即可使用。 \ No newline at end of file diff --git "a/docs/zh/docs/secDetector/\346\216\245\345\217\243\345\217\202\350\200\203.md" "b/docs/zh/docs/secDetector/\346\216\245\345\217\243\345\217\202\350\200\203.md" new file mode 100644 index 0000000000000000000000000000000000000000..1235f7bf151a0db3b6cbb62f8495185df34c9a58 --- /dev/null +++ "b/docs/zh/docs/secDetector/\346\216\245\345\217\243\345\217\202\350\200\203.md" @@ -0,0 +1,83 @@ +# 接口说明 + +secDetector操作系统内构入侵检测系统对外提供SDK,这里给出用户开发应用程序所需的接口。SDK的接口设计非常简单,一共只有三个接口,两个头文件。 + +头文件: + +- secDetector/secDetector_sdk.h:包含接口定义 + +- secDetector/secDetector_topic.h:包含一些调用接口所需使用的预定义宏,如可选择性订阅的功能主题编号 + + +## secSub + +订阅 topic 接口 + +**功能**: + +订阅接口,应用程序通过输入不同 topic id,可以选择订阅不同的功能主题,比如文件打开类异常探针。secDetector 提供的诸功能主题对应的 topic id 的定义在 secDetector_topic.h 中可以查看。本订阅接口支持一次订阅多个主题,多个探针的 topic id 可以以位图的形式进行组合。 + +>![](./public_sys-resources/icon-note.gif) **说明:** +>由于一次订阅产生一个reader即信息读取器,所以应用程序应当在一次订阅接口调用中订阅所需的所有主题。这样就可以使用一个reader进行信息的采集。如果需要调整订阅的内容,可以退订之后再重新订阅。 + +**函数声明:** + +```c +void *secSub(const int topic); +``` + +**参数:** + +- topic:入参,需要订阅的主题集合 + +**返回值:** + +- NULL:订阅失败 +- NOT NULL:读取订阅主题相关信息的GRPC reader读取器 + +## secUnsub + +退订 topic 接口 + +**功能**: + +退订接口,应用程序通过输入订阅成功后获得的reader,完成主题的退订。退订后应用程序便不会收到相应主题的信息。系统中某主题如果没有任何应用程序订阅,则不会被执行。 + +**函数声明:** + +```c +void secUnsub(void *reader); +``` + +**参数:** + +- reader:入参,需要退订的信息读取器 + +**返回值:** + +- 无 + + +## secReadFrom + +已订阅主题的消息读取接口 + +**功能**: + +使用订阅接口对某些主题的订阅成功后,执行退订操作之前,可使用本接口接受 secDetector 发送的已订阅主题的消息。本接口是阻塞式的。应用程序建议使用独立的线程循环调用。当已订阅主题有消息时候,本函数才会被恢复执行。 + +**函数声明:** + +```c +void secReadFrom(void *reader, char *data, int data_len); +``` + +**参数:** + +- reader:入参,主题订阅成功后得到的消息读取器 +- data:出参,消息缓冲区,由应用程序提供的一段内存 +- data_len:入参,消息缓冲区的尺寸 + +**返回值:** + +- 无 diff --git "a/docs/zh/docs/secDetector/\350\256\244\350\257\206secDetector.md" "b/docs/zh/docs/secDetector/\350\256\244\350\257\206secDetector.md" new file mode 100644 index 0000000000000000000000000000000000000000..d9af9b9ecd03612f3f2024727cea62940a8fefa3 --- /dev/null +++ "b/docs/zh/docs/secDetector/\350\256\244\350\257\206secDetector.md" @@ -0,0 +1,98 @@ +# 认识secDetector + +## 简介 + +secDetector 是专为OS设计的内构入侵检测系统,旨在为关键信息基础设施提供入侵检测及响应能力,为第三方安全工具减少开发成本同时增强检测和响应能力。secDetector 基于ATT&CK攻击模式库建模提供更为丰富的安全事件原语,并且可以提供实时阻断和灵活调整的响应能力。 + +secDetector 作为一套灵活的OS内构入侵检测系统,有三种使用模式: + +1. 直接被系统用户开启用作一些基础异常事件的告警和处置。 +2. 被安全态势感知服务集成,补齐系统信息采集缺陷,用于APT等复杂的安全威胁分析和重点事件布控实时阻断。 +3. 由安全从业人员或安全感知服务提供商二次开发,基于可拓展框架构建精确、高效、及时的入侵检测与响应能力。 + +## 软件架构 + +``` +||==APP===================================================================|| +|| || +|| ---------------------------- || +|| | SDK | || +|| ---------------------------- || +|| /^\ || +||==================================|=====================================|| + | + | + | +||==OBSERVER========================|=====================================|| +|| | || +|| ---------------------------- || +|| | service | || +|| ---------------------------- || +|| /^\ || +||==================================|=====================================|| + | +||==DRIVER================================================================|| +|| || +|| ---------------------------- || +|| | 8 types of cases | || +|| ---------------------------- || +|| || +||------------------------------------------------------------------------|| +|| core || +|| ------------- ---------------- ---------------- ----------------- || +|| | hook unit | | collect unit | | analyze unit | | response unit | || +|| ------------- ---------------- ---------------- ----------------- || +|| || +||========================================================================|| +``` + +secDetector在架构上分为四个部分:SDK、service、检测特性集合cases、检测框架core。 + +- SDK + + SDK是以一个用户态动态链接库lib的形态承载,被部署到需要使用secDetector入侵检测系统的安全感知业务中。SDK用于和secDetector入侵检测系统的service通讯,完成所需的工作(例如订阅,去订阅,读取现有消息等功能)。secDetector提供的异常信息被定义成不同的case,安全感知业务可以根据自身需求订阅。 + +- service + + service是以一个用户态服务应用的形态承载,向上管理、维护安全感知业务的case订阅信息,向下维护case的运行情况。框架core和检测特性集合case采集到的信息由service统一收集,按需转发给不同的安全感知业务。安全感知业务对于底层检测特性集合case和框架core的配置、管理的需求也由service进行统一的管理和转发。不同的安全感知业务可能会需求同样的case,service会统计出所有安全感知业务需求case的并集,向下层注册。 + +- 特性集合cases + + 检测特性集合cases是一系列异常检测探针,根据异常信息的不同会有不同的形态,比如内核异常信息检测的每个探针会以内核模块ko的形态承载。一个case代表一个探针,一个探针往往是一类异常信息或者一类异常事件的信息。比如进程类探针会关注所有进程的创建、退出、属性修改等事件信息,内存修改类探针会收集内核模块列表和安全开关等信息。因此一个探针可能会包含对多个事件的监控,而这些对不同事件的监控逻辑可能无法部署在同一个执行流当中。我们使用工作流(workflow)的概念表示一个探针在同一个执行流中的工作,一个探针可以包含一个或者多个工作流。比如对于进程探针而言,进程创建检测和进程属性修改检测就是不同的工作流。 + +- 框架core + + 检测框架core是每一个case依赖的基础框架,提供case的管理和workflow所需的通用的基础功能单元。内核异常信息检测框架会以内核模块ko的形态承载。一个检测特性case可以将自己注册到框架中,或者从框架中去注册。框架还可以提供特定的交互接口以满足外部的动态请求。一个workflow被定义为有四类功能单元组成:事件发生器、信息采集器、事件分析器、响应单元。 + +特性集合cases和框架core合起来被称为driver。driver驱动提供了secDetector功能的最底层的系统级实现。 + +driver分为两类,kerneldriver 和 usrdriver。顾名思义,kerneldriver是部署在内核态中的,以内核模块的形式承载。usrdriver是部署在用户态中的,直接被部署为service中的一个模块。从逻辑上usrdriver是在service之下的,但是在运行中,为了降低通信成本,usrdriver被直接集成在service程序中。 + +## 能力特性 + +### 检测能力 + +| 特性 | 状态 | 发布版本 | +| ------------------------------ | ------ | ------------------------------------------------------------ | +| 检测框架 | 已实现 | 统一灵活可拓展高效的检测框架,支持不同类型的触发、收集、分析、响应单元 | +| 进程管理类探针 | 已实现 | 监控进程创建、退出、元数据修改等事件 | +| 文件操作类探针 | 已实现 | 监控文件创建、删除、读写、属性修改等事件 | +| 程序行为类探针(API调用) | 已实现 | 监控匿名管道创建、命令执行、ptrace系统调用等关键程序行为 | +| 内存修改类探针(内核关键数据) | 已实现 | 监控内核模块列表,硬件安全功能开关等内核关键数据 | + +### 响应能力 + +| 特性 | 状态 | 说明 | +| -------- | ------ | -------------------------------------------------- | +| 响应框架 | 已实现 | 统一的灵活可拓展的响应框架,支持不同类型的响应单元 | +| 告警上报 | 已实现 | 提供异常信息上报能力的响应单元 | + +### 服务能力 + +| 特性 | 状态 | 说明 | +| -------- | ------ | ------------------------------------------------------------ | +| 通信框架 | 已实现 | 应用程序使用gRPC和service进行通信。功能被封装在SDK的动态库中。 | +| 订阅管理 | 已实现 | 应用程序可以一次订阅,长期使用secDetector获取信息。secDetector会对订阅的应用程序进行管理,分发对应的被订阅主题的信息。 | +| 配置下发 | 已实现 | 服务可以通过参数对于特定的检测、阻断特性进行配置,从而实现过滤、调整等功能。目前未对应用程序开放。 | +| 即时检测 | 已实现 | secDetector提供的信息是实时的,准确的,一手的。 | + diff --git "a/docs/zh/docs/secGear/\345\256\211\350\243\205secGear.md" "b/docs/zh/docs/secGear/\345\256\211\350\243\205secGear.md" index 3b9e91810552745e64a71216b13b3c2f69f7a9ef..67950e202d940cb63def628d34eef436a461208c 100644 --- "a/docs/zh/docs/secGear/\345\256\211\350\243\205secGear.md" +++ "b/docs/zh/docs/secGear/\345\256\211\350\243\205secGear.md" @@ -20,30 +20,22 @@ > - 普通服务器无法仅通过升级BMC、BIOS、TEE OS固件实现TrustZone特性使能。 > - 带TrustZone特性的服务器出厂默认特性关闭,请参考BIOS设置使能服务器TrustZone特性。 -#### 操作系统 - -openEuler 20.03 LTS SP2及以上 - -openEuler 22.09 - -openEuler 22.03 LTS及以上 - -### 环境准备 +### 环境准备 - 参考鲲鹏官网[环境要求](https://www.hikunpeng.com/document/detail/zh/kunpengcctrustzone/fg-tz/kunpengtrustzone_04_0006.html)和[搭建步骤](https://www.hikunpeng.com/document/detail/zh/kunpengcctrustzone/fg-tz/kunpengtrustzone_04_0007.html)。 + 参考鲲鹏官网[环境要求](https://www.hikunpeng.com/document/detail/zh/kunpengcctrustzone/fg-tz/kunpengtrustzone_20_0018.html)和[搭建步骤](https://www.hikunpeng.com/document/detail/zh/kunpengcctrustzone/fg-tz/kunpengtrustzone_20_0019.html)。 -### 安装操作 +### 安装操作 -1. 配置openEuler yum源,在线yum源或通过ISO挂载配置本地yum源,配置在线源如下(仅以22.03-LTS举例,其他版本需要使用版本对应的yum源)。 +1. 配置openEuler yum源,在线yum源或通过ISO挂载配置本地yum源,配置在线源如下。 ```shell - vi openEuler.repo + vi /etc/yum.repo/openEuler.repo [osrepo] name=osrepo - baseurl=http://repo.openeuler.org/openEuler-22.03-LTS/everything/aarch64/ + baseurl=http://repo.openeuler.org/openEuler-{version}/everything/aarch64/ enabled=1 gpgcheck=1 - gpgkey=http://repo.openeuler.org/openEuler-22.03-LTS/everything/aarch64/RPM-GPG-KEY-openEuler + gpgkey=http://repo.openeuler.org/openEuler-{version}/everything/aarch64/RPM-GPG-KEY-openEuler ``` 2. 安装secGear @@ -72,30 +64,22 @@ openEuler 22.03 LTS及以上 支持Intel SGX(Intel Software Guard Extensions) 特性的处理器。 -#### 操作系统 - -openEuler 20.03 LTS SP2及以上 - -openEuler 22.09 - -openEuler 22.03 LTS及以上 - ### 环境准备 购买支持Intel SGX特性设备,参考对应设备BIOS配置手册,开启SGX特性。 ### 安装操作 -1. 配置openEuler yum源,在线yum源或通过ISO挂载配置本地yum源,配置在线源如下(仅以22.03-LTS举例,其他版本需要使用版本对应的yum源)。 +1. 配置openEuler yum源,在线yum源或通过ISO挂载配置本地yum源,配置在线源如下。 ```shell vi openEuler.repo [osrepo] name=osrepo - baseurl=http://repo.openeuler.org/openEuler-22.03-LTS/everything/x86_64/ + baseurl=http://repo.openeuler.org/openEuler{version}/everything/x86_64/ enabled=1 gpgcheck=1 - gpgkey=http://repo.openeuler.org/openEuler-22.03-LTS/everything/x86_64/RPM-GPG-KEY-openEuler + gpgkey=http://repo.openeuler.org/openEuler-{version}/everything/x86_64/RPM-GPG-KEY-openEuler ``` 2. 安装secGear diff --git "a/docs/zh/docs/secGear/\350\256\244\350\257\206secGear.md" "b/docs/zh/docs/secGear/\350\256\244\350\257\206secGear.md" index 165b3712ba0ebcecdec8b5a7116a3007a4dc50ce..c7bbd241a1c0184e8933891555f2c4719bd47f3a 100644 --- "a/docs/zh/docs/secGear/\350\256\244\350\257\206secGear.md" +++ "b/docs/zh/docs/secGear/\350\256\244\350\257\206secGear.md" @@ -42,6 +42,7 @@ secGear机密计算统一开发框架技术架构如图所示,主要包括三 uint32_t parameter_num; uint32_t workers_policy; uint32_t rollback_to_common; + cpu_set_t num_cores; } cc_sl_config_t; ``` @@ -55,6 +56,7 @@ secGear机密计算统一开发框架技术架构如图所示,主要包括三 | parameter_num | switchless函数支持的最大参数个数,该字段仅在ARM平台生效。
规格: ARM:最大值:16;最小值:0 | | workers_policy | switchless代理线程运行模式,该字段仅在ARM平台生效。
规格: ARM: WORKERS_POLICY_BUSY:代理线程一直占用CPU资源,无论是否有任务需要处理,适用于对性能要求极高且系统软硬件资源丰富的场景; WORKERS_POLICY_WAKEUP:代理线程仅在有任务时被唤醒,处理完任务后进入休眠,等待再次被新任务唤醒 | | rollback_to_common | 异步switchless调用失败时是否回退到普通调用,该字段仅在ARM平台生效。
规格: ARM:0:否,失败时仅返回相应错误码;其他:是,失败时回退到普通调用,此时返回普通调用的返回值 | + | num_cores | 用于设置安全侧线程绑核
规格: 最大值为当前环境CPU核数 | 2. 定义EDL文件中接口时添加零切换标识transition_using_threads @@ -104,6 +106,51 @@ secGear机密计算统一开发框架技术架构如图所示,主要包括三 ##### 注意事项 安全通道仅封装密钥协商过程、加解密接口,不建立网络连接,协商过程复用业务的网络连接。其中客户端和服务端的网络连接由业务建立和维护,在安全通道客户端和服务端初始化时传入消息发送钩子函数和网络连接指针。 详见[安全通道样例](https://gitee.com/openeuler/secGear/tree/master/examples/secure_channel)。 + +### 远程证明 + +#### 客户痛点 +随着机密计算技术的发展,逐渐形成几大主流技术(如Arm Trustzone/CCA、Intel SGX/TDX、擎天Enclave、海光CSV等),产品解决方案中可能存在多种机密计算硬件,甚至不同TEE之间的协同,其中远程证明是任何一种机密计算技术信任链的重要一环,每种技术的远程证明报告格式及验证流程各有差异,用户对接不同的TEE,需要集成不同TEE证明报告的验证流程,增加了用户的集成负担,并且不利于扩展新的TEE类型。 + +#### 解决方案 +secGear远程证明统一框架是机密计算远程证明相关的关键组件,屏蔽不同TEE远程证明差异,提供Attestation Agent和Attestation Service两个组件,Agent供用户集成获取证明报告,对接证明服务;Service可独立部署,支持iTrustee、virtCCA远程证明报告的验证。 + +#### 功能描述 +远程证明统一框架聚焦机密计算相关功能,部署服务时需要的服务运维等相关能力由服务部署第三方提供。远程证明统一框架的关键技术如下: +- 报告校验插件框架:支持运行时兼容iTrustee、vritCCA、CCA等不同TEE平台证明报告检验,支持扩展新的TEE报告检验插件。 +- 证书基线管理:支持对不同TEE类型的TCB/TA基线值管理及公钥证书管理,集中部署到服务端,对用户透明。 +- 策略管理:提供默认策略(易用)、用户定制策略(灵活)。 +- 身份令牌:支持对不同TEE签发身份令牌,由第三方信任背书,实现不同TEE类型相互认证。 +- 证明代理:支持对接证明服务/点对点互证,兼容TEE报告获取,身份令牌验证等,易集成,使用户聚焦业务。 + +根据使用场景,支持点对点验证和证明服务验证两种模式。 + +证明服务验证流程如下: + +1.用户(普通节点或TEE)对TEE平台发起挑战。 + +2.TEE平台通过证明代理获取TEE证明报告,并返回给用户。 + +3.用户端证明代理将报告转发到远程证明服务。 + +4.远程证明服务完成报告校验,返回由第三方信任背书的统一格式身份令牌。 + +5.证明代理验证身份令牌,并解析得到证明报告校验结果。 + +6.得到通过的校验结果后,建立安全连接。 + +点对点验证流程(无证明服务)如下: + +1.用户向TEE平台发起挑战,TEE平台返回证明报告给用户。 + +2.用户使用本地点对点TEE校验插件完成报告验证。 + +> ![](./public_sys-resources/icon-note.gif) **说明:** +> +> 点对点验证和远程证明服务验证时的证明代理不同,在编译时可通过编译选项,决定编译有证明服务和点对点模式的证明代理。 +#### 应用场景 +在金融、AI等场景下,基于机密计算保护运行中的隐私数据安全时,远程证明是校验机密计算环境及应用合法性的技术手段,远程证明统一框架提供了易集成、易部署的组件,帮助用户快速使能机密计算远程证明能力。 + ## 缩略语 | 缩略语 | 英文全名 | 中文解释 | diff --git "a/docs/zh/docs/sysBoost/\344\275\277\347\224\250\346\226\271\346\263\225.md" "b/docs/zh/docs/sysBoost/\344\275\277\347\224\250\346\226\271\346\263\225.md" index 266798cc7ae61473ee10ed4c92a91006e9121222..35f4da2e16621ec885dfbf55d51cf7067d832bbd 100644 --- "a/docs/zh/docs/sysBoost/\344\275\277\347\224\250\346\226\271\346\263\225.md" +++ "b/docs/zh/docs/sysBoost/\344\275\277\347\224\250\346\226\271\346\263\225.md" @@ -12,7 +12,7 @@ ### 配置文件说明 配置文件目录:/etc/sysboost.d/ -**表 1** 客户端yaml文件配置说明 +**表 1** 客户端toml文件配置说明 diff --git "a/docs/zh/docs/sysMaster/sysmaster\344\275\277\347\224\250\350\257\264\346\230\216.md" "b/docs/zh/docs/sysMaster/sysmaster\344\275\277\347\224\250\350\257\264\346\230\216.md" index e49a236af6b7b9e91dda7dd9741962edaa8c4a9b..b39bef3db4da92c84ccb577a41a7130aac0a138b 100644 --- "a/docs/zh/docs/sysMaster/sysmaster\344\275\277\347\224\250\350\257\264\346\230\216.md" +++ "b/docs/zh/docs/sysMaster/sysmaster\344\275\277\347\224\250\350\257\264\346\230\216.md" @@ -5,8 +5,6 @@ * 如何创建 `service`服务单元配置文件。 * 如何管理单元服务,例如启动、停止、查看服务。 -更多可以查阅[官方手册](http://sysmaster.online/man/all/)。 - ## 创建单元配置文件 用户可以在 `/usr/lib/sysmaster/system/`目录下创建单元配置文件。 @@ -76,7 +74,7 @@ WantedBy="multi-user.target" 使用以下命令可以启动 `sshd`服务和运行 `ExecStart`所配置的命令。 ```bash -# sctl start sshd.service +sctl start sshd.service ``` ### 停止服务 @@ -84,7 +82,7 @@ WantedBy="multi-user.target" 使用以下命令可以停止 `sshd`服务,杀死 `ExecStart`所运行的进程。 ```bash -# sctl stop sshd.service +sctl stop sshd.service ``` ### 重启服务 @@ -92,7 +90,7 @@ WantedBy="multi-user.target" 使用以下命令可以重启 `sshd`服务,该命令会先停止后启动服务。 ```bash -# sctl restart sshd.service +sctl restart sshd.service ``` ### 查看服务状态 @@ -100,5 +98,5 @@ WantedBy="multi-user.target" 使用以下命令可以查看服务 `sshd`运行状态,用户可以查看服务的状态来获取服务是否正常运行。 ```bash -# sctl status sshd.service +sctl status sshd.service ``` diff --git "a/docs/zh/docs/sysmonitor/figures/sysmonitor\345\212\237\350\203\275\345\210\227\350\241\250.png" "b/docs/zh/docs/sysmonitor/figures/sysmonitor\345\212\237\350\203\275\345\210\227\350\241\250.png" new file mode 100644 index 0000000000000000000000000000000000000000..701e925d66a8771774e1bb38fdf70edd982913bf Binary files /dev/null and "b/docs/zh/docs/sysmonitor/figures/sysmonitor\345\212\237\350\203\275\345\210\227\350\241\250.png" differ diff --git "a/docs/zh/docs/sysmonitor/sysmonitor-\344\275\277\347\224\250\346\211\213\345\206\214.md" "b/docs/zh/docs/sysmonitor/sysmonitor-\344\275\277\347\224\250\346\211\213\345\206\214.md" new file mode 100644 index 0000000000000000000000000000000000000000..9494fb2c1a7b4b2f7fe565bbbc893e6a5634fb5b --- /dev/null +++ "b/docs/zh/docs/sysmonitor/sysmonitor-\344\275\277\347\224\250\346\211\213\345\206\214.md" @@ -0,0 +1,795 @@ +# sysmonitor + +## 介绍 +System Monitor Daemon + +sysmonitor 负责监控 OS 系统运行过程中的异常,将监控到的异常记录到系统日志(`/var/log/sysmonitor.log`)中。sysmonitor 以服务的形式提供,可以通过 `systemctl start|stop|restart|reload sysmonitor` 启动、关闭、重启、重载服务。建议产品部署 sysmonitor 调测软件,便于定位系统异常问题。 + +![](./figures/sysmonitor功能列表.png) + +### 注意事项 +- sysmonitor 不支持并发执行。 +- 各配置文件须合法配置,否则可能造成监控框架异常。 +- sysmonitor 服务操作和配置文件修改,日志查询需要 root 权限。root 用户具有系统最高权限,在使用 root 用户进行操作时,请严格按照操作指导进行操作,避免不规范操作造成系统管理及安全风险。 + +### 配置总览 +sysmonitor 有一个主配置文件(`/etc/sysconfig/sysmonitor`),用于配置各监控项的监控周期、是否需要监控。配置项的=和"之间不能有空格,如`PROCESS_MONITOR="on"`。 + +配置说明 + +| 配置项 | 配置项说明 | 是否必配 | 默认值 | +| ------------------------- | ------------------------------------------------------------ | -------- | -------------------------------------- | +| PROCESS_MONITOR | 设定是否开启关键进程监控,on为开启,off为关闭 | 否 | on | +| PROCESS_MONITOR_PERIOD | 设置关键进程监控的周期,单位秒 | 否 | 3s | +| PROCESS_RECALL_PERIOD | 关键进程恢复失败后再次尝试拉起周期,单位分,取值范围为1到1440之间的整数 | 否 | 1min | +| PROCESS_RESTART_TIMEOUT | 关键进程服务异常恢复过程中超时时间,单位秒,取值范围为30至300之间的整数 | 否 | 90s | +| PROCESS_ALARM_SUPRESS_NUM | 设置关键进程监控配置使用告警命令上报告警时的告警抑制次数,取值范围为正整数 | 否 | 5 | +| FILESYSTEM_MONITOR | 设定是否开启 ext3/ext4 文件系统监控,on 为开启,off 为关闭 | 否 | on | +| DISK_MONITOR | 设定是否开启磁盘分区监控,on为开启,off 为关闭 | 否 | on | +| DISK_MONITOR_PERIOD | 设定磁盘监控周期,单位秒 | 否 | 60s | +| INODE_MONITOR | 设定是否开启磁盘 inode 监控,on 为开启, off 为关闭 | 否 | on | +| INODE_MONITOR_PERIOD | 设定磁盘 inode 监控周期,单位秒 | 否 | 开启 | +| NETCARD_MONITOR | 设定是否开启网卡监控,on 为开启, off 为关闭 | 否 | on | +| FILE_MONITOR | 设定是否开启文件监控,on为开启, off 为关闭 | 否 | on | +| CPU_MONITOR | 设定是否开启 cpu 监控,on 为开启, off 为关闭 | 否 | on | +| MEM_MONITOR | 设定是否开启内存监控,on 为开启, off 为关闭 | 否 | on | +| PSCNT_MONITOR | 设定是否开启进程数监控,on为开启,off 为关闭 | 否 | on | +| FDCNT_MONITOR | 设定是否开启 fd 总数监控,on 为开启,off 为关闭 | 否 | on | +| CUSTOM_DAEMON_MONITOR | 用户自定义的 daemon类型的监控项,on为开启,off为关闭 | 否 | on | +| CUSTOM_PERIODIC_MONITOR | 用户自定义的 periodic 类型的监控项,on为开启, off 为关闭 | 否 | on | +| IO_DELAY_MONITOR | 本地磁盘 IO 延时监控开关,on 为开启,off 为关闭 | 否 | off | +| PROCESS_FD_NUM_MONITOR | 设定是否开启单个进程句柄数监控,on为开启,off 为关闭 | 否 | on | +| PROCESS_MONITOR_DELAY | sysmonitor 启动时,是否等待所有的监控项都正常,on为等待,off为不等待 | 否 | on | +| NET_RATE_LIMIT_BURST | 网卡监控路由信息打印抑制频率,即一秒内打印多少条日志 | 否 | 5
有效范围是 0-100,默认为5 | +| FD_MONITOR_LOG_PATH | 文件句柄监控日志文件 | 否 | 默认配置路径为 /var/log/sysmonitor.log | +| ZOMBIE_MONITOR | 僵尸进程监控开关 | 否 | off | +| CHECK_THREAD_MONITOR | 内部线程自愈开关,on为开启,off为关闭 | 否 | on
若不配置,默认值为开启 | +| CHECK_THREAD_FAILURE_NUM | 内部线程自愈的周期检查次数 | 否 | 默认值为3,范围为【2,10】 | + +- 修改 `/etc/sysconfig/sysmonitor` 配置文件后,需要重启 sysmonitor 服务生效。 +- 配置文件中,如果某一项没有配置,默认为监控项开启。 +- 内部线程自愈开启后,当监控项子线程卡住,且超过配置的周期检查次数,会重启 sysmonitor 服务,进行恢复,会重新加载配置,对于配置的关键进程监控和自定义监控,会重新拉起执行。如果对于用户使用有影响,可以选择关闭该功能。 + +### 命令参考 + +- 启动监控服务 +``` shell +systemctl start sysmonitor +``` +- 关闭监控服务 +``` shell +systemctl stop sysmonitor +``` +- 重启监控服务 +``` shell +systemctl restart sysmonitor +``` +- 修改监控项的配置文件后,重载监控服务可使修改后的配置动态生效 +``` shell +systemctl reload sysmonitor +``` + +### 监控日志 + +在默认情况下,为了防止 sysmonitor.log 文件过大,提供了切分转储日志的机制。日志将被转储到磁盘目录下,这样就能够保持一定量的日志。 + +配置文件为` /etc/rsyslog.d/sysmonitor.conf`,因为增加了 rsyslog 配置文件,第一次安装 sysmonitor 后,需要重启 rsyslog 服务生效 sysmonitor 日志配置。 + +``` +$template sysmonitorformat,"%TIMESTAMP:::date-rfc3339%|%syslogseverity-text%|%msg%\n" + +$outchannel sysmonitor, /var/log/sysmonitor.log, 2097152, /usr/libexec/sysmonitor/sysmonitor_log_dump.sh +if ($programname == 'sysmonitor' and $syslogseverity <= 6) then { +:omfile:$sysmonitor;sysmonitorformat +stop +} + +if ($msg contains 'Time has been changed') then { +:omfile:$sysmonitor;sysmonitorformat +stop +} + +if ($programname == 'sysmonitor' and $syslogseverity > 6) then { +/dev/null +stop +} +``` + + + +## ext3/ext4 文件系统监控 + +### 简介 + +当文件系统出现故障时会导致 IO 操作异常从而引发操作系统一系列问题。通过文件系统故障检测及时发现,以便于系统管理员或用户及时处理故障,修复问题。 +### 配置文件说明 +无 + +### 异常日志 + +对于增加了 errors=remount-ro 挂载选项的文件系统,如果监控到 ext3/ext4文件系统故障,sysmonitor.log 中打印异常信息示例如下: +``` +info|sysmonitor[127]: loop0 filesystem error. Remount filesystem read-only. +``` + +其他异常场景下,如果监控到 ext3/ext4 文件系统故障,sysmonitor.log 中打印异常信息示例如下: +``` +info|sysmonitor[127]: fs_monitor_ext3_4: loop0 filesystem error. flag is 1879113728. +``` + +## 关键进程监控 +### 简介 + +定期监控系统中关键进程,当系统内关键进程异常退出时,自动尝试恢复关键进程。如果恢复失败并需要告警,可上报告警。系统管理员能被及时告知进程异常退出事件,以及进程是否被恢复拉起。问题定位人员能从日志中定位进程异常退出的时间。 + +### 配置文件说明 + +配置目录为`/etc/sysmonitor/process`, 每个进程或模块一个配置文件。 + +``` +USER=root +NAME=irqbalance +RECOVER_COMMAND=systemctl restart irqbalance +MONITOR_COMMAND=systemctl status irqbalance +STOP_COMMAND=systemctl stop irqbalance +``` +各配置项如下: + +| 配置项 | 配置项说明 | 是否必配 | 默认值 | +| ---------------------- | ------------------------------------------------------------ | -------- | --------------------------------------------------- | +| NAME | 进程或模块名 | 是 | 无 | +| RECOVER_COMMAND | 恢复命令 | 否 | 无 | +| MONITOR_COMMAND | 监控命令
命令返回值为0视为进程正常,命令返回大于 0视为进程异常 | 否 | pgrep -f $(which xxx) "xxx"为NAME字段中配置的进程名 | +| STOP_COMMAND | 停止命令 | 否 | 无 | +| USER | 用户名
使用指定的用户执行、监控、恢复、停止命令或脚本 | 否 | 如果配置项为空,则默认使用 root | +| CHECK_AS_PARAM | 参数传递开关
开关设置为 on 时,在执行 RECOVER_COMMAND 命令时,会将 MONITOR_COMMAND 的返回值作为入参,传给 RECOVER_COMMAND 命令或脚本。 开关为 off 或其他时,功能关闭 | 否 | 无 | +| MONITOR_MODE | 监控模式
- 配置为 parallel,并行监控
- 配置为 serial,串行监控 | 否 | serial | +| MONITOR_PERIOD | 监控周期
- 并行监控监控周期
- 监控模块配置为 serial,该配置项不生效 | 否 | 3 | +| USE_CMD_ALARM | 告警模式
配置为 on 或 ON,则使用告警命令上报告警 | 否 | 无 | +| ALARM_COMMAND | 上报告警命令 | 否 | 无 | +| ALARM_RECOVER_COMMAND | 恢复告警命令 | 否 | 否 | + +- 修改关键进程监控的配置文件后,须执行 `systemctl reload sysmonitor`, 新的配置在一个监控周期后生效。 +- 恢复命令和监控命令不阻塞,否则会造成关键进程监控线程异常。 +- 当恢复命令执行超过 90 s时,会调用停止命令终止进程。 +- 当恢复命令配置为空或不配置时,监控命令检查到关键进程异常时,不会尝试进行拉起。 +- 当关键进程异常时,并且尝试拉起三次都不成功,最终会按照全局配置文件中配置的 PROCESS_RECALL_PERIOD 周期进行拉起。 +- 当监控的进程不是 daemon 进程,MONITOR_COMMAND 必配。 +- 若配置的关键服务在当前系统上不存在,则该监控不会生效,日志中会有相应提示;其他配置项,出现致命性错误,将使用默认配置,不报错。 +- 配置文件权限为 600,监控项建议为 systemd 中的 service类型(如 MONITOR_COMMAND=systemctl status irqbalance), 若监控的为进程,请确保 NAME 字段为绝对路径。 +- sysmonitor 重启(restart)、重载(reload)、退出(stop)都不会影响所监控的进程或服务。 +- 若 USE_CMD_ALARM 的配置为 on,ALARM_COMMAND、ALARM_RECOVER_COMMAND 的配置由用户保障。ALARM_COMMAND、ALARM_RECOVER_COMMAND 为空或没有配置,则不上报告警。 +- 对于用户自行配置的命令,如监控命令,恢复命令,停止命令,上报告警命令,恢复告警命令等,命令的安全性由用户保证。命令由 root 权限执行,建议脚本命令权限设置为仅供 root 使用,避免普通用户提权风险。 +- 配置监控命令的长度不大于200,大于 200,添加进程监控失败。 +- 当恢复命令配置为 systemd 的重启服务命令时(如`RECOVER_COMMAND=systemctl restart irqbalance`),需注意是否与开源 systemd 恢复服务的机制冲突,否则可能会影响关键进程异常后的行为模式。 +- 由 sysmonitor 恢复拉起的进程将和 sysmonitor 服务在同一个 Cgroup 当中,无法单独进行资源限制,因此建议优先使用开源 systemd 机制进行恢复。 + +### 异常日志 + +- 配置 RECOVER_COMMAND + + 如果监控到进程或模块异常,/var/log/sysmonitor.log 中打印异常信息示例如下: + + ``` + info|sysmonitor[127]: irqbalance is abnormal, check cmd return 1, use "systemctl restart irqbalance" to recover + ``` + + 如果监控到进程或模块恢复正常,/var/log/sysmonitor.log 中打印日志示例如下: + + ``` + info|sysmonitor[127]: irqbalance is recovered + ``` + +- 不配置 RECOVER_COMMAND + + 如果监控到进程或模块异常,/var/log/sysmonitor.log 中打印异常信息示例如下: + + ``` + info|sysmonitor[127]: irqbalance is abnormal, check cmd return 1, recover cmd is null, will not recover + ``` + + 如果监控到进程或模块恢复正常,/var/log/sysmonitor.log 中打印日志示例如下: + + ``` + info|sysmonitor[127]: irqbalance is recovered + ``` + + + +## 文件监控 +### 简介 + +系统关键文件被意外删除后,会导致系统运行异常甚至崩溃。通过文件监控可以及时获知系统中关键文件被删除或者有恶意文件被添加,以便管理员和用户及时获知并处理故障。 + +### 配置文件说明 + +配置文件为 `/etc/sysmonitor/file`。每个监控配置项为一行,监控配置项包含两个内容:监控文件(目录)和监控事件。监控文件(目录)是绝对路径,监控文件(目录)和监控事件中间由一个或多个空格隔开。 + +配置文件支持在` /etc/sysmonitor/file.d` 目录下增加文件监控项配置,配置方法与 `/etc/sysmoitor/file` 相同。 + +- 由于日志长度限制,建议配置的文件和目录绝对路径长度小于 223。如果配置的监控对象绝对路径长度超过223,可能会有日志打印不完整的现象出现。 + +- 请用户自行确保监控文件路径正确,如果配置文件不存在或路径错误则无法监控到该文件。 + +- 由于系统路径长度限制,监控的文件或目录绝对路径长度必须小于 4096。 + +- 支持监控目录和常规文件,/proc 和 /proc/* /dev和/dev/* /sys和/sys/* 管道文件 socket 文件等均不支持监控。 + +- /var/log 和 /var/log/* 均只支持删除事件。 + +- 当配置文件中存在多个相同路径的时候,以第一条合法配置为准,其他相同配置均不生效。在日志文件中可以查看到其他相同配置被忽略的提示。 + +- 不支持对软链接配置监控;当配置硬链接文件的删除事件时,需删除该文件和它的全部硬链接才会打印文件删除事件。 + +- 当文件添加监控成功及监控的事件发生时,监控日志打印的是配置文件中路径的绝对路径。 + +- 目前暂不支持目录递归监控,只能监控配置文件中的目录,子目录不会监控。 + +- 监控文件(目录)采用了位图的方式配置要监控的事件,对文件或目录进行监控的事件位图如下所示: +``` + ------------------------------- + | 11~32 | 10 | 9 | 1~8 | + ------------------------------- +``` + +事件位图每一位代表一个事件,第N位如果置1,则表示监控第n位对应的事件;如果第 n 位置 0,则表示不监控第 n 位对应的事件。监控位图对应的 16 进制数,即是写到配置文件中的监控事件项。 + +| 配置项 | 配置项说明 | 是否必配 | +| ------ | ------------------ | -------- | +| 1~8 | 保留 | 否 | +| 9 | 文件、目录添加事件 | 是 | +| 10 | 文件、目录删除事件 | 是 | +| 11~32 | 保留 | 否 | + +- 修改文件监控的配置文件后,须执行` systemctl reload sysmonitor`,新的配置在最多 60 秒后生效。 +- 监控事件需要严格遵守上述规则,如果配置有误,则无法监控;如果配置项中监控事件为空,则默认只监控删除事件,即 0x200。 +- 文件或目录删除后,只有当所有打开该文件的进程都停止后才会上报删除事件。 +- 监控的文件通过 vi、sed 等操作修改后会在监控日志中打印 File "XXX" may have been changed。 +- 文件监控目前实现了对添加和删除事件的监控,即第9位和第10位有效,其他位为保留位,暂不生效。如果配置了保留位,监控日志会提示监控事件配置错误。 + +**示例** + +配置对 /home 下子目录的增加和删除事件监控,低12 位位图为:001100000000,则可以配置如下: + +``` +/home 0x300 +``` + +配置对 /etc/ssh/sshd_config 文件的删除事件监控,低12位位图为:001000000000,则可以配置如下: + +``` +/etc/sshd/sshd_config 0x200 +``` + +### 异常日志 + +如果监控文件有配置的事件发生,/var/log/sysmonitor.log 中打印日志示例如下: + +``` +info|sysmonitor[127]: 1 events queued +info|sysmonitor[127]: 1th events handled +info|sysmonitor[127]: Subfile "111" under "/home" was added. +``` + +## 磁盘分区监控 + +### 简介 + +定期监控系统中挂载的磁盘分区空间,当磁盘分区使用率大于或等于用户设置的告警阈值时,记录磁盘空间告警。当磁盘分区使用率小于用户设置的告警恢复阈值时,记录磁盘空间恢复告警。 + +### 配置文件说明 + +配置文件为 `/etc/sysmonitor/disk`。 + +``` +DISK="/var/log" ALARM="90" RESUME="80" +DISK="/" ALARM="95" RESUME="85" +``` + +| 配置项 | 配置项说明 | 是否必配 | 默认值 | +| ------ | ---------------------- | -------- | ------ | +| DISK | 磁盘挂载目录名 | 是 | 无 | +| ALARM | 整数,磁盘空间告警阈值 | 否 | 90 | +| RESUME | 整数,磁盘空间恢复阈值 | 否 | 80 | + +- 修改磁盘空间监控的配置文件后,须执行 systemctl reload sysmonitor,新的配置在一个监控周期后生效。 +- 重复配置的挂载目录,最后一个配置项生效。 +- ALARM 值应该大于 RESUME 值。 +- 只能针对挂载点或被挂载点的磁盘分区做监控。 +- 在 CPU 和 IO 高压场景下,df 命令执行超时,会导致磁盘利用率获取不到。 +- 当多个挂载点对应同一个磁盘分区时,以挂载点为准来上报告警。 + +### 异常日志 + +如果监控到磁盘空间告警,`/var/log/sysmonitor.log `中打印信息示例如下: + +``` +warning|sysmonitor[127]: report disk alarm, /var/log used:90% alarm:90% +info|sysmonitor[127]: report disk recovered, /var/log used:4% resume:10% +``` + + + +## 网卡状态监控 +### 简介 + +系统运行过程中可能出现人为原因或异常而导致网卡状态或 IP 发生改变,对网卡状态和 IP 变化进行监控,以便及时感知到异常并方便定位异常原因。 + +### 配置文件说明 + +配置文件为 `/etc/sysmonitor/network`。 + +``` +#dev event +eth1 UP +``` +各配置项说明如下表 +| 配置项 | 配置项说明 | 是否必配 | 默认值 | +| ------ | ------------------------------------------------------------ | -------- | ------------------------------------------------- | +| dev | 网卡名 | 是 | 无 | +| event | 侦听事件,可取 UP, DOWN,NEWADDR, DELADDR.
- UP: 网卡 UP
- DOWN: 网卡 DOWN
- NEWADDR: 增加 ip 地址
- DELADDR: 删除 ip 地址 | 否 | 若侦听事件为空则 UP,DOWN,NEWADDR,DELADDR都监控 | + +- 修改网卡监控的配置文件后,执行 `systemctl reload sysmonitor`,新的配置生效。 +- 不支持虚拟网卡 UP 和 DOWN 状态监控。 +- 请确保网卡监控的配置文件每行少于 4096 个字符,若超过4096个字符会在监控日志中打印配置错误的提示信息。 +- 默认监控所有网卡的所有事件信息,即不配置任何网卡,默认监控所有网卡的 UP,DOWN,NEWADDR,DELADDR 事件。 +- 如果配置网卡,不配置事件,则默认监控改网卡的所有事件。 +- 增加路由信息,默认一秒五条,可通过/etc/sysconfig/sysmonitor 的 NET_RATE_LIMIT_BURST 配置选项配置一秒钟打印路由信息数量。 + +### 异常日志 + +如果监控到配置的网卡事件,`/var/log/sysmonitor.log` 中打印信息示例如下: + +``` +info|sysmonitor[127]: lo: ip[::1] prefixlen[128] is added, comm: (ostnamed)[1046], parent comm: syst emd[1] +info|sysmonitor[127]: lo: device is up, comm: (ostnamed)[1046], parent comm: systemd[1] + +``` + +如果监控到路由事件, `/var/log/sysmonitor.log` 中打印信息示例如下: + +``` + +info|sysmonitor[881]: Fib4 replace table=255 192.168.122.255/32, comm: daemon-init[1724], parent com m: systemd[1] +info|sysmonitor[881]: Fib4 replace table=254 192.168.122.0/24, comm: daemon-init[1724], parent comm: systemd[1] +info|sysmonitor[881]: Fib4 replace table=255 192.168.122.0/32, comm: daemon-init[1724], parent comm: systemd[1] +info|sysmonitor[881]: Fib6 replace fe80::5054:ff:fef6:b73e/128, comm: kworker/1:3[209], parent comm: kthreadd[2] + +``` + + + +## cpu 监控 +### 简介 + +监控系统全局或指定域内 cpu 的占用情况,当 cpu 使用率超出用户设置的告警阈值时,执行用户配置的日志收集命令。 + +### 配置文件说明 + +配置文件为`/etc/sysmonitor/cpu`。 + +当监控系统全局 cpu 时,配置文件示例如下: + +``` +# cpu usage alarm percent +ALARM="90" + +# cpu usage alarm resume percent +RESUME="80" + +# monitor period (second) +MONITOR_PERIOD="60" + +# stat period (second) +STAT_PERIOD="300" + +# command executed when cpu usage exceeds alarm percent +REPORT_COMMAND="" +``` + +当监控系统指定域 cpu 时,配置文件示例如下: + +``` +# monitor period (second) +MONITOR_PERIOD="60" + +# stat period (second) +STAT_PERIOD="300" + +DOMAIN="0,1" ALARM="90" RESUME="80" +DOMAIN="2,3" ALARM="50" RESUME="40" + +# command executed when cpu usage exceeds alarm percent +REPORT_COMMAND="" +``` + +| 配置项 | 配置项说明 | 是否必配 | 默认值 | +| -------------- | ------------------------------------------------------------ | -------- | ------ | +| ALARM | 大于0,cpu 使用率告警阈值 | 否 | 90 | +| RESUME | 大于等于0,cpu 使用率恢复阈值 | 否 | 80 | +| MONITOR_PERIOD | 监控周期(秒),取值大于0 | 否 | 60 | +| STAT_PERIOD | 统计周期(秒),取值大于0 | 否 | 300 | +| DOMAIN | 域内的 cpu 信号,cpu 号均以十进制数字表示
- 可以通过列举方式指定,cpu 号之间通过逗号分隔,例如:1,2,3。也可以通过范围方式指定,格式 X-Y(X- 每个监控域单独一个配置项,每个项支持最多配置 256 个 cpu,域内以及域之间 cpu 号均不能重复 | 否 | 无 | +| REPORT_COMMAND | cpu 使用率超过告警阈值后的日志收集命令 | 否 | 无 | + +- 修改 cpu 监控的配置文件后,须执行 systemctl reload sysmonitor, 新的配置在一个监控周期后生效。 +- ALARM 值应该大于 RESUME 值。 +- 当配置监控 cpu 域后,不再对系统全局 cpu 平均使用率进行监控,单独配置的 ALARM、RESUME 值不生效。 +- 如果某个监控域的配置存在非法,则整个 cpu 监控不执行。 +- DOMAIN 内配置的 cpu 必须全部处于在线工作状态,否则对该域的监控无法正常进行。 +- REPORT_COMMAND 项的命令不能包含 &、;、> 等不安全字符且总长度不能超过 159个字符,否则命令无法生效。 +- REPORT_COMMAND 项的命令安全性、有效性由用户自己保证,sysmonitor 只负责以 root 用户执行该命令。 +- REPORT_COMMAND 项的命令不能阻塞,当该命令执行时间超过 60s后,sysmonitor 会强行终止执行。 +- 每轮监控即使有多个域 cpu 使用率超过阈值,REPORT_COMMAND 也仅会执行一次。 + +### 异常日志 + +如果监控到全局 cpu 使用率告警或恢复且配置了日志收集命令,`/var/log/sysmonitor.log` 中打印信息示例如下: + +``` +info|sysmonitor[127]: CPU usage alarm: 91.3% +info|sysmonitor[127]: cpu monitor: execute REPORT_COMMAND[sysmoniotrcpu] sucessfully +info|sysmonitor[127]: CPU usage resume 70.1% +``` + +如果监控到某个域的 cpu 平均使用率告警或恢复且配置了日志收集命令,`/var/log/sysmonitor.log` 中打印信息示例如下: + +``` +info|sysmonitor[127]: CPU 1,2,3 usage alarm: 91.3% +info|sysmonitor[127]: cpu monitor: execute REPORT_COMMAND[sysmoniotrcpu] sucessfully +info|sysmonitor[127]: CPU 1,2,3 usage resume 70.1% +``` + + + +## 内存监控 +### 简介 + +监控系统内存占用情况,当内存使用率超出或低于阈值时,记录日志。 + +### 配置文件说明 + +配置文件为 `/etc/sysmonitor/memory`。 + +``` +# memory usage alarm percent +ALARM="90" + +# memory usage alarm resume percent +RESUME="80" + +# monitor period(second) +PERIOD="60" +``` + +### 配置项说明 + +| 配置项 | 配置项说明 | 是否必配 | 默认值 | +| ------ | ----------------------------- | -------- | ------ | +| ALARM | 大于0,内存占用率告警阈值 | 否 | 90 | +| RESUME | 大于等于0,内存占用率恢复阈值 | 否 | 80 | +| PERIOD | 监控周期(秒),取值大于 0 | 否 | 60 | + +- 修改内存监控的配置文件后,须执行 `systemctl reload sysmonitor`,新的配置在一个监控周期后生效。 +- ALARM 值应该大于 RESUME值。 +- 取三个监控周期的内存占用的平均值,来作为是否上报发生告警或恢复告警的依据。 + +### 异常日志 + +如果监控到内存告警,sysmonitor 获取 `/proc/meminfo `信息,打印到` /var/log/sysmonitor.log` 中,信息如下: + +``` +info|sysmonitor[127]: memory usage alarm: 90% +info|sysmonitor[127]:---------------show /proc/meminfo: --------------- +info|sysmonitor[127]:MemTotal: 3496388 kB +info|sysmonitor[127]:MemFree: 2738100 kB +info|sysmonitor[127]:MemAvailable: 2901888 kB +info|sysmonitor[127]:Buffers: 165064 kB +info|sysmonitor[127]:Cached: 282360 kB +info|sysmonitor[127]:SwapCached: 4492 kB +...... +info|sysmonitor[127]:---------------show_memory_info end. --------------- +``` + +sysmonitor 有如下打印信息时,表示 sysmonitor 会调用 "echo m > /proc/sysrq-trigger" 命令导出内存分配的信息(可以在 /var/log/messages 中进行查看)。 + +``` +info|sysmonitor[127]: sysrq show memory ifno in message。 +``` + +告警恢复时,打印信息如下: + +``` +info|sysmonitor[127]: memory usage resume: 4.6% +``` + +## 进程数/线程数监控 +### 简介 + +监控系统进程数目和线程数目,当进程总数或线程总数超出或低于阈值时,记录日志或上报告警。 + +### 配置文件说明 + +配置文件为 `/etc/sysmonitor/pscnt`。 + +``` +# number of processes(include threads) when alarm occur +ALARM="1600" + +# number of processes(include threads) when alarm resume +RESUME="1500" + +# monitor period(second) +PERIOD="60" + +# process count usage alarm percent +ALARM_RATIO="90" + +# process count usage resume percent +RESUME_RATIO="80" + +# print top process info with largest num of threads when threads alarm +# (range: 0-1024, default: 10, monitor for thread off:0) +SHOW_TOP_PROC_NUM="10" +``` + +| 配置项 | 配置项说明 | 是否必配 | 默认值 | +| ----------------- | ------------------------------------------------------------ | -------- | ------ | +| ALARM | 大于 0 的整数,进程总数告警阈值 | 否 | 1600 | +| RESUME | 大于等于0的整数,进程总数恢复阈值 | 否 | 1500 | +| PERIOD | 监控周期(秒),取值大于0 | 否 | 60 | +| ALARM_RATIO | 大于0小于等于100的值,可以为小数。进程使用率告警阈值 | 否 | 90 | +| RESUME_RATIO | 大于等于0小于100的值,可以为小数。进程使用率恢复阈值,必须必告警阈值小。 | 否 | 80 | +| SHOW_TOP_PROC_NUM | 使用线程数量最新 TOP 的进程信息 | 否 | 10 | + +- 修改进程数监控的配置文件后,须执行` systemctl reload sysmonitor`,新的配置在一个监控周期后生效。 +- ALARM 值应该大于 RESUME 值。 +- 进程数告警产生阈值取 ALARM 值与 `/proc/sys/kernel/pid_max` 的 ALARM_RATIO 中的最大值,告警恢复阈值取 RESUME 值与 `/proc/sys/kernel/pid_max` 的 RESUME_RATIO 中的最大值。 +- 线程数告警产生阈值取 ALARM 值与 `/proc/sys/kernel/threads-max` 的 ALARM_RATIO 中的最大值,告警恢复阈值取 RESUME 值与 `/proc/sys/kernel/threads-max` 的 RESUME_RATIO 中的最大值。 +- SHOW_TOP_PROC_NUM 的取值范围为0-1024,为0时,表示不启用线程监控;当设置值较大时,如 1024,在环境中产生线程告警,且告警阈值较高时,会有性能影响,建议设置为默认值 10 及更小值,若影响较大,建议设置为 0,不启动线程监控。 +- 线程监控启动时,由 `/etc/sysconfig/sysmonitor` 中 `PSCNT_MONITOR` 项和 `/etc/sysmonitor/pscnt` 中 `SHOW_TOP_PROC_NUM` 项设置。 + - `PSCNT_MONITOR` 为 on,且 `SHOW_TOP_PROC_NUM` 设置为合法值时,为启动。 + - `PSCNT_MONITOR` 为 on, `SHOW_TOP_PROC_NUM` 为 0时,为关闭。 + - `PSCNT_MONITOR `为 off,为关闭。 +- 进程数量告警时,增加打印系统句柄使用信息和内存信息(/proc/meminfo)。 +- 线程数量告警时,会记录线程总数信息,TOP 进程信息,当前环境进程数量信息,系统句柄数信息,内存信息(/proc/meminfo)。 +- 监控项监控周期到达前,若系统出现资源不足(如线程数超过系统最大线程数),则监控告警本身将由于资源受限无法正常运行,进而无法进行告警。 + +### 异常日志 + +如果监控到进程数告警,`/var/log/sysmonitor.log` 中打印信息示例如下: + +``` +info|sysmonitor[127]:---------------process count alarm start: --------------- +info|sysmonitor[127]: process count alarm:1657 +info|sysmonitor[127]: process count alarm, show sys fd count: 2592 +info|sysmonitor[127]: process count alarm, show mem info +info|sysmonitor[127]:---------------show /proc/meminfo: --------------- +info|sysmonitor[127]:MemTotal: 3496388 kB +info|sysmonitor[127]:MemFree: 2738100 kB +info|sysmonitor[127]:MemAvailable: 2901888 kB +info|sysmonitor[127]:Buffers: 165064 kB +info|sysmonitor[127]:Cached: 282360 kB +info|sysmonitor[127]:SwapCached: 4492 kB +...... +info|sysmonitor[127]:---------------show_memory_info end. --------------- +info|sysmonitor[127]:---------------process count alarm end: --------------- + +``` + +如果监控到进程数恢复告警,`/var/log/sysmonitor.log` 中打印信息示例如下: + +``` +info|sysmonitor[127]: process count resume: 1200 +``` + +如果监控到线程数告警,`/var/log/sysmonitor.log` 中打印信息示例如下: + +``` +info|sysmonitor[127]:---------------threads count alarm start: --------------- +info|sysmonitor[127]:threads count alarm: 273 +info|sysmonitor[127]:open threads most 10 processes is [top1:pid=1756900,openthreadsnum=13,cmd=/usr/bin/sysmonitor --daemon] +info|sysmonitor[127]:open threads most 10 processes is [top2:pid=3130,openthreadsnum=13,cmd=/usr/lib/gassproxy -D] +..... +info|sysmonitor[127]:---------------threads count alarm end. --------------- +``` + + + +## 系统句柄总数监控 +### 简介 + +监控系统文件句柄(fd)数目,当系统文件句柄总数超过或低于阈值时,记录日志。 + +### 配置文件说明 + +配置文件为 `/etc/sysmonitor/sys_fd_conf`。 + +``` +# system fd usage alarm percent +SYS_FD_ALARM="80" +# system fd usage alarm resume percent +SYS_FD_RESUME="70" +# monitor period (second) +SYS_FD_PERIOD="600" +``` + +配置项说明: + +| 配置项 | 配置项说明 | 是否必配 | 默认值 | +| ------------- | --------------------------------------------------------- | -------- | ------ | +| SYS_FD_ALARM | 大于0小于100的整数,fd总数与系统最大 fd数百分比的告警阈值 | 否 | 80% | +| SYS_FD_RESUME | 大于0小于100的整数,fd 总数与系统最大fd数百分比的恢复阈值 | 否 | 70% | +| SYS_FD_PERIOD | 监控周期(秒),取值为100~86400 之间的整数 | 否 | 600 | + +- 修改fd 总数监控的配置文件后,须执行 `systemctl reload sysmonitor`,新的配置在一个监控周期后生效。 +- `SYS_FD_ALARM` 值应该大于 `SYS_FD_RESUME` 值,当配置非法时,会使用默认值,并打印日志。 + +### 异常日志 + +如果监控到 fd 总数告警,在监控日志中打印告警。`/var/log/sysmonitor.log` 中打印信息示例如下: + +``` +info|sysmonitor[127]: sys fd count alarm: 259296 +``` + +系统句柄使用告警时,会打印前三个使用句柄数最多的进程: + +``` +info|sysmonitor[127]:open fd most three processes is:[top1:pid=23233,openfdnum=5000,cmd=/home/openfile] +info|sysmonitor[127]:open fd most three processes is:[top2:pid=23267,openfdnum=5000,cmd=/home/openfile] +info|sysmonitor[127]:open fd most three processes is:[top3:pid=30144,openfdnum=5000,cmd=/home/openfile] +``` + + + +## 磁盘 inode 监控 +### 简介 +定期监控系统中挂载的磁盘分区 inode,当磁盘分区 inode 使用率大于或等于用户设置的告警阈值,记录磁盘 inode 告警。发生告警后,当磁盘分区 inode 使用率小于用户设置的告警恢复阈值,记录磁盘 inode 恢复告警。 + +### 配置文件说明 + +配置文件为 `/etc/sysmonitor/inode`。 + +``` +DISK="/" +DISK="/var/log" +``` + +| 配置项 | 配置项说明 | 是否必配 | 默认值 | +| ------ | ------------------------- | -------- | ------ | +| DISK | 磁盘挂载目录名 | 是 | 无 | +| ALARM | 整数,磁盘 inode 告警阈值 | 否 | 90 | +| RESUME | 整数,磁盘 inode 恢复阈值 | 否 | 80 | + +- 修改磁盘 inode 监控的配置文件后,须执行 `systemctl reload sysmonitor`,新的配置在一个监控周期后生效。 +- 重复配置的挂载目录,最后一个配置项生效。 +- ALARM 值应该大于 RESUME 值。 +- 只能针对挂载点或被挂载的磁盘分区做监控。 +- 在 CPU 和 IO 高压场景下,df 执行命令超时,会导致磁盘 inode 利用率获取不到。 +- 当多个挂载点对应同一个磁盘分区,以挂载点为准来上报告警。 + +### 异常日志 + +如果监控到磁盘 inode 告警,`/var/log/sysmonitor.log`中打印信息示例如下: + +``` +info|sysmonitor[4570]:report disk inode alarm, /var/log used:90% alarm:90% +info|sysmonitor[4570]:report disk inode recovered, /var/log used:79% alarm:80% +``` + + + +## 本地磁盘 io 延时监控 +### 简介 + +每5秒读取一次本地磁盘 io 延时数据,每五分钟对在该五分钟内60组数据进行统计,如果有多于30次(一半)的数据大于配置的最大 IO 延时数据,则记录该磁盘的 IO 延时过大日志。 + +### 配置文件说明 + +配置文件为 `/etc/sysmonitor/iodelay`。 + +``` +DELAY_VALUE="500" +``` + +| 配置项 | 配置项说明 | 是否必配 | 默认值 | +| ----------- | -------------------- | -------- | ------ | +| DELAY_VALUE | 磁盘 IO 延时的最大值 | 是 | 500 | + +### 异常日志 + +如果监控到本地磁盘 IO 延时过大告警,`/var/log/sysmonitor.log` 中打印信息示例如下: + +``` +info|sysmonitor[127]:local disk sda IO delay is too large, I/O delay threshold is 70. +info|sysmonitor[127]:disk is sda, io delay data: 71 72 75 87 99 29 78 ...... +``` + +如果监控到本地磁盘 IO 延时告警恢复,`/var/log/sysmonitor.log` 中打印信息示例如下: + +``` +info|sysmonitor[127]:local disk sda IO delay is normal, I/O delay threshold is 70. +info|sysmonitor[127]:disk is sda, io delay data: 11 22 35 8 9 29 38 ...... +``` + + + +## 僵尸进程监控 +### 简介 + +监控系统僵尸进程数量,大于告警阈值,记录告警日志。当系统僵尸进程数小于恢复阈值时,告警恢复。 + +### 配置文件说明 + +配置文件为`/etc/sysmonitor/zombie`。 + +``` +# Ceiling zombie process counts of alarm +ALARM="500" + +# Floor zombie process counts of resume +RESUME="400" + +# Periodic (second) +PERIOD="600" +``` + +| 配置项 | 配置项说明 | 是否必配 | 默认值 | +| ------ | ------------------------------- | -------- | ------ | +| ALARM | 大于0,僵尸进程个数告警阈值 | 否 | 500 | +| RESUME | 大于等于0,僵尸进程个数恢复阈值 | 否 | 400 | +| PERIOD | 监控周期(秒),取值大于0 | 否 | 60 | + +### 异常日志 + +如果监控到僵尸进程个数告警,`/var/log/sysmonitor.log`中打印信息如下: + +``` +info|sysmonitor[127]: zombie process count alarm: 600 +info|sysmonitor[127]: zombie process count resume: 100 +``` + + + +## 自定义监控 + +### 简介 + +用户可以自定义监控项,监控框架读取配置文件内容,解析配置文件各监控属性,在监控框架里调用用户要执行的监控动作。监控模块仅提供监控框架,不感知用户在监控的内容以及如何监控,不负责上报告警。 + +### 配置文件说明 + +配置文件位于`/etc/sysmonitor.d/`路径下,每个进程或模块对应一个配置文件。 + +``` +MONITOR_SWITCH="on" +TYPE="periodic" +EXECSTART="/usr/sbin/iomonitor_daemon" +PERIOD="1800" +``` + +| 配置项 | 配置项说明 | 是否必配 | 默认值 | +| -------------- | ------------------------------------------------------------ | --------------------- | ------ | +| MONITOR_SWITCH | 监控开关 | 否 | off | +| TYPE | 自定义监控项的类型
daemon:后台运行
periodic:周期运行 | 是 | 无 | +| EXECSTART | 执行监控命令 | 是 | 无 | +| ENVIROMENTFILE | 环境变量存放文件 | 否 | 无 | +| PERIOD | 若 type 为 periodic 类型,此为必配项,为自定义监控的周期,取值为大于0的整数 | periodic 类型为必配项 | 无 | + +- 配置文件名称,环境变量文件名称,加上绝对路径总长度不能超过127个字符。环境变量文件必须为绝对路径和实际路径,不能是软链接路径。 +- EXECSTART项的命令总长度不能超过159个字符,关键字段配置不能有空格。 +- 周期性监控的执行命令不能超时,否则对自定义监控框架产生影响。 +- 目前支持配置的环境变量最多为256个。 +- daemon 类型的自定义监控每间隔10s会统一查询是否有 reload 命令下发,或者是否有 daemon 进程异常退出;如果有reload 命令下发,需要等待 10s 后才会重新加载新的配置,如果有 daemon 进程异常退出,需要等待 10s才会重新拉起。 +- ENVIROMENTFLE 对应的文件中的内容发生变化,如新增环境变量,或环境变量的值发生变化,需要重启 sysmonitor 服务,新的环境变量才能生效。 +- `/etc/sysmonitor.d/`目录下的配置文件权限建议为 600, EXECSTART 项中若只配置了执行文件,则执行文件的权限建议为 550。 +- daemon 进程异常退出后,sysmonitor 会重新加载该 daemon进程的配置文件。 + +### 异常日志 + +如果 daemon 类型监控项异常退出,/var/log/sysmonitor.log 中会有如下记录: + +``` +info|sysmonitor[127]: custom daemon monitor: child process[11609] name unetwork_alarm exit code[127],[1] times. +``` diff --git a/docs/zh/docs/thirdparty_migration/installha.md b/docs/zh/docs/thirdparty_migration/installha.md index 7eac83928862655418141a0df1f49c4e117463e7..850a37913115ee696d301988056781248486f378 100644 --- a/docs/zh/docs/thirdparty_migration/installha.md +++ b/docs/zh/docs/thirdparty_migration/installha.md @@ -6,7 +6,7 @@ ### 环境准备 -需要至少两台安装了openEuler 21.03 的物理机/虚拟机(现以两台为例),安装方法参考《[安装指南](../Installation/installation.md)》。 +需要至少两台安装了openEuler 24.03 的物理机/虚拟机(现以两台为例),安装方法参考《[安装指南](../Installation/installation.md)》。 ### 修改主机名称及/etc/hosts文件 @@ -34,24 +34,24 @@ ```Conf [OS] name=OS -baseurl=http://repo.openeuler.org/openEuler-23.09/OS/$basearch/ +baseurl=http://repo.openeuler.org/openEuler-{version}/OS/$basearch/ enabled=1 gpgcheck=1 -gpgkey=http://repo.openeuler.org/openEuler-23.09/OS/$basearch/RPM-GPG-KEY-openEuler +gpgkey=http://repo.openeuler.org/openEuler-{version}/OS/$basearch/RPM-GPG-KEY-openEuler [everything] name=everything -baseurl=http://repo.openeuler.org/openEuler-23.09/everything/$basearch/ +baseurl=http://repo.openeuler.org/openEuler-{version}/everything/$basearch/ enabled=1 gpgcheck=1 -gpgkey=http://repo.openeuler.org/openEuler-23.09/everything/$basearch/RPM-GPG-KEY-openEuler +gpgkey=http://repo.openeuler.org/openEuler-{version}/everything/$basearch/RPM-GPG-KEY-openEuler [EPOL] name=EPOL -baseurl=http://repo.openeuler.org/openEuler-23.09/EPOL/$basearch/ +baseurl=http://repo.openeuler.org/openEuler-{version}/EPOL/$basearch/ enabled=1 gpgcheck=1 -gpgkey=http://repo.openeuler.org/openEuler-23.09/OS/$basearch/RPM-GPG-KEY-openEuler +gpgkey=http://repo.openeuler.org/openEuler-{version}/OS/$basearch/RPM-GPG-KEY-openEuler ``` ### 安装HA软件包组件 diff --git a/docs/zh/docs/thirdparty_migration/openstack.md b/docs/zh/docs/thirdparty_migration/openstack.md index 05dd965ac5ea6eaa293805b3ab2ce0478d298e4e..12a6cbce8d096144d41d64509786a741dde07888 100644 --- a/docs/zh/docs/thirdparty_migration/openstack.md +++ b/docs/zh/docs/thirdparty_migration/openstack.md @@ -1,3 +1,3 @@ # openEuler OpenStack -openEuler OpenStack相关文档已迁移至[OpenStack SIG官网文档](https://openeuler.gitee.io/openstack/)。请访问链接获取详细信息。 +openEuler OpenStack相关文档已迁移至[OpenStack SIG官网文档](https://openstack-sig.readthedocs.io/zh/latest/)。请访问链接获取详细信息。 diff --git a/docs/zh/docs/userguide/pkgship.md b/docs/zh/docs/userguide/pkgship.md index 9e9e5af6da36d6bdfc9a0ab969c3560d553aa7f2..0eb22bfbeb1f6d809ac6139203bc2d2cda8a5056 100644 --- a/docs/zh/docs/userguide/pkgship.md +++ b/docs/zh/docs/userguide/pkgship.md @@ -63,14 +63,14 @@ pkgship提供了公网地址 说明:该软件支持在docker下运行。目前在openEuler21.09版本下,由于环境条件限制,创建docker时请使用--privileged参数,不使用--privileged参数将会导致软件启动失败,后续适配后将更新该文档。 +> 说明:该软件支持在docker下运行。 **1、pkgship工具安装** 工具安装可通过以下两种方式中的任意一种实现。 - 方法一:通过dnf挂载repo源实现。 - 先使用dnf挂载pkgship软件所在repo源(具体方法可参考[应用开发指南](https://openeuler.org/zh/docs/20.09/docs/ApplicationDev/%E5%BC%80%E5%8F%91%E7%8E%AF%E5%A2%83%E5%87%86%E5%A4%87.html)),然后执行如下指令下载以及安装pkgship及其依赖。 + 先使用dnf挂载pkgship软件所在repo源(具体方法可参考[应用开发指南](https://openeuler.org/zh/docs/24.03_LTS/docs/ApplicationDev/%E5%BC%80%E5%8F%91%E7%8E%AF%E5%A2%83%E5%87%86%E5%A4%87.html)),然后执行如下指令下载以及安装pkgship及其依赖。 ```bash dnf install pkgship @@ -194,14 +194,14 @@ database_port=9200 conf.yaml 文件默认存放在 /etc/pkgship/ 路径下,pkgship会通过该配置读取要建立的数据库名称以及需要导入的sqlite文件,也支持配置sqlite文件所在的repo地址。conf.yaml 示例如下所示。 ```yaml -dbname: oe20.03 #数据库名称 -src_db_file: /etc/pkgship/repo/openEuler-20.03/src #源码包所在的本地路径 -bin_db_file: /etc/pkgship/repo/openEuler-20.03/bin #二进制包所在的本地路径 +dbname: oe{version} #数据库名称 +src_db_file: /etc/pkgship/repo/openEuler-{version}/src #源码包所在的本地路径 +bin_db_file: /etc/pkgship/repo/openEuler-{version}/bin #二进制包所在的本地路径 priority: 1 #数据库优先级 -dbname: oe20.09 -src_db_file: https://repo.openeuler.org/openEuler-20.09/source #源码包所在的repo源 -bin_db_file: https://repo.openeuler.org/openEuler-20.09/everything/aarch64 #二进制包所在的repo源 +dbname: oe{version} +src_db_file: https://repo.openeuler.org/openEuler-{version}/source #源码包所在的repo源 +bin_db_file: https://repo.openeuler.org/openEuler-{version}/everything/aarch64 #二进制包所在的repo源 priority: 2 ``` @@ -239,7 +239,7 @@ pkgshipd stop 停止服务 1. 数据库初始化。 - > 使用场景:服务启动后,为了能查询对应的数据库(比如oe20.03,oe20.09)中的包信息及包依赖关系,需要将这些数据库通过createrepo生成的sqlite(分为源码库和二进制库)导入进服务内,生成对应的包信息json体然后插入Elasticsearch对应的数据库中。数据库名为根据conf.yaml中配置的dbname生成的dbname-source/binary。 + > 使用场景:服务启动后,为了能查询对应的数据库(比如oe{version})中的包信息及包依赖关系,需要将这些数据库通过createrepo生成的sqlite(分为源码库和二进制库)导入进服务内,生成对应的包信息json体然后插入Elasticsearch对应的数据库中。数据库名为根据conf.yaml中配置的dbname生成的dbname-source/binary。 ```bash pkgship init [-filepath path] diff --git a/docs/zh/menu/index.md b/docs/zh/menu/index.md index 2e38023f639f5a63ca7272079b91a7f1e4f72a44..2254870320f805d762f771d2d3870a8bbd47d0fc 100644 --- a/docs/zh/menu/index.md +++ b/docs/zh/menu/index.md @@ -6,7 +6,6 @@ headless: true - [简介]({{< relref "./docs/Releasenotes/简介.md" >}}) - [用户须知]({{< relref "./docs/Releasenotes/用户须知.md" >}}) - [帐号清单]({{< relref "./docs/Releasenotes/帐号清单.md" >}}) - - [简介]({{< relref "./docs/Releasenotes/简介.md" >}}) - [系统安装]({{< relref "./docs/Releasenotes/系统安装.md" >}}) - [关键特性]({{< relref "./docs/Releasenotes/关键特性.md" >}}) - [已知问题]({{< relref "./docs/Releasenotes/已知问题.md" >}}) @@ -23,17 +22,18 @@ headless: true - [安装方式介绍]({{< relref "./docs/Installation/安装方式介绍.md" >}}) - [安装指导]({{< relref "./docs/Installation/安装指导.md" >}}) - [使用kickstart自动化安装]({{< relref "./docs/Installation/使用kickstart自动化安装.md" >}}) - - [FAQ]({{< relref "./docs/Installation/FAQ.md" >}}) + - [常见问题与解决方法]({{< relref "./docs/Installation/常见问题与解决方法.md" >}}) - [安装在树莓派]({{< relref "./docs/Installation/安装在树莓派.md" >}}) - [安装准备]({{< relref "./docs/Installation/安装准备-1.md" >}}) - [安装方式介绍]({{< relref "./docs/Installation/安装方式介绍-1.md" >}}) - [安装指导]({{< relref "./docs/Installation/安装指导-1" >}}) - - [FAQ]({{< relref "./docs/Installation/FAQ-1.md" >}}) + - [常见问题与解决方法]({{< relref "./docs/Installation/常见问题与解决方法-1.md" >}}) - [更多资源]({{< relref "./docs/Installation/更多资源.md" >}}) - - [RISC-V安装指南]({{< relref "./docs/Installation/riscv.md" >}}) - - [虚拟机安装]({{< relref "./docs/Installation/riscv_qemu.md" >}}) - - [更多资源]({{< relref "./docs/Installation/riscv_more.md" >}}) - - [升级指南]({{< relref "./docs/os_upgrade_and_downgrade/openEuler 22.03 LTS升降级指导.md" >}}) + - [安装在RISC-V]({{< relref "./docs/Installation/安装在RISC-V.md" >}}) + - [在QEMU上安装]({{< relref "./docs/Installation/RISC-V-QEMU.md" >}}) + - [在PioneerBox上安装]({{< relref "./docs/Installation/RISC-V-Pioneer1.3.md" >}}) + - [在LicheePi4A上安装]({{< relref "./docs/Installation/RISC-V-LicheePi4A.md" >}}) + - [RISCV-OLK6.6同源版本指南]({{< relref "./docs/Installation/RISCV-OLK6.6同源版本指南.md" >}}) - [系统管理](#) - [管理员指南]({{< relref "./docs/Administration/administration.md" >}}) - [查看系统信息]({{< relref "./docs/Administration/查看系统信息.md" >}}) @@ -45,8 +45,8 @@ headless: true - [管理内存]({{< relref "./docs/Administration/overview.md" >}}) - [etmem用户指南]({{< relref "./docs/Administration/memory-management.md" >}}) - [GMEM用户指南]({{< relref "./docs/GMEM/认识GMEM" >}}) - - [安装与部署]({{< relref "./docs/GMEM/安装与部署.md" >}}) - - [使用方法]({{< relref "./docs/GMEM/使用说明.md" >}}) + - [安装与部署]({{< relref "./docs/GMEM/安装与部署.md" >}}) + - [使用方法]({{< relref "./docs/GMEM/使用说明.md" >}}) - [配置网络]({{< relref "./docs/Administration/配置网络.md" >}}) - [使用LVM管理硬盘]({{< relref "./docs/Administration/使用LVM管理硬盘.md" >}}) - [使用KAE加速引擎]({{< relref "./docs/Administration/使用KAE加速引擎.md" >}}) @@ -56,7 +56,13 @@ headless: true - [搭建web服务器]({{< relref "./docs/Administration/搭建web服务器.md" >}}) - [搭建数据库服务器]({{< relref "./docs/Administration/搭建数据库服务器.md" >}}) - [可信计算]({{< relref "./docs/Administration/可信计算.md" >}}) - - [FAQ]({{< relref "./docs/Administration/FAQ-54.md" >}}) + - [内核完整性度量(IMA)]({{< relref "./docs/Administration/内核完整性度量(IMA).md" >}}) + - [动态完整性度量(DIM)]({{< relref "./docs/Administration/动态完整性度量(DIM).md" >}}) + - [远程证明(鲲鹏安全库)]({{< relref "./docs/Administration/远程证明(鲲鹏安全库).md" >}}) + - [可信平台控制模块(TPCM)]({{< relref "./docs/Administration/可信平台控制模块(TPCM).md" >}}) + - [解释器类应用程序安全防护]({{< relref "./docs/Administration/解释器类应用程序完整性保护用户文档.md" >}}) + - [内核可信根框架]({{< relref "./docs/Administration/内核可信根框架用户文档.md" >}}) + - [常见问题与解决方法]({{< relref "./docs/Administration/常见问题与解决方法.md" >}}) - [运维指南]({{< relref "./docs/ops_guide/overview.md" >}}) - [运维概述]({{< relref "./docs/ops_guide/运维概述.md" >}}) - [系统资源与性能]({{< relref "./docs/ops_guide/系统资源与性能.md" >}}) @@ -80,14 +86,12 @@ headless: true - [Aops用户指南]({{< relref "./docs/A-Ops/overview.md" >}}) - [AOps部署指南]({{< relref "./docs/A-Ops/AOps部署指南.md" >}}) - [AOps智能定位框架使用手册]({{< relref "./docs/A-Ops/AOps智能定位框架使用手册.md" >}}) - - [aops-agent部署指南]({{< relref "./docs/A-Ops/aops-agent部署指南.md" >}}) - - [热补丁dnf插件使用手册]({{< relref "./docs/A-Ops/dnf插件命令指导手册.md" >}}) - - [配置溯源服务使用手册]({{< relref "./docs/A-Ops/配置溯源服务使用手册.md" >}}) - - [架构感知服务使用手册]({{< relref "./docs/A-Ops/架构感知服务使用手册.md" >}}) - - [gala-gopher使用手册]({{< relref "./docs/A-Ops/gala-gopher使用手册.md" >}}) + - [AOps漏洞管理模块使用手册]({{< relref "./docs/A-Ops/AOps漏洞管理模块使用手册.md" >}}) + - [热补丁dnf插件使用手册]({{< relref "./docs/A-Ops/dnf插件命令使用手册.md" >}}) + - [社区热补丁制作发布流程]({{< relref "./docs/A-Ops/社区热补丁制作发布流程.md" >}}) - [gala-anteater使用手册]({{< relref "./docs/A-Ops/gala-anteater使用手册.md" >}}) + - [gala-gopher使用手册]({{< relref "./docs/A-Ops/gala-gopher使用手册.md" >}}) - [gala-spider使用手册]({{< relref "./docs/A-Ops/gala-spider使用手册.md" >}}) - - [社区热补丁制作发布流程]({{< relref "./docs/A-Ops/社区热补丁制作发布流程.md" >}}) - [内核热升级指南]({{< relref "./docs/KernelLiveUpgrade/KernelLiveUpgrade.md" >}}) - [安装与部署]({{< relref "./docs/KernelLiveUpgrade/安装与部署.md" >}}) - [使用方法]({{< relref "./docs/KernelLiveUpgrade/使用方法.md" >}}) @@ -98,6 +102,7 @@ headless: true - [使用SysCare]({{< relref "./docs/SysCare/使用SysCare.md" >}}) - [约束限制]({{< relref "./docs/SysCare/约束限制.md" >}}) - [常见问题与解决方法]({{< relref "./docs/SysCare/常见问题与解决方法.md" >}}) + - [sysmonitor用户指南]({{< relref "./docs/sysmonitor/sysmonitor-使用手册.md" >}}) - [HA 用户指南]({{< relref "./docs/thirdparty_migration/ha.md" >}}) - [部署 HA]({{< relref "./docs/thirdparty_migration/installha.md" >}}) - [HA 使用实例]({{< relref "./docs/thirdparty_migration/usecase.md" >}}) @@ -113,6 +118,7 @@ headless: true - [内核参数]({{< relref "./docs/SecHarden/内核参数.md" >}}) - [SELinux配置]({{< relref "./docs/SecHarden/SELinux配置.md" >}}) - [安全加固工具]({{< relref "./docs/SecHarden/安全加固工具.md" >}}) + - [安全配置加固工具]({{< relref "./docs/SecHarden/安全配置加固工具.md" >}}) - [附录]({{< relref "./docs/SecHarden/附录.md" >}}) - [secGear开发指南]({{< relref "./docs/secGear/secGear.md" >}}) - [认识secGear]({{< relref "./docs/secGear/认识secGear.md" >}}) @@ -121,6 +127,25 @@ headless: true - [开发secGear应用程序]({{< relref "./docs/secGear/开发secGear应用程序.md" >}}) - [CVE-ease设计指南]({{< relref "./docs/CVE-ease/CVE-ease设计介绍.md" >}}) - [CVE-ease介绍和安装说明]({{< relref "./docs/CVE-ease/CVE-ease介绍和安装说明.md" >}}) + - [证书签名]({{< relref "./docs/CertSignature/总体概述.md" >}}) + - [签名证书介绍]({{< relref "./docs/CertSignature/签名证书介绍.md" >}}) + - [安全启动]({{< relref "./docs/CertSignature/安全启动.md" >}}) + - [国密]({{< relref "./docs/ShangMi/概述.md" >}}) + - [磁盘加密]({{< relref "./docs/ShangMi/磁盘加密.md" >}}) + - [内核模块签名]({{< relref "./docs/ShangMi/内核模块签名.md" >}}) + - [算法库]({{< relref "./docs/ShangMi/算法库.md" >}}) + - [文件完整性保护]({{< relref "./docs/ShangMi/文件完整性保护.md" >}}) + - [用户身份鉴别]({{< relref "./docs/ShangMi/用户身份鉴别.md" >}}) + - [证书]({{< relref "./docs/ShangMi/证书.md" >}}) + - [安全启动]({{< relref "./docs/ShangMi/安全启动.md" >}}) + - [SSH协议栈]({{< relref "./docs/ShangMi/SSH协议栈.md" >}}) + - [TLCP协议栈]({{< relref "./docs/ShangMi/TLCP协议栈.md" >}}) + - [RPM支持国密签名验签]({{< relref "./docs/ShangMi/RPM签名验签.md" >}}) + - [secDetector用户指南]({{< relref "./docs/secDetector/secDetector.md" >}}) + - [认识secDetector]({{< relref "./docs/secDetector/认识secDetector.md" >}}) + - [安装与部署]({{< relref "./docs/secDetector/安装secDetector.md" >}}) + - [接口参考]({{< relref "./docs/secDetector/接口参考.md" >}}) + - [使用secDetector]({{< relref "./docs/secDetector/使用secDetector.md" >}}) - [性能](#) - [A-Tune用户指南]({{< relref "./docs/A-Tune/A-Tune.md" >}}) - [认识A-Tune]({{< relref "./docs/A-Tune/认识A-Tune.md" >}}) @@ -129,10 +154,11 @@ headless: true - [native-turbo特性]({{< relref "./docs/A-Tune/native-turbo.md" >}}) - [常见问题与解决方法]({{< relref "./docs/A-Tune/常见问题与解决方法.md" >}}) - [附录]({{< relref "./docs/A-Tune/附录.md" >}}) - - [sysBoost用户指南]({{< relref "./docs/sysBoost/sysBoost.md" >}}) + - [sysBoost用户指南]({{< relref "./docs/sysBoost/sysBoost.md" >}}) - [认识sysBoost]({{< relref "./docs/sysBoost/认识sysBoost.md" >}}) - [安装与部署]({{< relref "./docs/sysBoost/安装与部署.md" >}}) - [使用方法]({{< relref "./docs/sysBoost/使用方法.md" >}}) + - [oeAware用户指南]({{< relref "./docs/oeAware/oeAware用户指南.md" >}}) - [桌面](#) - [UKUI]({{< relref "./docs/desktop/ukui.md" >}}) - [安装 UKUI]({{< relref "./docs/desktop/安装UKUI.md" >}}) @@ -150,7 +176,7 @@ headless: true - [安装 Kiran]({{< relref "./docs/desktop/kiran安装手册.md" >}}) - [Kiran 用户指南]({{< relref "./docs/desktop/Kiran_userguide.md" >}}) - [嵌入式](#) - - [openEuler Embedded用户指南](https://openeuler.gitee.io/yocto-meta-openeuler/master/index.html) + - [openEuler Embedded用户指南](https://pages.openeuler.openatom.cn/embedded/docs/build/html/openEuler-24.03-LTS/index.html) - [UniProton用户指南]({{< relref "./docs/Embedded/UniProton/UniProton用户指南-概述.md" >}}) - [UniProton功能设计]({{< relref "./docs/Embedded/UniProton/UniProton功能设计.md" >}}) - [UniProton接口说明]({{< relref "./docs/Embedded/UniProton/UniProton接口说明.md" >}}) @@ -170,6 +196,7 @@ headless: true - [vmtop]({{< relref "./docs/Virtualization/vmtop.md" >}}) - [LibcarePlus]({{< relref "./docs/Virtualization/LibcarePlus.md" >}}) - [Skylark虚拟机混部]({{< relref "./docs/Virtualization/Skylark.md" >}}) + - [常见问题与解决方法]({{< relref "./docs/Virtualization/常见问题与解决方法.md" >}}) - [附录]({{< relref "./docs/Virtualization/附录.md" >}}) - [StratoVirt用户指南]({{< relref "./docs/StratoVirt/StratoVirtGuide.md" >}}) - [StratoVirt介绍]({{< relref "./docs/StratoVirt/StratoVirt介绍.md" >}}) @@ -198,7 +225,8 @@ headless: true - [支持CNI网络]({{< relref "./docs/Container/支持CNI网络.md" >}}) - [容器资源管理]({{< relref "./docs/Container/容器资源管理.md" >}}) - [特权容器]({{< relref "./docs/Container/特权容器.md" >}}) - - [CRI接口]({{< relref "./docs/Container/CRI接口.md" >}}) + - [CRI-v1alpha2接口]({{< relref "./docs/Container/CRI-v1alpha2接口.md" >}}) + - [CRI-v1接口]({{< relref "./docs/Container/CRI-v1接口.md" >}}) - [镜像管理]({{< relref "./docs/Container/镜像管理.md" >}}) - [容器健康状态检查]({{< relref "./docs/Container/容器健康状态检查.md" >}}) - [查询信息]({{< relref "./docs/Container/查询信息.md" >}}) @@ -206,6 +234,9 @@ headless: true - [支持OCI hooks]({{< relref "./docs/Container/支持OCI-hooks.md" >}}) - [本地卷管理]({{< relref "./docs/Container/本地卷管理.md" >}}) - [iSulad shim v2 对接 StratoVirt]({{< relref "./docs/Container/iSula-shim-v2对接stratovirt.md" >}}) + - [iSulad支持cgroup v2]({{< relref "./docs/Container/iSulad支持cgroup v2.md" >}}) + - [iSulad支持CDI]({{< relref "./docs/Container/iSulad支持CDI.md" >}}) + - [常见问题与解决方法]({{< relref "./docs/Container/isula常见问题与解决方法.md" >}}) - [附录]({{< relref "./docs/Container/附录.md" >}}) - [系统容器]({{< relref "./docs/Container/系统容器.md" >}}) - [安装指导]({{< relref "./docs/Container/安装指导.md" >}}) @@ -239,6 +270,9 @@ headless: true - [镜像管理]({{< relref "./docs/Container/镜像管理-4.md" >}}) - [统计信息]({{< relref "./docs/Container/统计信息-4.md" >}}) - [容器镜像构建]({{< relref "./docs/Container/isula-build构建工具.md" >}}) + - [使用指南]({{< relref "./docs/Container/isula-build使用指南.md" >}}) + - [常见问题与解决方法]({{< relref "./docs/Container/isula-build常见问题与解决方法.md" >}}) + - [附录]({{< relref "./docs/Container/isula-build附录.md" >}}) - [Kuasar多沙箱容器运行时]({{< relref "./docs/Container/kuasar多沙箱运行时.md" >}}) - [安装与配置]({{< relref "./docs/Container/kuasar安装与配置.md" >}}) - [使用指南]({{< relref "./docs/Container/kuasar使用指南.md" >}}) @@ -261,13 +295,16 @@ headless: true - [部署集群]({{< relref "./docs/Kubernetes/eggo部署集群.md" >}}) - [拆除集群]({{< relref "./docs/Kubernetes/eggo拆除集群.md" >}}) - [运行测试pod]({{< relref "./docs/Kubernetes/运行测试pod.md" >}}) + - [基于containerd部署集群]({{< relref "./docs/Kubernetes/Kubernetes集群部署指南 - containerd.md" >}}) + - [常见问题与解决方法]({{< relref "./docs/Kubernetes/常见问题与解决方法.md" >}}) - [云原生混合部署rubik用户指南]({{< relref "./docs/rubik/overview.md" >}}) - [安装与部署]({{< relref "./docs/rubik/安装与部署.md" >}}) - - [http接口文档]({{< relref "./docs/rubik/http接口文档.md" >}}) + - [特性介绍]({{< relref "./docs/rubik/modules.md" >}}) + - [配置文档]({{< relref "./docs/rubik/配置文档.md" >}}) - [混部隔离示例]({{< relref "./docs/rubik/混部隔离示例.md" >}}) - - [NestOS用户指南]({{< relref "./docs/NestOS/overview.md" >}}) - - [安装与部署]({{< relref "./docs/NestOS/安装与部署.md" >}}) - - [使用方法]({{< relref "./docs/NestOS/使用方法.md" >}}) + - [附录]({{< relref "./docs/rubik/附录.md" >}}) + - [NestOS云底座操作系统]({{< relref "./docs/NestOS/overview.md" >}}) + - [NestOS For Container用户指南]({{< relref "./docs/NestOS/NestOS For Container用户指南.md" >}}) - [功能特性描述]({{< relref "./docs/NestOS/功能特性描述.md" >}}) - [CTinspector用户指南]({{< relref "./docs/CTinspector/认识CTinspector.md" >}}) - [安装与部署]({{< relref "./docs/CTinspector/安装与部署.md" >}}) @@ -282,7 +319,7 @@ headless: true - [认识Kmesh]({{< relref "./docs/Kmesh/认识Kmesh.md" >}}) - [安装与部署]({{< relref "./docs/Kmesh/安装与部署.md" >}}) - [使用方法]({{< relref "./docs/Kmesh/使用方法.md" >}}) - - [常见问题与解决办法]({{< relref "./docs/Kmesh/常见问题与解决办法.md" >}}) + - [常见问题与解决方法]({{< relref "./docs/Kmesh/常见问题与解决方法.md" >}}) - [附录]({{< relref "./docs/Kmesh/附录.md" >}}) - [边缘计算](#) - [KubeEdge部署指南]({{< relref "./docs/KubeEdge/overview.md" >}}) @@ -293,7 +330,7 @@ headless: true - [认识ROS]({{< relref "./docs/ROS/认识ROS.md" >}}) - [安装与部署]({{< relref "./docs/ROS/安装与部署.md" >}}) - [使用方法]({{< relref "./docs/ROS/使用方法.md" >}}) - - [常见问题与解决办法]({{< relref "./docs/ROS/常见问题与解决方法.md" >}}) + - [常见问题与解决方法]({{< relref "./docs/ROS/常见问题与解决方法.md" >}}) - [附录]({{< relref "./docs/ROS/附录.md" >}}) - [openEuler DevKit](#) - [isocut 使用指南]({{< relref "./docs/TailorCustom/isocut使用指南.md" >}}) @@ -306,6 +343,34 @@ headless: true - [openEuler DevOps](#) - [patch-tracking]({{< relref "./docs/userguide/patch-tracking.md" >}}) - [pkgship]({{< relref "./docs/userguide/pkgship.md" >}}) + - [EulerMaker]({{< relref "./docs/EulerMaker/EulerMaker用户指南.md" >}}) + - [merge-configs]({{< relref "./docs/EulerMaker/merge-configs.md" >}}) + - [Ods Pipeline用户指南]({{< relref "./docs/Ods-Pipeline/ods pipeline用户指南.md" >}}) +- [AI](#) + - [openEuler Copilot System]({{< relref "./docs/AI/openEuler_Copilot_System/README.md" >}}) + - [使用指南](#) + - [网页端](#) + - [前言]({{< relref "./docs/AI/openEuler_Copilot_System/使用指南/线上服务/前言.md" >}}) + - [注册与登录]({{< relref "./docs/AI/openEuler_Copilot_System/使用指南/线上服务/注册与登录.md" >}}) + - [智能问答使用指南]({{< relref "./docs/AI/openEuler_Copilot_System/使用指南/线上服务/智能问答使用指南.md" >}}) + - [智能插件简介]({{< relref "./docs/AI/openEuler_Copilot_System/使用指南/线上服务/智能插件简介.md" >}}) + - [命令行客户端](#) + - [获取 API Key]({{< relref "./docs/AI/openEuler_Copilot_System/使用指南/命令行客户端/获取 API Key.md" >}}) + - [命令行助手使用指南]({{< relref "./docs/AI/openEuler_Copilot_System/使用指南/命令行客户端/命令行助手使用指南.md" >}}) + - [智能调优]({{< relref "./docs/AI/openEuler_Copilot_System/使用指南/命令行客户端/智能调优.md" >}}) + - [智能诊断]({{< relref "./docs/AI/openEuler_Copilot_System/使用指南/命令行客户端/智能诊断.md" >}}) + - [知识库管理](#) + - [witChainD 使用指南]({{< relref "./docs/AI/openEuler_Copilot_System/使用指南/知识库管理/witChainD使用指南.md" >}}) + - [部署指南](#) + - [网络环境下部署指南]({{< relref "./docs/AI/openEuler_Copilot_System/部署指南/网络环境下部署指南.md" >}}) + - [无网络环境下部署指南]({{< relref "./docs/AI/openEuler_Copilot_System/部署指南/无网络环境下部署指南.md" >}}) + - [本地资产库构建指南]({{< relref "./docs/AI/openEuler_Copilot_System/部署指南/本地资产库构建指南.md" >}}) + - [插件部署指南](#) + - [智能调优]({{< relref "./docs/AI/openEuler_Copilot_System/部署指南/插件部署指南/智能调优/插件—智能调优部署指南.md" >}}) + - [智能诊断]({{< relref "./docs/AI/openEuler_Copilot_System/部署指南/插件部署指南/智能诊断/插件—智能诊断部署指南.md" >}}) + - [AI容器栈]({{< relref "./docs/AI/openEuler_Copilot_System/部署指南/插件部署指南/AI容器栈/插件—AI容器栈部署指南.md" >}}) + - [AI大模型服务镜像使用指南]({{< relref "./docs/AI/AI大模型服务镜像使用指南.md" >}}) + - [AI容器镜像用户指南]({{< relref "./docs/AI/AI容器镜像用户指南.md" >}}) - [应用开发](#) - [应用开发指南]({{< relref "./docs/ApplicationDev/application-development.md" >}}) - [开发环境准备]({{< relref "./docs/ApplicationDev/开发环境准备.md" >}}) @@ -314,7 +379,10 @@ headless: true - [使用make编译]({{< relref "./docs/ApplicationDev/使用make编译.md" >}}) - [使用JDK编译]({{< relref "./docs/ApplicationDev/使用JDK编译.md" >}}) - [构建RPM包]({{< relref "./docs/ApplicationDev/构建RPM包.md" >}}) - - [FAQ]({{< relref "./docs/ApplicationDev/FAQ.md" >}}) - - [GCC用指南]({{< relref "./docs/GCC/overview.md" >}}) + - [常见问题与解决方法]({{< relref "./docs/ApplicationDev/常见问题与解决方法.md" >}}) + - [GCC用户指南]({{< relref "./docs/GCC/overview.md" >}}) - [内核反馈优化特性用户指南]({{< relref "./docs/GCC/内核反馈优化特性用户指南.md" >}}) - + - [AI4C用户使用指南]({{< relref "./docs/AI4C/AI4C用户使用指南.md" >}}) + - [FangTian视窗引擎]({{< relref "./docs/FangTian/overview.md" >}}) + - [FangTian环境配置]({{< relref "./docs/FangTian/FangTian环境配置.md" >}}) + - [FangTian支持Wayland应用及鸿蒙应用]({{< relref "./docs/FangTian/FangTian支持Wayland应用及鸿蒙应用.md" >}})