diff --git a/docs/en/docs/A-Ops/sysTrace-usage-guide.md b/docs/en/docs/A-Ops/sysTrace-usage-guide.md new file mode 100644 index 0000000000000000000000000000000000000000..fa9e13e4d8a52778f2c79c2a2b7776695cc93907 --- /dev/null +++ b/docs/en/docs/A-Ops/sysTrace-usage-guide.md @@ -0,0 +1,55 @@ +# sysTrace Usage Guide + +## Overview + +sysTrace is a software tool for AI training tasks. During AI training, failures can lead to wasted resources. The pain points are as follows: + +- Lack of performance monitoring and anomaly detection for AI training faults +- Inability to perform end-to-end tracing on AI task slowdowns and faulty card caused by Host Bound. + +sysTrace provides the following functions: + +- Collects the Python call stack at the torch_npu layer. +- Collects information about memory usage at the CANN layer to determine whether an HBM out-of-memory (OOM) fault occurs. +- Collects information about dispatch and execution of the communication operator via the MindStudio Profiling Tool Interface (MSPTI) to determine whether the operator-level latency occurs and identify a slow card if present. + +## Installation + +The sysTrace can be installed only on openEuler 22.03 SP4. On a host configured with the openEuler YUM source, you can run the **yum** command to install it. The following describes how to install sysTrace. + +### Environment Requirements + +- OS: openEuler 22.03 SP4 +- Ascend CANN 8.0 RC3 or later +- Libunwind 1.7 or later + +### Installation Procedure + +Configure the openEuler YUM source and run the **yum** command to install it. + +```shell +yum install sysTrace +``` + +## Usage + +### Collecting data + +After the sysTrace software package is installed, load the dynamic library to the AI training task using **LD_PRELAOD**. (Note: The profiling overhead introduced by sysTrace is affected by the AI training workload. It is recommended that fluctuations of training cost during actual testing remain below 0.5%. In scenarios with large fluctuations, the profiling overhead may increase and amplify the resource usage introduced by sysTrace (about 2%).) + +```shell +LD_PRELOAD=/usr/local/lib/libunwind.so.8.2.0:/usr/local/lib/libunwind-aarch64.so.8.2.0:/home/ascend-toolkit-bak/ascend-toolkit/8.0.RC3.10/tools/mspti/lib64/libmspti.so:/usr/lib64/libsysTrace.so python ... +``` + +### Converting Data + +```python +## Converting data to the flame graph format (note: Only data from card 0 can be converted.) +python /usr/bin/sysTrace/convert_mem_to_flamegraph.py + +python /usr/bin/sysTrace/convert_pytorch_to_timeline.py --output +``` + +### Display + +Upload the output data to and display the data. diff --git a/docs/en/docs/Administration/FUSE-fastpath-feature-description-and-usage-guide.md b/docs/en/docs/Administration/FUSE-fastpath-feature-description-and-usage-guide.md new file mode 100644 index 0000000000000000000000000000000000000000..7ff0756887efcaa44025f70e9424f7f8909aaba7 --- /dev/null +++ b/docs/en/docs/Administration/FUSE-fastpath-feature-description-and-usage-guide.md @@ -0,0 +1,107 @@ +# FUSE Fastpath Feature Description and Usage Guide + +## Overview + +Filesystem in Userspace (FUSE) allows non-root users to create file systems in user space. Developers—even without kernel module development experience—can implement new and tailored file systems without modifying kernel code. FUSE simplifies file system development, enhances functionality, and boosts security and stability. Yet, it significantly affects I/O performance, limiting its applications. To improve the FUSE performance, openEuler introduces the fastpath feature. It accelerates FUSE in both single-thread and multi-thread scenarios through pre-created threads with CPU binding, shared memory, and fast process switching. Specifically, the following three technologies are provided: + +1. When a user-space file system is mounted, a dedicated FUSE daemon is created for each CPU and bound to that core. When a user process issues an I/O request, the FUSE daemon on the same CPU is directly invoked. This reduces the overhead of creating and destroying threads, while maximizing cache locality—since both the user process and its FUSE daemon often access the same memory. Thread-to-core binding mimics the behavior of per-CPU variables. In this way, user processes on different CPUs do not contend for shared locks when delivering I/O requests, improving parallelism. +2. The shared memory is created for the FUSE kernel module and the FUSE daemon, and the FUSE header information is directly stored in the shared memory, avoiding the overhead of addressing and replication. +3. The fast thread switching allows threads running on CPUs to be explicitly switched on the kernel, bypassing the conventional thread switching process and achieving higher CPU utilization. + +## Usage Mode + +### Environment Requirements + +- Hardware: AArch64 processor +- Software: The FUSE fastpath feature is available only on openEuler 22.03 LTS SP4 and requires coordination between the kernel space and user space components. To enable the fastpath feature, the minimum kernel version required is 5.10.0-264.0.0. On the user-space side, only libfuse is supported, and the minimum fuse3 version required is 3.10.5-11. You can obtain the updated software package from the update source of openEuler. + +### How to Use + +#### Dynamic Library Linking + +The user-space portion of the fastpath feature is provided as a libfuse dynamic library within the fuse3 software package. For the user-space file system using the libfuse, you can run the following command to link libfuse: + +```bash +gcc program.o -L/path/to/lib -lfuse -o program +``` + +For a user-space file system that has been compiled into a binary, you can run the following command to view the dynamic library linked to the file system: + +```bash +ldd /path/to/binary +``` + +If the linked dynamic library is not the target version, you can use the following methods to change to the desired version: + +- Method 1: Updating `/etc/ld.so.conf` + + Open or create a configuration file, for example, `/etc/d.so.conf.d/newlib.conf`, and add the library path. + + ```bash + /path/to/lib + ``` + + Run the following command to update the cache: + + ```bash + sudo ldconfig + ``` + +- Method 2: Setting LD_LIBRARY_PATH + + You can set the environment variable *LD_LIBRARY_PATH* to specify the library path. This method works without admin rights but lasts only during the current session. + + ```bash + export LD_LIBRARY_PATH=/path/to/lib:$LD_LIBRARY_PATH + ``` + +#### Adaptation and Enablement + +After confirming that the file system is linked to the correct version of the dynamic library, you can enable the fastpath feature by adding parameters when creating a FUSE session. The following three configurations are available: + +- use_fastpath: Enables the fastpath feature, including the pre-created thread with CPU binding, shared memory, and fast process switching. +- no_interrupt: Disables handling of **interrupt** requests. +- no_forget: Disables handling of **forget** requests. + +The no_forget and no_interrupt options can improve performance by skipping the handling of **interrupt** and **forget** requests. You can enable the options only when the user-space file system does not rely on these requests. Otherwise, problems may occur. + +Note that the parameters are added to the fuse_session_new function when creating the FUSE session using the libfuse, rather than added when running the user-space file system binary. The adaptation method varies according to the user-space file system implementation. The following uses the demo ([passthrough_hp](https://github.com/libfuse/libfuse/blob/fuse-3.10.5/example/passthrough_hp.cc)) delivered with libfuse as an example. The code needs to be adapted as follows: + +```diff +diff --git a/example/passthrough_hp.cc b/example/passthrough_hp.cc +index 872fc73..1f96820 100644 +--- a/example/passthrough_hp.cc ++++ b/example/passthrough_hp.cc +@@ -1146,7 +1146,10 @@ static cxxopts::ParseResult parse_options(int argc, char **argv) { + ("help", "Print help") + ("nocache", "Disable all caching") + ("nosplice", "Do not use splice(2) to transfer data") +- ("single", "Run single-threaded"); ++ ("single", "Run single-threaded") ++ ("nointerrupt", "Do not process interrupt request") ++ ("noforget", "Do not process forget request") ++ ("usefastpath", "use fastpath"); + + // FIXME: Find a better way to limit the try clause to just + // opt_parser.parse() (cf. https://github.com/jarro2783/cxxopts/issues/146) +@@ -1225,7 +1228,10 @@ int main(int argc, char *argv[]) { + if (fuse_opt_add_arg(&args, argv[0]) || + fuse_opt_add_arg(&args, "-o") || + fuse_opt_add_arg(&args, "default_permissions,fsname=hpps") || +- (options.count("debug-fuse") && fuse_opt_add_arg(&args, "-odebug"))) ++ (options.count("debug-fuse") && fuse_opt_add_arg(&args, "-odebug")) || ++ (options.count("nointerrupt") && fuse_opt_add_arg(&args, "-ono_interrupt")) || ++ (options.count("noforget") && fuse_opt_add_arg(&args, "-ono_forget")) || ++ (options.count("usefastpath") && fuse_opt_add_arg(&args, "-ouse_fastpath"))) + errx(3, "ERROR: Out of memory"); + + fuse_lowlevel_ops sfs_oper {}; +``` + +Mount Mode: + +```bash +passthrough_hp --usefastpath /path/to/src /path/to/mnt +``` + +After being passed to passthrough_hp, the parameter `--usefastpath` is parsed as the parameter `-ouse_fastpath` for creating a FUSE session. The session initialized with this parameter will enable the fastpath function during file system mounting and FUSE daemon creation. diff --git a/docs/en/docs/Gazelle/Gazelle.md b/docs/en/docs/Gazelle/Gazelle.md index 7b0debf1d63aea44f73cafeb60711d68be3cf2aa..e74445f860b0e32acc033e5994d6f66ec364df30 100644 --- a/docs/en/docs/Gazelle/Gazelle.md +++ b/docs/en/docs/Gazelle/Gazelle.md @@ -16,7 +16,7 @@ In the single-process scenario where the NIC supports multiple queues, use **lib ## Installation -Configure the Yum source of openEuler and run the`yum` command to install Gazelle. +Configure the Yum repository of openEuler and run the`yum` command to install Gazelle. ```sh yum install dpdk diff --git a/docs/en/docs/Virtualization/vm-live-migration.md b/docs/en/docs/Virtualization/vm-live-migration.md index 4570b39c0138936360283525eee42a04daeb9dcb..e6749556c168917293618dc9a8ea67b0f01e287c 100644 --- a/docs/en/docs/Virtualization/vm-live-migration.md +++ b/docs/en/docs/Virtualization/vm-live-migration.md @@ -12,11 +12,12 @@ When a VM is running on a physical machine, the physical machine may be overloaded or underloaded due to uneven resource allocation. In addition, operations such as hardware replacement, software upgrade, networking adjustment, and troubleshooting are performed on the physical machine. Therefore, it is important to complete these operations without interrupting services. The VM live migration technology implements load balancing or the preceding operations on the premise of service continuity, improving user experience and work efficiency. VM live migration is to save the running status of the entire VM and quickly restore the VM to the original or even different hardware platforms. After the VM is restored, it can still run smoothly without any difference to users. Because the VM data can be stored on the current host or a shared remote storage device, openEuler supports shared and non-shared storage live migration. -## Application Scenarios +### Application Scenarios Shared and non-shared storage live migration applies to the following scenarios: -- When a physical machine is faulty or overloaded, you can migrate the running VM to another physical machine to prevent service interruption and ensure normal service running. +- When most physical machines are underloaded, migrate and integrate VMs to reduce the number of physical machines and improve resource utilization. + When a physical machine is faulty or overloaded, you can migrate the running VM to another physical machine to prevent service interruption and ensure normal service running. - When most physical machines are underloaded, migrate and integrate VMs to reduce the number of physical machines and improve resource utilization. - When the hardware of a physical server becomes a bottleneck, such as the CPU, memory, and hard disk, replace the hardware with better performance or add devices. However, you cannot stop the VM or stop services. - Server software upgrade, such as virtualization platform upgrade, allows the VM to be live migrated from the old virtualization platform to the new one. @@ -26,7 +27,7 @@ Non-shared storage live migration can also be used in the following scenarios: - If a physical machine is faulty and the storage space is insufficient, migrate the running VM to another physical machine to prevent service interruption and ensure normal service running. - When the storage device of the physical machine is aged, the performance cannot support the current service data processing and becomes the bottleneck of the system performance. In this case, a storage device with higher performance needs to be used, but the VM cannot be shut down or stopped. The running VM needs to be migrated to a physical machine with better performance. -## Precautions and Restrictions +### Precautions and Restrictions - During the live migration, ensure that the network is in good condition. If the network is interrupted, live migration is suspended until the network is recovered. If the network connection times out, live migration fails. - During the migration, do not perform operations such as VM life cycle management and VM hardware device management. @@ -36,6 +37,7 @@ Non-shared storage live migration can also be used in the following scenarios: - Only homogeneous live migration is supported. That is, the CPU models of the source and destination must be the same. - A VM can be successfully migrated across service network segments. However, network exceptions may occur after the VM is migrated to the destination. To prevent this problem, ensure that the service network segments to be migrated are the same. - If the number of vCPUs on the source VM is greater than that on the destination physical machine, the VM performance will be affected after the migration. Ensure that the number of vCPUs on the destination physical machine is greater than or equal to that on the source VM. +- Passthrough live migration is supported only for KAE components. Precautions for live migration of non-shared storage: @@ -61,13 +63,13 @@ Procedure: For example, if the VM name is **openEulerVM** and the calculation time is 1s, run the following command: -```shell +```bash virsh qemu-monitor-command openEulerVM '{"execute":"calc-dirty-rate", "arguments": {"calc-time": 1}}' ``` After 1s, run the following command to query the dirty page change rate: -```shell +```bash virsh qemu-monitor-command openEulerVM '{"execute":"query-dirty-rate"}' ``` @@ -77,22 +79,22 @@ Before live migration, run the **virsh migrate-setmaxdowntime** command to spe For example, to set the maximum downtime of the VM named **openEulerVM** to **500 ms**, run the following command: -```shell -virsh migrate-setmaxdowntime openEulerVM 500 +```bash +# virsh migrate-setmaxdowntime openEulerVM 500 ``` In addition, you can run the **virsh migrate-setspeed** command to limit the bandwidth occupied by VM live migration. This prevents VM live migration from occupying too much bandwidth and affecting other VMs or services on the host. This operation is also optional for live migration. For example, to set the live migration bandwidth of the VM named **openEulerVM** to **500 Mbit/s**, run the following command: -```shell -virsh migrate-setspeed openEulerVM --bandwidth 500 +```bash +# virsh migrate-setspeed openEulerVM --bandwidth 500 ``` You can run the **migrate-getspeed** command to query the maximum bandwidth during VM live migration. -```shell -$ virsh migrate-getspeed openEulerVM +```bash +# virsh migrate-getspeed openEulerVM 500 ``` @@ -106,14 +108,14 @@ You can use migrate-set-parameters to set parameters related to live migration. For example, set the live migration algorithm of the VM named _openEulerVM_ to zstd and retain the default values for other parameters. -```shell -virsh qemu-monitor-command openeulerVM '{ "execute": "migrate-set-parameters", "arguments": {"compress-method": "zstd"}}' +```bash +# virsh qemu-monitor-command openeulerVM '{ "execute": "migrate-set-parameters", "arguments": {"compress-method": "zstd"}}' ``` You can run the query-migrate-parameters command to query parameters related to live migration. -```shell -$ virsh qemu-monitor-command openeulerVM '{ "execute": "query-migrate-parameters"}' --pretty +```bash +# virsh qemu-monitor-command openeulerVM '{ "execute": "query-migrate-parameters"}' --pretty { "return": { @@ -148,8 +150,8 @@ $ virsh qemu-monitor-command openeulerVM '{ "execute": "query-migrate-parameters 1. Check whether the storage device is shared. - ```shell - $ virsh domblklist + ```bash + # virsh domblklist Target Source -------------------------------------------- sda /dev/mapper/open_euleros_disk @@ -162,8 +164,8 @@ $ virsh qemu-monitor-command openeulerVM '{ "execute": "query-migrate-parameters For example, run the **virsh migrate** command to migrate VM **openEulerVM** to the destination host. - ```shell - virsh migrate --live --unsafe openEulerVM qemu+ssh:///system + ```bash + # virsh migrate --live --unsafe openEulerVM qemu+ssh:///system ``` **** indicates the IP address of the destination host. Before live migration, SSH authentication must be performed to obtain the source host management permission. @@ -197,14 +199,14 @@ $ virsh qemu-monitor-command openeulerVM '{ "execute": "query-migrate-parameters Before live migration, create a virtual disk file in the same disk directory on the destination host. Ensure that the disk format and size are the same. - ```shell - qemu-img create -f qcow2 /mnt/sdb/openeuler/openEulerVM.qcow2 20G + ```bash + # qemu-img create -f qcow2 /mnt/sdb/openeuler/openEulerVM.qcow2 20G ``` 2. Run the **virsh migrate** command on the source to perform live migration. During the migration, the storage is also migrated to the destination. - ```shell - $ virsh migrate --live --unsafe --copy-storage-all --migrate-disks sda \ + ```bash + # virsh migrate --live --unsafe --copy-storage-all --migrate-disks sda \ openEulerVM qemu+ssh:///system ``` @@ -232,13 +234,13 @@ $ virsh qemu-monitor-command openeulerVM '{ "execute": "query-migrate-parameters Single-channel live migration encryption transmission command -```shell +```bash virsh migrate --live --unsafe --tls --domain openEulerVM --desturi qemu+tcp:///system --migrateuri tcp:// ``` Encrypted transmission commands for multi-channel live migration -```shell +```bash virsh migrate --live --unsafe --parallel --tls --domain openEulerVM --desturi qemu+tcp:///system --migrateuri tcp:// ``` @@ -352,3 +354,31 @@ if conn is not None: 4. How to Use Enable KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE in the kernel. + +### Live Migration Operations (KAE Passthrough) + +1. Overview + + The Kunpeng Accelerator Engine (KAE) is a hardware acceleration solution based on Kunpeng 920 processors. It contains HPRE, SEC, and ZIP components for encryption and decryption, and compression and decompression. KAE can reduce processor consumption and boost processor efficiency. For KAE components, live migration of VMs with passthrough Virtual Function (VF) is supported. After enabling VF, and assigning it to a VM via passthrough, you can migrate the VM with VF to another server without interrupting services. To use this feature, you need to set the **migration** parameter of the KAE passthrough component to **on** in the **xml** file. + +2. Application Scenarios + + KAE components and live migration are required for VMs. + +3. Precautions + + Passthrough live migration is supported only for KAE components. + + The live migration performance is optimal if all KAE passthrough components are located on the same NUMA node. + +4. How to Use + + In the **xml** file, set the **migration** parameter of the KAE passthrough component to **on**. + + ```xml + + + +
+ + \ No newline at end of file diff --git a/docs/en/docs/oeAware/oeAware-user-guide.md b/docs/en/docs/oeAware/oeAware-user-guide.md new file mode 100644 index 0000000000000000000000000000000000000000..06e91e5b845d4318ad9f650926a5bc3e809f61a5 --- /dev/null +++ b/docs/en/docs/oeAware/oeAware-user-guide.md @@ -0,0 +1,744 @@ +# oeAware User Guide + +## Overview + +oeAware is a framework that provides low-load collection, sensing, and tuning upon detecting defined system behaviors on openEuler. The framework divides the tuning process into three layers: collection, sensing, and tuning. The three layers are developed as plugins and associated with each other through subscription, overcoming the limitations of traditional tuning features that run independently and are statically enabled or disabled. + +## Installation + +Configure the openEuler Yum repository and run the `yum` commands to install oeAware. oeAware is installed by default on openEuler 22.03 LTS SP4. + +```shell +yum install oeAware-manager +``` + +## Usage + +Start the oeaware service. Use the `oeawarectl` command to control the service. + +### Service initiation + +Run the `systemd` command to start the service. oeAware is started by default after the installation. + +```shell +systemctl start oeaware +``` + +### Configuration file + +Configuration file path: **/etc/oeAware/config.yaml** + +```yaml +log_path: /var/log/oeAware # Log storage path +log_level: 1 # Log levels. 1: DEBUG 2: NFO 3: WARN 4: ERROR +enable_list: # The plugin is enabled by default. + - name: libtest.so # Configure the plugin and enable all instances of the plugin. + - name: libtest1.so # Configure plugin instances and enable these plugin instances. + instances: + - instance1 + - instance2 + ... + ... +plugin_list: # Plugins you can download. + - name: test # The name must be unique. If duplicated, the first entry is used. + description: hello world + url: https://gitee.com/openeuler/oeAware-manager/raw/master/README.md # url must not be empty. + ... +``` + +After modifying the configuration file, run the following command to restart the service: + +```shell +systemctl restart oeaware +``` + +### Plugin Description + +**Plugin definition**: Each plugin corresponds to a .so file. Plugins are classified into collection plugins, sensing plugins, and tuning plugins. + +**Instance definition**: Instances are basic units of service scheduling. A plugin contains multiple instances. For example, a collection plugin includes multiple collection items, and each collection item is an instance. + +### Plugin Loading + +By default, the service loads the plugins from the plugin storage path. + +Plugin path: **/usr/lib64/oeAware-plugin/** + +You can also manually load the plugins. + +```shell +oeawarectl -l | --load +``` + +Example + +```shell +[root@localhost ~]# oeawarectl -l libthread_collect.so +Plugin loaded successfully. +``` + +If the operation fails, an error description is returned. + +### Plugin Unloading + +```shell +oeawarectl -r | --remove +``` + +Example + +```shell +[root@localhost ~]# oeawarectl -r libthread_collect.so +Plugin remove successfully. +``` + +If the operation fails, an error description is returned. + +### Plugin Query + +#### Querying the plugin status + +```shell +oeawarectl -q # Query all loaded plugins. +oeawarectl --query # Query a specified plugin. +``` + +Example + +```shell +Show plugins and instances status. +------------------------------------------------------------ +libpmu.so + pmu_counting_collector(available, close, count: 0) + pmu_sampling_collector(available, close, count: 0) + pmu_spe_collector(available, close, count: 0) + pmu_uncore_collector(available, close, count: 0) +libsystem_collector.so + thread_collector(available, close, count: 0) + kernel_config(available, close, count: 0) + command_collector(available, close, count: 0) + env_info_collector(available, close, count: 0) + net_interface_info(available, close, count: 0) +libdocker_collector.so + docker_collector(available, close, count: 0) +libub_tune.so + unixbench_tune(available, close, count: 0) +libsystem_tune.so + stealtask_tune(available, close, count: 0) + dynamic_smt_tune(available, close, count: 0) + smc_tune(available, close, count: 0) + xcall_tune(available, close, count: 0) + transparent_hugepage_tune(available, close, count: 0) + seep_tune(available, close, count: 0) + preload_tune(available, close, count: 0) + binary_tune(available, close, count: 0) + numa_sched_tune(available, close, count: 0) + realtime_tune(available, close, count: 0) + net_hard_irq_tune(available, close, count: 0) + multi_net_path_tune(available, close, count: 0) +libdocker_tune.so + docker_cpu_burst(available, close, count: 0) + docker_burst(available, close, count: 0) +libthread_scenario.so + thread_scenario(available, close, count: 0) +libanalysis_oeaware.so + hugepage_analysis(available, close, count: 0) + dynamic_smt_analysis(available, close, count: 0) + smc_d_analysis(available, close, count: 0) + xcall_analysis(available, close, count: 0) + net_hirq_analysis(available, close, count: 0) + numa_analysis(available, close, count: 0) + docker_coordination_burst_analysis(available, close, count: 0) + microarch_tidnocmp_analysis(available, close, count: 0) +------------------------------------------------------------ +format: +[plugin] + [instance]([dependency status], [running status], [enable cnt]) +dependency status: available means satisfying dependency, otherwise unavailable. +running status: running means that instance is running, otherwise close. +enable cnt: number of instances enabled. +``` + +If the operation fails, an error description is returned. + +#### Querying Tuning Instance Information + +```shell +oeawarectl --info +``` + +Displays the description information and running status of the tunning instance. + +#### Querying the Subscription Relationship of Running Instances + +```shell +oeawarectl -Q # Query the subscription relationship diagram of all running instances. +oeawarectl --query-dep= # Query the subscription relationship diagram of the running instances. +``` + +The **dep.png** file is generated in the current directory, showing the subscription relationship. + +The subscription relationship is displayed only when the instances are running. + +Example + +```sh +oeawarectl -e thread_scenario +oeawarectl -Q +``` + +![img](./figures/dep.png) + +### Plugin Instance Enablement + +#### Enabling a Plugin Instance + +```shell +oeawarectl -e | --enable +``` + +If a plugin instance is enabled, the topic instance subscribed by the plugin instance is also enabled. + +If the operation fails, an error description is returned. + +Run the following command to query the tuning instance: + +```sh +oeawarectl --info +``` + +Data is provided and analyzed by other plugins and can be obtained through an SDK. + +Enabling an instance with parameters + +```sh +oeawarectl -e xcall_tune # -c [path] The optional parameter -c indicates that the configuration file in the path is used. +oeawarectl -e dynamic_smt_tune # -threshold [number] The optional parameter -threshold is used to set the CPU usage threshold. +oeawarectl -e multi_net_path_tune -ifname [name] # The mandatory parameter -ifname indicates the name of the affinity-bound NIC. The optional parameter -appname indicates the process to which the instance configured applies. The optional parameter -matchip indicates whether to enforce IP address match. By default, this is set to true, indicating that the system will automatically attempt to match the instance to an IP address. The optional parameter -mode indicates that modes 0 and 1 are supported. A specific NIC (supporting ntuple) is required. +oeawarectl -e docker_burst # The parameter -docker_id [id1,id2...] is used to set the Docker ID. The parameter -ratio is used to set the burst ratio. The default value is 20%. +``` + +#### Disabling a Plugin Instance + +```shell +oeawarectl -d | --disable +``` + +If a plugin instance is disabled, the topic instance subscribed by the plugin instance is also disabled. + +If the operation fails, an error description is returned. + +### Plugin Download and Installation + +Run the `--list` command to query the installed plugins and the RPM packages that can be downloaded. + +```shell +oeawarectl --list +``` + +The query result is as follows: + +```shell +Supported Packages: # Packages that can be downloaded +[name1] # A plugin listed in the plugin_list in config +[name2] +... +Installed Plugins: # Installed plugins +[name1] +[name2] +... +``` + +Run the `--install` command to download and install the RPM package. + +```shell +oeawarectl -i | --install # Specifies a package name that can be queried using --list (that is, a package listed under Supported Packages). +``` + +If the operation fails, an error description is returned. + +### Analysis mode + +```sh +oeawarectl analysis -h +usage: oeawarectl analysis [options]... + options + -t|--time set analysis duration in seconds(default 30s), range from 1 to 100. + -r|--realtime show real time report. + -v|--verbose show verbose information. + -h|--help show this help message. + --l1-miss-threshold set l1 tlbmiss threshold. + --l2-miss-threshold set l2 tlbmiss threshold. + --out-path set the path of the analysis report. + --dynamic-smt-threshold set dynamic smt cpu threshold. + --pid set the pid to be analyzed. + --numa-thread-threshold set numa sched thread creation threshold. + --smc-change-rate set smc connections change rate threshold. + --smc-localnet-flow set smc local net flow threshold. + --host-cpu-usage-threshold set host cpu usage threshold. + --docker-cpu-usage-threshold set docker cpu usage threshold. + +``` + +--l1-miss-threshold sets the threshold for l1-tlb-miss rate. If the miss rate goes beyond the threshold, huge pages are recommended. + +--l2-miss-threshold sets the threshold for l2-tlb-miss rate. If the miss rate goes beyond the threshold, huge pages are recommended. + +--out-path sets the output path of the analysis report. + +--dynamic-smt-threshold sets the CPU usage threshold for simultaneous multithreading (SMT). If the CPU usage falls below the threshold, SMT is recommended. + +--numa-thread-threshold sets the threshold for thread creation rate. If the number of threads created per second is greater than the threshold, NUMA tuning is recommended. + +--smc-change-rate sets the threshold for the TCP connection status change rate. If the change rate is lower than the threshold, smc-d is recommended. + +--smc-localnet-flow sets the local network traffic threshold. If the traffic volume exceeds the threshold, smc-d is recommended. + +--host-cpu-usage-threshold sets the host CPU usage threshold. If the host CPU usage is lower than the threshold, docker burst is recommended. + +--docker-cpu-usage-threshold sets the docker CPU usage threshold. If the CPU usage of the docker container is higher than the threshold, docker burst is recommended. + +The threshold parameters are set through the configuration file or command line parameters. + +The configuration file is stored at **/etc/oeAware/analysis_config.yaml** + +```yaml +#default analysis config +timeout: 5 # Client wait timeout +dynamic_smt: + threshold: 40.0 # Value range:[0,100]. + +hugepage: + l1_miss_threshold: 5.0 # Value range:[0,100]. + l2_miss_threshold: 10.0 # Value range:[0,100]. + +numa_analysis: + thread_threshold: 200 # thread count threshold to use numa native schedule. Value must be a non-negative integer. + +smc_d_analysis: + change_rate: 0.1 # Value must be a non-negative number. + local_net_flow: 100 # MB/S + +docker_coordination_burst: + host_cpu_usage_threshold: 45 # Value range:[0,100]. + docker_cpu_usage_threshold: 95 # Value range:[0,100]. + +microarch_tidnocmp: + service_list: + - mysqld # Supported service. The default value is mysqld. + cpu_part: + - 0xd02 +xcall_analysis: + threshold: 5 # Kernel cpu usage, value range:[0,100]. + num: 5 # top num syscall , value must be a non-negative integer. + +``` + +Example + +Run the following command to generate the system analysis report: + +```sh +oeawarectl analysis -t 10 +``` + +The report includes three parts: + +- Data Analysis: analyzes the system performance data based on the system running status. +- Analysis Conclusion: provides the system analysis conclusion. +- Analysis Suggestion: provides the tuning suggestions. + +Note: + +- uncore_ops_num_per_second: indicates the number of memory access operations per second. A value greater than 2,000,000, indicates high memory access intensity. +- remote_access_ratio: indicates the remote access ratio. A ratio greater than 5% indicates a high proportion of the remote access. + +### Help + +Run the `--help` command for help information. + +```shell +usage: oeawarectl [options]... + options + analysis run analysis mode. + -l|--load [plugin] load plugin. + -r|--remove [plugin] remove plugin from system. + -e|--enable [instance] enable the plugin instance. + -d|--disable [instance] disable the plugin instance. + -q query all plugins information. + --query [plugin] query the plugin information. + -Q query all instances dependencies. + --query-dep [instance] query the instance dependency. + --list the list of supported plugins. + --info the list of InfoCmd plugins. + -i|--install [plugin] install plugin from the list. + --help show this help message. +``` + +## Plugin Development Description + +### Basic Data Structure + +```c++ +typedef struct { + char *instanceName; // Instance name + char *topicName; // Topic name + char *params; // Parameters +} CTopic; + +typedef struct { + CTopic topic; + unsigned long long len; // Length of the data array + void **data; // Stored data +} DataList; + +const int OK = 0; +const int FAILED = -1; + +typedef struct { + int code; // If the operation is successful, OK is returned. If the operation fails, FAILED is returned. + char *payload; // Additional information +} Result; + +``` + +### Instance Base Class + +```c++ +namespace oeaware { +// Instance type. +const int TUNE = 0b10000; +const int SCENARIO = 0b01000; +const int RUN_ONCE = 0b00010; +class Interface { +public: + virtual Result OpenTopic(const Topic &topic) = 0; + virtual void CloseTopic(const Topic &topic) = 0; + virtual void UpdateData(const DataList &dataList) = 0; + virtual Result Enable(const std::string ¶m = "") = 0; + virtual void Disable() = 0; + virtual void Run() = 0; +protected: + std::string name; + std::string version; + std::string description; + std::vector supportTopics; + int priority; + int type; + int period; +} +} +``` + +Each instance is developed by inheriting from the instance base class, implementing six virtual functions, and assigning values to seven class attributes. + +The instance uses a Publish-Subscribe pattern, obtaining data through a Subscribe API and publishing data through a Publish API. + +### Attributes + +| Attribute| Type| Description| +| --- | --- | --- | +| name | string | Instance name.| +| version | string | Instance version (reserved).| +| description | string | Instance description.| +| supportTopics | vector\ | Supported topics.| +| priority | int | Instance execution priority (tuning > awareness > collection)| +| type | int | Instance type, which is identified by bits. The second bit indicates a single execution instance, the third bit indicates a collection instance, the fourth bit indicates an awareness instance, and the fifth bit indicates a tuning instance.| +| period | int | Instance execution period, in milliseconds. The value of period is a multiple of 10.| + +### APIs + +| Function| Parameter| Return Value| Description| +| --- | --- | --- | --- | +|Result OpenTopic(const Topic &topic) | topic: topic to be opened| | Open the specified topic.| +| void CloseTopic(const Topic &topic) | topic: topic to be closed| |Close the specified topic.| +| void UpdateData(const DataList &dataList) | dataList: subscribed data| | When a topic is subscribed to, this topic updates data through UpdateData every period.| +| Result Enable(const std::string ¶m = "") | param: reserved for future use| | Enable this instance.| +| void Disable() | | | Disable the instance.| +| void Run() | | | Execute the run function in every period.| + +### Instance Example + +```C++ +#include +#include + +class Test : public oeaware::Interface { +public: + Test() { + name = "TestA"; + version = "1.0"; + description = "this is a test plugin"; + supportTopics; + priority = 0; + type = 0; + period = 20; + } + oeaware::Result OpenTopic(const oeaware::Topic &topic) override { + return oeaware::Result(OK); + } + void CloseTopic(const oeaware::Topic &topic) override { + + } + void UpdateData(const DataList &dataList) override { + for (int i = 0; i < dataList.len; ++i) { + ThreadInfo *info = static_cast(dataList.data[i]); + INFO(logger, "pid: " << info->pid << ", name: " << info->name); + } + } + oeaware::Result Enable(const std::string ¶m = "") override { + Subscribe(oeaware::Topic{"thread_collector", "thread_collector", ""}); + return oeaware::Result(OK); + } + void Disable() override { + + } + void Run() override { + DataList dataList; + oeaware::SetDataListTopic(&dataList, "test", "test", ""); + dataList.len = 1; + dataList.data = new void* [1]; + dataList.data[0] = &pubData; + Publish(dataList); + } +private: + int pubData = 1; +}; + +extern "C" void GetInstance(std::vector> &interfaces) +{ + interfaces.emplace_back(std::make_shared()); +} +``` + +## Internal Plugin + +### libpmu.so + +| Instance Name| Architecture| Description| topic | +| --- | --- | --- | --- | +| pmu_counting_collector | aarch64 | Collect count events.|cycles,net:netif_rx,L1-dcache-load-misses,L1-dcache-loads,L1-icache-load-misses,L1-icache-loads,branch-load-misses,branch-loads,dTLB-load-misses,dTLB-loads,iTLB-load-misses,iTLB-loads,cache-references,cache-misses,l2d_tlb_refill,l2d_cache_refill,l1d_tlb_refill,l1d_cache_refill,inst_retired,instructions | +| pmu_sampling_collector | aarch64 | Collect sample events.| cycles, skb:skb_copy_datagram_iovec,net:napi_gro_receive_entry | +| pmu_spe_collector | aarch64 | Collecting spe events.| spe | +| pmu_uncore_collector | aarch64 | Collects uncore events.| uncore | + +#### Constraints + +The collection of spe events depends on the hardware capability. This plugin relies on the BIOS SPE feature. Before running the plugin, you need to enable the SPE. + +Run **perf list | grep arm_spe** to check whether SPE is enabled. If SPE is enabled, the following information is displayed: + +```sh +arm_spe_0// [Kernel PMU event] +``` + +If not, perform the following steps to enable it: + +Go to MISC Config --> SPE in the BIOS. If the SPE is set to Disable, switch it to Enable. If you cannot find this option, the BIOS version may be outdated. + +Enter the system **vim /boot/efi/EFI/ openEuler /grub.cfg**, locate the startup item corresponding to the kernel version, and add **kpti=off** to the end of the startup item. For example: + +```sh +linux /vmlinuz-4.19.90-2003.4.0.0036.oe1.aarch64 root=/dev/mapper/openeuler-root ro rd.lvm.lv=openeuler/root rd.lvm.lv=openeuler/swap video=VGA-1:640x480-32@60me rhgb quiet smmu.bypassdev=0x1000:0x17 smmu.bypassdev=0x1000:0x15 crashkernel=1024M,high video=efifb:off video=VGA-1:640x480-32@60me kpti=off +``` + +Press **Esc**, type **:wq**, and press **Enter** to save the change and exit. Run the **reboot** command to restart the server. + +### libsystem_collector.so + +System information collection plugin + +| Instance name| Architecture| Description| topic | +| --- | --- | --- | --- | +| thread_collector | aarch64/x86 | Collect system thread information.| thread_collector | +| kernel_config | aarch64/x86| Collect kernel parameters, including all sysctl parameters, lscpu, and meminfo.| get_kernel_config,get_cmd,set_kernel_config | +| command_collector | aarch64/x86 | Collect sysstat data.| mpstat,iostat,vmstat,sar,pidstat | +| net_interface_info | aarch64/x86 | Collect network information.| base,driver,local_net_affinity,net_thread_que_data | +| env_info_collector | aarch64/x86 | Collect system information.| static,realtime,cpu_util | + +### libdocker_collector.so + +Docker information collection plugin + +| Instance name| Architecture| Description| topic | +| --- | --- | --- | --- | +| docker_collector | aarch64/x86 | Collect docker information.| docker_collector | + +### libthread_scenario.so + +Thread awareness plugin + +| Instance name| Architecture| Description| Subscription| +| --- | --- | --- | --- | +| thread_scenario | aarch64/x86 | Obtain the thread information from the configuration file.| thread_collector::thread_collector | + +#### Configuration file + +thread_scenario.conf + +```sh +redis +fstime +fsbuffer +fsdisk +``` + +### libanalysis_oeaware.so + +| Instance name| Architecture| Description| Subscription| +| --- | --- | --- | --- | +| hugepage_analysis | aarch64 | Analyze whether to recommend huge pages.| pmu_counting_collector::l1d_tlb,pmu_counting_collector::l1d_tlb_refill,pmu_counting_collector::l1i_tlb,pmu_counting_collector::l1i_tlb_refill,pmu_counting_collector::l2d_tlb,pmu_counting_collector::l2d_tlb_refill,pmu_counting_collector::l2i_tlb,pmu_counting_collector::l2i_tlb_refill | +| dynamic_smt_analysis | aarch64 | Analyze whether to recommend SMT.| env_info_collector::cpu_util | +| smc_d_analysis | aarch64 | Analyze whether to recommend SMC-D| | +| xcall_analysis | aarch64 | Analyze whether to recommend xcall.| env_info_collector::cpu_util,thread_collector::thread_collector | +| net_hirq_analysis | aarch64 | Analyze whether to recommend NIC interrupt tuning.| pmu_sampling_collector::net:napi_gro_receive_entry | +| numa_analysis | aarch64 | Analyze whether to recommend NUMA tuning.| pmu_counting_collector::sched:sched_process_fork,pmu_counting_collector::sched:sched_process_exit,pmu_uncore_collector::uncore | +| docker_coordination_burst_analysis | aarch64 | Analyze whether to recommend docker burst.| env_info_collector::cpu_util, pmu_sampling_collector::cycles,docker_collector::docker_collector | +| microarch_tidnocmp_analysis | aarch64 | Analyze whether to recommend microarchitecture tuning.| thread_collector::thread_collector | + +### libsystem_tune.so + +System tuning plug-in + +| Instance name| Architecture| Description| Subscription| +| --- | --- | --- | --- | +| stealtask_tune | aarch64 | In high-load scenarios, the lightweight search algorithm quickly balances loads across multiple cores, optimizing CPU efficiency.| None| +| smc_tune | aarch64 | Enable SMC acceleration, to provide transparent acceleration for TCP connections.| None| +| xcall_tune | aarch64 | Reduce system call noise to improve system performance.| thread_collector::thread_collector | +| seep_tune | aarch64 | Enable the intelligent power mode to reduce system power consumption.| None| +| transparent_hugepage_tune | aarch64/x86 | Enable transparent huge pages to reduce the tlb-miss rate.| None| +| preload_tune | aarch64 | Load dynamic libraries seamlessly.| None| +| dynamic_smt_tune | aarch64 | Enable SMT.| 无 | +| binary_tune | aarch64 | Optimize process scheduling in containers.| env_info_collector::static, env_info_collector::realtime, thread_collector::thread_collector, docker_collector::docker_collector| +| numa_sched_tune | arrch64 | Optimize thread scheduling through native NUMA scheduling.| None| +| net_hard_irq_tune | aarch64 | Perform NIC interrupt tunning to improve network program performance.| env_info_collector::static, env_info_collector::cpu_util, net_interface_info::base::operstate_up, net_interface_info::driver::operstate_up, pmu_sampling_collector::net:napi_gro_receive_entry, pmu_sampling_collector::skb:skb_copy_datagram_iovec, pmu_sampling_collector::cycles, env_info_collector::net_thread_que_data::thread_recv_que_cnt| +| multi_net_path_tune | aarch64 | Perform NIC multipath tunning to improve network program performance.| None| + +#### Configuration file + +xcall.yaml + +``` yaml +redis: # Thread name + - xcall_1: 1 # xcall_1 indicates the xcall tunning method. Currently, only xcall_1 is supported, where 1 indicates the system call to be optimized. + - xcall_2: 22 # Only epoll_pwait is supported. +... +``` + +preload.yaml + +Path: **/etc/oeAware/preload.yaml** + +```yaml +- appname: "" + so: "" +``` + +Run the **oeawarectl -e preload_tune** command to load the SO to the corresponding process based on the configuration file. + +#### Constraints + +xcall_tune depends on kernel features. You need to enable FAST_SYSCALL to compile the kernel and add the **xcall** field to cmdline. + +### libub_tune.so + +UnixBench tuning plugin + +| Instance name| Architecture| Description| Subscription| +| --- | --- | --- | --- | +| unixbench_tune | aarch64/x86 | Reduce remote memory access to optimize the UB performance.| thread_collector::thread_collector | + +### libdocker_tune.so + +| Instance name| Architecture| Description| Subscription| +| --- | --- | --- | --- | +| docker_cpu_burst | aarch64 | CPUBurst can temporarily provide additional CPU resources for containers to alleviate performance bottlenecks caused by CPU limits when burst loads occur.| pmu_counting_collector::cycles,docker_collector::docker_collector | + +## External Plugin + +You can use the following command to install an external plugin, for example, the numafast plugin. + +```sh +oeawarectl -i numafast +``` + +### libscenario_numa.so + +| Instance name| Architecture| Description| Subscription| topic | +| --- | --- | --- | --- | --- | +| scenario_numa | aarch64 | Obtains the cross-NUMA memory access ratio in the current environment. It is used by instances or SDKs through subscription (and cannot be enabled independently).| pmu_uncore_collector::uncore | system_score | + +### libtune_numa.so + +| Instance name| Architecture| Description| Subscription| +| --- | --- | --- | --- | +| tune_numa_mem_access | aarch64 | Periodically migrates threads and memory to reduce cross-NUMA memory access.| scenario_numa::system_score, pmu_spe_collector::spe, pmu_counting_collector::cycles | + +## SDK Instructions + +```C +typedef int(*Callback)(const DataList *); +int OeInit(); // Initialize resources and establish a connection with the server. +int OeSubscribe(const CTopic *topic, Callback callback); // Subscribe to a topic and execute the callback asynchronously. +int OeUnsubscribe(const CTopic *topic); // Unsubscribe from a topic. +int OePublish(const DataList *dataList); // Publish data to the server. +void OeClose(); // Release resources. +``` + +**Example** + +```C +#include "oe_client.h" +#include "command_data.h" +int f(const DataList *dataList) +{ + int i = 0; + for (; i < dataList->len; i++) { + CommandData *data = (CommandData*)dataList->data[i]; + for (int j = 0; j < data->attrLen; ++j) { + printf("%s ", data->itemAttr[j]); + } + printf("\n"); + } + return 0; +} +int main() { + OeInit(); + CTopic topic = { + "command_collector", + "sar", + "-q 1", + }; + if (OeSubscribe(&topic, f) < 0) { + printf("failed\n"); + } else { + printf("success\n"); + } + sleep(10); + OeClose(); +} +``` + +## Constraints + +### Function constraints + +By default, oeAware integrates the Arm microarchitecture profiling module libkperf. This module can only be accessed by one process at a time. If other processes or tools (such as perf) attempt to use it simultaneously, conflicts may occur. + +### Operation constraints + +oeAware only allows operations by users in the root group, while the SDK allows operations by users in both the root and oeaware groups. + +## Precautions + +oeAware performs strict validation on the configuration files, plugin user groups, and permissions. Do not modify the permissions or user group settings of any oeAware-related files. + +Permission description: + +- Plugin files: 440 + +- Client executable file: 750 + +- Server executable file: 750 + +- Service configuration file: 640