# microbench **Repository Path**: pomelocoder/microbench ## Basic Information - **Project Name**: microbench - **Description**: Designed to measure the time cost of the core CPU instructions or the core combined instructions, from https://gitee.com/tinylab/riscv-linux/tree/master/test/microbench - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 2 - **Created**: 2023-09-21 - **Last Updated**: 2023-09-21 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # microbench This benchmark is designed to measure the time cost of the core CPU instructions or the core combined instructions. It aims to such goals: 1. help developers understand the existing glittering kernel and application code snippets. 2. reveal the shortcoming of a CPU design, guide the next-generation design. 3. guide the software optimization direction from the instruction level. It is based on the google [benchmark](https://github.com/google/benchmark.git) framework. ## Installation Please install `make`, `gcc`, `g++` and `cmake` at first. If cpufreq supported, before testing, please make sure the cpu frequency is locked at a fixed level (base frequency or max frequency). $ sudo -s # echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor or # echo userspace > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor # cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq > /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed ## Usage Run it for x86_64. $ make logging benchmark/build/test/x86_64 2022-03-21T22:55:56+08:00 Running benchmark/build/test/x86_64 Run on (3 X 1992 MHz CPU s) CPU Caches: L1 Data 32 KiB (x3) L1 Instruction 32 KiB (x3) L2 Unified 256 KiB (x3) L3 Unified 8192 KiB (x3) Load Average: 1.74, 1.13, 1.05 ------------------------------------------------------------------ Benchmark Time CPU Iterations ------------------------------------------------------------------ BM_nop 0.288 ns 0.277 ns 1000000000 BM_ub 0.974 ns 0.972 ns 708061514 BM_bnez 0.992 ns 0.991 ns 666291716 BM_beqz 1.03 ns 1.03 ns 666640065 BM_load_bnez 0.565 ns 0.563 ns 1000000000 BM_load_beqz 0.863 ns 0.859 ns 805747354 BM_cache_miss_load_bnez 1.34 ns 0.334 ns 1000000000 BM_cache_miss_load_beqz 2.31 ns 0.326 ns 1000000000 ## Run it for the other architectures Please create a new `test/$(ARCH).cc` at first, for example: $ cp test/x86_64.cc test/aarch64.cc And then, refer to the target ISA Spec and customize the instructions in `test/$(ARCH).cc`. Finally, copy the whole microbench directory to the target machine with Aarch64 cpu, and run it: $ make Logging it: $ make logging Logging it if product or cpumodel can not be fetched automatically: $ make logging PRODUCT=product-name CPUMODEL=cpu-name If the detected cpu frequency is wrong, please modify it manually, thanks! ## Run with or without loop optimization By default, the iterations optimization is enabled, to disable / enable it explicitly: // prevent iterations optimization $ make O=0 // allow iterations optimization $ make O=1 ## Static compiling If want to run in another embedded system, statically compile it: $ make clean $ make STATIC=1 ## Cross compiling We can cross compile for target architecture easily: $ make clean $ make ARCH=riscv64 clean $ make ARCH=riscv64