# microbench

**Repository Path**: pomelocoder/microbench

## Basic Information

- **Project Name**: microbench
- **Description**: Designed to measure the time cost of the core CPU instructions or the core combined instructions, from https://gitee.com/tinylab/riscv-linux/tree/master/test/microbench
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 2
- **Created**: 2023-09-21
- **Last Updated**: 2023-09-21

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README


# microbench

This benchmark is designed to measure the time cost of the core CPU instructions or the core combined instructions.

It aims to such goals:

1. help developers understand the existing glittering kernel and application code snippets.
2. reveal the shortcoming of a CPU design, guide the next-generation design.
3. guide the software optimization direction from the instruction level.

It is based on the google [benchmark](https://github.com/google/benchmark.git) framework.

## Installation

Please install `make`, `gcc`, `g++` and `cmake` at first.

If cpufreq supported, before testing, please make sure the cpu frequency is locked at a fixed level (base frequency or max frequency).

    $ sudo -s
    # echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
    or
    # echo userspace > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
    # cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq > /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed

## Usage

Run it for x86_64.

    $ make logging
    benchmark/build/test/x86_64
    2022-03-21T22:55:56+08:00
    Running benchmark/build/test/x86_64
    Run on (3 X 1992 MHz CPU s)
    CPU Caches:
      L1 Data 32 KiB (x3)
      L1 Instruction 32 KiB (x3)
      L2 Unified 256 KiB (x3)
      L3 Unified 8192 KiB (x3)
    Load Average: 1.74, 1.13, 1.05
    ------------------------------------------------------------------
    Benchmark			 Time		  CPU	Iterations
    ------------------------------------------------------------------
    BM_nop			     0.288 ns	     0.277 ns	1000000000
    BM_ub			     0.974 ns	     0.972 ns	 708061514
    BM_bnez			     0.992 ns	     0.991 ns	 666291716
    BM_beqz			      1.03 ns	      1.03 ns	 666640065
    BM_load_bnez		     0.565 ns	     0.563 ns	1000000000
    BM_load_beqz		     0.863 ns	     0.859 ns	 805747354
    BM_cache_miss_load_bnez	      1.34 ns	     0.334 ns	1000000000
    BM_cache_miss_load_beqz	      2.31 ns	     0.326 ns	1000000000

## Run it for the other architectures

Please create a new `test/$(ARCH).cc` at first, for example:

    $ cp test/x86_64.cc test/aarch64.cc

And then, refer to the target ISA Spec and customize the instructions in `test/$(ARCH).cc`.

Finally, copy the whole microbench directory to the target machine with Aarch64 cpu, and run it:

    $ make

Logging it:

    $ make logging

Logging it if product or cpumodel can not be fetched automatically:

    $ make logging PRODUCT=product-name CPUMODEL=cpu-name

If the detected cpu frequency is wrong, please modify it manually, thanks!

## Run with or without loop optimization

By default, the iterations optimization is enabled, to disable / enable it explicitly:

    // prevent iterations optimization
    $ make O=0

    // allow iterations optimization
    $ make O=1

## Static compiling

If want to run in another embedded system, statically compile it:

    $ make clean
    $ make STATIC=1

## Cross compiling

We can cross compile for target architecture easily:

    $ make clean
    $ make ARCH=riscv64 clean
    $ make ARCH=riscv64