NMSIS Bench and Test Helper Functions

group NMSIS Bench and Test Related Helper Functions

Functions that used to do benchmark and test suite.

NMSIS benchmark and test related helper functions are provided to help do benchmark and test case pass/fail assertion.

If you want to do calculate cpu cycle cost of a process, you can use BENCH_xxx macros defined in this.

In a single c source code file, you should include nmsis_bench.h, and then you should place BENCH_DECLARE_VAR(); before call other BENCH_xxx macros. If you want to start to do benchmark, you should only call BENCH_INIT(); once in your source code, and then place BENCH_START(proc_name); and BENCH_END(proc_name) before and after the process you want to measure. You can refer to <nuclei-sdk>/application/baremetal/demo_dsp for how to use it.

If you want to disable the benchmark calculation, you can place #define DISABLE_NMSIS_BENCH before include nmsis_bench.h

If in your c test source code, you can add NMSIS_TEST_PASS(); and NMSIS_TEST_FAIL(); to mark c test is pass or fail.

Defines

READ_CYCLE __get_rv_cycle

When XLEN=32, reading the full 64-bit CYCLE register incurs additional overhead.

BENCH_XLEN_MODE skips reading the upper 32 bits, reducing the extra cycle cost and allowing for more accurate measurements of small cycle counts.

NOTE: It is only applicable when the total cycle count does not exceed 2^32. Read the whole 64 bits value of MCYCLE register

BENCH_DECLARE_VAR()

Declare benchmark required variables, need to be placed above all BENCH_xxx macros in each c source code if BENCH_xxx used.

BENCH_INIT()

Initialize benchmark environment, need to called in before other BENCH_xxx macros are called.

BENCH_RESET(proc) _bc_sumcyc = 0; _bc_usecyc = 0; _bc_lpcnt = 0; _bc_ercd = 0;

Reset benchmark sum cycle and use cycle for proc.

BENCH_START(proc)

Start to do benchmark for proc, and record start cycle, and reset error code.

BENCH_SAMPLE(proc)

Sample a benchmark for proc, and record this start -> sample cost cycle, and accumulate it to sum cycle.

BENCH_END(proc)

Mark end of benchmark for proc, and calc used cycle, and print it.

BENCH_STOP(proc) printf("CSV, %s, %lu\n", #proc, (unsigned long)_bc_sumcyc);

Mark stop of benchmark, start -> sample -> sample -> stop, and print the sum cycle of a proc.

BENCH_STAT(proc) printf("STAT, %s, %lu, %lu\n", #proc, (unsigned long)_bc_lpcnt, (unsigned long)_bc_sumcyc);

Show statistics of benchmark, format: STAT, proc, loopcnt, sumcyc.

BENCH_GET_USECYC() (_bc_usecyc)

Get benchmark use cycle.

BENCH_GET_SUMCYC() (_bc_sumcyc)

Get benchmark sum cycle.

BENCH_GET_LPCNT() (_bc_lpcnt)

Get benchmark loop count.

BENCH_ERROR(proc) _bc_ercd = 1;

Mark benchmark for proc is errored.

BENCH_STATUS(proc)

Show the status of the benchmark.

EVENT_SEL_INSTRUCTION_COMMIT 0
EVENT_SEL_MEMORY_ACCESS 1
EVENT_SEL_TYPE_0 0
EVENT_SEL_TYPE_1 1
EVENT_SEL_TYPE_2 2
EVENT_SEL_TYPE_3 3
EVENT_INSTRUCTION_COMMIT_CYCLE_COUNT 1
EVENT_INSTRUCTION_COMMIT_RETIRED_COUNT 2
EVENT_INSTRUCTION_COMMIT_INTEGER_LOAD 3
EVENT_INSTRUCTION_COMMIT_INTEGER_STORE 4
EVENT_INSTRUCTION_COMMIT_ATOMIC_MEMORY_OPERATION 5
EVENT_INSTRUCTION_COMMIT_SYSTEM 6
EVENT_INSTRUCTION_COMMIT_INTEGER_COMPUTATIONAL 7
EVENT_INSTRUCTION_COMMIT_CONDITIONAL_BRANCH 8
EVENT_INSTRUCTION_COMMIT_TAKEN_CONDITIONAL_BRANCH 9
EVENT_INSTRUCTION_COMMIT_JAL 10
EVENT_INSTRUCTION_COMMIT_JALR 11
EVENT_INSTRUCTION_COMMIT_RETURN 12
EVENT_INSTRUCTION_COMMIT_CONTROL_TRANSFER 13
EVENT_INSTRUCTION_COMMIT_FENCE_INSTRUCTION 14
EVENT_INSTRUCTION_COMMIT_INTEGER_MULTIPLICATION 15
EVENT_INSTRUCTION_COMMIT_INTEGER_DIVISION_REMAINDER 16
EVENT_INSTRUCTION_COMMIT_FLOATING_POINT_LOAD 17
EVENT_INSTRUCTION_COMMIT_FLOATING_POINT_STORE 18
EVENT_INSTRUCTION_COMMIT_FLOATING_POINT_ADDITION_SUBTRACTION 19
EVENT_INSTRUCTION_COMMIT_FLOATING_POINT_MULTIPLICATION 20
EVENT_INSTRUCTION_COMMIT_FLOATING_POINT_FUSED_MULTIPLY_ADD_SUB 21
EVENT_INSTRUCTION_COMMIT_FLOATING_POINT_DIVISION_OR_SQUARE_ROOT 22
EVENT_INSTRUCTION_COMMIT_OTHER_FLOATING_POINT_INSTRUCTION 23
EVENT_INSTRUCTION_COMMIT_CONDITIONAL_BRANCH_PREDICTION_FAIL 24
EVENT_INSTRUCTION_COMMIT_JALR_PREDICTION_FAIL 25
EVENT_INSTRUCTION_COMMIT_POP_PREDICTION_FAIL 26
EVENT_INSTRUCTION_COMMIT_FENCEI_INSTRUCTION 27
EVENT_INSTRUCTION_COMMIT_SFENCE_INSTRUCTION 28
EVENT_INSTRUCTION_COMMIT_ECALL_INSTRUCTION 29
EVENT_INSTRUCTION_COMMIT_EXCEPTION_INSTRUCTION 30
EVENT_INSTRUCTION_COMMIT_INTERRUPT_INSTRUCTION 31
EVENT_MEMORY_ACCESS_ICACHE_MISS 1
EVENT_MEMORY_ACCESS_DCACHE_MISS 2
EVENT_MEMORY_ACCESS_ITLB_MISS 3
EVENT_MEMORY_ACCESS_DTLB_MISS 4
EVENT_MEMORY_ACCESS_MAIN_DTLB_MISS 5
EVENT_MEMORY_ACCESS_MAIN_TLB_MISS 5
EVENT_MEMORY_ACCESS_L2_CACHE_ACCESS 8
EVENT_MEMORY_ACCESS_L2_CACHE_MISS 9
EVENT_MEMORY_ACCESS_MEMORY_BUS_REQUEST 10
EVENT_MEMORY_ACCESS_IFU_STALL_CYCLE 11
EVENT_MEMORY_ACCESS_EXU_STALL_CYCLE 12
EVENT_MEMORY_ACCESS_TIMER 13
EVENT_TYPE_0_CYCLE_COUNT 1
EVENT_TYPE_0_RETIRED_COUNT 2
EVENT_TYPE_0_INTEGER_LOAD 3
EVENT_TYPE_0_INTEGER_STORE 4
EVENT_TYPE_0_ATOMIC_MEMORY_OPERATION 5
EVENT_TYPE_0_SYSTEM 6
EVENT_TYPE_0_INTEGER_COMPUTATIONAL 7
EVENT_TYPE_0_CONDITIONAL_BRANCH 8
EVENT_TYPE_0_TAKEN_CONDITIONAL_BRANCH 9
EVENT_TYPE_0_JAL 10
EVENT_TYPE_0_JALR 11
EVENT_TYPE_0_RETURN 12
EVENT_TYPE_0_CONTROL_TRANSFER 13
EVENT_TYPE_0_FENCE_INSTRUCTION 14
EVENT_TYPE_0_INTEGER_MULTIPLICATION 15
EVENT_TYPE_0_INTEGER_DIVISION_REMAINDER 16
EVENT_TYPE_0_FLOATING_POINT_LOAD 17
EVENT_TYPE_0_FLOATING_POINT_STORE 18
EVENT_TYPE_0_FLOATING_POINT_ADDITION_SUBTRACTION 19
EVENT_TYPE_0_FLOATING_POINT_MULTIPLICATION 20
EVENT_TYPE_0_FLOATING_POINT_FUSED_MULTIPLY_ADD_SUB 21
EVENT_TYPE_0_FLOATING_POINT_DIVISION_OR_SQUARE_ROOT 22
EVENT_TYPE_0_OTHER_FLOATING_POINT_INSTRUCTION 23
EVENT_TYPE_0_CONDITIONAL_BRANCH_PREDICTION_FAIL 24
EVENT_TYPE_0_JALR_PREDICTION_FAIL 25
EVENT_TYPE_0_POP_PREDICTION_FAIL 26
EVENT_TYPE_0_FENCEI_INSTRUCTION 27
EVENT_TYPE_0_SFENCE_INSTRUCTION 28
EVENT_TYPE_0_ECALL_INSTRUCTION 29
EVENT_TYPE_0_EXCEPTION_INSTRUCTION 30
EVENT_TYPE_0_INTERRUPT_INSTRUCTION 31
EVENT_TYPE_1_ICACHE_READ_MISS 1
EVENT_TYPE_1_DCACHE_RW_MISS 2
EVENT_TYPE_1_ITLB_READ_MISS 3
EVENT_TYPE_1_DTLB_RW_MISS 4
EVENT_TYPE_1_MAIN_TLB_MISS 5
EVENT_TYPE_1_L2_CACHE_ACCESS 8
EVENT_TYPE_1_L2_CACHE_MISS 9
EVENT_TYPE_1_MEMORY_BUS_REQUEST 10
EVENT_TYPE_1_IFU_STALL_CYCLE 11
EVENT_TYPE_1_EXU_STALL_CYCLE 12
EVENT_TYPE_1_TIMER 13
EVENT_TYPE_2_BRANCH_INSTRUCTION_COMMIT 2
EVENT_TYPE_2_BRANCH_PREDICT_FAIL_COMMIT 3
EVENT_TYPE_3_DCACHE_READ 0
EVENT_TYPE_3_DCACHE_READ_MISS 1
EVENT_TYPE_3_DCACHE_WRITE 2
EVENT_TYPE_3_DCACHE_WRITE_MISS 3
EVENT_TYPE_3_DCACHE_PREFETCH 4
EVENT_TYPE_3_DCACHE_PREFETCH_MISS 5
EVENT_TYPE_3_ICACHE_READ 6
EVENT_TYPE_3_ICACHE_PREFETCH 8
EVENT_TYPE_3_ICACHE_PREFETCH_MISS 9
EVENT_TYPE_3_L2_CACHE_READ_HIT 10
EVENT_TYPE_3_L2_CACHE_READ_MISS 11
EVENT_TYPE_3_L2_CACHE_WRITE_HIT 12
EVENT_TYPE_3_L2_CACHE_WRITE_MISS 13
EVENT_TYPE_3_L2_CACHE_PREFETCH_HIT 14
EVENT_TYPE_3_L2_CACHE_PREFETCH_MISS 15
EVENT_TYPE_3_DTLB_READ 16
EVENT_TYPE_3_DTLB_READ_MISS 17
EVENT_TYPE_3_DTLB_WRITE 18
EVENT_TYPE_3_DTLB_WRITE_MISS 19
EVENT_TYPE_3_ITLB_READ 20
EVENT_TYPE_3_BTB_READ 22
EVENT_TYPE_3_BTB_READ_MISS 23
EVENT_TYPE_3_BTB_WRITE 24
EVENT_TYPE_3_BTB_WRITE_MISS 25
MSU_EVENT_ENABLE 0x0F
MEVENT_EN 0x08
SEVENT_EN 0x02
UEVENT_EN 0x01
READ_HPM_COUNTER __get_hpm_counter
HPM_DECLARE_VAR(idx)

Declare high performance monitor counter idx benchmark required variables, need to be placed above all HPM_xxx macros in each c source code if HPM_xxx used.

HPM_SEL_ENABLE(ena) (ena << 28)
HPM_SEL_EVENT(sel, idx) ((sel) | (idx << 4))
HPM_EVENT(sel, idx, ena) (HPM_SEL_ENABLE(ena) | HPM_SEL_EVENT(sel, idx))

Construct a event variable to be set(sel -> event_sel, idx -> event_idx, ena -> m/s/u_enable)

HPM_INIT()

Initialize high performance monitor environment, need to called in before other HPM_xxx macros are called.

HPM_RESET(idx, proc, event) __hpm_sumcyc##idx = 0; __hpm_lpcnt##idx = 0;

Reset high performance benchmark for proc using counter which index is idx.

HPM_START(idx, proc, event)

Start to do high performance benchmark for proc, and record start hpm counter.

HPM_SAMPLE(idx, proc, event)

Do high performance benchmark sample for proc, and sum it into sum counter.

HPM_END(idx, proc, event)

Mark end of high performance benchmark for proc, and calc used hpm counter value.

HPM_STOP(idx, proc, event)                                 printf("HPM%d:0x%x, %s, %lu\n", idx, event, #proc, (unsigned long)__hpm_sumcyc##idx);

Mark stop of hpm benchmark, start -> sample -> sample -> stop, and print the sum cycle of a proc.

HPM_STAT(idx, proc, event)                                 printf("STATHPM%d:0x%x, %s, %lu, %lu\n", idx, event, #proc, (unsigned long)__hpm_lpcnt##idx, (unsigned long)__hpm_sumcyc##idx);

Show statistics of hpm benchmark, format: STATHPM::idx:event, proc, loopcnt, sumcyc.

HPM_GET_USECYC(idx) (__hpm_usecyc##idx)

Get hpm benchmark use cycle for counter idx.

HPM_GET_SUMCYC(idx) (__hpm_sumcyc##idx)

Get hpm benchmark sum cycle for counter idx.

HPM_GET_LPCNT(idx) (__hpm_lpcnt##idx)

Get hpm benchmark loop count for counter idx.

NMSIS_TEST_PASS() printf("\nNMSIS_TEST_PASS\n");

Mark test or application passed.

NMSIS_TEST_FAIL() printf("\nNMSIS_TEST_FAIL\n");

Mark test or application failed.

Functions

__STATIC_FORCEINLINE void __prepare_bench_env (void)

Prepare benchmark environment.

Prepare benchmark required environment, such as turn on necessary units like vpu, cycle, instret counters, hpm counters