NMSIS Bench and Test Helper Functions
- group NMSIS Bench and Test Related Helper Functions
Functions that used to do benchmark and test suite.
NMSIS benchmark and test related helper functions are provided to help do benchmark and test case pass/fail assertion.
If you want to do calculate cpu cycle cost of a process, you can use BENCH_xxx macros defined in this.
In a single c source code file, you should include
nmsis_bench.h
, and then you should placeBENCH_DECLARE_VAR();
before call other BENCH_xxx macros. If you want to start to do benchmark, you should only callBENCH_INIT();
once in your source code, and then placeBENCH_START(proc_name);
andBENCH_END(proc_name)
before and after the process you want to measure. You can refer to<nuclei-sdk>/application/baremetal/demo_dsp
for how to use it.If you want to disable the benchmark calculation, you can place
#define DISABLE_NMSIS_BENCH
before includenmsis_bench.h
If in your c test source code, you can add
NMSIS_TEST_PASS();
andNMSIS_TEST_FAIL();
to mark c test is pass or fail.Defines
-
READ_CYCLE __get_rv_cycle
When XLEN=32, reading the full 64-bit CYCLE register incurs additional overhead.
BENCH_XLEN_MODE
skips reading the upper 32 bits, reducing the extra cycle cost and allowing for more accurate measurements of small cycle counts.NOTE: It is only applicable when the total cycle count does not exceed 2^32. Read the whole 64 bits value of MCYCLE register
-
BENCH_DECLARE_VAR()
Declare benchmark required variables, need to be placed above all BENCH_xxx macros in each c source code if BENCH_xxx used.
-
BENCH_INIT()
Initialize benchmark environment, need to called in before other BENCH_xxx macros are called.
-
BENCH_RESET(proc) _bc_sumcyc = 0; _bc_usecyc = 0; _bc_lpcnt = 0; _bc_ercd = 0;
Reset benchmark sum cycle and use cycle for proc.
-
BENCH_START(proc)
Start to do benchmark for proc, and record start cycle, and reset error code.
-
BENCH_SAMPLE(proc)
Sample a benchmark for proc, and record this start -> sample cost cycle, and accumulate it to sum cycle.
-
BENCH_END(proc)
Mark end of benchmark for proc, and calc used cycle, and print it.
-
BENCH_STOP(proc) printf("CSV, %s, %lu\n", #proc, (unsigned long)_bc_sumcyc);
Mark stop of benchmark, start -> sample -> sample -> stop, and print the sum cycle of a proc.
-
BENCH_STAT(proc) printf("STAT, %s, %lu, %lu\n", #proc, (unsigned long)_bc_lpcnt, (unsigned long)_bc_sumcyc);
Show statistics of benchmark, format: STAT, proc, loopcnt, sumcyc.
-
BENCH_GET_USECYC() (_bc_usecyc)
Get benchmark use cycle.
-
BENCH_GET_SUMCYC() (_bc_sumcyc)
Get benchmark sum cycle.
-
BENCH_GET_LPCNT() (_bc_lpcnt)
Get benchmark loop count.
-
BENCH_ERROR(proc) _bc_ercd = 1;
Mark benchmark for proc is errored.
-
BENCH_STATUS(proc)
Show the status of the benchmark.
-
EVENT_SEL_INSTRUCTION_COMMIT 0
-
EVENT_SEL_MEMORY_ACCESS 1
-
EVENT_SEL_TYPE_0 0
-
EVENT_SEL_TYPE_1 1
-
EVENT_SEL_TYPE_2 2
-
EVENT_SEL_TYPE_3 3
-
EVENT_INSTRUCTION_COMMIT_CYCLE_COUNT 1
-
EVENT_INSTRUCTION_COMMIT_RETIRED_COUNT 2
-
EVENT_INSTRUCTION_COMMIT_INTEGER_LOAD 3
-
EVENT_INSTRUCTION_COMMIT_INTEGER_STORE 4
-
EVENT_INSTRUCTION_COMMIT_ATOMIC_MEMORY_OPERATION 5
-
EVENT_INSTRUCTION_COMMIT_SYSTEM 6
-
EVENT_INSTRUCTION_COMMIT_INTEGER_COMPUTATIONAL 7
-
EVENT_INSTRUCTION_COMMIT_CONDITIONAL_BRANCH 8
-
EVENT_INSTRUCTION_COMMIT_TAKEN_CONDITIONAL_BRANCH 9
-
EVENT_INSTRUCTION_COMMIT_JAL 10
-
EVENT_INSTRUCTION_COMMIT_JALR 11
-
EVENT_INSTRUCTION_COMMIT_RETURN 12
-
EVENT_INSTRUCTION_COMMIT_CONTROL_TRANSFER 13
-
EVENT_INSTRUCTION_COMMIT_FENCE_INSTRUCTION 14
-
EVENT_INSTRUCTION_COMMIT_INTEGER_MULTIPLICATION 15
-
EVENT_INSTRUCTION_COMMIT_INTEGER_DIVISION_REMAINDER 16
-
EVENT_INSTRUCTION_COMMIT_FLOATING_POINT_LOAD 17
-
EVENT_INSTRUCTION_COMMIT_FLOATING_POINT_STORE 18
-
EVENT_INSTRUCTION_COMMIT_FLOATING_POINT_ADDITION_SUBTRACTION 19
-
EVENT_INSTRUCTION_COMMIT_FLOATING_POINT_MULTIPLICATION 20
-
EVENT_INSTRUCTION_COMMIT_FLOATING_POINT_FUSED_MULTIPLY_ADD_SUB 21
-
EVENT_INSTRUCTION_COMMIT_FLOATING_POINT_DIVISION_OR_SQUARE_ROOT 22
-
EVENT_INSTRUCTION_COMMIT_OTHER_FLOATING_POINT_INSTRUCTION 23
-
EVENT_INSTRUCTION_COMMIT_CONDITIONAL_BRANCH_PREDICTION_FAIL 24
-
EVENT_INSTRUCTION_COMMIT_JALR_PREDICTION_FAIL 25
-
EVENT_INSTRUCTION_COMMIT_POP_PREDICTION_FAIL 26
-
EVENT_INSTRUCTION_COMMIT_FENCEI_INSTRUCTION 27
-
EVENT_INSTRUCTION_COMMIT_SFENCE_INSTRUCTION 28
-
EVENT_INSTRUCTION_COMMIT_ECALL_INSTRUCTION 29
-
EVENT_INSTRUCTION_COMMIT_EXCEPTION_INSTRUCTION 30
-
EVENT_INSTRUCTION_COMMIT_INTERRUPT_INSTRUCTION 31
-
EVENT_MEMORY_ACCESS_ICACHE_MISS 1
-
EVENT_MEMORY_ACCESS_DCACHE_MISS 2
-
EVENT_MEMORY_ACCESS_ITLB_MISS 3
-
EVENT_MEMORY_ACCESS_DTLB_MISS 4
-
EVENT_MEMORY_ACCESS_MAIN_DTLB_MISS 5
-
EVENT_MEMORY_ACCESS_MAIN_TLB_MISS 5
-
EVENT_MEMORY_ACCESS_L2_CACHE_ACCESS 8
-
EVENT_MEMORY_ACCESS_L2_CACHE_MISS 9
-
EVENT_MEMORY_ACCESS_MEMORY_BUS_REQUEST 10
-
EVENT_MEMORY_ACCESS_IFU_STALL_CYCLE 11
-
EVENT_MEMORY_ACCESS_EXU_STALL_CYCLE 12
-
EVENT_MEMORY_ACCESS_TIMER 13
-
EVENT_TYPE_0_CYCLE_COUNT 1
-
EVENT_TYPE_0_RETIRED_COUNT 2
-
EVENT_TYPE_0_INTEGER_LOAD 3
-
EVENT_TYPE_0_INTEGER_STORE 4
-
EVENT_TYPE_0_ATOMIC_MEMORY_OPERATION 5
-
EVENT_TYPE_0_SYSTEM 6
-
EVENT_TYPE_0_INTEGER_COMPUTATIONAL 7
-
EVENT_TYPE_0_CONDITIONAL_BRANCH 8
-
EVENT_TYPE_0_TAKEN_CONDITIONAL_BRANCH 9
-
EVENT_TYPE_0_JAL 10
-
EVENT_TYPE_0_JALR 11
-
EVENT_TYPE_0_RETURN 12
-
EVENT_TYPE_0_CONTROL_TRANSFER 13
-
EVENT_TYPE_0_FENCE_INSTRUCTION 14
-
EVENT_TYPE_0_INTEGER_MULTIPLICATION 15
-
EVENT_TYPE_0_INTEGER_DIVISION_REMAINDER 16
-
EVENT_TYPE_0_FLOATING_POINT_LOAD 17
-
EVENT_TYPE_0_FLOATING_POINT_STORE 18
-
EVENT_TYPE_0_FLOATING_POINT_ADDITION_SUBTRACTION 19
-
EVENT_TYPE_0_FLOATING_POINT_MULTIPLICATION 20
-
EVENT_TYPE_0_FLOATING_POINT_FUSED_MULTIPLY_ADD_SUB 21
-
EVENT_TYPE_0_FLOATING_POINT_DIVISION_OR_SQUARE_ROOT 22
-
EVENT_TYPE_0_OTHER_FLOATING_POINT_INSTRUCTION 23
-
EVENT_TYPE_0_CONDITIONAL_BRANCH_PREDICTION_FAIL 24
-
EVENT_TYPE_0_JALR_PREDICTION_FAIL 25
-
EVENT_TYPE_0_POP_PREDICTION_FAIL 26
-
EVENT_TYPE_0_FENCEI_INSTRUCTION 27
-
EVENT_TYPE_0_SFENCE_INSTRUCTION 28
-
EVENT_TYPE_0_ECALL_INSTRUCTION 29
-
EVENT_TYPE_0_EXCEPTION_INSTRUCTION 30
-
EVENT_TYPE_0_INTERRUPT_INSTRUCTION 31
-
EVENT_TYPE_1_ICACHE_READ_MISS 1
-
EVENT_TYPE_1_DCACHE_RW_MISS 2
-
EVENT_TYPE_1_ITLB_READ_MISS 3
-
EVENT_TYPE_1_DTLB_RW_MISS 4
-
EVENT_TYPE_1_MAIN_TLB_MISS 5
-
EVENT_TYPE_1_L2_CACHE_ACCESS 8
-
EVENT_TYPE_1_L2_CACHE_MISS 9
-
EVENT_TYPE_1_MEMORY_BUS_REQUEST 10
-
EVENT_TYPE_1_IFU_STALL_CYCLE 11
-
EVENT_TYPE_1_EXU_STALL_CYCLE 12
-
EVENT_TYPE_1_TIMER 13
-
EVENT_TYPE_2_BRANCH_INSTRUCTION_COMMIT 2
-
EVENT_TYPE_2_BRANCH_PREDICT_FAIL_COMMIT 3
-
EVENT_TYPE_3_DCACHE_READ 0
-
EVENT_TYPE_3_DCACHE_READ_MISS 1
-
EVENT_TYPE_3_DCACHE_WRITE 2
-
EVENT_TYPE_3_DCACHE_WRITE_MISS 3
-
EVENT_TYPE_3_DCACHE_PREFETCH 4
-
EVENT_TYPE_3_DCACHE_PREFETCH_MISS 5
-
EVENT_TYPE_3_ICACHE_READ 6
-
EVENT_TYPE_3_ICACHE_PREFETCH 8
-
EVENT_TYPE_3_ICACHE_PREFETCH_MISS 9
-
EVENT_TYPE_3_L2_CACHE_READ_HIT 10
-
EVENT_TYPE_3_L2_CACHE_READ_MISS 11
-
EVENT_TYPE_3_L2_CACHE_WRITE_HIT 12
-
EVENT_TYPE_3_L2_CACHE_WRITE_MISS 13
-
EVENT_TYPE_3_L2_CACHE_PREFETCH_HIT 14
-
EVENT_TYPE_3_L2_CACHE_PREFETCH_MISS 15
-
EVENT_TYPE_3_DTLB_READ 16
-
EVENT_TYPE_3_DTLB_READ_MISS 17
-
EVENT_TYPE_3_DTLB_WRITE 18
-
EVENT_TYPE_3_DTLB_WRITE_MISS 19
-
EVENT_TYPE_3_ITLB_READ 20
-
EVENT_TYPE_3_BTB_READ 22
-
EVENT_TYPE_3_BTB_READ_MISS 23
-
EVENT_TYPE_3_BTB_WRITE 24
-
EVENT_TYPE_3_BTB_WRITE_MISS 25
-
MSU_EVENT_ENABLE 0x0F
-
MEVENT_EN 0x08
-
SEVENT_EN 0x02
-
UEVENT_EN 0x01
-
READ_HPM_COUNTER __get_hpm_counter
-
HPM_DECLARE_VAR(idx)
Declare high performance monitor counter idx benchmark required variables, need to be placed above all HPM_xxx macros in each c source code if HPM_xxx used.
-
HPM_SEL_ENABLE(ena) (ena << 28)
-
HPM_SEL_EVENT(sel, idx) ((sel) | (idx << 4))
-
HPM_EVENT(sel, idx, ena) (HPM_SEL_ENABLE(ena) | HPM_SEL_EVENT(sel, idx))
Construct a event variable to be set(sel -> event_sel, idx -> event_idx, ena -> m/s/u_enable)
-
HPM_INIT()
Initialize high performance monitor environment, need to called in before other HPM_xxx macros are called.
-
HPM_RESET(idx, proc, event) __hpm_sumcyc##idx = 0; __hpm_lpcnt##idx = 0;
Reset high performance benchmark for proc using counter which index is idx.
-
HPM_START(idx, proc, event)
Start to do high performance benchmark for proc, and record start hpm counter.
-
HPM_SAMPLE(idx, proc, event)
Do high performance benchmark sample for proc, and sum it into sum counter.
-
HPM_END(idx, proc, event)
Mark end of high performance benchmark for proc, and calc used hpm counter value.
-
HPM_STOP(idx, proc, event) printf("HPM%d:0x%x, %s, %lu\n", idx, event, #proc, (unsigned long)__hpm_sumcyc##idx);
Mark stop of hpm benchmark, start -> sample -> sample -> stop, and print the sum cycle of a proc.
-
HPM_STAT(idx, proc, event) printf("STATHPM%d:0x%x, %s, %lu, %lu\n", idx, event, #proc, (unsigned long)__hpm_lpcnt##idx, (unsigned long)__hpm_sumcyc##idx);
Show statistics of hpm benchmark, format: STATHPM::idx:event, proc, loopcnt, sumcyc.
-
HPM_GET_USECYC(idx) (__hpm_usecyc##idx)
Get hpm benchmark use cycle for counter idx.
-
HPM_GET_SUMCYC(idx) (__hpm_sumcyc##idx)
Get hpm benchmark sum cycle for counter idx.
-
HPM_GET_LPCNT(idx) (__hpm_lpcnt##idx)
Get hpm benchmark loop count for counter idx.
-
NMSIS_TEST_PASS() printf("\nNMSIS_TEST_PASS\n");
Mark test or application passed.
-
NMSIS_TEST_FAIL() printf("\nNMSIS_TEST_FAIL\n");
Mark test or application failed.
Functions
- __STATIC_FORCEINLINE void __prepare_bench_env (void)
Prepare benchmark environment.
Prepare benchmark required environment, such as turn on necessary units like vpu, cycle, instret counters, hpm counters
-
READ_CYCLE __get_rv_cycle