NMSIS Bench and Test Helper Functions
- group NMSIS_Core_Bench_Helpers
Functions that used to do benchmark and test suite.
NMSIS benchmark and test related helper functions are provided to help do benchmark and test case pass/fail assertion.
If you want to do calculate cpu cycle cost of a process, you can use BENCH_xxx macros defined in this.
In a single c source code file, you should include
nmsis_bench.h, and then you should placeBENCH_DECLARE_VAR();before call other BENCH_xxx macros. If you want to start to do benchmark, you should only callBENCH_INIT();once in your source code, and then placeBENCH_START(proc_name);andBENCH_END(proc_name)before and after the process you want to measure. You can refer to<nuclei-sdk>/application/baremetal/demo_dspfor how to use it.If you want to disable the benchmark calculation, you can place
#define DISABLE_NMSIS_BENCHbefore includenmsis_bench.hIf in your c test source code, you can add
NMSIS_TEST_PASS();andNMSIS_TEST_FAIL();to mark c test is pass or fail.Defines
-
READ_CYCLE __get_rv_cycle
When XLEN=32, reading the full 64-bit CYCLE register incurs additional overhead.
BENCH_XLEN_MODEskips reading the upper 32 bits, reducing the extra cycle cost and allowing for more accurate measurements of small cycle counts.NOTE: It is only applicable when the total cycle count does not exceed 2^32. Read the whole 64 bits value of MCYCLE register
-
BENCH_DECLARE_VAR()
Declare benchmark required variables, need to be placed above all BENCH_xxx macros in each c source code if BENCH_xxx used.
-
BENCH_INIT()
Initialize benchmark environment, need to called in before other BENCH_xxx macros are called.
-
BENCH_RESET(proc) _bc_sumcyc = 0; _bc_usecyc = 0; _bc_lpcnt = 0; _bc_ercd = 0;
Reset benchmark sum cycle and use cycle for proc.
-
BENCH_START(proc)
Start to do benchmark for proc, and record start cycle, and reset error code.
-
BENCH_SAMPLE(proc)
Sample a benchmark for proc, and record this start -> sample cost cycle, and accumulate it to sum cycle.
-
BENCH_END(proc)
Mark end of benchmark for proc, and calc used cycle, and print it.
-
BENCH_STOP(proc) printf("CSV, %s, %lu\n", #proc, (unsigned long)_bc_sumcyc);
Mark stop of benchmark, start -> sample -> sample -> stop, and print the sum cycle of a proc.
-
BENCH_STAT(proc) printf("STAT, %s, %lu, %lu\n", #proc, (unsigned long)_bc_lpcnt, (unsigned long)_bc_sumcyc);
Show statistics of benchmark, format: STAT, proc, loopcnt, sumcyc.
-
BENCH_GET_USECYC() (_bc_usecyc)
Get benchmark use cycle.
-
BENCH_GET_SUMCYC() (_bc_sumcyc)
Get benchmark sum cycle.
-
BENCH_GET_LPCNT() (_bc_lpcnt)
Get benchmark loop count.
-
BENCH_ERROR(proc) _bc_ercd = 1;
Mark benchmark for proc is errored.
-
BENCH_STATUS(proc)
Show the status of the benchmark.
-
EVENT_SEL_INSTRUCTION_COMMIT 0
-
EVENT_SEL_MEMORY_ACCESS 1
-
EVENT_SEL_TYPE_0 0
-
EVENT_SEL_TYPE_1 1
-
EVENT_SEL_TYPE_2 2
-
EVENT_SEL_TYPE_3 3
-
EVENT_INSTRUCTION_COMMIT_CYCLE_COUNT 1
-
EVENT_INSTRUCTION_COMMIT_RETIRED_COUNT 2
-
EVENT_INSTRUCTION_COMMIT_INTEGER_LOAD 3
-
EVENT_INSTRUCTION_COMMIT_INTEGER_STORE 4
-
EVENT_INSTRUCTION_COMMIT_ATOMIC_MEMORY_OPERATION 5
-
EVENT_INSTRUCTION_COMMIT_SYSTEM 6
-
EVENT_INSTRUCTION_COMMIT_INTEGER_COMPUTATIONAL 7
-
EVENT_INSTRUCTION_COMMIT_CONDITIONAL_BRANCH 8
-
EVENT_INSTRUCTION_COMMIT_TAKEN_CONDITIONAL_BRANCH 9
-
EVENT_INSTRUCTION_COMMIT_JAL 10
-
EVENT_INSTRUCTION_COMMIT_JALR 11
-
EVENT_INSTRUCTION_COMMIT_RETURN 12
-
EVENT_INSTRUCTION_COMMIT_CONTROL_TRANSFER 13
-
EVENT_INSTRUCTION_COMMIT_FENCE_INSTRUCTION 14
-
EVENT_INSTRUCTION_COMMIT_INTEGER_MULTIPLICATION 15
-
EVENT_INSTRUCTION_COMMIT_INTEGER_DIVISION_REMAINDER 16
-
EVENT_INSTRUCTION_COMMIT_FLOATING_POINT_LOAD 17
-
EVENT_INSTRUCTION_COMMIT_FLOATING_POINT_STORE 18
-
EVENT_INSTRUCTION_COMMIT_FLOATING_POINT_ADDITION_SUBTRACTION 19
-
EVENT_INSTRUCTION_COMMIT_FLOATING_POINT_MULTIPLICATION 20
-
EVENT_INSTRUCTION_COMMIT_FLOATING_POINT_FUSED_MULTIPLY_ADD_SUB 21
-
EVENT_INSTRUCTION_COMMIT_FLOATING_POINT_DIVISION_OR_SQUARE_ROOT 22
-
EVENT_INSTRUCTION_COMMIT_OTHER_FLOATING_POINT_INSTRUCTION 23
-
EVENT_INSTRUCTION_COMMIT_CONDITIONAL_BRANCH_PREDICTION_FAIL 24
-
EVENT_INSTRUCTION_COMMIT_JALR_PREDICTION_FAIL 25
-
EVENT_INSTRUCTION_COMMIT_POP_PREDICTION_FAIL 26
-
EVENT_INSTRUCTION_COMMIT_FENCEI_INSTRUCTION 27
-
EVENT_INSTRUCTION_COMMIT_SFENCE_INSTRUCTION 28
-
EVENT_INSTRUCTION_COMMIT_ECALL_INSTRUCTION 29
-
EVENT_INSTRUCTION_COMMIT_EXCEPTION_INSTRUCTION 30
-
EVENT_INSTRUCTION_COMMIT_INTERRUPT_INSTRUCTION 31
-
EVENT_MEMORY_ACCESS_ICACHE_MISS 1
-
EVENT_MEMORY_ACCESS_DCACHE_MISS 2
-
EVENT_MEMORY_ACCESS_ITLB_MISS 3
-
EVENT_MEMORY_ACCESS_DTLB_MISS 4
-
EVENT_MEMORY_ACCESS_MAIN_DTLB_MISS 5
-
EVENT_MEMORY_ACCESS_MAIN_TLB_MISS 5
-
EVENT_MEMORY_ACCESS_L2_CACHE_ACCESS 8
-
EVENT_MEMORY_ACCESS_L2_CACHE_MISS 9
-
EVENT_MEMORY_ACCESS_MEMORY_BUS_REQUEST 10
-
EVENT_MEMORY_ACCESS_IFU_STALL_CYCLE 11
-
EVENT_MEMORY_ACCESS_EXU_STALL_CYCLE 12
-
EVENT_MEMORY_ACCESS_TIMER 13
-
EVENT_TYPE_0_CYCLE_COUNT 1
-
EVENT_TYPE_0_RETIRED_COUNT 2
-
EVENT_TYPE_0_INTEGER_LOAD 3
-
EVENT_TYPE_0_INTEGER_STORE 4
-
EVENT_TYPE_0_ATOMIC_MEMORY_OPERATION 5
-
EVENT_TYPE_0_SYSTEM 6
-
EVENT_TYPE_0_INTEGER_COMPUTATIONAL 7
-
EVENT_TYPE_0_CONDITIONAL_BRANCH 8
-
EVENT_TYPE_0_TAKEN_CONDITIONAL_BRANCH 9
-
EVENT_TYPE_0_JAL 10
-
EVENT_TYPE_0_JALR 11
-
EVENT_TYPE_0_RETURN 12
-
EVENT_TYPE_0_CONTROL_TRANSFER 13
-
EVENT_TYPE_0_FENCE_INSTRUCTION 14
-
EVENT_TYPE_0_INTEGER_MULTIPLICATION 15
-
EVENT_TYPE_0_INTEGER_DIVISION_REMAINDER 16
-
EVENT_TYPE_0_FLOATING_POINT_LOAD 17
-
EVENT_TYPE_0_FLOATING_POINT_STORE 18
-
EVENT_TYPE_0_FLOATING_POINT_ADDITION_SUBTRACTION 19
-
EVENT_TYPE_0_FLOATING_POINT_MULTIPLICATION 20
-
EVENT_TYPE_0_FLOATING_POINT_FUSED_MULTIPLY_ADD_SUB 21
-
EVENT_TYPE_0_FLOATING_POINT_DIVISION_OR_SQUARE_ROOT 22
-
EVENT_TYPE_0_OTHER_FLOATING_POINT_INSTRUCTION 23
-
EVENT_TYPE_0_CONDITIONAL_BRANCH_PREDICTION_FAIL 24
-
EVENT_TYPE_0_JALR_PREDICTION_FAIL 25
-
EVENT_TYPE_0_POP_PREDICTION_FAIL 26
-
EVENT_TYPE_0_FENCEI_INSTRUCTION 27
-
EVENT_TYPE_0_SFENCE_INSTRUCTION 28
-
EVENT_TYPE_0_ECALL_INSTRUCTION 29
-
EVENT_TYPE_0_EXCEPTION_INSTRUCTION 30
-
EVENT_TYPE_0_INTERRUPT_INSTRUCTION 31
-
EVENT_TYPE_1_ICACHE_READ_MISS 1
-
EVENT_TYPE_1_DCACHE_RW_MISS 2
-
EVENT_TYPE_1_ITLB_READ_MISS 3
-
EVENT_TYPE_1_DTLB_RW_MISS 4
-
EVENT_TYPE_1_MAIN_TLB_MISS 5
-
EVENT_TYPE_1_L2_CACHE_ACCESS 8
-
EVENT_TYPE_1_L2_CACHE_MISS 9
-
EVENT_TYPE_1_MEMORY_BUS_REQUEST 10
-
EVENT_TYPE_1_IFU_STALL_CYCLE 11
-
EVENT_TYPE_1_EXU_STALL_CYCLE 12
-
EVENT_TYPE_1_TIMER 13
-
EVENT_TYPE_2_BRANCH_INSTRUCTION_COMMIT 2
-
EVENT_TYPE_2_BRANCH_PREDICT_FAIL_COMMIT 3
-
EVENT_TYPE_3_DCACHE_READ 0
-
EVENT_TYPE_3_DCACHE_READ_MISS 1
-
EVENT_TYPE_3_DCACHE_WRITE 2
-
EVENT_TYPE_3_DCACHE_WRITE_MISS 3
-
EVENT_TYPE_3_DCACHE_PREFETCH 4
-
EVENT_TYPE_3_DCACHE_PREFETCH_MISS 5
-
EVENT_TYPE_3_ICACHE_READ 6
-
EVENT_TYPE_3_ICACHE_PREFETCH 8
-
EVENT_TYPE_3_ICACHE_PREFETCH_MISS 9
-
EVENT_TYPE_3_L2_CACHE_READ 10
-
EVENT_TYPE_3_L2_CACHE_READ_MISS 11
-
EVENT_TYPE_3_L2_CACHE_WRITE 12
-
EVENT_TYPE_3_L2_CACHE_WRITE_MISS 13
-
EVENT_TYPE_3_L2_CACHE_PREFETCH_HIT 14
-
EVENT_TYPE_3_L2_CACHE_PREFETCH_MISS 15
-
EVENT_TYPE_3_DTLB_READ 16
-
EVENT_TYPE_3_DTLB_READ_MISS 17
-
EVENT_TYPE_3_DTLB_WRITE 18
-
EVENT_TYPE_3_DTLB_WRITE_MISS 19
-
EVENT_TYPE_3_ITLB_READ 20
-
EVENT_TYPE_3_BTB_READ 22
-
EVENT_TYPE_3_BTB_READ_MISS 23
-
EVENT_TYPE_3_BTB_WRITE 24
-
EVENT_TYPE_3_BTB_WRITE_MISS 25
-
MSU_EVENT_ENABLE 0x0F
-
MEVENT_EN 0x08
-
SEVENT_EN 0x02
-
UEVENT_EN 0x01
-
READ_HPM_COUNTER __get_hpm_counter
-
HPM_DECLARE_VAR(idx)
Declare high performance monitor counter idx benchmark required variables, need to be placed above all HPM_xxx macros in each c source code if HPM_xxx used.
-
HPM_SEL_ENABLE(ena) (ena << 28)
-
HPM_SEL_EVENT(sel, idx) ((sel) | (idx << 4))
-
HPM_EVENT(sel, idx, ena) (HPM_SEL_ENABLE(ena) | HPM_SEL_EVENT(sel, idx))
Construct a event variable to be set(sel -> event_sel, idx -> event_idx, ena -> m/s/u_enable)
-
HPM_INIT()
Initialize high performance monitor environment, need to called in before other HPM_xxx macros are called.
-
HPM_RESET(idx, proc, event) __hpm_sumcyc##idx = 0; __hpm_lpcnt##idx = 0;
Reset high performance benchmark for proc using counter which index is idx.
-
HPM_START(idx, proc, event)
Start to do high performance benchmark for proc, and record start hpm counter.
-
HPM_SAMPLE(idx, proc, event)
Do high performance benchmark sample for proc, and sum it into sum counter.
-
HPM_END(idx, proc, event)
Mark end of high performance benchmark for proc, and calc used hpm counter value.
-
HPM_STOP(idx, proc, event) printf("HPM%d:0x%x, %s, %lu\n", idx, event, #proc, (unsigned long)__hpm_sumcyc##idx);
Mark stop of hpm benchmark, start -> sample -> sample -> stop, and print the sum cycle of a proc.
-
HPM_STAT(idx, proc, event) printf("STATHPM%d:0x%x, %s, %lu, %lu\n", idx, event, #proc, (unsigned long)__hpm_lpcnt##idx, (unsigned long)__hpm_sumcyc##idx);
Show statistics of hpm benchmark, format: STATHPM::idx:event, proc, loopcnt, sumcyc.
-
HPM_GET_USECYC(idx) (__hpm_usecyc##idx)
Get hpm benchmark use cycle for counter idx.
-
HPM_GET_SUMCYC(idx) (__hpm_sumcyc##idx)
Get hpm benchmark sum cycle for counter idx.
-
HPM_GET_LPCNT(idx) (__hpm_lpcnt##idx)
Get hpm benchmark loop count for counter idx.
-
NMSIS_TEST_PASS() printf("\nNMSIS_TEST_PASS\n");
Mark test or application passed.
-
NMSIS_TEST_FAIL() printf("\nNMSIS_TEST_FAIL\n");
Mark test or application failed.
Functions
- __STATIC_FORCEINLINE void __prepare_bench_env (void)
Prepare benchmark environment.
Prepare benchmark required environment, such as turn on necessary units like vpu, cycle, instret counters, hpm counters
-
READ_CYCLE __get_rv_cycle