Nuclei Customized DSP InstructionsΒΆ

__STATIC_FORCEINLINE unsigned long long __RV_DKHM8(unsigned long long a, unsigned long long b)

__STATIC_FORCEINLINE unsigned long long __RV_DKHM16(unsigned long long a, unsigned long long b)

__STATIC_FORCEINLINE unsigned long long __RV_DKABS8(unsigned long long a)

__STATIC_FORCEINLINE unsigned long long __RV_DKABS16(unsigned long long a)

__STATIC_FORCEINLINE unsigned long long __RV_DKSLRA8(unsigned long long a, int b)

__STATIC_FORCEINLINE unsigned long long __RV_DKSLRA16(unsigned long long a, int b)

__STATIC_FORCEINLINE unsigned long long __RV_DKADD8(unsigned long long a, unsigned long long b)

__STATIC_FORCEINLINE unsigned long long __RV_DKADD16(unsigned long long a, unsigned long long b)

__STATIC_FORCEINLINE unsigned long long __RV_DKSUB8(unsigned long long a, unsigned long long b)

__STATIC_FORCEINLINE unsigned long long __RV_DKSUB16(unsigned long long a, unsigned long long b)

__STATIC_FORCEINLINE unsigned long __RV_EXPD80(unsigned long a)

__STATIC_FORCEINLINE unsigned long __RV_EXPD81(unsigned long a)

__STATIC_FORCEINLINE unsigned long __RV_EXPD82(unsigned long a)

__STATIC_FORCEINLINE unsigned long __RV_EXPD83(unsigned long a)

group
NMSIS_Core_DSP_Intrinsic_NUCLEI_CUSTOM
(RV32 only)Nuclei Customized DSP Instructions
This is Nuclei customized DSP instructions only for RV32
Functions

__STATIC_FORCEINLINE unsigned long long __RV_DKHM8(unsigned long long a, unsigned long long b)
DKHM8 (64bit SIMD Signed Saturating Q7 Multiply)
Type: SIMD
Syntax:
DKHM8 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose:
Do Q7xQ7 element multiplications simultaneously. The Q14 results are then reduced to Q7 numbers again.
Description:
For the
DKHM8
instruction, multiply the top 8bit Q7 content of 16bit chunks in Rs1 with the top 8bit Q7 content of 16bit chunks in Rs2. At the same time, multiply the bottom 8bit Q7 content of 16bit chunks in Rs1 with the bottom 8bit Q7 content of 16bit chunks in Rs2.The Q14 results are then rightshifted 7bits and saturated into Q7 values. The Q7 results are then written into Rd. When both the two Q7 inputs of a multiplication are 0x80, saturation will happen. The result will be saturated to 0x7F and the overflow flag OV will be set.
Operations:
op1t = Rs1.B[x+1]; op2t = Rs2.B[x+1]; // top op1b = Rs1.B[x]; op2b = Rs2.B[x]; // bottom for ((aop,bop,res) in [(op1t,op2t,rest), (op1b,op2b,resb)]) { if (0x80 != aop  0x80 != bop) { res = (aop s* bop) >> 7; } else { res= 0x7F; OV = 1; } } Rd.H[x/2] = concat(rest, resb); for RV32, x=0,2,4,6
 Return
value stored in unsigned long long type
 Parameters
[in] a
: unsigned long long type of value stored in a[in] b
: unsigned long long type of value stored in b

__STATIC_FORCEINLINE unsigned long long __RV_DKHM16(unsigned long long a, unsigned long long b)
DKHM16 (64bit SIMD Signed Saturating Q15 Multiply)
Type: SIMD
Syntax:
DKHM16 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose:
Do Q15xQ15 element multiplications simultaneously. The Q30 results are then reduced to Q15 numbers again.
Description:
For the
DKHM16
instruction, multiply the top 16bit Q15 content of 32bit chunks in Rs1 with the top 16bit Q15 content of 32bit chunks in Rs2. At the same time, multiply the bottom 16bit Q15 content of 32bit chunks in Rs1 with the bottom 16bit Q15 content of 32bit chunks in Rs2.The Q30 results are then rightshifted 15bits and saturated into Q15 values. The Q15 results are then written into Rd. When both the two Q15 inputs of a multiplication are 0x8000, saturation will happen. The result will be saturated to 0x7FFF and the overflow flag OV will be set.
Operations:
op1t = Rs1.H[x+1]; op2t = Rs2.H[x+1]; // top op1b = Rs1.H[x]; op2b = Rs2.H[x]; // bottom for ((aop,bop,res) in [(op1t,op2t,rest), (op1b,op2b,resb)]) { if (0x8000 != aop  0x8000 != bop) { res = (aop s* bop) >> 15; } else { res= 0x7FFF; OV = 1; } } Rd.W[x/2] = concat(rest, resb); for RV32: x=0, 2
 Return
value stored in unsigned long long type
 Parameters
[in] a
: unsigned long long type of value stored in a[in] b
: unsigned long long type of value stored in b

__STATIC_FORCEINLINE unsigned long long __RV_DKABS8(unsigned long long a)
DKABS8 (64bit SIMD 8bit Saturating Absolute)
Type: SIMD
Syntax:
DKABS8 Rd, Rs1 # Rd, Rs1 are all even/odd pair of registers
Purpose:
Get the absolute value of 8bit signed integer elements simultaneously.
Description:
This instruction calculates the absolute value of 8bit signed integer elements stored in Rs1 and writes the element results to Rd. If the input number is 0x80, this instruction generates 0x7f as the output and sets the OV bit to 1.
Operations:
src = Rs1.B[x]; if (src == 0x80) { src = 0x7f; OV = 1; } else if (src[7] == 1) src = src; } Rd.B[x] = src; for RV32: x=7...0,
 Return
value stored in unsigned long long type
 Parameters
[in] a
: unsigned long long type of value stored in a

__STATIC_FORCEINLINE unsigned long long __RV_DKABS16(unsigned long long a)
DKABS16 (64bit SIMD 16bit Saturating Absolute)
Type: SIMD
Syntax:
DKABS16 Rd, Rs1 # Rd, Rs1 are all even/odd pair of registers
Purpose:
Get the absolute value of 16bit signed integer elements simultaneously.
Description:
This instruction calculates the absolute value of 16bit signed integer elements stored in Rs1 and writes the element results to Rd. If the input number is 0x8000, this instruction generates 0x7fff as the output and sets the OV bit to 1.
Operations:
src = Rs1.H[x]; if (src == 0x8000) { src = 0x7fff; OV = 1; } else if (src[15] == 1) src = src; } Rd.H[x] = src; for RV32: x=3...0,
 Return
value stored in unsigned long long type
 Parameters
[in] a
: unsigned long long type of value stored in a

__STATIC_FORCEINLINE unsigned long long __RV_DKSLRA8(unsigned long long a, int b)
DKSLRA8 (64bit SIMD 8bit Shift Left Logical with Saturation or Shift Right Arithmetic)
Type: SIMD
Syntax:
DKSLRA8 Rd, Rs1, Rs2 # Rd, Rs1 are all even/odd pair of registers
Purpose:
Do 8bit elements logical left (positive) or arithmetic right (negative) shift operation with Q7 saturation for the left shift.
Description:
The 8bit data elements of Rs1 are leftshifted logically or rightshifted arithmetically based on the value of Rs2[3:0]. Rs2[3:0] is in the signed range of [2^3, 2^31]. A positive Rs2[3:0] means logical left shift and a negative Rs2[3:0] means arithmetic right shift. The shift amount is the absolute value of Rs2[3:0]. However, the behavior of
Rs2[3:0]==2^3 (0x8)
is defined to be equivalent to the behavior ofRs2[3:0]==(2^31) (0x9)
. The leftshifted results are saturated to the 8bit signed integer range of [2^7, 2^71]. If any saturation happens, this instruction sets the OV flag. The value of Rs2[31:4] will not affect this instruction.Operations:
if (Rs2[3:0] < 0) { sa = Rs2[3:0]; sa = (sa == 8)? 7 : sa; Rd.B[x] = SE8(Rs1.B[x][7:sa]); } else { sa = Rs2[2:0]; res[(7+sa):0] = Rs1.B[x] <<(logic) sa; if (res > (2^7)1) { res[7:0] = 0x7f; OV = 1; } else if (res < 2^7) { res[7:0] = 0x80; OV = 1; } Rd.B[x] = res[7:0]; } for RV32: x=7...0,
 Return
value stored in unsigned long long type
 Parameters
[in] a
: unsigned long long type of value stored in a[in] b
: int type of value stored in b

__STATIC_FORCEINLINE unsigned long long __RV_DKSLRA16(unsigned long long a, int b)
DKSLRA16 (64bit SIMD 16bit Shift Left Logical with Saturation or Shift Right Arithmetic)
Type: SIMD
Syntax:
DKSLRA16 Rd, Rs1, Rs2 # Rd, Rs1 are all even/odd pair of registers
Purpose:
Do 16bit elements logical left (positive) or arithmetic right (negative) shift operation with Q15 saturation for the left shift.
Description:
The 16bit data elements of Rs1 are leftshifted logically or rightshifted arithmetically based on the value of Rs2[4:0]. Rs2[4:0] is in the signed range of [2^4, 2^41]. A positive Rs2[4:0] means logical left shift and a negative Rs2[4:0] means arithmetic right shift. The shift amount is the absolute value of Rs2[4:0]. However, the behavior of
Rs2[4:0]==2^4 (0x10)
is defined to be equivalent to the behavior ofRs2[4:0]==(2^41) (0x11)
. The leftshifted results are saturated to the 16bit signed integer range of [2^15, 2^151]. After the shift, saturation, or rounding, the final results are written to Rd. If any saturation happens, this instruction sets the OV flag. The value of Rs2[31:5] will not affect this instruction.Operations:
if (Rs2[4:0] < 0) { sa = Rs2[4:0]; sa = (sa == 16)? 15 : sa; Rd.H[x] = SE16(Rs1.H[x][15:sa]); } else { sa = Rs2[3:0]; res[(15+sa):0] = Rs1.H[x] <<(logic) sa; if (res > (2^15)1) { res[15:0] = 0x7fff; OV = 1; } else if (res < 2^15) { res[15:0] = 0x8000; OV = 1; } d.H[x] = res[15:0]; } for RV32: x=3...0,
 Return
value stored in unsigned long long type
 Parameters
[in] a
: unsigned long long type of value stored in a[in] b
: int type of value stored in b

__STATIC_FORCEINLINE unsigned long long __RV_DKADD8(unsigned long long a, unsigned long long b)
DKADD8 (64bit SIMD 8bit Signed Saturating Addition)
Type: SIMD
Syntax:
DKADD8 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose:
Do 8bit signed integer element saturating additions simultaneously.
Description:
This instruction adds the 8bit signed integer elements in Rs1 with the 8bit signed integer elements in Rs2. If any of the results are beyond the Q7 number range (2^7 <= Q7 <= 2^71), they are saturated to the range and the OV bit is set to 1. The saturated results are written to Rd.
Operations:
res[x] = Rs1.B[x] + Rs2.B[x]; if (res[x] > 127) { res[x] = 127; OV = 1; } else if (res[x] < 128) { res[x] = 128; OV = 1; } Rd.B[x] = res[x]; for RV32: x=7...0,
 Return
value stored in unsigned long long type
 Parameters
[in] a
: unsigned long long type of value stored in a[in] b
: unsigned long long type of value stored in b

__STATIC_FORCEINLINE unsigned long long __RV_DKADD16(unsigned long long a, unsigned long long b)
DKADD16 (64bit SIMD 16bit Signed Saturating Addition)
Type: SIMD
Syntax:
DKADD16 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose:
Do 16bit signed integer element saturating additions simultaneously.
Description:
This instruction adds the 16bit signed integer elements in Rs1 with the 16bit signed integer elements in Rs2. If any of the results are beyond the Q15 number range (2^15 <= Q15 <= 2^151), they are saturated to the range and the OV bit is set to 1. The saturated results are written to Rd.
Operations:
res[x] = Rs1.H[x] + Rs2.H[x]; if (res[x] > 32767) { res[x] = 32767; OV = 1; } else if (res[x] < 32768) { res[x] = 32768; OV = 1; } Rd.H[x] = res[x]; for RV32: x=3...0,
 Return
value stored in unsigned long long type
 Parameters
[in] a
: unsigned long long type of value stored in a[in] b
: unsigned long long type of value stored in b

__STATIC_FORCEINLINE unsigned long long __RV_DKSUB8(unsigned long long a, unsigned long long b)
DKSUB8 (64bit SIMD 8bit Signed Saturating Subtraction)
Type: SIMD
Syntax:
DKSUB8 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose:
Do 8bit signed elements saturating subtractions simultaneously.
Description:
This instruction subtracts the 8bit signed integer elements in Rs2 from the 8bit signed integer elements in Rs1. If any of the results are beyond the Q7 number range (2^7 <= Q7 <= 2^71), they are saturated to the range and the OV bit is set to 1. The saturated results are written to Rd.
Operations:
res[x] = Rs1.B[x]  Rs2.B[x]; if (res[x] > (2^7)1) { res[x] = (2^7)1; OV = 1; } else if (res[x] < 2^7) { res[x] = 2^7; OV = 1; } Rd.B[x] = res[x]; for RV32: x=7...0,
 Return
value stored in unsigned long long type
 Parameters
[in] a
: unsigned long long type of value stored in a[in] b
: unsigned long long type of value stored in b

__STATIC_FORCEINLINE unsigned long long __RV_DKSUB16(unsigned long long a, unsigned long long b)
DKSUB16 (64bit SIMD 16bit Signed Saturating Subtraction)
Type: SIMD
Syntax:
DKSUB16 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose:
Do 16bit signed integer elements saturating subtractions simultaneously.
Description:
This instruction subtracts the 16bit signed integer elements in Rs2 from the 16bit signed integer elements in Rs1. If any of the results are beyond the Q15 number range (2^15 <= Q15 <= 2^151), they are saturated to the range and the OV bit is set to 1. The saturated results are written to Rd.
Operations:
res[x] = Rs1.H[x]  Rs2.H[x]; if (res[x] > (2^15)1) { res[x] = (2^15)1; OV = 1; } else if (res[x] < 2^15) { res[x] = 2^15; OV = 1; } Rd.H[x] = res[x]; for RV32: x=3...0,
 Return
value stored in unsigned long long type
 Parameters
[in] a
: unsigned long long type of value stored in a[in] b
: unsigned long long type of value stored in b

__STATIC_FORCEINLINE unsigned long __RV_EXPD80(unsigned long a)
EXPD80 (Expand and Copy Byte 0 to 32bit)
Type: DSP
Syntax:
EXPD80 Rd, Rs1
Purpose:
Copy 8bit data from 32bit chunks into 4 bytes in a register.
Description:
Moves Rs1.B[0][7:0] to Rd.[0][7:0], Rd.[1][7:0], Rd.[2][7:0], Rd.[3][7:0]
Operations:
Rd.W[x][31:0] = CONCAT(Rs1.B[0][7:0], Rs1.B[0][7:0], Rs1.B[0][7:0], Rs1.B[0][7:0]); for RV32: x=0
 Return
value stored in unsigned long type
 Parameters
[in] a
: unsigned long type of value stored in a

__STATIC_FORCEINLINE unsigned long __RV_EXPD81(unsigned long a)
EXPD81 (Expand and Copy Byte 1 to 32bit)
Type: DSP
Syntax:
EXPD81 Rd, Rs1
Purpose:
Copy 8bit data from 32bit chunks into 4 bytes in a register.
Description:
Moves Rs1.B[1][7:0] to Rd.[0][7:0], Rd.[1][7:0], Rd.[2][7:0], Rd.[3][7:0]
Operations:
Rd.W[x][31:0] = CONCAT(Rs1.B[1][7:0], Rs1.B[1][7:0], Rs1.B[1][7:0], Rs1.B[1][7:0]); for RV32: x=0
 Return
value stored in unsigned long type
 Parameters
[in] a
: unsigned long type of value stored in a

__STATIC_FORCEINLINE unsigned long __RV_EXPD82(unsigned long a)
EXPD82 (Expand and Copy Byte 2 to 32bit)
Type: DSP
Syntax:
EXPD82 Rd, Rs1
Purpose:
Copy 8bit data from 32bit chunks into 4 bytes in a register.
Description:
Moves Rs1.B[2][7:0] to Rd.[0][7:0], Rd.[1][7:0], Rd.[2][7:0], Rd.[3][7:0]
Operations:
Rd.W[x][31:0] = CONCAT(Rs1.B[2][7:0], Rs1.B[2][7:0], Rs1.B[2][7:0], Rs1.B[2][7:0]); for RV32: x=0
 Return
value stored in unsigned long type
 Parameters
[in] a
: unsigned long type of value stored in a

__STATIC_FORCEINLINE unsigned long __RV_EXPD83(unsigned long a)
EXPD83 (Expand and Copy Byte 3 to 32bit)
Type: DSP
Syntax:
EXPD83 Rd, Rs1
Purpose:
Copy 8bit data from 32bit chunks into 4 bytes in a register.
Description:
Moves Rs1.B[3][7:0] to Rd.[0][7:0], Rd.[1][7:0], Rd.[2][7:0], Rd.[3][7:0]
Operations:
Rd.W[x][31:0] = CONCAT(Rs1.B[3][7:0], Rs1.B[3][7:0], Rs1.B[3][7:0], Rs1.B[3][7:0]); for RV32: x=0
 Return
value stored in unsigned long type
 Parameters
[in] a
: unsigned long type of value stored in a
