Nuclei Customized Default DSP Instructions
 __STATIC_FORCEINLINE unsigned long __RV_EXPD80 (unsigned long a)
 __STATIC_FORCEINLINE unsigned long __RV_EXPD81 (unsigned long a)
 __STATIC_FORCEINLINE unsigned long __RV_EXPD82 (unsigned long a)
 __STATIC_FORCEINLINE unsigned long __RV_EXPD83 (unsigned long a)
 group NMSIS_Core_DSP_Intrinsic_NUCLEI_Default
(RV32 & RV64)Nuclei Customized DSP Instructions
This is Nuclei customized DSP instructions for both RV32 and RV64
Functions
 __STATIC_FORCEINLINE unsigned long __RV_EXPD80 (unsigned long a)
EXPD80 (Expand and Copy Byte 0 to 32bit(when rv32) or 64bit(when rv64))
Type: DSP
Syntax:
EXPD80 Rd, Rs1
Purpose
:
When rv32, Copy 8bit data from 32bit chunks into 4 bytes in a register. When rv64, Copy 8bit data from 64bit chunks into 8 bytes in a register.
Description
:
Moves Rs1.B[0][7:0] to Rd.[0][7:0], Rd.[1][7:0], Rd.[2][7:0], Rd.[3][7:0]
Operations:
Rd.W[x][31:0] = CONCAT(Rs1.B[0][7:0], Rs1.B[0][7:0], Rs1.B[0][7:0], Rs1.B[0][7:0]); for RV32: x=0
 Parameters
a – [in] unsigned long type of value stored in a
 Returns
value stored in unsigned long type
 __STATIC_FORCEINLINE unsigned long __RV_EXPD81 (unsigned long a)
EXPD81 (Expand and Copy Byte 1 to 32bit(rv32) or 64bit(when rv64))
Type: DSP
Syntax:
EXPD81 Rd, Rs1
Purpose
:
Copy 8bit data from 32bit chunks into 4 bytes in a register.
Description
:
Moves Rs1.B[1][7:0] to Rd.[0][7:0], Rd.[1][7:0], Rd.[2][7:0], Rd.[3][7:0]
Operations:
Rd.W[x][31:0] = CONCAT(Rs1.B[1][7:0], Rs1.B[1][7:0], Rs1.B[1][7:0], Rs1.B[1][7:0]); for RV32: x=0
 Parameters
a – [in] unsigned long type of value stored in a
 Returns
value stored in unsigned long type
 __STATIC_FORCEINLINE unsigned long __RV_EXPD82 (unsigned long a)
EXPD82 (Expand and Copy Byte 2 to 32bit(rv32) or 64bit(when rv64))
Type: DSP
Syntax:
EXPD82 Rd, Rs1
Purpose
:
Copy 8bit data from 32bit chunks into 4 bytes in a register.
Description
:
Moves Rs1.B[2][7:0] to Rd.[0][7:0], Rd.[1][7:0], Rd.[2][7:0], Rd.[3][7:0]
Operations:
Rd.W[x][31:0] = CONCAT(Rs1.B[2][7:0], Rs1.B[2][7:0], Rs1.B[2][7:0], Rs1.B[2][7:0]); for RV32: x=0
 Parameters
a – [in] unsigned long type of value stored in a
 Returns
value stored in unsigned long type
 __STATIC_FORCEINLINE unsigned long __RV_EXPD83 (unsigned long a)
EXPD83 (Expand and Copy Byte 3 to 32bit(rv32) or 64bit(when rv64))
Type: DSP
Syntax:
EXPD83 Rd, Rs1
Purpose
:
Copy 8bit data from 32bit chunks into 4 bytes in a register.
Description
:
Moves Rs1.B[3][7:0] to Rd.[0][7:0], Rd.[1][7:0], Rd.[2][7:0], Rd.[3][7:0]
Operations:
Rd.W[x][31:0] = CONCAT(Rs1.B[3][7:0], Rs1.B[3][7:0], Rs1.B[3][7:0], Rs1.B[3][7:0]); for RV32: x=0
 Parameters
a – [in] unsigned long type of value stored in a
 Returns
value stored in unsigned long type
Nuclei Customized N1/N2/N3 DSP Instructions
 __STATIC_FORCEINLINE unsigned long long __RV_DKHM8 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DKHM16 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DKABS8 (unsigned long long a)
 __STATIC_FORCEINLINE unsigned long long __RV_DKABS16 (unsigned long long a)
 __STATIC_FORCEINLINE unsigned long long __RV_DKSLRA8 (unsigned long long a, int b)
 __STATIC_FORCEINLINE unsigned long long __RV_DKSLRA16 (unsigned long long a, int b)
 __STATIC_FORCEINLINE unsigned long long __RV_DKADD8 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DKADD16 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DKSUB8 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DKSUB16 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DKHMX8 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DKHMX16 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DSMMUL (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DSMMUL_U (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DKWMMUL (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DKWMMUL_U (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DKABS32 (unsigned long long a)
 __STATIC_FORCEINLINE unsigned long long __RV_DKSLRA32 (unsigned long long a, int b)
 __STATIC_FORCEINLINE unsigned long long __RV_DKADD32 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DKSUB32 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DRADD16 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DSUB16 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DRADD32 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DSUB32 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DMSR16 (unsigned long a, unsigned long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DMSR17 (unsigned long a, unsigned long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DMSR33 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DMXSR33 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long __RV_DREDAS16 (unsigned long long a)
 __STATIC_FORCEINLINE unsigned long __RV_DREDSA16 (unsigned long long a)
 __STATIC_FORCEINLINE int16_t __RV_DKCLIP64 (unsigned long long a)
 __STATIC_FORCEINLINE unsigned long long __RV_DKMDA (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DKMXDA (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DSMDRS (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DSMXDS (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE long long __RV_DSMBB32 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE long long __RV_DSMBB32_SRA14 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE long long __RV_DSMBB32_SRA32 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE long long __RV_DSMBT32 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE long long __RV_DSMBT32_SRA14 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE long long __RV_DSMBT32_SRA32 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE long long __RV_DSMTT32 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE long long __RV_DSMTT32_SRA14 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE long long __RV_DSMTT32_SRA32 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DPKBB32 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DPKBT32 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DPKTT32 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DPKTB32 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DPKTB16 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DPKBB16 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DPKBT16 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DPKTT16 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DSRA16 (unsigned long long a, unsigned long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DADD16 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DADD32 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DSMBB16 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DSMBT16 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DSMTT16 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DRCRSA16 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DRCRSA32 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DRCRAS16 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DRCRAS32 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DKCRAS16 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DKCRSA16 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DRSUB16 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DSTSA32 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DSTAS32 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DKCRSA32 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DKCRAS32 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DCRSA32 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DCRAS32 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DKSTSA16 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DKSTAS16 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DRSUB32 (unsigned long long a, unsigned long long b)

__RV_DSCLIP8(a, b)

__RV_DSCLIP16(a, b)

__RV_DSCLIP32(a, b)
 __STATIC_FORCEINLINE unsigned long long __RV_DKMMAC (unsigned long long t, unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DKMMAC_U (unsigned long long t, unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DKMMSB (unsigned long long t, unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DKMMSB_U (unsigned long long t, unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DKMADA (unsigned long long t, unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DKMAXDA (unsigned long long t, unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DKMADS (unsigned long long t, unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DKMADRS (unsigned long long t, unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DKMAXDS (unsigned long long t, unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DKMSDA (unsigned long long t, unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DKMSXDA (unsigned long long t, unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DSMAQA (unsigned long long t, unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DSMAQA_SU (unsigned long long t, unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DUMAQA (unsigned long long t, unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE long long __RV_DKMDA32 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE long long __RV_DKMXDA32 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE long long __RV_DKMADA32 (long long t, unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE long long __RV_DKMAXDA32 (long long t, unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE long long __RV_DKMADS32 (long long t, unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE long long __RV_DKMADRS32 (long long t, unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE long long __RV_DKMAXDS32 (long long t, unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE long long __RV_DKMSDA32 (long long t, unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE long long __RV_DKMSXDA32 (long long t, unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE long long __RV_DSMDS32 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE long long __RV_DSMDRS32 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE long long __RV_DSMXDS32 (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE long long __RV_DSMALDA (long long t, unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE long long __RV_DSMALXDA (long long t, unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE long long __RV_DSMALDS (long long t, unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE long long __RV_DSMALDRS (long long t, unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE long long __RV_DSMALXDS (long long t, unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE long long __RV_DSMSLDA (long long t, unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE long long __RV_DSMSLXDA (long long t, unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE long long __RV_DDSMAQA (long long t, unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE long long __RV_DDSMAQASU (long long t, unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE long long __RV_DDUMAQA (long long t, unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE long __RV_DSMA32_U (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE long __RV_DSMXS32_U (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE long __RV_DSMXA32_U (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE long __RV_DSMS32_U (unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE long __RV_DSMADA16 (long long t, unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE long __RV_DSMAXDA16 (long long t, unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE unsigned long long __RV_DKSMS32_U (unsigned long long t, unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE long __RV_DMADA32 (long long t, unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE long long __RV_DSMALBB (long long t, unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE long long __RV_DSMALBT (long long t, unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE long long __RV_DSMALTT (long long t, unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE long long __RV_DKMABB32 (long long t, unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE long long __RV_DKMABT32 (long long t, unsigned long long a, unsigned long long b)
 __STATIC_FORCEINLINE long long __RV_DKMATT32 (long long t, unsigned long long a, unsigned long long b)
 group NMSIS_Core_DSP_Intrinsic_NUCLEI_N1
(RV32 only)Nuclei Customized N1 DSP Instructions
This is Nuclei customized DSP N1 instructions only for RV32
Functions
 __STATIC_FORCEINLINE unsigned long long __RV_DKHM8 (unsigned long long a, unsigned long long b)
DKHM8 (64bit SIMD Signed Saturating Q7 Multiply)
Type: SIMD
Syntax:
DKHM8 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do Q7xQ7 element multiplications simultaneously. The Q14 results are then reduced to Q7 numbers again.
Description
:
For the
DKHM8
instruction, multiply the top 8bit Q7 content of 16bit chunks in Rs1 with the top 8bit Q7 content of 16bit chunks in Rs2. At the same time, multiply the bottom 8bit Q7 content of 16bit chunks in Rs1 with the bottom 8bit Q7 content of 16bit chunks in Rs2.The Q14 results are then rightshifted 7bits and saturated into Q7 values. The Q7 results are then written into Rd. When both the two Q7 inputs of a multiplication are 0x80, saturation will happen. The result will be saturated to 0x7F and the overflow flag OV will be set.
Operations:
op1t = Rs1.B[x+1]; op2t = Rs2.B[x+1]; // top op1b = Rs1.B[x]; op2b = Rs2.B[x]; // bottom for ((aop,bop,res) in [(op1t,op2t,rest), (op1b,op2b,resb)]) { if (0x80 != aop  0x80 != bop) { res = (aop s* bop) >> 7; } else { res= 0x7F; OV = 1; } } Rd.H[x/2] = concat(rest, resb); for RV32, x=0,2,4,6
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DKHM16 (unsigned long long a, unsigned long long b)
DKHM16 (64bit SIMD Signed Saturating Q15 Multiply)
Type: SIMD
Syntax:
DKHM16 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do Q15xQ15 element multiplications simultaneously. The Q30 results are then reduced to Q15 numbers again.
Description
:
For the
DKHM16
instruction, multiply the top 16bit Q15 content of 32bit chunks in Rs1 with the top 16bit Q15 content of 32bit chunks in Rs2. At the same time, multiply the bottom 16bit Q15 content of 32bit chunks in Rs1 with the bottom 16bit Q15 content of 32bit chunks in Rs2.The Q30 results are then rightshifted 15bits and saturated into Q15 values. The Q15 results are then written into Rd. When both the two Q15 inputs of a multiplication are 0x8000, saturation will happen. The result will be saturated to 0x7FFF and the overflow flag OV will be set.
Operations:
op1t = Rs1.H[x+1]; op2t = Rs2.H[x+1]; // top op1b = Rs1.H[x]; op2b = Rs2.H[x]; // bottom for ((aop,bop,res) in [(op1t,op2t,rest), (op1b,op2b,resb)]) { if (0x8000 != aop  0x8000 != bop) { res = (aop s* bop) >> 15; } else { res= 0x7FFF; OV = 1; } } Rd.W[x/2] = concat(rest, resb); for RV32: x=0, 2
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DKABS8 (unsigned long long a)
DKABS8 (64bit SIMD 8bit Saturating Absolute)
Type: SIMD
Syntax:
DKABS8 Rd, Rs1 # Rd, Rs1 are all even/odd pair of registers
Purpose
:
Get the absolute value of 8bit signed integer elements simultaneously.
Description
:
This instruction calculates the absolute value of 8bit signed integer elements stored in Rs1 and writes the element results to Rd. If the input number is 0x80, this instruction generates 0x7f as the output and sets the OV bit to 1.
Operations:
src = Rs1.B[x]; if (src == 0x80) { src = 0x7f; OV = 1; } else if (src[7] == 1) src = src; } Rd.B[x] = src; for RV32: x=7...0,
 Parameters
a – [in] unsigned long long type of value stored in a
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DKABS16 (unsigned long long a)
DKABS16 (64bit SIMD 16bit Saturating Absolute)
Type: SIMD
Syntax:
DKABS16 Rd, Rs1 # Rd, Rs1 are all even/odd pair of registers
Purpose
:
Get the absolute value of 16bit signed integer elements simultaneously.
Description
:
This instruction calculates the absolute value of 16bit signed integer elements stored in Rs1 and writes the element results to Rd. If the input number is 0x8000, this instruction generates 0x7fff as the output and sets the OV bit to 1.
Operations:
src = Rs1.H[x]; if (src == 0x8000) { src = 0x7fff; OV = 1; } else if (src[15] == 1) src = src; } Rd.H[x] = src; for RV32: x=3...0,
 Parameters
a – [in] unsigned long long type of value stored in a
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DKSLRA8 (unsigned long long a, int b)
DKSLRA8 (64bit SIMD 8bit Shift Left Logical with Saturation or Shift Right Arithmetic)
Type: SIMD
Syntax:
DKSLRA8 Rd, Rs1, Rs2 # Rd, Rs1 are all even/odd pair of registers
Purpose
:
Do 8bit elements logical left (positive) or arithmetic right (negative) shift operation with Q7 saturation for the left shift.
Description
:
The 8bit data elements of Rs1 are leftshifted logically or rightshifted arithmetically based on the value of Rs2[3:0]. Rs2[3:0] is in the signed range of [2^3, 2^31]. A positive Rs2[3:0] means logical left shift and a negative Rs2[3:0] means arithmetic right shift. The shift amount is the absolute value of Rs2[3:0]. However, the behavior of
Rs2[3:0]==2^3 (0x8)
is defined to be equivalent to the behavior ofRs2[3:0]==(2^31) (0x9)
. The leftshifted results are saturated to the 8bit signed integer range of [2^7, 2^71]. If any saturation happens, this instruction sets the OV flag. The value of Rs2[31:4] will not affect this instruction.Operations:
if (Rs2[3:0] < 0) { sa = Rs2[3:0]; sa = (sa == 8)? 7 : sa; Rd.B[x] = SE8(Rs1.B[x][7:sa]); } else { sa = Rs2[2:0]; res[(7+sa):0] = Rs1.B[x] <<(logic) sa; if (res > (2^7)1) { res[7:0] = 0x7f; OV = 1; } else if (res < 2^7) { res[7:0] = 0x80; OV = 1; } Rd.B[x] = res[7:0]; } for RV32: x=7...0,
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] int type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DKSLRA16 (unsigned long long a, int b)
DKSLRA16 (64bit SIMD 16bit Shift Left Logical with Saturation or Shift Right Arithmetic)
Type: SIMD
Syntax:
DKSLRA16 Rd, Rs1, Rs2 # Rd, Rs1 are all even/odd pair of registers
Purpose
:
Do 16bit elements logical left (positive) or arithmetic right (negative) shift operation with Q15 saturation for the left shift.
Description
:
The 16bit data elements of Rs1 are leftshifted logically or rightshifted arithmetically based on the value of Rs2[4:0]. Rs2[4:0] is in the signed range of [2^4, 2^41]. A positive Rs2[4:0] means logical left shift and a negative Rs2[4:0] means arithmetic right shift. The shift amount is the absolute value of Rs2[4:0]. However, the behavior of
Rs2[4:0]==2^4 (0x10)
is defined to be equivalent to the behavior ofRs2[4:0]==(2^41) (0x11)
. The leftshifted results are saturated to the 16bit signed integer range of [2^15, 2^151]. After the shift, saturation, or rounding, the final results are written to Rd. If any saturation happens, this instruction sets the OV flag. The value of Rs2[31:5] will not affect this instruction.Operations:
if (Rs2[4:0] < 0) { sa = Rs2[4:0]; sa = (sa == 16)? 15 : sa; Rd.H[x] = SE16(Rs1.H[x][15:sa]); } else { sa = Rs2[3:0]; res[(15+sa):0] = Rs1.H[x] <<(logic) sa; if (res > (2^15)1) { res[15:0] = 0x7fff; OV = 1; } else if (res < 2^15) { res[15:0] = 0x8000; OV = 1; } d.H[x] = res[15:0]; } for RV32: x=3...0,
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] int type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DKADD8 (unsigned long long a, unsigned long long b)
DKADD8 (64bit SIMD 8bit Signed Saturating Addition)
Type: SIMD
Syntax:
DKADD8 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do 8bit signed integer element saturating additions simultaneously.
Description
:
This instruction adds the 8bit signed integer elements in Rs1 with the 8bit signed integer elements in Rs2. If any of the results are beyond the Q7 number range (2^7 <= Q7 <= 2^71), they are saturated to the range and the OV bit is set to 1. The saturated results are written to Rd.
Operations:
res[x] = Rs1.B[x] + Rs2.B[x]; if (res[x] > 127) { res[x] = 127; OV = 1; } else if (res[x] < 128) { res[x] = 128; OV = 1; } Rd.B[x] = res[x]; for RV32: x=7...0,
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DKADD16 (unsigned long long a, unsigned long long b)
DKADD16 (64bit SIMD 16bit Signed Saturating Addition)
Type: SIMD
Syntax:
DKADD16 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do 16bit signed integer element saturating additions simultaneously.
Description
:
This instruction adds the 16bit signed integer elements in Rs1 with the 16bit signed integer elements in Rs2. If any of the results are beyond the Q15 number range (2^15 <= Q15 <= 2^151), they are saturated to the range and the OV bit is set to 1. The saturated results are written to Rd.
Operations:
res[x] = Rs1.H[x] + Rs2.H[x]; if (res[x] > 32767) { res[x] = 32767; OV = 1; } else if (res[x] < 32768) { res[x] = 32768; OV = 1; } Rd.H[x] = res[x]; for RV32: x=3...0,
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DKSUB8 (unsigned long long a, unsigned long long b)
DKSUB8 (64bit SIMD 8bit Signed Saturating Subtraction)
Type: SIMD
Syntax:
DKSUB8 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do 8bit signed elements saturating subtractions simultaneously.
Description
:
This instruction subtracts the 8bit signed integer elements in Rs2 from the 8bit signed integer elements in Rs1. If any of the results are beyond the Q7 number range (2^7 <= Q7 <= 2^71), they are saturated to the range and the OV bit is set to 1. The saturated results are written to Rd.
Operations:
res[x] = Rs1.B[x]  Rs2.B[x]; if (res[x] > (2^7)1) { res[x] = (2^7)1; OV = 1; } else if (res[x] < 2^7) { res[x] = 2^7; OV = 1; } Rd.B[x] = res[x]; for RV32: x=7...0,
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DKSUB16 (unsigned long long a, unsigned long long b)
DKSUB16 (64bit SIMD 16bit Signed Saturating Subtraction)
Type: SIMD
Syntax:
DKSUB16 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do 16bit signed integer elements saturating subtractions simultaneously.
Description
:
This instruction subtracts the 16bit signed integer elements in Rs2 from the 16bit signed integer elements in Rs1. If any of the results are beyond the Q15 number range (2^15 <= Q15 <= 2^151), they are saturated to the range and the OV bit is set to 1. The saturated results are written to Rd.
Operations:
res[x] = Rs1.H[x]  Rs2.H[x]; if (res[x] > (2^15)1) { res[x] = (2^15)1; OV = 1; } else if (res[x] < 2^15) { res[x] = 2^15; OV = 1; } Rd.H[x] = res[x]; for RV32: x=3...0,
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 group NMSIS_Core_DSP_Intrinsic_NUCLEI_N2
(RV32 only)Nuclei Customized N2 DSP Instructions
This is Nuclei customized DSP N2 instructions only for RV32
Defines

__RV_DSCLIP8(a, b)
DSCLIP8 (8bit Signed Saturation and Clip)
Type: SIMD
Syntax:
DSCLIP8 Rd, Rs1, imm3u[2:0] # Rd, Rs1 are all even/odd pair of registers
Purpose
:
Limit the 8bit signed integer elements of a register into a signed range simultaneously.
Description
:
This instruction limits the 8bit signed integer elements stored in Rs1 into a signed integer range between 2^imm3u and 2^imm3u1, and writes the limited results to Rd. For example, if imm3u is 3, the 8bit input values should be saturated between 7 and 8. If saturation is performed, set OV bit to 1.
Operations:
src = Rs1.B[x]; if (src > (2^imm3u)1) { src = (2^imm3u)1; OV = 1; } else if (src < 2^imm3u) { src = 2^imm3u; OV = 1; } Rd.B[x] = src x=7...0
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type

__RV_DSCLIP16(a, b)
DSCLIP16 (16bit Signed Saturation and Clip)
Type: SIMD
Syntax:
DSCLIP16 Rd, Rs1, imm4u[3:0] # Rd, Rs1 are all even/odd pair of registers
Purpose
:
Limit the 16bit signed integer elements of a register into a signed range simultaneously.
Description
:
This instruction limits the 16bit signed integer elements stored in Rs1 into a signed integer range between 2^imm4u and 2^imm4u1, and writes the limited results to Rd. For example, if imm4u is 3, the 32bit input values should be saturated between 7 and 8. If saturation is performed, set OV bit to 1.
Operations:
src = Rs1.H[x]; if (src > (2^imm4u)1) { src = (2^imm4u)1; OV = 1; } else if (src < 2^imm4u) { src = 2^imm4u; OV = 1; } Rd.H[x] = src x=3...0
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type

__RV_DSCLIP32(a, b)
DSCLIP32 (32bit Signed Saturation and Clip)
Type: SIMD
Syntax:
DSCLIP32 Rd, Rs1, imm5u[4:0] # Rd, Rs1 are all even/odd pair of registers
Purpose
:
Limit the 32bit signed integer elements of a register into a signed range simultaneously.
Description
:
This instruction limits the 32bit signed integer elements stored in Rs1 into a signed integer range between 2^imm5u and 2^imm5u1, and writes the limited results to Rd. For example, if imm5u is 3, the 32bit input values should be saturated between 7 and 8. If saturation is performed, set OV bit to 1.
Operations:
src = Rs1.W[x]; if (src > (2^imm5u)1) { src = (2^imm5u)1; OV = 1; } else if (src < 2^imm5u) { src = 2^imm5u; OV = 1; } Rd.W[x] = src x=1...0
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
Functions
 __STATIC_FORCEINLINE unsigned long long __RV_DKHMX8 (unsigned long long a, unsigned long long b)
DKHMX8 (64bit SIMD Signed Crossed Saturating Q7 Multiply)
Type: SIMD
Syntax:
DKHMX8 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do Q7xQ7 element crossed multiplications simultaneously. The Q15 results are then reduced to Q7 numbers again.
Description
:
For the
KHM8
instruction, multiply the top 8bit Q7 content of 16bit chunks in Rs1 with the bottom 8bit Q7 content of 16bit chunks in Rs2. At the same time, multiply the bottom 8bit Q7 content of 16bit chunks in Rs1 with the top 8bit Q7 content of 16bit chunks in Rs2.The Q14 results are then rightshifted 7bits and saturated into Q7 values. The Q7 results are then written into Rd. When both the two Q7 inputs of a multiplication are 0x80, saturation will happen. The result will be saturated to 0x7F and the overflow flag OV will be set.
Operations:
op1t = Rs1.B[x+1]; op2t = Rs2.B[x]; // top op1b = Rs1.B[x]; op2b = Rs2.B[x+1]; // bottom for ((aop,bop,res) in [(op1t,op2t,rest), (op1b,op2b,resb)]) { if (0x80 != aop  0x80 != bop) { res = (aop s* bop) >> 7; } else { res= 0x7F; OV = 1; } } Rd.H[x/2] = concat(rest, resb); for RV32, x=0,2,4,6
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DKHMX16 (unsigned long long a, unsigned long long b)
DKHMX16 (64bit SIMD Signed Crossed Saturating Q15 Multiply)
Type: SIMD
Syntax:
DKHMX16 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do Q15xQ15 element crossed multiplications simultaneously. The Q31 results are then reduced to Q15 numbers again.
Description
:
For the
KHMX16
instruction, multiply the top 16bit Q15 content of 32bit chunks in Rs1 with the bottom 16bit Q15 content of 32bit chunks in Rs2. At the same time, multiply the bottom 16bit Q15 content of 32bit chunks in Rs1 with the top 16bit Q15 content of 32bit chunks in Rs2.The Q30 results are then rightshifted 15bits and saturated into Q15 values. The Q15 results are then written into Rd. When both the two Q15 inputs of a multiplication are 0x8000, saturation will happen. The result will be saturated to 0x7FFF and the overflow flag OV will be set.
Operations:
op1t = Rs1.H[x+1]; op2t = Rs2.H[x]; // top op1b = Rs1.H[x]; op2b = Rs2.H[x+1]; // bottom for ((aop,bop,res) in [(op1t,op2t,rest), (op1b,op2b,resb)]) { if (0x8000 != aop  0x8000 != bop) { res = (aop s* bop) >> 15; } else { res= 0x7FFF; OV = 1; } } Rd.W[x/2] = concat(rest, resb); for RV32, x=0,2
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DSMMUL (unsigned long long a, unsigned long long b)
DSMMUL (64bit MSW 32x32 Signed Multiply)
Type: SIMD
Syntax:
DSMMUL Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do MSW 32x32 element signed multiplications simultaneously. The results are written into Rd.
Description
:
This instruction multiplies the 32bit elements of Rs1 with the 32bit elements of Rs2 and writes the most significant 32bit multiplication results to the corresponding 32bit elements of Rd. The 32bit elements of Rs1 and Rs2 are treated as signed integers. The .u form of the instruction rounds up the most significant 32bit of the 64bit multiplication results by adding a 1 to bit 31 of the results.
Operations:
op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom for ((aop,bop,res) in [(op1t,op2t,rest), (op1b,op2b,resb)]) { res = (aop s* bop)[63:32]; } Rd = concat(rest, resb); x=0
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DSMMUL_U (unsigned long long a, unsigned long long b)
DSMMULU (64bit MSW 32x32 Unsigned Multiply)
Type: SIMD
Syntax:
DSMMUL.U Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do MSW 32x32 element unsigned multiplications simultaneously. The results are written into Rd.
Description
:
This instruction multiplies the 32bit elements of Rs1 with the 32bit elements of Rs2 and writes the most significant 32bit multiplication results to the corresponding 32bit elements of Rd. The 32bit elements of Rs1 and Rs2 are treated as unsigned integers. The .u form of the instruction rounds up the most significant 32bit of the 64bit multiplication results by adding a 1 to bit 31 of the results.
Operations:
op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom for ((aop,bop,res) in [(op1t,op2t,rest), (op1b,op2b,resb)]) { res = RUND(aop u* bop)[63:32]; } Rd = concat(rest, resb); x=0
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DKWMMUL (unsigned long long a, unsigned long long b)
DKWMMUL (64bit MSW 32x32 Signed Multiply & Double)
Type: SIMD
Syntax:
DKWMMUL Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do MSW 32x32 element signed multiplications simultaneously and double. The results are written into Rd.
Description
:
This instruction multiplies the 32bit elements of Rs1 with the 32bit elements of Rs2. It then shifts the multiplication results one bit to the left and takes the most significant 32bit results. If the shifted result is greater than 2^311, it is saturated to 2^311 and the OV flag is set to 1. The final element result is written to Rd. The 32bit elements of Rs1 and Rs2 are treated as signed integers. The .u form of the instruction additionally rounds up the 64bit multiplication results by adding a 1 to bit 30 before the shift and saturation operations.
Operations:
op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom for ((aop,bop,res) in [(op1t,op2t,rest), (op1b,op2b,resb)]) { res = sat.q31((aop s* bop) << 1)[63:32]; } Rd = concat(rest, resb); x=0
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DKWMMUL_U (unsigned long long a, unsigned long long b)
DKWMMULU (64bit MSW 32x32 Unsigned Multiply & Double)
Type: SIMD
Syntax:
DKWMMUL.U Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do MSW 32x32 element unsigned multiplications simultaneously and double. The results are written into Rd.
Description
:
This instruction multiplies the 32bit elements of Rs1 with the 32bit elements of Rs2. It then shifts the multiplication results one bit to the left and takes the most significant 32bit results. If the shifted result is greater than 2^311, it is saturated to 2^311 and the OV flag is set to 1. The final element result is written to Rd. The 32bit elements of Rs1 and Rs2 are treated as signed integers. The .u form of the instruction additionally rounds up the 64bit multiplication results by adding a 1 to bit 30 before the shift and saturation operations.
Operations:
op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom for ((aop,bop,res) in [(op1t,op2t,rest), (op1b,op2b,resb)]) { res = sat.q31(RUND(aop u* bop) << 1)[63:32]; } Rd = concat(rest, resb); x=0
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DKABS32 (unsigned long long a)
DKABS32 (64bit SIMD 32bit Saturating Absolute)
Type: SIMD
Syntax:
DKABS32 Rd, Rs1 # Rd, Rs1 are all even/odd pair of registers
Purpose
:
Get the absolute value of 32bit signed integer elements simultaneously.
Description
:
This instruction calculates the absolute value of 32bit signed integer elements stored in Rs1 and writes the element results to Rd. If the input number is 0x8000_0000, this instruction generates 0x7fff_ffff as the output and sets the OV bit to 1.
Operations:
src = Rs1.W[x]; if (src == 0x8000_0000) { src = 0x7fff_ffff; OV = 1; } else if (src[31] == 1) src = src; } Rd.W[x] = src; x=1...0
 Parameters
a – [in] unsigned long long type of value stored in a
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DKSLRA32 (unsigned long long a, int b)
DKSLRA32 (64bit SIMD 32bit Shift Left Logical with Saturation or Shift Right Arithmetic)
Type: SIMD
Syntax:
DKSLRA32 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do 31bit elements logical left (positive) or arithmetic right (negative) shift operation with Q31 saturation for the left shift.
Description
:
The 31bit data elements of Rs1 are leftshifted logically or rightshifted arithmetically based on the value of Rs2[5:0]. Rs2[5:0] is in the signed range of [2^5, 2^51]. A positive Rs2[5:0] means logical left shift and a negative Rs2[4:0] means arithmetic right shift. The shift amount is the absolute value of Rs2[5:0]. However, the behavior of Rs2[5:0]== 2^5 (0x20) is defined to be equivalent to the behavior of Rs2[5:0]==(2^51) (0x21).
Operations:
if (Rs2[5:0] < 0) { sa = Rs2[5:0]; sa = (sa == 32)? 31 : sa; Rd.W[x] = SE32(Rs1.W[x][31:sa]); } else { sa = Rs2[4:0]; res[(31+sa):0] = Rs1.W[x] <<(logic) sa; if (res > (2^31)1) { res[31:0] = 0x7fff_ffff; OV = 1; } else if (res < 2^31) { res[31:0] = 0x8000_0000; OV = 1; } Rd.W[x] = res[31:0]; } x=1...0
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] int type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DKADD32 (unsigned long long a, unsigned long long b)
DKADD32(64bit SIMD 32bit Signed Saturating Addition)
Type: SIMD
Syntax:
DKADD32 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do 32bit signed integer element saturating additions simultaneously.
Description
:
This instruction adds the 32bit signed integer elements in Rs1 with the 32bit signed integer elements in Rs2. If any of the results are beyond the Q31 number range (2^31 <= Q31 <= 2^311), they are saturated to the range and the OV bit is set to 1. The saturated results are written to Rd.
Operations:
res[x] = Rs1.W[x] + Rs2.W[x]; if (res[x] > 0x7fff_ffff) { res[x] = 0x7fff_ffff; OV = 1; } else if (res[x] < 0x8000_0000) { res[x] = 0x8000_0000; OV = 1; } Rd.W[x] = res[x]; x=1...0
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DKSUB32 (unsigned long long a, unsigned long long b)
DKSUB32 (64bit SIMD 32bit Signed Saturating Subtraction)
Type: SIMD
Syntax:
DKSUB32 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do 32bit signed integer element saturating subtractions simultaneously.
Description
:
This instruction subtracts the 32bit signed integer elements in Rs2 from the 32bit signed integer elements in Rs1. If any of the results are beyond the Q31 number range (2^31 <= Q31 <= 2^311), they are saturated to the range and the OV bit is set to 1. The saturated results are written to Rd.
Operations:
res[x] = Rs1.W[x]  Rs2.W[x]; if (res[x] > (2^31)1) { res[x] = (2^31)1; OV = 1; } else if (res[x] < 2^31) { res[x] = 2^31; OV = 1; } Rd.W[x] = res[x]; x=1...0
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DRADD16 (unsigned long long a, unsigned long long b)
DRADD16 (64bit SIMD 16bit Halving Signed Addition)
Type: SIMD
Syntax:
DRADD16 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do 16bit signed integer element additions simultaneously. The results are halved to avoid overflow or saturation.
Description
:
This instruction adds the 16bit signed integer elements in Rs1 with the 16bit signed integer elements in Rs2. The results are first arithmetically rightshifted by 1 bit and then written to Rd.
Operations:
Rd.H[x] = [(Rs1.H[x]) + (Rs2.H[x])] s>> 1; x=3...0
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DSUB16 (unsigned long long a, unsigned long long b)
DSUB16 (64bit SIMD 16bit Halving Signed Subtraction)
Type: SIMD
Syntax:
DSUB16 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do 16bit integer element subtractions simultaneously.
Description
:
This instruction adds the 16bit signed integer elements in Rs1 with the 16bit signed integer elements in Rs2. The results are first arithmetically rightshifted by 1 bit and then written to Rd.
Operations:
Rd.H[x] = [(Rs1.H[x])  (Rs2.H[x])] ; x=3...0
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DRADD32 (unsigned long long a, unsigned long long b)
DRADD32 (64bit SIMD 32bit Halving Signed Addition)
Type: SIMD
Syntax:
DRADD32 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do 32bit signed integer element additions simultaneously. The results are halved to avoid overflow or saturation.
Description
:
This instruction adds the 32bit signed integer elements in Rs1 with the 32bit signed integer elements in Rs2. The results are first arithmetically rightshifted by 1 bit and then written to Rd.
Operations:
Rd.W[x] = [(Rs1.W[x]) + (Rs2.W[x])] s>> 1; x=1...0
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DSUB32 (unsigned long long a, unsigned long long b)
DSUB32 (64bit SIMD 32bit Halving Signed Subtraction)
Type: SIMD
Syntax:
DSUB32 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do 32bit integer element subtractions simultaneously.
Description
:
This instruction subtracts the 32bit signed integer elements in Rs2 from the 32bit signed integer elements in Rs1 . The results are written to Rd.
Operations:
Rd.W[x] = [(Rs1.E[x])  (Rs2.E[x])] ; x=1...0
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DMSR16 (unsigned long a, unsigned long b)
DMSR16 (Signed Multiply Halfs with Right Shift 16bit and Cross Multiply Halfs with Right Shift 16bit)
Type: SIMD
Syntax:
DMSR16 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do two signed 16bit multiplications and cross multiplications from the 16bit elements of two registers; and each multiplications performs a right shift operation.
Description
:
For the
DMSR16
instruction, multiply the top 16bit Q15 content of 32bit chunks in Rs1 with the top 16bit Q15 content of 32bit chunks in Rs2, multiply the bottom 16bit Q15 content of 32bit chunks in Rs1 with the bottom 16bit Q15 content of 32bit chunks in Rs2. At the same time, multiply the top 16bit Q15 content of 32bit chunks in Rs1 with the bottom16bit Q15 content of 32bit chunks in Rs2 and multiply the bottom16bit Q15 content of 32bit chunks in Rs1 with the top16bit Q15 content of 32bit chunks in Rs2. The Q31 results are then rightshifted 16bits and clipped to Q15 values. The Q15 results are then written into Rd.Operations:
Rd.H[0] = (Rs1.H[0] s* Rs2.H[0]) s>> 16 Rd.H[1] = (Rs1.H[1] s* Rs2.H[1]) s>> 16 Rd.H[2] = (Rs1.H[1] s* Rs2.H[0]) s>> 16 Rd.H[3] = (Rs1.H[0] s* Rs2.H[1]) s>> 16
 Parameters
a – [in] unsigned long type of value stored in a
b – [in] unsigned long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DMSR17 (unsigned long a, unsigned long b)
DMSR17 (Signed Multiply Halfs with Right Shift 17bit and Cross Multiply Halfs with Right Shift 17bit)
Type: SIMD
Syntax:
DMSR17 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do two signed 16bit multiplications and cross multiplications from the 16bit elements of two registers; and each multiplications performs a right shift operation.
Description
:
For the
DMSR17
instruction, multiply the top 16bit Q15 content of 32bit chunks in Rs1 with the top 16bit Q15 content of 32bit chunks in Rs2, multiply the bottom 16bit Q15 content of 32bit chunks in Rs1 with the bottom 16bit Q15 content of 32bit chunks in Rs2. At the same time, multiply the top 16bit Q15 content of 32bit chunks in Rs1 with the bottom 16bit Q15 content of 32bit chunks in Rs2 and multiply the bottom 16bit Q15 content of 32bit chunks in Rs1 with the top 16bit Q15 content of 32bit chunks in Rs2. The Q31 results are then rightshifted 17bits and clipped to Q15 values. The Q15 results are then written into Rd.Operations:
Rd.H[0] = (Rs1.H[0] s* Rs2.H[0]) s>> 17 Rd.H[1] = (Rs1.H[1] s* Rs2.H[1]) s>> 17 Rd.H[2] = (Rs1.H[1] s* Rs2.H[0]) s>> 17 Rd.H[3] = (Rs1.H[0] s* Rs2.H[1]) s>> 17
 Parameters
a – [in] unsigned long type of value stored in a
b – [in] unsigned long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DMSR33 (unsigned long long a, unsigned long long b)
DMSR33 (Signed Multiply with Right Shift 33bit and Cross Multiply with Right Shift 33bit)
Type: SIMD
Syntax:
DMSR33 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do two signed 32bit multiplications from the 32bit elements of two registers, and each multiplications performs a right shift operation.
Description
:
For the
DMSR33
instruction, multiply the top 32bit Q31 content of 64bit chunks in Rs1 with the top 32bit Q31 content of 64bit chunks in Rs2. At the same time, multiply the bottom 32bit Q31 content of 64bit chunks in Rs1 with the bottom 32bit Q31 content of 64bit. The Q64 results are then rightshifted 33bits and clipped to Q31 values. The Q31 results are then written into Rd.Operations:
Rd.W[0] = (Rs1.W[0] s* Rs2.W[0]) s>> 33 Rd.W[1] = (Rs1.W[1] s* Rs2.W[1]) s>> 33
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DMXSR33 (unsigned long long a, unsigned long long b)
DMXSR33 (Signed Multiply with Right Shift 33bit and Cross Multiply with Right Shift 33bit)
Type: SIMD
Syntax:
DMXSR33 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do two signed 32bit cross multiplications from the 32bit elements of two registers, and each multiplications performs a right shift operation.
Description
:
For the
DMXSR33
instruction, multiply the top 32bit Q31 content of 64bit chunks in Rs1 with the bottom 32bit Q31 content of 64bit chunks in Rs2. At the same time, multiply the bottom 32bit Q31 content of 64bit chunks in Rs1 with the top 32bit Q31 content of 64bit chunks in Rs2. The Q63 results are then rightshifted 33bits and clipped to Q31 values. The Q31 results are then written into Rd.Operations:
Rd.W[0] = (Rs1.W[0] s* Rs2.W[1]) s>> 33 Rd.W[1] = (Rs1.W[1] s* Rs2.W[0]) s>> 33
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long __RV_DREDAS16 (unsigned long long a)
DREDAS16 (Reduced Addition and Reduced Subtraction)
Type: SIMD
Syntax:
DREDAS16 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do halfs reduced subtraction and halfs reduced addition from a register. The result is written to Rd.
Description
:
For the
DREDAS16
instruction, subtract the top 16bit Q15 element from the bottom 16bit Q15 element of the bottom 32bit Q31 content of 64bit chunks in Rs1. At the same time, add the the top16bit Q15 element with the bottom16bit Q15 element of the top 32bit Q31 content of 64bit chunks in Rs1. The two Q15 results are then written into Rd.Operations:
Rd.H[0] = Rs1.H[0]  Rs1.H[1] Rd.H[1] = Rs1.H[2] + Rs1.H[3]
 Parameters
a – [in] unsigned long long type of value stored in a
 Returns
value stored in unsigned long type
 __STATIC_FORCEINLINE unsigned long __RV_DREDSA16 (unsigned long long a)
DREDSA16 (Reduced Subtraction and Reduced Addition)
Type: SIMD
Syntax:
DREDSA16 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do halfs reduced subtraction and halfs reduced addition from a register. The result is written to Rd.
Description
:
For the
DREDSA16
instruction, add the top 16bit Q15 element from the bottom 16bit Q15 element of the bottom 32bit Q31 content of 64bit chunks in Rs1. At the same time, subtract the the top16bit Q15 element with the bottom16bit Q15 element of the top 32bit Q31 content of 64bit chunks in Rs1. The two Q15 results are then written into Rd.Operations:
Rd.H[0] = Rs1.H[0] + Rs1.H[1] Rd.H[1] = Rs1.H[2]  Rs1.H[3]
 Parameters
a – [in] unsigned long longtype of value stored in a
 Returns
value stored in unsigned long type
 __STATIC_FORCEINLINE int16_t __RV_DKCLIP64 (unsigned long long a)
DKCLIP64 (64bit Clipped to 16bit Saturation Value)
Type: SIMD
Syntax:
DKCLIP64 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do 15bit element arithmetic right shift operations and limit result into 32bit int,then do saturate operation to 16bit and clip result to 16bit Q15.
Description
:
For the
DKCLIP64
instruction, shift the input 15 bits to the right and data convert the result to 32bit int type, after which the input is saturated to limit the data to between 2^151 and 2^15. the result is converted to 16bits q15 type. The final results are written to Rd.Operations:
const int32_t max = (int32_t)((1U << 15U)  1U); const int32_t min = 1  max ; int32_t val = (int32_t)(Rs s>> 15); if (val > max) { Rd = max; } else if (val < min) { Rd = min; } else { Rd = (int16_t)val; }
 Parameters
a – [in] unsigned long long type of value stored in a
 Returns
value stored in int16_t type
 __STATIC_FORCEINLINE unsigned long long __RV_DKMDA (unsigned long long a, unsigned long long b)
DKMDA (Signed Multiply Two Halfs and Add)
Type: SIMD
Syntax:
DKMDA Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do two signed 16bit multiplications from the 32bit elements of two registers; and then adds the two 32bit results together. The addition result may be saturated.
Description
:
This instruction multiplies the bottom 16bit content of the 32bit elements of Rs1 with the bottom 16bit content of the 32bit elements of Rs2 and then adds the result to the result of multiplying the top 16bit content of the 32bit elements of Rs1 with the top 16bit content of the 32bit elements of Rs2. The addition result is checked for saturation. If saturation happens, the result is saturated to 2^311 The final results are written to Rd. The 16bit contents are treated as signed integers
Operations:
if (Rs1.W[x] != 0x80008000) or (Rs2.W[x] != 0x80008000){ Rd.W[x] = (Rs1.W[x].H[1] * Rs2.W[x].H[1]) + (Rs1.W[x].H[0] * Rs2.W[x].H[0]); } else { Rd.W[x] = 0x7fffffff; OV = 1; } x=1...0
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DKMXDA (unsigned long long a, unsigned long long b)
DKMXDA (Signed Crossed Multiply Two Halfs and Add)
Type: SIMD
Syntax:
DKMXDA Rd, Rs1, Rs2
Purpose
:
Do two signed 16bit multiplications from the 32bit elements of two registers; and then adds the two 32bit results together. The addition result may be saturated.
DKMXDA: top*bottom + top*bottom (per 32bit element)
Description
:
This instruction multiplies the bottom 16bit content of the 32bit elements of Rs1 with the top 16bit content of the 32bit elements of Rs2 and then adds the result to the result of multiplying the top 16bit content of the 32bit elements of Rs1 with the bottom 16bit content of the 32bit elements of Rs2. The addition result is checked for saturation.If saturation happens, the result is saturated to 2^311 The final results are written to Rd. The 16bit contents are treated as signed integers.
Operations:
if (Rs1.W[x] != 0x80008000) or (Rs2.W[x] != 0x80008000){ Rd.W[x] = (Rs1.W[x].H[1] * Rs2.W[x].H[0]) + (Rs1.W[x].H[0] * Rs2.W[x].H[1]); } else { Rd.W[x] = 0x7fffffff; OV = 1; } x=1...0
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DSMDRS (unsigned long long a, unsigned long long b)
DSMDRS (Signed Multiply Two Halfs and Reverse Subtract)
Type: SIMD
Syntax:
DSMDRS Rd, Rs1, Rs2
Purpose
:
Do two signed 16bit multiplications from the 32bit elements of two registers; and then perform a subtraction operation between the two 32bit results.
DSMDRS: bottom*bottom  top*top (per 32bit element)
Description
:
This instruction multiplies the top 16bit content of the 32bit elements of Rs1 with the top 16bit content of the 32bit elements of Rs2 and then subtracts the result from the result of multiplying the bottom 16bit content of the 32bit elements of Rs1 with the bottom 16bit content of the 32bit elements of Rs2. The subtraction result is written to the corresponding 32bit element of Rd (The 16bit contents of multiplication are treated as signed integers).
Operations:
Rd.W[x] = (Rs1.W[x].H[0] * Rs2.W[x].H[0])  (Rs1.W[x].H[1] * Rs2.W[x].H[1]); x = 1...0
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DSMXDS (unsigned long long a, unsigned long long b)
DSMXDS (Signed Crossed Multiply Two Halfs and Subtract)
Type: SIMD
Syntax:
DSMXDS Rd, Rs1, Rs2
Purpose
:
Do two signed 16bit multiplications from the 32bit elements of two registers; and then perform a subtraction operation between the two 32bit results.
DSMXDS: top*bottom  bottom*top (per 32bit element)
Description
:
This instruction multiplies the bottom 16bit content of the 32bit elements of Rs1 with the top 16bit content of the 32bit elements of Rs2 and then subtracts the result from the result of multiplying the top 16bit content of the 32bit elements of Rs1 with the bottom 16bit content of the 32bit elements of Rs2. The subtraction result is written to the corresponding 32bit element of Rd. The 16bit contents of multiplication are treated as signed integers.
Operations:
Rd.W[x] = (Rs1.W[x].H[1] * Rs2.W[x].H[0])  (Rs1.W[x].H[0] * Rs2.W[x].H[1]); x = 1...0
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE long long __RV_DSMBB32 (unsigned long long a, unsigned long long b)
DSMBB32 (Signed Multiply Bottom Word & Bottom Word)
Type: SIMD
Syntax:
DSMBB32 Rd, Rs1, Rs2
Purpose
:
Multiply the signed 32bit element of a register with the signed 32bit element of another register and write the 64bit result to a third register.
DSMBB32: bottom*bottom
Description
:
This instruction multiplies the bottom 32bit element of Rs1 with the bottom 32bit element of Rs2. The 64bit multiplication result is written to Rd. The 32bit contents of Rs1 and Rs2 are treated as signed integers.
Operations:
res = (Rs1.W[0] * Rs2.W[0]); Rd = res;
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in long long type
 __STATIC_FORCEINLINE long long __RV_DSMBB32_SRA14 (unsigned long long a, unsigned long long b)
DSMBB32.sra14 (Signed Crossed Multiply Two Halfs and Subtract with Right Shift 14)
Type: SIMD
Syntax:
DSMBB32.sra14 Rd, Rs1, Rs2
Purpose
:
Multiply the signed 32bit element of a register with the signed 32bit element of another register, then right shift 14 bit,finally write the 64bit result to a third register.
DSMBB32.SRL14: bottom*bottom s>> 14
Description
:
This instruction multiplies the bottom 32bit element of Rs1 with the bottom 32bit element of Rs2. The 64bit multiplication result is written to Rd after right shift 14bit. The 32bit contents of Rs1 and Rs2 are treated as signed integers.
Operations:
res = (Rs1.W[0] * Rs2.W[0]) s>> 14; Rd = res;
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in long long type
 __STATIC_FORCEINLINE long long __RV_DSMBB32_SRA32 (unsigned long long a, unsigned long long b)
DSMBB32.sra32 (Signed Crossed Multiply Two Halfs and Subtract with Right Shift 32)
Type: SIMD
Syntax:
DSMBB32.sra32 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Multiply the signed 32bit element of a register with the signed 32bit element of another register, then right shift 32 bit,finally write the 64bit result to a third register.
DSMBB32.SRL32: bottom*bottom s >> 32
Description
:
This instruction multiplies the bottom 32bit element of Rs1 with the bottom 32bit element of Rs2. The 64bit multiplication result is written to Rd after right shift 32bit. The 32bit contents of Rs1 and Rs2 are treated as signed integers.
Operations:
res = (Rs1.W[0] * Rs2.W[0]) s>> 32; Rd = res;
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in long long type
 __STATIC_FORCEINLINE long long __RV_DSMBT32 (unsigned long long a, unsigned long long b)
SMBT32 (Signed Multiply Bottom Word & Top Word)
Type: SIMD
Syntax:
DSMBT32 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Multiply the signed 32bit element of a register with the signed 32bit element of another register and write the 64bit result to a third register.
DSMBT32: bottom*top
Description
:
This instruction multiplies the bottom 32bit element of Rs1 with the top 32bit element of Rs2. The 64bit multiplication result is written to Rd. The 32bit contents of Rs1 and Rs2 are treated as signed integers.
Operations:
res = (Rs1.W[0] * Rs2.W[0]); Rd = res;
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in long long type
 __STATIC_FORCEINLINE long long __RV_DSMBT32_SRA14 (unsigned long long a, unsigned long long b)
DSMBT32.sra14 (Signed Multiply Bottom Word & Top Word with Right Shift 14)
Type: SIMD
Syntax:
DSMBT32.sra14 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Multiply the signed 32bit element of a register with the signed 32bit element of another register, then right shift 14 bit,finally write the 64bit result to a third register.
DSMBT32.SRL14: bottom*bottom s>> 14
Description
:
This instruction multiplies the bottom 32bit element of Rs1 with the top 32bit element of Rs2. The 64bit multiplication result is written to Rd after right shift 14bit. The 32bit contents of Rs1 and Rs2 are treated as signed integers.
Operations:
res = (Rs1.W[0] * Rs2.W[0]) s>> 14; Rd = res;
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in long long type
 __STATIC_FORCEINLINE long long __RV_DSMBT32_SRA32 (unsigned long long a, unsigned long long b)
DSMBT32.sra32 (Signed Crossed Multiply Two Halfs and Subtract with Right Shift 32)
Type: SIMD
Syntax:
DSMBT32.sra32 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Multiply the signed 32bit element of a register with the signed 32bit element of another register, then right shift 32 bit,finally write the 64bit result to a third register.
DSMBT32.SRL32: bottom*bottom s>> 32
Description
:
This instruction multiplies the bottom 32bit element of Rs1 with the top 32bit element of Rs2. The 64bit multiplication result is written to Rd after right shift 32bit. The 32bit contents of Rs1 and Rs2 are treated as signed integers.
Operations:
res = (Rs1.W[0] * Rs2.W[0]) s>> 14; Rd = res;
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in long long type
 __STATIC_FORCEINLINE long long __RV_DSMTT32 (unsigned long long a, unsigned long long b)
DSMTT32 (Signed Multiply Top Word & Top Word)
Type: SIMD
Syntax:
DSMTT32 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Multiply the signed 32bit element of a register with the signed 32bit element of another register and write the 64bit result to a third register.
DSMTT32: top*top
Description
:
This instruction multiplies the top 32bit element of Rs1 with the top 32bit element of Rs2. The 64bit multiplication result is written to Rd. The 32bit contents of Rs1 and Rs2 are treated as signed integers.
Operations:
res = Rs1.W[1] * Rs2.W[1]; Rd = res;
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in long long type
 __STATIC_FORCEINLINE long long __RV_DSMTT32_SRA14 (unsigned long long a, unsigned long long b)
DSMTT32.sra14 (Signed Multiply Top Word & Top Word with Right Shift 14bit)
Type: SIMD
Syntax:
DSMTT32.sra14 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Multiply the signed 32bit element of a register with the signed 32bit element of another register,then right shift 14bit, finally write the 64bit result to a third register.
DSMTT32.SRL14: top*top s>> 14
Description
:
This instruction multiplies the top 32bit element of Rs1 with the top 32bit element of Rs2. The 64bit multiplication result is written to Rd after right shift 14bit. The 32bit contents of Rs1 and Rs2 are treated as signed integers.
Operations:
res = Rs1.W[1] * Rs2.W[1] >> 14; Rd = res;
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in long long type
 __STATIC_FORCEINLINE long long __RV_DSMTT32_SRA32 (unsigned long long a, unsigned long long b)
DSMTT32.sra32 (Signed Multiply Top Word & Top Word with Right Shift 32bit)
Type: SIMD
Syntax:
DSMTT32.sra32 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Multiply the signed 32bit element of a register with the signed 32bit element of another register,then right shift 32bit, finally write the 64bit result to a third register.
DSMTT32.SRL14: top*top s>> 32
Description
:
This instruction multiplies the top 32bit element of Rs1 with the top 32bit element of Rs2. The 64bit multiplication result is written to Rd after right shift 32bit. The 32bit contents of Rs1 and Rs2 are treated as signed integers.
Operations:
res = Rs1.W[1] * Rs2.W[1] >> 32; Rd = res;
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DPKBB32 (unsigned long long a, unsigned long long b)
DPKBB32 (Pack Two 32bit Data from Both Bottom Half)
Type: SIMD
Syntax:
DPKBB32 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Pack 32bit data from 64bit chunks in two registers.
DPKBB32: bottom.bottom
Description
:
This instruction moves Rs1.W[0] to Rd.W[1] and moves Rs2.W[0] to Rd.W[0].
Operations:
Rd = CONCAT(Rs1.W[0], Rs2.W[0]);
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DPKBT32 (unsigned long long a, unsigned long long b)
DPKBT32 (Pack Two 32bit Data from Bottom and Top Half)
Type: SIMD
Syntax:
DPKBT32 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Pack 32bit data from 64bit chunks in two registers.
DPKBT32: bottom.top
Description
:
This instruction moves Rs1.W[0] to Rd.W[1] and moves Rs2.W[1] to Rd.W[0].
Operations:
Rd = CONCAT(Rs1.W[0], Rs2.W[1]);
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DPKTT32 (unsigned long long a, unsigned long long b)
DPKTT32 (Pack Two 32bit Data from Both Top Half)
Type: SIMD
Syntax:
DPKTT32 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Pack 32bit data from 64bit chunks in two registers.
DPKTT32: top.top
Description
:
This instruction moves Rs1.W[1] to Rd.W[0] and moves Rs2.W[1] to Rd.W[0].
Operations:
Rd = CONCAT(Rs1.W[1], Rs2.W[1]);
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DPKTB32 (unsigned long long a, unsigned long long b)
DPKTB32 (Pack Two 32bit Data from Top and Bottom Half)
Type: SIMD
Syntax:
DPKTB32 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Pack 32bit data from 64bit chunks in two registers.
DPKTB32: top.bottom
Description
:
This instruction moves Rs1.W[1] to Rd.W[1] and moves Rs2.W[0] to Rd.W[0].
Operations:
Rd = CONCAT(Rs1.W[1], Rs2.W[0]);
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DPKTB16 (unsigned long long a, unsigned long long b)
DPKTB16 (Pack Two 32bit Data from Top and Bottom Half)
Type: SIMD
Syntax:
DPKTB16 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Pack 16bit data from 32bit chunks in two registers.
DPKTB16: top.bottom
Description
:
This instruction moves Rs1.W[x] [31:16] to Rd.W[x] [31:16] and moves Rs2.W[x] [15:0] to Rd.W[x] [15:0].
Operations:
Rd.W[x][31:0] = CONCAT(Rs1.W[x][31:16], Rs2.W[x][15:0]); x=1...0
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DPKBB16 (unsigned long long a, unsigned long long b)
DPKBB16 (Pack Two 16bit Data from Both Bottom Half)
Type: SIMD
Syntax:
DPKBB16 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Pack 16bit data from 32bit chunks in two registers.
PKBB16: bottom.bottom
Description
:
This instruction moves Rs1.W[x][15:0] to Rd.W[x][31:16] and moves Rs2.W[x] [15:0] to Rd.W[x] [15:0].
Operations:
Rd.W[x][31:0] = CONCAT(Rs1.W[x][15:0], Rs2.W[x][15:0]); x=1...0
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DPKBT16 (unsigned long long a, unsigned long long b)
DPKBT16 (Pack Two 16bit Data from Bottom and Top Half)
Type: SIMD
Syntax:
DPKBT16 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Pack 16bit data from 32bit chunks in two registers.
PKBT16: bottom.top
Description
:
This instruction moves Rs1.W[x] [15:0] to Rd.W[x] [31:16] and moves Rs2.W[x] [31:16] to Rd.W[x] [15:0].
Operations:
Rd.W[x][31:0] = CONCAT(Rs1.W[x][15:0], Rs2.W[x][31:16]); x=1...0
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DPKTT16 (unsigned long long a, unsigned long long b)
DPKTT16 (Pack Two 16bit Data from Both Top Half)
Type: SIMD
Syntax:
DPKTT16 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Pack 16bit data from 32bit chunks in two registers.
PKTT16 top.top
Description
:
This instruction moves Rs1.W[x] [31:16] to Rd.W[x] [31:16] and moves Rs2.W[x] [31:16] to Rd.W[x] [15:0].
Operations:
Rd.W[x][31:0] = CONCAT(Rs1.W[x][31:16], Rs2.W[x][31:16]); x=1...0
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DSRA16 (unsigned long long a, unsigned long b)
DSRA16 (32bit Signed Saturating Cross Addition & Subtraction)
Type: SIMD
Syntax:
DSRA16 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do 16bit element arithmetic right shift operations simultaneously. The shift amount is a variable from a GPR.
Description
:
The 16bit data elements in Rs1 are rightshifted arithmetically, that is, the shifted out bits are filled with the signbit of the data elements. The shift amount is specified by the loworder 4bits of the value in the Rs2 register. And the results are written to Rd.
Operations:
sa = Rs2[3:0]; if (sa != 0) { Rd.H[x] = SE16(Rs1.H[x][15:sa]); } else { Rd = Rs1; } x=3...0
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DADD16 (unsigned long long a, unsigned long long b)
DADD16 (16bit Addition)
Type: SIMD
Syntax:
DADD16 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do 16bit integer element additions simultaneously.
Description
:
This instruction adds the 16bit unsigned integer elements in Rs1 with the 16bit unsigned integer elements in Rs2. And the results are written to Rd.
Operations:
Rd.H[x] = Rs1.H[x] + Rs2.H[x]; x=3...0
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DADD32 (unsigned long long a, unsigned long long b)
DADD32 (32bit Addition)
Type: SIMD
Syntax:
DADD32 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do 32bit integer element additions simultaneously.
Description
:
This instruction adds the 32bit integer elements in Rs1 with the 32bit integer elements in Rs2, and then writes the 32bit element results to Rd.
Operations:
Rd.W[x] = Rs1.W[x] + Rs2.W[x]; x=1...0
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DSMBB16 (unsigned long long a, unsigned long long b)
DSMBB16 (Signed Multiply Bottom Half & Bottom Half)
Type: SIMD
Syntax:
DSMBB16 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Multiply the signed 16bit content of the 32bit elements of a register with the signed 16bit content of the 32bit elements of another register and write the result to a third register.
DSMBB16: W[x].bottom*W[x].bottom
Description
:
For the
DSMBB16
instruction, it multiplies the bottom 16bit content of the 32bit elements of Rs1 with the bottom 16bit content of the 32bit elements of Rs2. The multiplication results are written to Rd. The 16bit contents of Rs1 and Rs2 are treated as signed integers.Operations:
Rd.W[x] = Rs1.W[x].H[0] * Rs2.W[x].H[0]; x=1...0
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DSMBT16 (unsigned long long a, unsigned long long b)
DSMBT16 (Signed Multiply Bottom Half & Top Half)
Type: SIMD
Syntax:
DSMBT16 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Multiply the signed 16bit content of the 32bit elements of a register with the signed 16bit content of the 32bit elements of another register and write the result to a third register.
DSMBT16: W[x].bottom *W[x].top
Description
:
For the
DSMBT16
instruction, it multiplies the bottom 16bit content of the 32bit elements of Rs1 with the top 16bit content of the 32bit elements of Rs2. The multiplication results are written to Rd. The 16bit contents of Rs1 and Rs2 are treated as signed integers.Operations:
Rd.W[x] = Rs1.W[x].H[0] * Rs2.W[x].H[1]; x=1...0
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DSMTT16 (unsigned long long a, unsigned long long b)
DSMTT16 (Signed Multiply Top Half & Top Half)
Type: SIMD
Syntax:
DSMTT16 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Multiply the signed 16bit content of the 32bit elements of a register with the signed 16bit content of the 32bit elements of another register and write the result to a third register.
DSMTT16: W[x].top * W[x].top
Description
:
For the
DSMTT16
instruction, it multiplies the top 16bit content of the 32bit elements of Rs1 with the top 16bit content of the 32bit elements of Rs2. The multiplication results are written to Rd. The 16bit contents of Rs1 and Rs2 are treated as signed integers.Operations:
Rd.W[x] = Rs1.W[x].H[1] * Rs2.W[x].H[1]; x=1...0
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DRCRSA16 (unsigned long long a, unsigned long long b)
DRCRSA16 (16bit Signed Halving Cross Subtraction & Addition)
Type: SIMD
Syntax:
DRCRSA16 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do 16bit signed integer element subtraction and 16bit signed integer element addition in a 32bit chunk simultaneously. Operands are from crossed positions in 32bit chunks. The results are halved to avoid overflow or saturation.
Description
:
This instruction subtracts the 16bit signed integer in [31:16] of 32bit chunks in Rs1 with the 16bit signed integer in [15:0] of 32bit chunks in Rs2, and adds the 16bit signed integer in [31:16] of 32bit chunks in Rs2 from the 16bit signed integer in [15:0] of 32bit chunks in Rs1. The element results are first logically rightshifted by 1 bit and then written to [31:16] of 32 bit chunks in Rd and [15:0] of 32bit chunks in Rd.
Operations:
Rd.W[x][31:16] = (Rs1.W[x][31:16]  Rs2.W[x][15:0]) s>> 1; Rd.W[x][15:0] = (Rs1.W[x][15:0] + Rs2.W[x][31:16]) s>> 1; x=1...0
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DRCRSA32 (unsigned long long a, unsigned long long b)
DRCRSA32 (32bit Signed Halving CrossSubtraction & Addition)
Type: SIMD
Syntax:
DRCRSA32 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do 32bit signed integer element subtraction and 32bit signed integer element addition in a 64bit chunk simultaneously. Operands are from crossed 32bit elements. The results are halved to avoid overflow or saturation.
Description
:
This instruction subtracts the 32bit signed integer element in [63:32] of Rs1 with the 32bit signed integer element in [31:0] of Rs2, and adds the 32bit signed integer element in [63:32] of Rs2 from the 32bit signed integer element in [31:0] of Rs1. The element results are first arithmetically rightshifted by 1 bit and then written to [63:32] of Rd for addition and [31:0] of Rd for subtraction.
Operations:
Rd.W[1] = (Rs1.W[1]  Rs2.W[0]) s>> 1; Rd.W[0] = (Rs1.W[0] + Rs2.W[1]) s>> 1;
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DRCRAS16 (unsigned long long a, unsigned long long b)
DRCRAS16 (16bit Signed Halving Cross Addition & Subtraction)
Type: SIMD
Syntax:
DRCRAS16 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do 16bit signed integer element subtraction and 16bit signed integer element addition in a 32bit chunk simultaneously. Operands are from crossed positions in 32bit chunks. The results are halved to avoid overflow or saturation.
Description
:
This instruction adds the 16bit unsigned integer in [31:16] of 32bit chunks in Rs1 with the 16bit unsigned integer in [15:0] of 32bit chunks in Rs2, and subtracts the 16bit unsigned integer in [31:16] of 32bit chunks in Rs2 from the 16bit unsigned integer in [15:0] of 32bit chunks in Rs1. The element results are first logically rightshifted by 1 bit and then written to [31:16] of 32bit chunks in Rd and [15:0] of 32bit chunks in Rd.
Operations:
Rd.W[x][31:16] = (Rs1.W[x][31:16] + Rs2.W[x][15:0]) s>> 1; Rd.W[x][15:0] = (Rs1.W[x][15:0]  Rs2.W[x][31:16]) s>> 1; x=1...0
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DRCRAS32 (unsigned long long a, unsigned long long b)
DRCRAS32 (32bit Signed Cross Addition & Subtraction)
Type: SIMD
Syntax:
DRCRAS32 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do 32bit signed integer element addition and 32bit signed integer element subtraction in a 64bit chunk simultaneously. Operands are from crossed 32bit elements. The results are halved to avoid overflow or saturation.
Description
:
This instruction adds the 32bit signed integer element in [63:32] of Rs1 with the 32bit signed integer element in [31:0] of Rs2, and subtracts the 32bit signed integer element in [63:32] of Rs2 from the 32bit signed integer element in [31:0] of Rs1. The element results are first arithmetically rightshifted by 1 bit and then written to [63:32] of Rd for addition and [31:0] of Rd for subtraction.
Operations:
Rd.W[1] = (Rs1.W[1] + Rs2.W[0]) s>> 1; Rd.W[0] = (Rs1.W[0]  Rs2.W[1]) s>> 1;
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DKCRAS16 (unsigned long long a, unsigned long long b)
DKCRAS16 (16bit Signed Saturating Cross Addition & Subtraction)
Type: SIMD
Syntax:
DKCRAS16 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do 16bit signed integer element saturating addition and 16bit signed integer element saturating subtraction in a 32bit chunk simultaneously. Operands are from crossed positions in 32bit chunks.
Description
:
This instruction adds the 16bit signed integer element in [31:16] of 32bit chunks in Rs1 with the 16bit signed integer element in [15:0] of 32bit chunks in Rs2; at the same time, it subtracts the 16bit signed integer element in [31:16] of 32bit chunks in Rs2 from the 16bit signed integer element in [15:0] of 32bit chunks in Rs1. If any of the results are beyond the Q15 number range (2^15 <= Q15 <= 2^151), they are saturated to the range and the OV bit is set to 1. The saturated results are written to [31:16] of 32bit chunks in Rd for subtraction and [15:0] of 32bit chunks in Rd for addition.
Operations:
res1 = Rs1.W[x][31:16]  Rs2.W[x][15:0]; res2 = Rs1.W[x][15:0] + Rs2.W[x][31:16]; for (res in [res1, res2]) { if (res > (2^15)1) { res = (2^15)1; OV = 1; } else if (res < 2^15) { res = 2^15; OV = 1; } } Rd.W[x][31:16] = res1; Rd.W[x][15:0] = res2; x=1...0
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DKCRSA16 (unsigned long long a, unsigned long long b)
DKCRSA16 (16bit Signed Saturating Cross Subtraction & Addition)
Type: SIMD
Syntax:
DKCRSA16 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do 16bit signed integer element saturating subtraction and 16bit signed integer element saturating addition in a 32bit chunk simultaneously. Operands are from crossed positions in 32bit chunks.
Description
:
This instruction subtracts the 16bit signed integer element in [15:0] of 32bit chunks in Rs2 from the 16bit signed integer element in [31:16] of 32bit chunks in Rs1; at the same time, it adds the 16bit signed integer element in [31:16] of 32bit chunks in Rs2 with the 16bit signed integer element in [15:0] of 32bit chunks in Rs1. If any of the results are beyond the Q15 number range (2^15 <= Q15 <= 2^151), they are saturated to the range and the OV bit is set to 1. The saturated results are written to [31:16] of 32bit chunks in Rd for addition and [15:0] of 32bit chunks in Rd for subtraction.
Operations:
res1 = Rs1.W[x][31:16] + Rs2.W[x][15:0]; res2 = Rs1.W[x][15:0]  Rs2.W[x][31:16]; for (res in [res1, res2]) { if (res > (2^15)1) { res = (2^15)1; OV = 1; } else if (res < 2^15) { res = 2^15; OV = 1; } } Rd.W[x][31:16] = res1; Rd.W[x][15:0] = res2; x=1...0
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DRSUB16 (unsigned long long a, unsigned long long b)
DRSUB16 (16bit Signed Halving Subtraction)
Type: SIMD
Syntax:
DRSUB16 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do 16bit signed integer element subtractions simultaneously. The results are halved to avoid overflow or saturation.
Description
:
This instruction subtracts the 16bit signed integer elements in Rs2 from the 16bit signed integer elements in Rs1. The results are first arithmetically rightshifted by 1 bit and then written to Rd.
Operations:
Rd.H[x] = (Rs1.H[x]  Rs2.H[x]) s>> 1; x=3...0
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DSTSA32 (unsigned long long a, unsigned long long b)
DSTSA32 (32bit Straight Subtraction & Addition)
Type: SIMD
Syntax:
DSTSA32 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do 32bit integer element subtraction and 32bit integer element addition in a 64bit chunk simultaneously. Operands are from corresponding 32bit elements.
Description
:
This instruction subtracts the 32bit integer element in [63:32] of Rs2 from the 32bit integer element in [63:32] of Rs1, and writes the result to [63:32] of Rd; at the same time, it adds the 32bit integer element in [31:0] of Rs1 with the 32bit integer element in [31:0] of Rs2, and writes the result to [31:0] of Rd.
Operations:
Rd.W[1] = Rs1.W[1]  Rs2.W[1]; Rd.W[0] = Rs1.W[0] + Rs2.W[0];
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DSTAS32 (unsigned long long a, unsigned long long b)
DSTAS32 (SIMD 32bit Straight Addition & Subtractionn)
Type: SIMD
Syntax:
DSTAS32 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do 32bit integer element addition and 32bit integer element subtraction in a 64bit chunk simultaneously. Operands are from corresponding 32bit elements.
Description
:
This instruction adds the 32bit integer element in [63:32] of Rs1 with the 32bit integer element in [63:32] of Rs2, and writes the result to [63:32] of Rd; at the same time, it subtracts the 32bit integer element in [31:0] of Rs2 from the 32bit integer element in [31:0] of Rs1, and writes the result to [31:0] of Rd.
Operations:
Rd.W[1] = Rs1.W[1] + Rs2.W[1]; Rd.W[0] = Rs1.W[0]  Rs2.W[0];
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DKCRSA32 (unsigned long long a, unsigned long long b)
DKCRSA32 (32bit Signed Saturating Cross Subtraction & Addition)
Type: SIMD
Syntax:
DKCRSA32 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do 32bit signed integer element saturating subtraction and 32bit signed integer element saturating addition in a 64bit chunk simultaneously. Operands are from crossed 32bit elements.
Description
:
This instruction subtracts the 32bit integer element in [31:0] of Rs2 from the 32bit integer element in [63:32] of Rs1; at the same time, it adds the 32bit integer element in [31:0] of Rs1 with the 32bit integer element in [63:32] of Rs2. If any of the results are beyond the Q31 number range (2^31 <= Q31 <= 2^311), they are saturated to the range and the OV bit is set to 1. The saturated results are written to [63:32] of Rd for subtraction and [31:0] of Rd for addition.
Operations:
res[1] = Rs1.W[1]  Rs2.W[0]; res[0] = Rs1.W[0] + Rs2.W[1]; if (res[x] > (2^31)1) { res[x] = (2^31)1; OV = 1; } else if (res < 2^31) { res[x] = 2^31; OV = 1; } Rd.W[1] = res[1]; Rd.W[0] = res[0];
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DKCRAS32 (unsigned long long a, unsigned long long b)
DKCRAS32 (32bit Signed Saturating Cross Addition & Subtraction)
Type: SIMD
Syntax:
DKCRAS32 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do 32bit signed integer element saturating subtraction and 32bit signed integer element saturating addition in a 64bit chunk simultaneously. Operands are from crossed 32bit elements.
Description
:
This instruction adds the 32bit integer element in [31:0] of Rs2 from the 32bit integer element in [63:32] of Rs1; at the same time, it subtracts the 32bit integer element in [31:0] of Rs1 with the 32bit integer element in [63:32] of Rs2. If any of the results are beyond the Q31 number range (2^31 <= Q31 <= 2^311), they are saturated to the range and the OV bit is set to 1. The saturated results are written to [63:32] of Rd for subtraction and [31:0] of Rd for addition.
Operations:
res[1] = Rs1.W[1] + Rs2.W[0]; res[0] = Rs1.W[0]  Rs2.W[1]; if (res[x] > (2^31)1) { res[x] = (2^31)1; OV = 1; } else if (res < 2^31) { res[x] = 2^31; OV = 1; } Rd.W[1] = res[1]; Rd.W[0] = res[0];
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DCRSA32 (unsigned long long a, unsigned long long b)
DCRSA32 (32bit Cross Subtraction & Addition)
Type: SIMD
Syntax:
DCRSA32 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do 32bit integer element subtraction and 32bit integer element addition in a 64bit chunk simultaneously. Operands are from crossed 32bit elements.
Description
:
This instruction adds the 32bit integer element in [63:32] of Rs1 with the 32bit integer element in [31:0] of Rs2, and writes the result to [63:32] of Rd; at the same time, it subtracts the 32bit integer element in [63:32] of Rs2 from the 32bit integer element in [31:0] of Rs1, and writes the result to [31:0] of Rd.
Operations:
res[1] = Rs1.W[1]  Rs2.W[0]; res[0] = Rs1.W[0] + Rs2.W[1];
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DCRAS32 (unsigned long long a, unsigned long long b)
DCRAS32 (32bit Cross Addition & Subtraction)
Type: SIMD
Syntax:
DCRAS32 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do 32bit integer element addition and 32bit integer element subtraction in a 64bit chunk simultaneously. Operands are from crossed 32bit elements.
Description
:
This instruction subtracts the 32bit integer element in [63:32] of Rs1 with the 32bit integer element in [31:0] of Rs2, and writes the result to [63:32] of Rd; at the same time, it adds the 32bit integer element in [63:32] of Rs2 from the 32bit integer element in [31:0] of Rs1, and writes the result to [31:0] of Rd.
Operations:
res[1] = Rs1.W[1]  Rs2.W[0]; res[0] = Rs1.W[0] + Rs2.W[1];
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DKSTSA16 (unsigned long long a, unsigned long long b)
DKSTSA16 (16bit Signed Saturating Straight Subtraction & Addition)
Type: SIMD
Syntax:
DKSTSA16 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do 16bit signed integer element saturating subtraction and 16bit signed integer element saturating addition in a 32bit chunk simultaneously. Operands are from corresponding positions in 32bit chunks.
Description
:
This instruction subtracts the 16bit signed integer element in [31:16] of 32bit chunks in Rs2 from the 16bit signed integer element in [31:16] of 32bit chunks in Rs1; at the same time, it adds the 16bit signed integer element in [15:0] of 32bit chunks in Rs2 with the 16bit signed integer element in [15:0] of 32bit chunks in Rs1. If any of the results are beyond the Q15 number range (2^15 <= Q15 <= 2^151), they are saturated to the range and the OV bit is set to 1. The saturated results are written to [31:16] of 32bit chunks in Rd for subtraction and [15:0] of 32bit chunks in Rd for addition.
Operations:
res1 = Rs1.W[x][31:16]  Rs2.W[x][31:16]; res2 = Rs1.W[x][15:0] + Rs2.W[x][15:0]; for (res in [res1, res2]) { if (res > (2^15)1) { res = (2^15)1; OV = 1; } else if (res < 2^15) { res = 2^15; OV = 1; } } Rd.W[x][31:16] = res1; Rd.W[x][15:0] = res2; x=1...0
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DKSTAS16 (unsigned long long a, unsigned long long b)
DKSTAS16 (16bit Signed Saturating Straight Addition & Subtraction)
Type: SIMD
Syntax:
DKSTAS16 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do 16bit signed integer element saturating addition and 16bit signed integer element saturating subtraction in a 32bit chunk simultaneously. Operands are from corresponding positions in 32bit chunks.
Description
:
This instruction adds the 16bit signed integer element in [31:16] of 32bit chunks in Rs1 with the 16bit signed integer element in [31:16] of 32bit chunks in Rs2; at the same time, it subtracts the 16bit signed integer element in [15:0] of 32bit chunks in Rs2 from the 16bit signed integer element in [15:0] of 32bit chunks in Rs1. If any of the results are beyond the Q15 number range (2^15 <= Q15 <= 2^151), they are saturated to the range and the OV bit is set to 1. The saturated results are written to [31:16] of 32bit chunks in Rd for subtraction and [15:0] of 32bit chunks in Rd for addition.
Operations:
res1 = Rs1.W[x][31:16] + Rs2.W[x][31:16]; res2 = Rs1.W[x][15:0]  Rs2.W[x][15:0]; for (res in [res1, res2]) { if (res > (2^15)1) { res = (2^15)1; OV = 1; } else if (res < 2^15) { res = 2^15; OV = 1; } } Rd.W[x][31:16] = res1; Rd.W[x][15:0] = res2; x=1...0
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DRSUB32 (unsigned long long a, unsigned long long b)
DRSUB32 (32bit Signed Halving Subtraction)
Type: SIMD
Syntax:
DRSUB32 Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do 32bit signed integer element subtractions simultaneously. The results are halved to avoid overflow or saturation.
Description
:
This instruction subtracts the 32bit signed integer elements in Rs2 from the 32bit signed integer elements in Rs1. The results are first arithmetically rightshifted by 1 bit and then written to Rd.
Operations:
Rd.W[x] = (Rs1.W[x]  Rs2.W[x]) s>> 1; x=1...0
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type

__RV_DSCLIP8(a, b)
 group NMSIS_Core_DSP_Intrinsic_NUCLEI_N3
(RV32 only)Nuclei Customized N3 DSP Instructions
This is Nuclei customized DSP N3 instructions only for RV32
Functions
 __STATIC_FORCEINLINE unsigned long long __RV_DKMMAC (unsigned long long t, unsigned long long a, unsigned long long b)
DKMMAC (64bit MSW 32x32 Signed Multiply and Saturating Add)
Type: SIMD
Syntax:
DKMMAC Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do MSW 32x32 element signed multiplications and saturating addition simultaneously. The results are written into Rd.
Description
:
This instruction multiplies the signed 32bit elements of Rs1 with the signed 32bit elements of Rs2 and adds the most significant 32bit multiplication results with the signed 32bit elements of Rd. If the addition result is beyond the Q31 number range (2^31 <= Q31 <= 2^311), it is saturated to the range and the OV bit is set to 1. The results after saturation are written to Rd. The .u form of the instruction additionally rounds up the most significant 32bit of the 64bit multiplication results by adding a 1 to bit 31 of the results.
Operations:
op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; op3t = Rd.W[x+1] // top op1b = Rs1.W[x]; op2b = Rs2.W[x]; op3b = Rd.W[x] // bottom for ((aop,bop,dop,res) in [(op1t,op2t,op3t,rest), (op1b,op2b,op3b,resb)]) { res = sat.q31(dop + (aop s* bop)[63:32]); } Rd = concat(rest, resb); x=0
 Parameters
t – [in] unsigned long long type of value stored in t
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DKMMAC_U (unsigned long long t, unsigned long long a, unsigned long long b)
DKMMACU (64bit MSW 32x32 Unsigned Multiply and Saturating Add)
Type: SIMD
Syntax:
DKMMACU Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do MSW 32x32 element unsigned multiplications and saturating addition simultaneously. The results are written into Rd.
Description
:
This instruction multiplies the signed 32bit elements of Rs1 with the signed 32bit elements of Rs2 and adds the most significant 32bit multiplication results with the signed 32bit elements of Rd. If the addition result is beyond the Q31 number range (2^31 <= Q31 <= 2^311), it is saturated to the range and the OV bit is set to 1. The results after saturation are written to Rd. The .u form of the instruction additionally rounds up the most significant 32bit of the 64bit multiplication results by adding a 1 to bit 31 of the results.
Operations:
op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; op3t = Rd.W[x+1] // top op1b = Rs1.W[x]; op2b = Rs2.W[x]; op3b = Rd.W[x] // bottom for ((aop,bop,dop,res) in [(op1t,op2t,op3t,rest), (op1b,op2b,op3b,resb)]) { res = sat.q31(dop + RUND(aop u* bop)[63:32]); } Rd = concat(rest, resb); x=0
 Parameters
t – [in] unsigned long long type of value stored in t
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DKMMSB (unsigned long long t, unsigned long long a, unsigned long long b)
DKMMSB (64bit MSW 32x32 Signed Multiply and Saturating Sub)
Type: SIMD
Syntax:
DKMMSB Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do MSW 32x32 element signed multiplications and saturating subtraction simultaneously. The results are written into Rd.
Description
:
This instruction multiplies the signed 32bit elements of Rs1 with the signed 32bit elements of Rs2 and subtracts the most significant 32bit multiplication results from the signed 32bit elements of Rd. If the subtraction result is beyond the Q31 number range (2^31 <= Q31 <= 2^311), it is saturated to the range and the OV bit is set to 1. The results after saturation are written to Rd. The .u form of the instruction additionally rounds up the most significant 32bit of the 64bit multiplication results by adding a 1 to bit 31 of the results.
Operations:
op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; op3t = Rd.W[x+1] // top op1b = Rs1.W[x]; op2b = Rs2.W[x]; op3b = Rd.W[x] // bottom for ((aop,bop,dop,res) in [(op1t,op2t,op3t,rest), (op1b,op2b,op3b,resb)]) { res = sat.q31(dop  (aop s* bop)[63:32]); } Rd = concat(rest, resb); x=0
 Parameters
t – [in] unsigned long long type of value stored in t
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DKMMSB_U (unsigned long long t, unsigned long long a, unsigned long long b)
DKMMSBU (64bit MSW 32x32 Unsigned Multiply and Saturating Sub)
Type: SIMD
Syntax:
DKMMSBU Rd, Rs1, Rs2 # Rd, Rs1, Rs2 are all even/odd pair of registers
Purpose
:
Do MSW 32x32 element unsigned multiplications and saturating subtraction simultaneously. The results are written into Rd.
Description
:
This instruction multiplies the signed 32bit elements of Rs1 with the signed 32bit elements of Rs2 and subtracts the most significant 32bit multiplication results from the signed 32bit elements of Rd. If the subtraction result is beyond the Q31 number range (2^31 <= Q31 <= 2^311), it is saturated to the range and the OV bit is set to 1. The results after saturation are written to Rd. The .u form of the instruction additionally rounds up the most significant 32bit of the 64bit multiplication results by adding a 1 to bit 31 of the results.
Operations:
op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; op3t = Rd.W[x+1] // top op1b = Rs1.W[x]; op2b = Rs2.W[x]; op3b = Rd.W[x] // bottom for ((aop,bop,dop,res) in [(op1t,op2t,op3t,rest), (op1b,op2b,op3b,resb)]) { res = sat.q31(dop  (aop u* bop)[63:32]); } Rd = concat(rest, resb); x=0
 Parameters
t – [in] unsigned long long type of value stored in t
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DKMADA (unsigned long long t, unsigned long long a, unsigned long long b)
DKMADA (Saturating Signed Multiply Two Halfs and Two Adds)
Type: DSP
Syntax:
DKMADA Rd, Rs1, Rs2
Purpose
:
Do two 16x16 with 32bit signed double addition simultaneously. The results are written into Rd.
Description
:
It multiplies the bottom 16bit content of 32bit elements in Rs1 with the bottom 16bit content of 32bit elements in Rs2 and then adds the result to the result of multiplying the top 16bit content of 32bit elements in Rs1 with the top 16bit content of 32bit elements in Rs2.
Operations:
op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; op3t = Rd.W[x+1] // top op1b = Rs1.W[x]; op2b = Rs2.W[x]; op3b = Rd.W[x] // bottom for ((aop,bop,dop,res) in [(op1t,op2t,op3t,rest), (op1b,op2b,op3b,resb)]) { mul1 = aop.H[1] s* bop.H[1]; mul2 = aop.H[0] s* bop.H[0]; res = sat.q31(dop + mul1 + mul2); } Rd = concat(rest, resb); x=0
 Parameters
t – [in] unsigned long long type of value stored in t
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DKMAXDA (unsigned long long t, unsigned long long a, unsigned long long b)
DKMAXDA (Two Cross 16x16 with 32bit Signed Double Add)
Type: DSP
Syntax:
DKMAXDA Rd, Rs1, Rs2
Purpose
:
Do two cross 16x16 with 32bit signed double addition simultaneously. The results are written into Rd.
Description
:
It multiplies the top 16bit content of 32bit elements in Rs1 with the bottom 16bit content of 32bit elements in Rs2 and then adds the result to the result of multiplying the bottom 16bit content of 32bit elements in Rs1 with the top 16bit content of 32bit elements in elements in Rs2.
Operations:
op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; op3t = Rd.W[x+1] // top op1b = Rs1.W[x]; op2b = Rs2.W[x]; op3b = Rd.W[x] // bottom for ((aop,bop,dop,res) in [(op1t,op2t,op3t,rest), (op1b,op2b,op3b,resb)]) { mul1 = aop.H[1] s* bop.H[0]; mul2 = aop.H[0] s* bop.H[1]; res = sat.q31(dop + mul1 + mul2); } Rd = concat(rest, resb); x=0
 Parameters
t – [in] unsigned long long type of value stored in t
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DKMADS (unsigned long long t, unsigned long long a, unsigned long long b)
DKMADS (Two 16x16 with 32bit Signed Add and Sub)
Type: DSP
Syntax:
DKMADS Rd, Rs1, Rs2
Purpose
:
Do two 16x16 with 32bit signed addition and subtraction simultaneously. The results are written into Rd.
Description
:
It multiplies the bottom 16bit content of 32bit elements in Rs1 with the bottom 16bit content of 32bit elements in Rs2 and then subtracts the result from the result of multiplying the top 16bit content of 32bit elements in Rs1 with the top 16bit content of 32bit elements in Rs2.
Operations:
op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; op3t = Rd.W[x+1] // top op1b = Rs1.W[x]; op2b = Rs2.W[x]; op3b = Rd.W[x] // bottom for ((aop,bop,dop,res) in [(op1t,op2t,op3t,rest), (op1b,op2b,op3b,resb)]) { mul1 = aop.H[1] s* bop.H[1]; mul2 = aop.H[0] s* bop.H[0]; res = sat.q31(dop + mul1  mul2); } Rd = concat(rest, resb); x=0
 Parameters
t – [in] unsigned long long type of value stored in t
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DKMADRS (unsigned long long t, unsigned long long a, unsigned long long b)
DKMADRS (Two 16x16 with 32bit Signed Add and Reversed Sub)
Type: DSP
Syntax:
DKMADRS Rd, Rs1, Rs2
Purpose
:
Do two 16x16 with 32bit signed addition and revered subtraction simultaneously. The results are written into Rd.
Description
:
it multiplies the top 16bit content of 32bit elements in Rs1 with the top 16bit content of 32bit elements in Rs2 and then subtracts the result from the result of multiplying the bottom 16bit content of 32bit elements in Rs1 with the bottom 16bit content of 32 bit elements in Rs2
Operations:
op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; op3t = Rd.W[x+1] // top op1b = Rs1.W[x]; op2b = Rs2.W[x]; op3b = Rd.W[x] // bottom for ((aop,bop,dop,res) in [(op1t,op2t,op3t,rest), (op1b,op2b,op3b,resb)]) { mul1 = aop.H[1] s* bop.H[1]; mul2 = aop.H[0] s* bop.H[0]; res = sat.q31(dop  mul1 + mul2); } Rd = concat(rest, resb); x=0
 Parameters
t – [in] unsigned long long type of value stored in t
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DKMAXDS (unsigned long long t, unsigned long long a, unsigned long long b)
DKMAXDS (Saturating Signed Crossed Multiply Two Halfs & Subtract & Add)
Type: DSP
Syntax:
DKMAXDS Rd, Rs1, Rs2
Purpose
:
Do two cross 16x16 with 32bit signed addition and subtraction simultaneously. The results are written into Rd.
Description
:
Do two signed 16bit multiplications from 32bit elements in two registers; and then perform a subtraction operation between the two 32bit results. Then add the subtraction result to the corresponding 32bit elements in a third register. The addition result may be saturated.
Operations:
op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; op3t = Rd.W[x+1] // top op1b = Rs1.W[x]; op2b = Rs2.W[x]; op3b = Rd.W[x] // bottom for ((aop,bop,dop,res) in [(op1t,op2t,op3t,rest), (op1b,op2b,op3b,resb)]) { mul1 = aop.H[1] s* bop.H[0]; mul2 = aop.H[0] s* bop.H[1]; res = sat.q31(dop + mul1  mul2); } Rd = concat(rest, resb); x=0
 Parameters
t – [in] unsigned long long type of value stored in t
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DKMSDA (unsigned long long t, unsigned long long a, unsigned long long b)
DKMSDA (Two 16x16 with 32bit Signed Double Sub)
Type: DSP
Syntax:
DKMSDA Rd, Rs1, Rs2
Purpose
:
Do two 16x16 with 32bit signed double subtraction simultaneously. The results are written into Rd.
Description
:
it multiplies the bottom 16bit content of the 32bit elements of Rs1 with the bottom 16bit content of the 32bit elements of Rs2 and multiplies the top 16bit content of the 32bit elements of Rs1 with the top 16bit content of the 32bit elements of Rs2.
Operations:
op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; op3t = Rd.W[x+1] // top op1b = Rs1.W[x]; op2b = Rs2.W[x]; op3b = Rd.W[x] // bottom for ((aop,bop,dop,res) in [(op1t,op2t,op3t,rest), (op1b,op2b,op3b,resb)]) { mul1 = aop.H[1] s* bop.H[0]; mul2 = aop.H[0] s* bop.H[1]; res = sat.q31(dop  mul1  mul2); } Rd = concat(rest, resb); x=0
 Parameters
t – [in] unsigned long long type of value stored in t
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DKMSXDA (unsigned long long t, unsigned long long a, unsigned long long b)
DKMSXDA (Two Cross 16x16 with 32bit Signed Double Sub)
Type: DSP
Syntax:
DKMSXDA Rd, Rs1, Rs2
Purpose
:
Do two cross 16x16 with 32bit signed double subtraction simultaneously. The results are written into Rd.
Description
:
It multiplies the bottom 16bit content of the 32bit elements of Rs1 with the top 16bit content of the 32bit elements of Rs2 and multiplies the top 16bit content of the 32bit elements of Rs1 with the bottom 16bit content of the 32bit elements of Rs2.
Operations:
op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; op3t = Rd.W[x+1] // top op1b = Rs1.W[x]; op2b = Rs2.W[x]; op3b = Rd.W[x] // bottom for ((aop,bop,dop,res) in [(op1t,op2t,op3t,rest), (op1b,op2b,op3b,resb)]) { mul1 = aop.H[1] s* bop.H[0]; mul2 = aop.H[0] s* bop.H[1]; res = sat.q31(dop  mul1  mul2); } Rd = concat(rest, resb); x=0
 Parameters
t – [in] unsigned long long type of value stored in t
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DSMAQA (unsigned long long t, unsigned long long a, unsigned long long b)
DSMAQA (Four Signed 8x8 with 32bit Signed Add)
Type: DSP
Syntax:
DSMAQA Rd, Rs1, Rs2
Purpose
:
Do four signed 8x8 with 32bit signed addition simultaneously. The results are written into Rd.
Description
:
This instruction multiplies the four signed 8bit elements of 32bit chunks of Rs1 with the four signed 8bit elements of 32bit chunks of Rs2 and then adds the four results together with the signed content of the corresponding 32bit chunks of Rd. The final results are written back to the corresponding 32bit chunks in Rd.
Operations:
op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; op3t = Rd.W[x+1] // top op1b = Rs1.W[x]; op2b = Rs2.W[x]; op3b = Rd.W[x] // bottom for ((aop,bop,dop,res) in [(op1t,op2t,op3t,rest), (op1b,op2b,op3b,resb)]) { m0 = aop.B[0] s* bop.B[0]; m1 = aop.B[1] s* bop.B[1]; m2 = aop.B[2] s* bop.B[2]; m3 = aop.B[3] s* bop.B[3]; res = dop + m0 + m1 + m2 + m3; } Rd = concat(rest, resb); x=0
 Parameters
t – [in] unsigned long long type of value stored in t
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DSMAQA_SU (unsigned long long t, unsigned long long a, unsigned long long b)
DSMAQASU (Four Signed 8 x Unsigned 8 with 32bit Signed Add)
Type: DSP
Syntax:
DSMAQASU Rd, Rs1, Rs2
Purpose
:
Do four Signed 8 x Unsigned 8 with 32bit unsigned addition simultaneously. The results are written into Rd.
Description
:
This instruction multiplies the four unsigned 8bit elements of 32bit chunks of Rs1 with the four signed 8bit elements of 32bit chunks of Rs2 and then adds the four results together with the unsigned content of the corresponding 32bit chunks of Rd. The final results are written back to the corresponding 32bit chunks in Rd.
Operations:
op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; op3t = Rd.W[x+1] // top op1b = Rs1.W[x]; op2b = Rs2.W[x]; op3b = Rd.W[x] // bottom for ((aop,bop,dop,res) in [(op1t,op2t,op3t,rest), (op1b,op2b,op3b,resb)]) { m0 = aop.B[0] su* bop.B[0]; m1 = aop.B[1] su* bop.B[1]; m2 = aop.B[2] su* bop.B[2]; m3 = aop.B[3] su* bop.B[3]; res = dop + m0 + m1 + m2 + m3; } Rd = concat(rest, resb); x=0
 Parameters
t – [in] unsigned long long type of value stored in t
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE unsigned long long __RV_DUMAQA (unsigned long long t, unsigned long long a, unsigned long long b)
DUMAQA (Four Unsigned 8x8 with 32bit Unsigned Add)
Type: DSP
Syntax:
DUMAQA Rd, Rs1, Rs2
Purpose
:
Do four unsigned 8x8 with 32bit unsigned addition simultaneously. The results are written into Rd.
Description
:
This instruction multiplies the four unsigned 8bit elements of 32bit chunks of Rs1 with the four unsigned 8bit elements of 32bit chunks of Rs2 and then adds the four results together with the unsigned content of the corresponding 32bit chunks of Rd. The final results are written back to the corresponding 32bit chunks in Rd.
Operations:
op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; op3t = Rd.W[x+1] // top op1b = Rs1.W[x]; op2b = Rs2.W[x]; op3b = Rd.W[x] // bottom for ((aop,bop,dop,res) in [(op1t,op2t,op3t,rest), (op1b,op2b,op3b,resb)]) { m0 = aop.B[0] su* bop.B[0]; m1 = aop.B[1] su* bop.B[1]; m2 = aop.B[2] su* bop.B[2]; m3 = aop.B[3] su* bop.B[3]; res = dop + m0 + m1 + m2 + m3; } Rd = concat(rest, resb); x=0
 Parameters
t – [in] unsigned long long type of value stored in t
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE long long __RV_DKMDA32 (unsigned long long a, unsigned long long b)
DKMDA32 (Two Signed 32x32 with 64bit Saturation Add)
Type: DSP
Syntax:
DKMDA32 Rd, Rs1, Rs2
Purpose
:
Do two signed 32x32 add the signed multiplication results with Q63 saturation. The results are written into Rd.
Description
:
For the
KMDA32
instruction, it multiplies the bottom 32bit element of Rs1 with the bottom 32bit element of Rs2 and then adds the result to the result of multiplying the top 32bit element of Rs1 with the top 32bit element of Rs2.Operations:
op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom t0 = op1b s* op2b; t1 = op1t s* op2t; Rd = concat(rest, resb); x=0
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in long long type
 __STATIC_FORCEINLINE long long __RV_DKMXDA32 (unsigned long long a, unsigned long long b)
DKMXDA32 (Two Cross Signed 32x32 with 64bit Saturation Add)
Type: DSP
Syntax:
DKMXDA32 Rd, Rs1, Rs2
Purpose
:
Do two cross signed 32x32 and add the signed multiplication results with Q63 saturation. The results are written into Rd.
Description
:
It multiplies the bottom 32bit element of Rs1 with the top 32bit element of Rs2 and then adds the result to the result of multiplying the top 32bit element of Rs1 with the bottom 32bit element of Rs2.
Operations:
op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom t01 = op1b s* op2t; t10 = op1t s* op2b; Rd = sat.q63(t01 + t10); x=0
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in long long type
 __STATIC_FORCEINLINE long long __RV_DKMADA32 (long long t, unsigned long long a, unsigned long long b)
DKMADA32 (Two Signed 32x32 with 64bit Saturation Add)
Type: DSP
Syntax:
DKMADA32 Rd, Rs1, Rs2
Purpose
:
Do two signed 32x32 and add the signed multiplication results and a third register with Q63 saturation. The results are written into Rd.
Description
:
It multiplies the bottom 32bit element of Rs1 with the bottom 32bit element of Rs2 and then adds the result to the result of multiplying the top 32bit element of Rs1 with the top 32bit element of Rs2.
Operations:
op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom t01 = op1b s* op2b; t10 = op1t s* op2t; Rd = sat.q63(t01 + t10); x=0
 Parameters
t – [in] long long type of value stored in t
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in long long type
 __STATIC_FORCEINLINE long long __RV_DKMAXDA32 (long long t, unsigned long long a, unsigned long long b)
DKMAXDA32 (Two Cross Signed 32x32 with 64bit Saturation Add)
Type: DSP
Syntax:
DKMAXDA32 Rd, Rs1, Rs2
Purpose
:
Do two cross signed 32x32 and add the signed multiplication results and a third register with Q63 saturation. The results are written into Rd.
Description
:
It multiplies the top 32bit element in Rs1 with the bottom 32bit element in Rs2 and then adds the result to the result of multiplying the bottom 32bit element in Rs1 with the top 32bit element in Rs2.
Operations:
op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom t01 = op1b s* op2t; t10 = op1t s* op2b; Rd = sat.q63(Rd + t01 + t10); x=0
 Parameters
t – [in] long long type of value stored in t
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in long long type
 __STATIC_FORCEINLINE long long __RV_DKMADS32 (long long t, unsigned long long a, unsigned long long b)
DKMADS32 (Two Signed 32x32 with 64bit Saturation Add and Sub)
Type: DSP
Syntax:
DKMADS32 Rd, Rs1, Rs2
Purpose
:
Do two signed 32x32 and add the top signed multiplication results and subtraction bottom signed multiplication results and add a third register with Q63 saturation. The results are written into Rd.
Description
:
It multiplies the top 32bit element in Rs1 with the bottom 32bit element in Rs2 and then subtracts the result to the result of multiplying the top 32bit element in Rs1 with the top 32bit element in Rs2.
Operations:
op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom t0 = op1b s* op2b; t1 = op1t s* op2t; Rd = sat.q63(Rd  t0 + t1); x=0
 Parameters
t – [in] long long type of value stored in t
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in long long type
 __STATIC_FORCEINLINE long long __RV_DKMADRS32 (long long t, unsigned long long a, unsigned long long b)
DKMADRS32 (Two Signed 32x32 with 64bit Saturation Revered Add and Sub)
Type: DSP
Syntax:
DKMADRS32 Rd, Rs1, Rs2
Purpose
:
Do two signed 32x32 and add the signed multiplication results and a third register with Q63 saturation. The results are written into Rd.Do two signed 32x32 and subtraction the top signed multiplication results and add bottom signed multiplication results and add a third register with Q63 saturation. The results are written into Rd.
Description
:
It multiplies the top 32bit element in Rs1 with the top 32bit element in Rs2 and then subtracts the result from the result of multiplying the bottom 32bit element in Rs1 with the bottom 32bit element in Rs2.
Operations:
op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom t0 = op1b s* op2b; t1 = op1t s* op2t; Rd = sat.q63(Rd + t0  t1); x=0
 Parameters
t – [in] long long type of value stored in t
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in long long type
 __STATIC_FORCEINLINE long long __RV_DKMAXDS32 (long long t, unsigned long long a, unsigned long long b)
DKMAXDS32 (Two Cross Signed 32x32 with 64bit Saturation Add and Sub)
Type: DSP
Syntax:
DKMAXDS32 Rd, Rs1, Rs2
Purpose
:
Do two signed 32x32 and add the top signed multiplication results and subtraction bottom signed multiplication results and add a third register with Q63 saturation. The results are written into Rd.
Description
:
It multiplies the bottom 32bit element in Rs1 with the top 32bit element in Rs2 and then subtracts the result from the result of multiplying the top 32bit element in Rs1 with the bottom 32bit element in Rs2.
Operations:
op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom t01 = op1b s* op2t; t10 = op1t s* op2b; Rd = sat.q63(Rd  t01 + t10); x=0
 Parameters
t – [in] long long type of value stored in t
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in long long type
 __STATIC_FORCEINLINE long long __RV_DKMSDA32 (long long t, unsigned long long a, unsigned long long b)
DKMSDA32 (Two Signed 32x32 with 64bit Saturation Sub)
Type: DSP
Syntax:
DKMSDA32 Rd, Rs1, Rs2
Purpose
:
Do two signed 32x32 and subtraction the top signed multiplication results and subtraction bottom signed multiplication results and add a third register with Q63 saturation. The results are written into Rd.
Description
:
It multiplies the bottom 32bit element of Rs1 with the bottom 32bit element of Rs2 and multiplies the top 32bit element of Rs1 with the top 32bit element of Rs2.
Operations:
op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom t0 = op1b s* op2b; t1 = op1t s* op2t; Rd = sat.q63(Rd  t0  t1); x=0
 Parameters
t – [in] long long type of value stored in t
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in long long type
 __STATIC_FORCEINLINE long long __RV_DKMSXDA32 (long long t, unsigned long long a, unsigned long long b)
DKMSXDA32 (Two Cross Signed 32x32 with 64bit Saturation Sub)
Type: DSP
Syntax:
DKMSXDA32 Rd, Rs1, Rs2
Purpose
:
Do two cross signed 32x32 and subtraction the top signed multiplication results and subtraction bottom signed multiplication results and add a third register with Q63 saturation. The results are written into Rd.
Description
:
It multiplies the bottom 32bit element of Rs1 with the top 32bit element of Rs2 and multiplies the top 32bit element of Rs1 with the bottom 32bit element of Rs2.
Operations:
op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom t0 = op1b s* op2t; t1 = op1t s* op2b; Rd = sat.q63(Rd  t0  t1); x=0
 Parameters
t – [in] long long type of value stored in t
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in long long type
 __STATIC_FORCEINLINE long long __RV_DSMDS32 (unsigned long long a, unsigned long long b)
DSMDS32 (Two Signed 32x32 with 64bit Sub)
Type: DSP
Syntax:
DSMDS32 Rd, Rs1, Rs2
Purpose
:
Do two signed 32x32 and add the top signed multiplication results and subtraction bottom signed multiplication. The results are written into Rd.
Description
:
It multiplies the bottom 32bit element of Rs1 with the bottom 32bit element of Rs2 and then subtracts the result from the result of multiplying the top 32bit element of Rs1 with the top 32bit element of Rs2.
Operations:
op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom t0 = op1b s* op2t; t1 = op1t s* op2b; Rd = t1  t0; x=0
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in long long type
 __STATIC_FORCEINLINE long long __RV_DSMDRS32 (unsigned long long a, unsigned long long b)
DSMDRS32 (Two Signed 32x32 with 64bit Revered Sub)
Type: DSP
Syntax:
DSMDRS32 Rd, Rs1, Rs2
Purpose
:
Do two signed 32x32 and subtraction the top signed multiplication results and add bottom signed multiplication. The results are written into Rd
Description
:
It multiplies the top 32bit element of Rs1 with the top 32bit element of Rs2 and then subtracts the result from the result of multiplying the bottom 32bit element of Rs1 with the bottom 32bit element of Rs2.
Operations:
op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom t0 = op1b s* op2b; t1 = op1t s* op2t; Rd = t1  t0; x=0
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in long long type
 __STATIC_FORCEINLINE long long __RV_DSMXDS32 (unsigned long long a, unsigned long long b)
DSMXDS32 (Two Cross Signed 32x32 with 64bit Sub)
Type: DSP
Syntax:
DSMXDS32 Rd, Rs1, Rs2
Purpose
:
Do two cross signed 32x32 and add the top signed multiplication results and subtraction bottom signed multiplication. The results are written into Rd.
Description
:
It multiplies the bottom 32bit element of Rs1 with the top 32bit element of Rs2 and then subtracts the result from the result of multiplying the top 32bit element of Rs1 with the bottom 32bit element of Rs2.
Operations:
op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom t01 = op1b s* op2t; t10 = op1t s* op2b; Rd = t1  t0; x=0
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in long long type
 __STATIC_FORCEINLINE long long __RV_DSMALDA (long long t, unsigned long long a, unsigned long long b)
DSMALDA (Four Signed 16x16 with 64bit Add)
Type: DSP
Syntax:
DSMALDA Rd, Rs1, Rs2
Purpose
:
Do four signed 16x16 and add signed multiplication results and a third register. The results are written into Rd.
Description
:
It multiplies the bottom 16bit content of Rs1 with the bottom 16bit content of Rs2 and then adds the result to the result of multiplying the top 16bit content of Rs1 with the top 16bit content of Rs2 with unlimited precision
Operations:
op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom m0 = op1b.H[0] s* op2b.H[0]; m1 = op1b.H[1] s* op2b.H[1]; m2 = op1t.H[0] s* op2t.H[0]; m3 = op1t.H[1] s* op2t.H[1]; Rd = Rd + m0 + m1 + m2 + m3; x=0
 Parameters
t – [in] long long type of value stored in t
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in long long type
 __STATIC_FORCEINLINE long long __RV_DSMALXDA (long long t, unsigned long long a, unsigned long long b)
DSMALXDA (Four Signed 16x16 with 64bit Add)
Type: DSP
Syntax:
DSMALXDA Rd, Rs1, Rs2
Purpose
:
Do four cross signed 16x16 and add signed multiplication results and a third register. The results are written into Rd.
Description
:
It multiplies the top 16bit content of Rs1 with the bottom 16bit content of Rs2 and then adds the result to the result of multiplying the bottom 16bit content of Rs1 with the top 16bit content of Rs2 with unlimited precision.
Operations:
op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom m0 = op1b.H[0] s* op2b.H[1]; m1 = op1b.H[1] s* op2b.H[0]; m2 = op1t.H[0] s* op2t.H[1]; m3 = op1t.H[1] s* op2t.H[0]; Rd = Rd + m0 + m1 + m2 + m3; x=0
 Parameters
t – [in] long long type of value stored in t
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in long long type
 __STATIC_FORCEINLINE long long __RV_DSMALDS (long long t, unsigned long long a, unsigned long long b)
DSMALDS (Four Signed 16x16 with 64bit Add and Sub)
Type: DSP
Syntax:
DSMALDS Rd, Rs1, Rs2
Purpose
:
Do four signed 16x16 and add and subtraction signed multiplication results and a third register. The results are written into Rd.
Description
:
It multiplies the bottom 16bit content of Rs1 with the bottom 16bit content of Rs2 and then subtracts the result from the result of multiplying the top 16bit content of Rs1 with the top 16bit content of Rs2.
Operations:
op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom m0 = op1b.H[1] s* op2b.H[1]; m1 = op1b.H[0] s* op2b.H[0]; m2 = op1t.H[1] s* op2t.H[1]; m3 = op1t.H[0] s* op2t.H[0]; Rd = Rd + m0  m1 + m2  m3; x=0
 Parameters
t – [in] long long type of value stored in t
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in long long type
 __STATIC_FORCEINLINE long long __RV_DSMALDRS (long long t, unsigned long long a, unsigned long long b)
DSMALDRS (Four Signed 16x16 with 64bit Add and Revered Sub)
Type: DSP
Syntax:
DSMALDRS Rd, Rs1, Rs2
Purpose
:
Do two signed 16x16 and add and revered subtraction signed multiplication results and a third register. The results are written into Rd.
Description
:
It multiplies the top 16bit content of Rs1 with the top 16bit content of Rs2 and then subtracts the result from the result of multiplying the bottom 16bit content of Rs1 with the bottom 16bit content of Rs2.
Operations:
op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom m0 = op1b.H[0] s* op2b.H[0]; m1 = op1b.H[1] s* op2b.H[1]; m2 = op1t.H[0] s* op2t.H[0]; m3 = op1t.H[1] s* op2t.H[1]; Rd = Rd + m0  m1 + m2  m3; x=0
 Parameters
t – [in] long long type of value stored in t
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in long long type
 __STATIC_FORCEINLINE long long __RV_DSMALXDS (long long t, unsigned long long a, unsigned long long b)
DSMALXDS (Four Cross Signed 16x16 with 64bit Add and Sub)
Type: DSP
Syntax:
DSMALXDS Rd, Rs1, Rs2
Purpose
:
Do four cross signed 16x16 and add and subtraction signed multiplication results and a third register. The results are written into Rd.
Description
:
It multiplies the bottom 16bit content of Rs1 with the top 16bit content of Rs2 and then subtracts the result from the result of multiplying the top 16bit content of Rs1 with the bottom 16bit content of Rs2.
Operations:
op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom m0 = op1b.H[1] s* op2b.H[0]; m1 = op1b.H[0] s* op2b.H[1]; m2 = op1t.H[1] s* op2t.H[0]; m3 = op1t.H[0] s* op2t.H[1]; Rd = Rd + m0  m1 + m2  m3; x=0
 Parameters
t – [in] long long type of value stored in t
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in long long type
 __STATIC_FORCEINLINE long long __RV_DSMSLDA (long long t, unsigned long long a, unsigned long long b)
DSMSLDA (Four Signed 16x16 with 64bit Sub)
Type: DSP
Syntax:
DSMSLDA Rd, Rs1, Rs2
Purpose
:
Do four signed 16x16 and subtraction signed multiplication results and add a third register. The results are written into Rd.
Description
:
It multiplies the bottom 16bit content of Rs1 with the bottom 16bit content Rs2 and multiplies the top 16bit content of Rs1 with the top 16bit content of Rs2.
Operations:
op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom m0 = op1b.H[0] s* op2b.H[0]; m1 = op1b.H[1] s* op2b.H[1]; m2 = op1t.H[0] s* op2t.H[0]; m3 = op1t.H[1] s* op2t.H[1]; Rd = Rd  m0  m1  m2  m3; x=0
 Parameters
t – [in] long long type of value stored in t
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in long long type
 __STATIC_FORCEINLINE long long __RV_DSMSLXDA (long long t, unsigned long long a, unsigned long long b)
DSMSLXDA (Four Cross Signed 16x16 with 64bit Sub)
Type: DSP
Syntax:
DSMSLXDA Rd, Rs1, Rs2
Purpose
:
Do four signed 16x16 and subtraction signed multiplication results and add a third register. The results are written into Rd.
Description
:
It multiplies the top 16bit content of Rs1 with the bottom 16bit content of Rs2 and multiplies the bottom 16bit content of Rs1 with the top 16bit content of Rs2.
Operations:
op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom m0 = op1b.H[0] s* op2b.H[1]; m1 = op1b.H[1] s* op2b.H[0]; m2 = op1t.H[0] s* op2t.H[1]; m3 = op1t.H[1] s* op2t.H[0]; Rd = Rd  m0  m1  m2  m3; x=0
 Parameters
t – [in] long long type of value stored in t
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in long long type
 __STATIC_FORCEINLINE long long __RV_DDSMAQA (long long t, unsigned long long a, unsigned long long b)
DDSMAQA (Eight Signed 8x8 with 64bit Add)
Type: DSP
Syntax:
DDSMAQA Rd, Rs1, Rs2
Purpose
:
Do eight signed 8x8 and add signed multiplication results and a third register. The results are written into Rd.
Description
:
Do eight signed 8bit multiplications from eight 8bit chunks of two registers; and then adds the eight 16bit results and the content of 64bit chunks of a third register.
Operations:
op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom m0 = op1b.B[0] s* op2b.B[0]; m1 = op1b.B[1] s* op2b.B[1]; m2 = op1b.B[2] s* op2b.B[2]; m3 = op1b.B[3] s* op2b.B[3]; m4 = op1t.B[0] s* op2t.B[0]; m5 = op1t.B[1] s* op2t.B[1]; m6 = op1t.B[2] s* op2t.B[2]; m7 = op1t.B[3] s* op2t.B[3]; s0 = m0 + m1 + m2 + m3; s1 = m4 + m5 + m6 + m7; Rd = Rd + s0 + s1; x=0
 Parameters
t – [in] long long type of value stored in t
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in long long type
 __STATIC_FORCEINLINE long long __RV_DDSMAQASU (long long t, unsigned long long a, unsigned long long b)
DDSMAQASU (Eight Signed 8 x Unsigned 8 with 64bit Add)
Type: DSP
Syntax:
DDSMAQASU Rd, Rs1, Rs2
Purpose
:
Do eight signed 8 x unsigned 8 and add signed multiplication results and a third register. The results are written into Rd.
Description
:
Do eight signed 8 x unsigned 8 and add signed multiplication results and a third register; and then adds the eight 16bit results and the content of 64bit chunks of a third register.
Operations:
op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom m0 = op1b.B[0] su* op2b.B[0]; m1 = op1b.B[1] su* op2b.B[1]; m2 = op1b.B[2] su* op2b.B[2]; m3 = op1b.B[3] su* op2b.B[3]; m4 = op1t.B[0] su* op2t.B[0]; m5 = op1t.B[1] su* op2t.B[1]; m6 = op1t.B[2] su* op2t.B[2]; m7 = op1t.B[3] su* op2t.B[3]; s0 = m0 + m1 + m2 + m3; s1 = m4 + m5 + m6 + m7; Rd = Rd + s0 + s1; x=0
 Parameters
t – [in] long long type of value stored in t
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in long long type
 __STATIC_FORCEINLINE long long __RV_DDUMAQA (long long t, unsigned long long a, unsigned long long b)
DDUMAQA (Eight Unsigned 8x8 with 64bit Unsigned Add)
Type: DSP
Syntax:
DDUMAQA Rd, Rs1, Rs2
Purpose
:
Do eight unsigned 8x8 and add unsigned multiplication results and a third register. The results are written into Rd.
Description
:
Do eight unsigned 8x8 and add unsigned multiplication results and a third register; and then adds the eight 16bit results and the content of 64bit chunks of a third register.
Operations:
op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom m0 = op1b.B[0] u* op2b.B[0]; m1 = op1b.B[1] u* op2b.B[1]; m2 = op1b.B[2] u* op2b.B[2]; m3 = op1b.B[3] u* op2b.B[3]; m4 = op1t.B[0] u* op2t.B[0]; m5 = op1t.B[1] u* op2t.B[1]; m6 = op1t.B[2] u* op2t.B[2]; m7 = op1t.B[3] u* op2t.B[3]; s0 = m0 + m1 + m2 + m3; s1 = m4 + m5 + m6 + m7; Rd = Rd + s0 + s1; x=0
 Parameters
t – [in] long long type of value stored in t
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in long long type
 __STATIC_FORCEINLINE long __RV_DSMA32_U (unsigned long long a, unsigned long long b)
DSMA32.u (64bit SIMD 32bit Signed Multiply Addition With Rounding and Clip)
Type: DSP
Syntax:
DSMA32.u Rd, Rs1, Rs2
Purpose
:
Do two signed 32x32 and add signed multiplication results with Rounding, then right shift 32bit and clip q63 to q31. The result is written to Rd.
Description
:
For the
DSMA32.u
instruction, multiply the top 32bit Q31 content of 64bit chunks in Rs1 with the top 32bit Q31 content of 64bit chunks in Rs2. At the same time, multiply the bottom 32bit Q31 content of 64bit chunks in Rs1 with the bottom 32bit Q31 content of 64bit chunks in Rs2. Then, do the addtion for the results above and perform the addtional rounding operations, and then move the data to the right by 32bit, and clip the 64bit data into 32bit.The result is written to Rd.Operations:
Rd = (q31_t)((Rs1.W[x] s* Rs2.W[x] + Rs1.W[x + 1] s* Rs2.W[x + 1] + 0x80000000LL) s>> 32); x=0
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in long type
 __STATIC_FORCEINLINE long __RV_DSMXS32_U (unsigned long long a, unsigned long long b)
DSMXS32.u (64bit SIMD 32bit Signed Multiply Cross Subtraction With Rounding and Clip)
Type: DSP
Syntax:
DSMXS32.u Rd, Rs1, Rs2
Purpose
:
Do two cross signed 32x32 and sub signed multiplication results with Rounding, then right shift 32bit and clip q63 to q31. The result is written to Rd.
Description
:
For the
DSMXS32.u
instruction, multiply the top 32bit Q31 content of 64bit chunks in Rs1 with the bottom 32bit Q31 content of 64bit chunks in Rs2. At the same time, multiply the bottom 32bit Q31 content of 64bit chunks in Rs1 with the top 32bit Q31 content of 64bit chunks in Rs2. Then, do the subtraction for the results above and perform the addtional rounding operations, and then move the data to the right by 32bit, and clip the 64bit data into 32bit.The result is written to Rd.Operations:
Rd = (q31_t)((Rs1.W[x + 1] s* Rs2.W[x]  Rs1.W[x] s* Rs2.W[x + 1] + 0x80000000LL) s>> 32); x=0
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in long type
 __STATIC_FORCEINLINE long __RV_DSMXA32_U (unsigned long long a, unsigned long long b)
DSMXA32.u (64bit SIMD 32bit Signed Cross Multiply Addition with Rounding and Clip)
Type: DSP
Syntax:
DSMXA32.u Rd, Rs1, Rs2
Purpose
:
Do two cross signed 32x32 and add signed multiplication results with Rounding, then right shift 32bit and clip q63 to q31. The result is written to Rd.
Description
:
For the
DSMXA32.u
instruction,multiply the top 32bit Q31 content of 64bit chunks in Rs1 with the bottom 32bit Q31 content of 64bit chunks in Rs2. At the same time, multiply the bottom 32bit Q31 content of 64bit chunks in Rs1 with the top 32bit Q31 content of 64bit chunks in Rs2. Then, do the addtion for the results above and perform the addtional rounding operations, and then move the data to the right by 32bit, and clip the 64bit data into 32bit.The result is written to Rd.Operations:
Rd = (q31_t)((Rs1.W[x + 1] s* Rs2.W[x] + Rs1.W[x] s* Rs2.W[x + 1] + 0x80000000LL) s>> 32); x=0
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in long type
 __STATIC_FORCEINLINE long __RV_DSMS32_U (unsigned long long a, unsigned long long b)
DSMS32.u (64bit SIMD 32bit Signed Multiply Subtraction with Rounding and Clip)
Type: DSP
Syntax:
DSMS32.u Rd, Rs1, Rs2
Purpose
:
Do two signed 32x32 and sub signed multiplication results with Rounding, then right shift 32bit and clip q63 to q31. The result is written to Rd.
Description
:
For the
DSMS32.u
instruction, multiply the bottom 32bit Q31 content of 64bit chunks in Rs1 with the bottom 32bit Q31 content of 64bit chunks in Rs2. At the same time, multiply the top 32bit Q31 content of 64bit chunks in Rs1 with the top 32bit Q31 content of 64bit chunks in Rs2. Then, do the subtraction for the results above and perform the addtional rounding operations, and then move the data to the right by 32bit, and clip the 64bit data into 32bit.The result is written to Rd.Operations:
Rd = (q31_t)((Rs1.W[x] s* Rs2.W[x]  Rs1.W[x + 1] s* Rs2.W[x + 1] + 0x80000000LL) s>> 32); x=0
 Parameters
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in long type
 __STATIC_FORCEINLINE long __RV_DSMADA16 (long long t, unsigned long long a, unsigned long long b)
DSMADA16 (Signed Multiply Two Halfs and Two Adds 32bit)
Type: SIMD
Syntax:
DSMADA16 Rd, Rs1, Rs2
Purpose
:
Do two signed 16bit multiplications of two 32bit registers; and then adds the 32bit results and the 32bit value of an even/odd pair of registers together.
DSMADA16: rt pair+ top*top + bottom*bottom
Description
:
This instruction multiplies the per 16bit content of the 32bit elements of Rs1 with the corresponding 16bit content of the 32bit elements of Rs2. The result is added to the 32bit value of an even/odd pair of registers specified by Rd(4,1). The 32bit addition result is written back to the registerpair. The 16bit values of Rs1 and Rs2, and the 32bit value of the registerpair are treated as signed integers.
Operations:
Mres0[0][31:0] = (Rs1.W[0].H[0] * Rs2.W[0].H[0]); Mres1[0][31:0] = (Rs1.W[0].H[1] * Rs2.W[0].H[1]); Mres0[1][31:0] = (Rs1.W[1].H[0] * Rs2.W[1].H[0]); Mres1[1][31:0] = (Rs1.W[1].H[1] * Rs2.W[1].H[1]); Rd.W = Rd.W + SE32(Mres0[0][31:0]) + SE32(Mres1[0][31:0]) + SE32(Mres0[1][31:0]) + SE32(Mres1[1][31:0]);
 Parameters
t – [in] long long type of value stored in t
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in long type
 __STATIC_FORCEINLINE long __RV_DSMAXDA16 (long long t, unsigned long long a, unsigned long long b)
DSMAXDA16 (Signed Crossed Multiply Two Halfs and Two Adds 32bit)
Type: SIMD
Syntax:
DSMAXDA16 Rd, Rs1, Rs2
Purpose
:
Do two signed 16bit multiplications of two 32bit registers; and then adds the 32bit results and the 32bit value of an even/odd pair of registers together.
DSMAXDA: rt pair+ top*bottom + bottom*top (all 32bit elements)
Description
:
This instruction crossly multiplies the top 16bit content of the 32bit elements of Rs1 with the bottom 16bit content of the 32bit elements of Rs2 and then adds the result to the result of multiplying the bottom 16bit content of the 32bit elements of Rs1 with the top 16bit content of the 32bit elements of Rs2 with unlimited precision. The result is added to the 64bit value of an even/odd pair of registers specified by Rd(4,1).The 64bit addition result is clipped to 32bit result.
Operations:
Mres0[0][31:0] = (Rs1.W[0].H[0] * Rs2.W[0].H[1]); Mres1[0][31:0] = (Rs1.W[0].H[1] * Rs2.W[0].H[0]); Mres0[1][31:0] = (Rs1.W[1].H[0] * Rs2.W[1].H[1]); Mres1[1][31:0] = (Rs1.W[1].H[1] * Rs2.W[1].H[0]); Rd.W = Rd.W + SE32(Mres0[0][31:0]) + SE32(Mres1[0][31:0]) + SE32(Mres0[1][31:0]) + SE32(Mres1[1][31:0]);
 Parameters
t – [in] long long type of value stored in t
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in long type
 __STATIC_FORCEINLINE unsigned long long __RV_DKSMS32_U (unsigned long long t, unsigned long long a, unsigned long long b)
DKSMS32.u (Two Signed Multiply Shiftclip and Saturation with Rounding)
Type: SIMD
Syntax:
DKSMS32.u Rd, Rs1, Rs2
Purpose
:
Computes saturated multiplication of two pairs of q31 type with shifted rounding.
Description
:
Compute the multiplication of Rs1 and Rs2 of type q31_t, intercept [47:16] for the resulting 64bit product to get the 32bit number, then add 1 to it to do rounding, and finally saturate the result after rounding.
Operations:
Mres[x][63:0] = Rs1.W[x] s* Rs2.W[x]; Round[x][32:0] = Mres[x][47:15] + 1; Rd.W[x] = sat.31(Rd.W[x] + Round[x][32:1]); x=1...0
 Parameters
t – [in] unsigned long long type of value stored in t
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type
 __STATIC_FORCEINLINE long __RV_DMADA32 (long long t, unsigned long long a, unsigned long long b)
DMADA32 ((Two Cross Signed 32x32 with 64bit Add and Clip to 32bit)
Type: SIMD
Syntax:
DMADA32 Rd, Rs1, Rs2
Purpose
:
Do two cross signed 32x32 and add the signed multiplication results to q63, then clip the q63 result to q31 , the final results are written into Rd.
Description
:
For the
DMADA32
instruction, it multiplies the top 32bit element in Rs1 with the bottom 32bit element in Rs2 and then adds the result to the result of multiplying the bottom 32bit element in Rs1 with the top 32bit element in Rs2, then clip the q63 result to q31.Operations:
res = (q31_t)((((q63_t) Rd.w[0] << 32) + (q63_t)Rs1.w[0] s* Rs2.w[1] + (q63_t)Rs1.w[1] s* Rs2.w[0]) s>> 32); rd = res;
 Parameters
t – [in] long long type of value stored in t
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in long type
 __STATIC_FORCEINLINE long long __RV_DSMALBB (long long t, unsigned long long a, unsigned long long b)
DSMALBB (Signed Multiply Bottom Halfs & Add 64bit)
Type: SIMD
Syntax:
DSMALBB Rd, Rs1, Rs2
Purpose
:
Multiply the signed 16bit content of the 32bit elements of a register with the 16bit content of the corresponding 32bit elements of another register and add the results with a 64bit value of an even/odd pair of registers. The addition result is written back to the registerpair.
DSMALBB: rt pair + bottom*bottom (all 32bit elements)
Description
:
For the
DSMALBB
instruction, it multiplies the bottom 16bit content of Rs1 with the bottom 16bit content of Rs2.The multiplication results are added with the 64bit value of Rd. The 64bit addition result is written back to Rd.Operations:
Mres[0][31:0] = Rs1.W[0].H[0] * Rs2.W[0].H[0]; Mres[1][31:0] = Rs1.W[1].H[0] * Rs2.W[1].H[0]; Rd = Rd + SE64(Mres[0][31:0]) + SE64(Mres[1][31:0]);
 Parameters
t – [in] long long type of value stored in t
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in long long type
 __STATIC_FORCEINLINE long long __RV_DSMALBT (long long t, unsigned long long a, unsigned long long b)
DSMALBT (Signed Multiply Bottom Half & Top Half & Add 64bit)
Type: SIMD
Syntax:
DSMALBT Rd, Rs1, Rs2
Purpose
:
Multiply the signed 16bit content of the 32bit elements of a register with the 16bit content of the corresponding 32bit elements of another register and add the results with a 64bit value of an even/odd pair of registers. The addition result is written back to the registerpair.
DSMALBT: rt pair + bottom*top (all 32bit elements)
Description
:
For the
DSMALBT
instruction, it multiplies the bottom 16bit content of the 32bit elements of Rs1 with the top 16bit content of the 32bit elements of Rs2. The multiplication results are added with the 64bit value of Rd. The 64bit addition result is written back to Rd. The 16bit values of Rs1 and Rs2, and the 64bit value of Rd are treated as signed integersOperations:
Mres[0][31:0] = Rs1.W[0].H[0] * Rs2.W[0].H[1]; Mres[1][31:0] = Rs1.W[1].H[0] * Rs2.W[1].H[1]; Rd = Rd + SE64(Mres[0][31:0]) + SE64(Mres[1][31:0]);
 Parameters
t – [in] long long type of value stored in t
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in long long type
 __STATIC_FORCEINLINE long long __RV_DSMALTT (long long t, unsigned long long a, unsigned long long b)
DSMALTT (Signed Multiply Top Half & Add 64bit)
Type: SIMD
Syntax:
DSMALTT Rd, Rs1, Rs2
Purpose
:
Multiply the signed 16bit content of the 32bit elements of a register with the 16bit content of the corresponding 32bit elements of another register and add the results with a 64bit value of an even/odd pair of registers. The addition result is written back to the registerpair.
DSMALTT: DSMALTT rt pair + top*top (all 32bit elements)
Description
:
For the
DSMALTT
instruction, it multiplies the top 16bit content of the 32bit elements of Rs1 with the top 16bit content of the 32bit elements of Rs2. The multiplication results are added with the 64bit value of Rd. The 64bit addition result is written back to Rd. The 16bit values of Rs1 and Rs2, and the 64bit value of Rd are treated as signed integers.Operations:
Mres[0][31:0] = Rs1.W[0].H[1] * Rs2.W[0].H[1]; Mres[1][31:0] = Rs1.W[1].H[1] * Rs2.W[1].H[1]; Rd = Rd + SE64(Mres[0][31:0]) + SE64(Mres[1][31:0]);
 Parameters
t – [in] long long type of value stored in t
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in long long type
 __STATIC_FORCEINLINE long long __RV_DKMABB32 (long long t, unsigned long long a, unsigned long long b)
DKMABB32 (Saturating Signed Multiply Bottom Words & Add)
Type: SIMD
Syntax:
DKMABB32 Rd, Rs1, Rs2
Purpose
:
Multiply the signed 32bit element in a register with the 32bit element in another register and add the result to the content of 64bit data in the third register. The addition result may besaturated and is written to the third register.
DKMABB32: rd + bottom*bottom
Description
:
For the
DKMABB32
instruction, it multiplies the bottom 32bit element in Rs1 with the bottom 32bit element in Rs2 The multiplication result is added to the content of 64bit data in Rd. If the addition result is beyond the Q63 number range (2^63 <= Q63 <= 2^631), it is saturated to the range and the OV bit is set to 1. The result after saturation is written to Rd. The 32bit contents of Rs1 and Rs2 are treated as signed integers.Operations:
res = Rd + (Rs1.W[0] * Rs2.W[0]); if (res > (2^63)1) { res = (2^63)1; OV = 1; } else if (res < 2^63) { res = 2^63; OV = 1; } Rd = res;
 Parameters
t – [in] long long type of value stored in t
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in long long type
 __STATIC_FORCEINLINE long long __RV_DKMABT32 (long long t, unsigned long long a, unsigned long long b)
DKMABT32 (Saturating Signed Multiply Bottom & Top Words & Add)
Type: SIMD
Syntax:
DKMABT32 Rd, Rs1, Rs2
Purpose
:
Multiply the signed 32bit element in a register with the 32bit element in another register and add the result to the content of 64bit data in the third register. The addition result may be saturated and is written to the third register.
DKMABT32: rd + bottom*top
Description
:
For the
DKMABT32
instruction, it multiplies the bottom 32bit element in Rs1 with the top 32bit element in Rs2 The multiplication result is added to the content of 64bit data in Rd. If the addition result is beyond the Q63 number range (2^63 <= Q63 <= 2^631), it is saturated to the range and the OV bit is set to 1. The result after saturation is written to Rd. The 32bit contents of Rs1 and Rs2 are treated as signed integers.Operations:
res = Rd + (Rs1.W[0] * Rs2.W[1]); if (res > (2^63)1) { res = (2^63)1; OV = 1; } else if (res < 2^63) { res = 2^63; OV = 1; } Rd = res;
 Parameters
t – [in] long long type of value stored in t
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in long long type
 __STATIC_FORCEINLINE long long __RV_DKMATT32 (long long t, unsigned long long a, unsigned long long b)
DKMATT32 (Saturating Signed Multiply Bottom & Top Words & Add)
Type: SIMD
Syntax:
DKMATT32 Rd, Rs1, Rs2
Purpose
:
Multiply the signed 32bit element in a register with the 32bit element in another register and add the result to the content of 64bit data in the third register. The addition result may be saturated and is written to the third register.
DKMATT32: rd + top*top
Description
:
For the
DKMATT32
instruction, it multiplies the top 32bit element in Rs1 with the top 32bit element in Rs2 The multiplication result is added to the content of 64bit data in Rd. If the addition result is beyond the Q63 number range (2^63 <= Q63 <= 2^631), it is saturated to the range and the OV bit is set to 1. The result after saturation is written to Rd. The 32bit contents of Rs1 and Rs2 are treated as signed integers.Operations:
res = Rd + (Rs1.W[1] * Rs2.W[1]); if (res > (2^63)1) { res = (2^63)1; OV = 1; } else if (res < 2^63) { res = 2^63; OV = 1; } Rd = res;
 Parameters
t – [in] long long type of value stored in t
a – [in] unsigned long long type of value stored in a
b – [in] unsigned long long type of value stored in b
 Returns
value stored in unsigned long long type