Nuclei Customized Default DSP Instructions

__STATIC_FORCEINLINE unsigned long __RV_EXPD80 (unsigned long a)
__STATIC_FORCEINLINE unsigned long __RV_EXPD81 (unsigned long a)
__STATIC_FORCEINLINE unsigned long __RV_EXPD82 (unsigned long a)
__STATIC_FORCEINLINE unsigned long __RV_EXPD83 (unsigned long a)
group NMSIS_Core_DSP_Intrinsic_NUCLEI_Default

(RV32 & RV64)Nuclei Customized DSP Instructions

This is Nuclei customized DSP instructions for both RV32 and RV64

Functions

__STATIC_FORCEINLINE unsigned long __RV_EXPD80 (unsigned long a)

EXPD80 (Expand and Copy Byte 0 to 32bit(when rv32) or 64bit(when rv64))

Type: DSP

Syntax:

EXPD80 Rd, Rs1

Purpose

:

When rv32, Copy 8-bit data from 32-bit chunks into 4 bytes in a register. When rv64, Copy 8-bit data from 64-bit chunks into 8 bytes in a register.

Description

:

Moves Rs1.B[0][7:0] to Rd.[0][7:0], Rd.[1][7:0], Rd.[2][7:0], Rd.[3][7:0]

Operations:

Rd.W[x][31:0] = CONCAT(Rs1.B[0][7:0], Rs1.B[0][7:0], Rs1.B[0][7:0], Rs1.B[0][7:0]);
for RV32: x=0

Parameters

a[in] unsigned long type of value stored in a

Returns

value stored in unsigned long type

__STATIC_FORCEINLINE unsigned long __RV_EXPD81 (unsigned long a)

EXPD81 (Expand and Copy Byte 1 to 32bit(rv32) or 64bit(when rv64))

Type: DSP

Syntax:

EXPD81 Rd, Rs1

Purpose

:

Copy 8-bit data from 32-bit chunks into 4 bytes in a register.

Description

:

Moves Rs1.B[1][7:0] to Rd.[0][7:0], Rd.[1][7:0], Rd.[2][7:0], Rd.[3][7:0]

Operations:

Rd.W[x][31:0] = CONCAT(Rs1.B[1][7:0], Rs1.B[1][7:0], Rs1.B[1][7:0], Rs1.B[1][7:0]);
for RV32: x=0

Parameters

a[in] unsigned long type of value stored in a

Returns

value stored in unsigned long type

__STATIC_FORCEINLINE unsigned long __RV_EXPD82 (unsigned long a)

EXPD82 (Expand and Copy Byte 2 to 32bit(rv32) or 64bit(when rv64))

Type: DSP

Syntax:

EXPD82 Rd, Rs1

Purpose

:

Copy 8-bit data from 32-bit chunks into 4 bytes in a register.

Description

:

Moves Rs1.B[2][7:0] to Rd.[0][7:0], Rd.[1][7:0], Rd.[2][7:0], Rd.[3][7:0]

Operations:

Rd.W[x][31:0] = CONCAT(Rs1.B[2][7:0], Rs1.B[2][7:0], Rs1.B[2][7:0], Rs1.B[2][7:0]);
for RV32: x=0

Parameters

a[in] unsigned long type of value stored in a

Returns

value stored in unsigned long type

__STATIC_FORCEINLINE unsigned long __RV_EXPD83 (unsigned long a)

EXPD83 (Expand and Copy Byte 3 to 32bit(rv32) or 64bit(when rv64))

Type: DSP

Syntax:

EXPD83 Rd, Rs1

Purpose

:

Copy 8-bit data from 32-bit chunks into 4 bytes in a register.

Description

:

Moves Rs1.B[3][7:0] to Rd.[0][7:0], Rd.[1][7:0], Rd.[2][7:0], Rd.[3][7:0]

Operations:

Rd.W[x][31:0] = CONCAT(Rs1.B[3][7:0], Rs1.B[3][7:0], Rs1.B[3][7:0], Rs1.B[3][7:0]);
for RV32: x=0

Parameters

a[in] unsigned long type of value stored in a

Returns

value stored in unsigned long type

Nuclei Customized N1/N2/N3 DSP Instructions

__STATIC_FORCEINLINE unsigned long long __RV_DKHM8 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DKHM16 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DKABS8 (unsigned long long a)
__STATIC_FORCEINLINE unsigned long long __RV_DKABS16 (unsigned long long a)
__STATIC_FORCEINLINE unsigned long long __RV_DKSLRA8 (unsigned long long a, int b)
__STATIC_FORCEINLINE unsigned long long __RV_DKSLRA16 (unsigned long long a, int b)
__STATIC_FORCEINLINE unsigned long long __RV_DKADD8 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DKADD16 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DKSUB8 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DKSUB16 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DKHMX8 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DKHMX16 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DSMMUL (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DSMMUL_U (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DKWMMUL (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DKWMMUL_U (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DKABS32 (unsigned long long a)
__STATIC_FORCEINLINE unsigned long long __RV_DKSLRA32 (unsigned long long a, int b)
__STATIC_FORCEINLINE unsigned long long __RV_DKADD32 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DKSUB32 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DRADD16 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DSUB16 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DRADD32 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DSUB32 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DMSR16 (unsigned long a, unsigned long b)
__STATIC_FORCEINLINE unsigned long long __RV_DMSR17 (unsigned long a, unsigned long b)
__STATIC_FORCEINLINE unsigned long long __RV_DMSR33 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DMXSR33 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long __RV_DREDAS16 (unsigned long long a)
__STATIC_FORCEINLINE unsigned long __RV_DREDSA16 (unsigned long long a)
__STATIC_FORCEINLINE int16_t __RV_DKCLIP64 (unsigned long long a)
__STATIC_FORCEINLINE unsigned long long __RV_DKMDA (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DKMXDA (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DSMDRS (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DSMXDS (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE long long __RV_DSMBB32 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE long long __RV_DSMBB32_SRA14 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE long long __RV_DSMBB32_SRA32 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE long long __RV_DSMBT32 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE long long __RV_DSMBT32_SRA14 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE long long __RV_DSMBT32_SRA32 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE long long __RV_DSMTT32 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE long long __RV_DSMTT32_SRA14 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE long long __RV_DSMTT32_SRA32 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DPKBB32 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DPKBT32 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DPKTT32 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DPKTB32 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DPKTB16 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DPKBB16 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DPKBT16 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DPKTT16 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DSRA16 (unsigned long long a, unsigned long b)
__STATIC_FORCEINLINE unsigned long long __RV_DADD16 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DADD32 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DSMBB16 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DSMBT16 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DSMTT16 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DRCRSA16 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DRCRSA32 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DRCRAS16 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DRCRAS32 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DKCRAS16 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DKCRSA16 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DRSUB16 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DSTSA32 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DSTAS32 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DKCRSA32 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DKCRAS32 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DCRSA32 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DCRAS32 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DKSTSA16 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DKSTAS16 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DRSUB32 (unsigned long long a, unsigned long long b)
__RV_DSCLIP8(a, b)
__RV_DSCLIP16(a, b)
__RV_DSCLIP32(a, b)
__STATIC_FORCEINLINE unsigned long long __RV_DKMMAC (unsigned long long t, unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DKMMAC_U (unsigned long long t, unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DKMMSB (unsigned long long t, unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DKMMSB_U (unsigned long long t, unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DKMADA (unsigned long long t, unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DKMAXDA (unsigned long long t, unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DKMADS (unsigned long long t, unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DKMADRS (unsigned long long t, unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DKMAXDS (unsigned long long t, unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DKMSDA (unsigned long long t, unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DKMSXDA (unsigned long long t, unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DSMAQA (unsigned long long t, unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DSMAQA_SU (unsigned long long t, unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DUMAQA (unsigned long long t, unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE long long __RV_DKMDA32 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE long long __RV_DKMXDA32 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE long long __RV_DKMADA32 (long long t, unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE long long __RV_DKMAXDA32 (long long t, unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE long long __RV_DKMADS32 (long long t, unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE long long __RV_DKMADRS32 (long long t, unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE long long __RV_DKMAXDS32 (long long t, unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE long long __RV_DKMSDA32 (long long t, unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE long long __RV_DKMSXDA32 (long long t, unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE long long __RV_DSMDS32 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE long long __RV_DSMDRS32 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE long long __RV_DSMXDS32 (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE long long __RV_DSMALDA (long long t, unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE long long __RV_DSMALXDA (long long t, unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE long long __RV_DSMALDS (long long t, unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE long long __RV_DSMALDRS (long long t, unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE long long __RV_DSMALXDS (long long t, unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE long long __RV_DSMSLDA (long long t, unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE long long __RV_DSMSLXDA (long long t, unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE long long __RV_DDSMAQA (long long t, unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE long long __RV_DDSMAQASU (long long t, unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE long long __RV_DDUMAQA (long long t, unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE long __RV_DSMA32_U (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE long __RV_DSMXS32_U (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE long __RV_DSMXA32_U (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE long __RV_DSMS32_U (unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE long __RV_DSMADA16 (long long t, unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE long __RV_DSMAXDA16 (long long t, unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE unsigned long long __RV_DKSMS32_U (unsigned long long t, unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE long __RV_DMADA32 (long long t, unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE long long __RV_DSMALBB (long long t, unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE long long __RV_DSMALBT (long long t, unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE long long __RV_DSMALTT (long long t, unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE long long __RV_DKMABB32 (long long t, unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE long long __RV_DKMABT32 (long long t, unsigned long long a, unsigned long long b)
__STATIC_FORCEINLINE long long __RV_DKMATT32 (long long t, unsigned long long a, unsigned long long b)
group NMSIS_Core_DSP_Intrinsic_NUCLEI_N1

(RV32 only)Nuclei Customized N1 DSP Instructions

This is Nuclei customized DSP N1 instructions only for RV32

Functions

__STATIC_FORCEINLINE unsigned long long __RV_DKHM8 (unsigned long long a, unsigned long long b)

DKHM8 (64-bit SIMD Signed Saturating Q7 Multiply)

Type: SIMD

Syntax:

DKHM8 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do Q7xQ7 element multiplications simultaneously. The Q14 results are then reduced to Q7 numbers again.

Description

:

For the

DKHM8 instruction, multiply the top 8-bit Q7 content of 16-bit chunks in Rs1 with the top 8-bit Q7 content of 16-bit chunks in Rs2. At the same time, multiply the bottom 8-bit Q7 content of 16-bit chunks in Rs1 with the bottom 8-bit Q7 content of 16-bit chunks in Rs2.

The Q14 results are then right-shifted 7-bits and saturated into Q7 values. The Q7 results are then written into Rd. When both the two Q7 inputs of a multiplication are 0x80, saturation will happen. The result will be saturated to 0x7F and the overflow flag OV will be set.

Operations:

op1t = Rs1.B[x+1]; op2t = Rs2.B[x+1]; // top
op1b = Rs1.B[x]; op2b = Rs2.B[x]; // bottom
for ((aop,bop,res) in [(op1t,op2t,rest), (op1b,op2b,resb)]) {
  if (0x80 != aop | 0x80 != bop) {
    res = (aop s* bop) >> 7;
  } else {
    res= 0x7F;
    OV = 1;
  }
}
Rd.H[x/2] = concat(rest, resb);
for RV32, x=0,2,4,6

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DKHM16 (unsigned long long a, unsigned long long b)

DKHM16 (64-bit SIMD Signed Saturating Q15 Multiply)

Type: SIMD

Syntax:

DKHM16 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do Q15xQ15 element multiplications simultaneously. The Q30 results are then reduced to Q15 numbers again.

Description

:

For the

DKHM16 instruction, multiply the top 16-bit Q15 content of 32-bit chunks in Rs1 with the top 16-bit Q15 content of 32-bit chunks in Rs2. At the same time, multiply the bottom 16-bit Q15 content of 32-bit chunks in Rs1 with the bottom 16-bit Q15 content of 32-bit chunks in Rs2.

The Q30 results are then right-shifted 15-bits and saturated into Q15 values. The Q15 results are then written into Rd. When both the two Q15 inputs of a multiplication are 0x8000, saturation will happen. The result will be saturated to 0x7FFF and the overflow flag OV will be set.

Operations:

op1t = Rs1.H[x+1]; op2t = Rs2.H[x+1]; // top
op1b = Rs1.H[x]; op2b = Rs2.H[x]; // bottom
for ((aop,bop,res) in [(op1t,op2t,rest), (op1b,op2b,resb)]) {
  if (0x8000 != aop | 0x8000 != bop) {
    res = (aop s* bop) >> 15;
  } else {
    res= 0x7FFF;
    OV = 1;
  }
}
Rd.W[x/2] = concat(rest, resb);
for RV32: x=0, 2

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DKABS8 (unsigned long long a)

DKABS8 (64-bit SIMD 8-bit Saturating Absolute)

Type: SIMD

Syntax:

DKABS8 Rd, Rs1
# Rd, Rs1 are all even/odd pair of registers

Purpose

:

Get the absolute value of 8-bit signed integer elements simultaneously.

Description

:

This instruction calculates the absolute value of 8-bit signed integer elements stored in Rs1 and writes the element results to Rd. If the input number is 0x80, this instruction generates 0x7f as the output and sets the OV bit to 1.

Operations:

src = Rs1.B[x];
if (src == 0x80) {
  src = 0x7f;
  OV = 1;
} else if (src[7] == 1)
  src = -src;
}
Rd.B[x] = src;
for RV32: x=7...0,

Parameters

a[in] unsigned long long type of value stored in a

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DKABS16 (unsigned long long a)

DKABS16 (64-bit SIMD 16-bit Saturating Absolute)

Type: SIMD

Syntax:

DKABS16 Rd, Rs1
# Rd, Rs1 are all even/odd pair of registers

Purpose

:

Get the absolute value of 16-bit signed integer elements simultaneously.

Description

:

This instruction calculates the absolute value of 16-bit signed integer elements stored in Rs1 and writes the element results to Rd. If the input number is 0x8000, this instruction generates 0x7fff as the output and sets the OV bit to 1.

Operations:

src = Rs1.H[x];
if (src == 0x8000) {
  src = 0x7fff;
  OV = 1;
} else if (src[15] == 1)
  src = -src;
}
Rd.H[x] = src;
for RV32: x=3...0,

Parameters

a[in] unsigned long long type of value stored in a

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DKSLRA8 (unsigned long long a, int b)

DKSLRA8 (64-bit SIMD 8-bit Shift Left Logical with Saturation or Shift Right Arithmetic)

Type: SIMD

Syntax:

DKSLRA8 Rd, Rs1, Rs2
# Rd, Rs1 are all even/odd pair of registers

Purpose

:

Do 8-bit elements logical left (positive) or arithmetic right (negative) shift operation with Q7 saturation for the left shift.

Description

:

The 8-bit data elements of Rs1 are left-shifted logically or right-shifted arithmetically based on the value of Rs2[3:0]. Rs2[3:0] is in the signed range of [-2^3, 2^3-1]. A positive Rs2[3:0] means logical left shift and a negative Rs2[3:0] means arithmetic right shift. The shift amount is the absolute value of Rs2[3:0]. However, the behavior of

Rs2[3:0]==-2^3 (0x8) is defined to be equivalent to the behavior of Rs2[3:0]==-(2^3-1) (0x9). The left-shifted results are saturated to the 8-bit signed integer range of [-2^7, 2^7-1]. If any saturation happens, this instruction sets the OV flag. The value of Rs2[31:4] will not affect this instruction.

Operations:

if (Rs2[3:0] < 0) {
  sa = -Rs2[3:0];
  sa = (sa == 8)? 7 : sa;
  Rd.B[x] = SE8(Rs1.B[x][7:sa]);
} else {
  sa = Rs2[2:0];
  res[(7+sa):0] = Rs1.B[x] <<(logic) sa;
  if (res > (2^7)-1) {
    res[7:0] = 0x7f; OV = 1;
  } else if (res < -2^7) {
    res[7:0] = 0x80; OV = 1;
  }
  Rd.B[x] = res[7:0];
}
for RV32: x=7...0,

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] int type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DKSLRA16 (unsigned long long a, int b)

DKSLRA16 (64-bit SIMD 16-bit Shift Left Logical with Saturation or Shift Right Arithmetic)

Type: SIMD

Syntax:

DKSLRA16 Rd, Rs1, Rs2
# Rd, Rs1 are all even/odd pair of registers

Purpose

:

Do 16-bit elements logical left (positive) or arithmetic right (negative) shift operation with Q15 saturation for the left shift.

Description

:

The 16-bit data elements of Rs1 are left-shifted logically or right-shifted arithmetically based on the value of Rs2[4:0]. Rs2[4:0] is in the signed range of [-2^4, 2^4-1]. A positive Rs2[4:0] means logical left shift and a negative Rs2[4:0] means arithmetic right shift. The shift amount is the absolute value of Rs2[4:0]. However, the behavior of

Rs2[4:0]==-2^4 (0x10) is defined to be equivalent to the behavior of Rs2[4:0]==-(2^4-1) (0x11). The left-shifted results are saturated to the 16-bit signed integer range of [-2^15, 2^15-1]. After the shift, saturation, or rounding, the final results are written to Rd. If any saturation happens, this instruction sets the OV flag. The value of Rs2[31:5] will not affect this instruction.

Operations:

if (Rs2[4:0] < 0) {
  sa = -Rs2[4:0];
  sa = (sa == 16)? 15 : sa;
  Rd.H[x] = SE16(Rs1.H[x][15:sa]);
} else {
  sa = Rs2[3:0];
  res[(15+sa):0] = Rs1.H[x] <<(logic) sa;
  if (res > (2^15)-1) {
    res[15:0] = 0x7fff; OV = 1;
  } else if (res < -2^15) {
    res[15:0] = 0x8000; OV = 1;
  }
  d.H[x] = res[15:0];
}
for RV32: x=3...0,

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] int type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DKADD8 (unsigned long long a, unsigned long long b)

DKADD8 (64-bit SIMD 8-bit Signed Saturating Addition)

Type: SIMD

Syntax:

DKADD8 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do 8-bit signed integer element saturating additions simultaneously.

Description

:

This instruction adds the 8-bit signed integer elements in Rs1 with the 8-bit signed integer elements in Rs2. If any of the results are beyond the Q7 number range (-2^7 <= Q7 <= 2^7-1), they are saturated to the range and the OV bit is set to 1. The saturated results are written to Rd.

Operations:

res[x] = Rs1.B[x] + Rs2.B[x];
if (res[x] > 127) {
  res[x] = 127;
  OV = 1;
} else if (res[x] < -128) {
  res[x] = -128;
  OV = 1;
}
Rd.B[x] = res[x];
for RV32: x=7...0,

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DKADD16 (unsigned long long a, unsigned long long b)

DKADD16 (64-bit SIMD 16-bit Signed Saturating Addition)

Type: SIMD

Syntax:

DKADD16 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do 16-bit signed integer element saturating additions simultaneously.

Description

:

This instruction adds the 16-bit signed integer elements in Rs1 with the 16-bit signed integer elements in Rs2. If any of the results are beyond the Q15 number range (-2^15 <= Q15 <= 2^15-1), they are saturated to the range and the OV bit is set to 1. The saturated results are written to Rd.

Operations:

res[x] = Rs1.H[x] + Rs2.H[x];
if (res[x] > 32767) {
  res[x] = 32767;
  OV = 1;
} else if (res[x] < -32768) {
  res[x] = -32768;
  OV = 1;
}
Rd.H[x] = res[x];
for RV32: x=3...0,

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DKSUB8 (unsigned long long a, unsigned long long b)

DKSUB8 (64-bit SIMD 8-bit Signed Saturating Subtraction)

Type: SIMD

Syntax:

DKSUB8 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do 8-bit signed elements saturating subtractions simultaneously.

Description

:

This instruction subtracts the 8-bit signed integer elements in Rs2 from the 8-bit signed integer elements in Rs1. If any of the results are beyond the Q7 number range (-2^7 <= Q7 <= 2^7-1), they are saturated to the range and the OV bit is set to 1. The saturated results are written to Rd.

Operations:

res[x] = Rs1.B[x] - Rs2.B[x];
if (res[x] > (2^7)-1) {
  res[x] = (2^7)-1;
  OV = 1;
} else if (res[x] < -2^7) {
  res[x] = -2^7;
  OV = 1;
}
Rd.B[x] = res[x];
for RV32: x=7...0,

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DKSUB16 (unsigned long long a, unsigned long long b)

DKSUB16 (64-bit SIMD 16-bit Signed Saturating Subtraction)

Type: SIMD

Syntax:

DKSUB16 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do 16-bit signed integer elements saturating subtractions simultaneously.

Description

:

This instruction subtracts the 16-bit signed integer elements in Rs2 from the 16-bit signed integer elements in Rs1. If any of the results are beyond the Q15 number range (-2^15 <= Q15 <= 2^15-1), they are saturated to the range and the OV bit is set to 1. The saturated results are written to Rd.

Operations:

res[x] = Rs1.H[x] - Rs2.H[x];
if (res[x] > (2^15)-1) {
  res[x] = (2^15)-1;
  OV = 1;
} else if (res[x] < -2^15) {
  res[x] = -2^15;
  OV = 1;
}
Rd.H[x] = res[x];
for RV32: x=3...0,

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

group NMSIS_Core_DSP_Intrinsic_NUCLEI_N2

(RV32 only)Nuclei Customized N2 DSP Instructions

This is Nuclei customized DSP N2 instructions only for RV32

Defines

__RV_DSCLIP8(a, b)

DSCLIP8 (8-bit Signed Saturation and Clip)

Type: SIMD

Syntax:

DSCLIP8 Rd, Rs1, imm3u[2:0]
# Rd, Rs1 are all even/odd pair of registers

Purpose

:

Limit the 8-bit signed integer elements of a register into a signed range simultaneously.

Description

:

This instruction limits the 8-bit signed integer elements stored in Rs1 into a signed integer range between -2^imm3u and 2^imm3u-1, and writes the limited results to Rd. For example, if imm3u is 3, the 8-bit input values should be saturated between 7 and -8. If saturation is performed, set OV bit to 1.

Operations:

src = Rs1.B[x];
if (src > (2^imm3u)-1) {
  src = (2^imm3u)-1;
  OV = 1;
} else if (src < -2^imm3u) {
  src = -2^imm3u;
  OV = 1;
}
Rd.B[x] = src
x=7...0

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__RV_DSCLIP16(a, b)

DSCLIP16 (16-bit Signed Saturation and Clip)

Type: SIMD

Syntax:

DSCLIP16 Rd, Rs1, imm4u[3:0]
# Rd, Rs1 are all even/odd pair of registers

Purpose

:

Limit the 16-bit signed integer elements of a register into a signed range simultaneously.

Description

:

This instruction limits the 16-bit signed integer elements stored in Rs1 into a signed integer range between -2^imm4u and 2^imm4u-1, and writes the limited results to Rd. For example, if imm4u is 3, the 32-bit input values should be saturated between 7 and -8. If saturation is performed, set OV bit to 1.

Operations:

src = Rs1.H[x];
if (src > (2^imm4u)-1) {
  src = (2^imm4u)-1;
  OV = 1;
} else if (src < -2^imm4u) {
  src = -2^imm4u;
  OV = 1;
}
Rd.H[x] = src
x=3...0

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__RV_DSCLIP32(a, b)

DSCLIP32 (32-bit Signed Saturation and Clip)

Type: SIMD

Syntax:

DSCLIP32 Rd, Rs1, imm5u[4:0]
# Rd, Rs1 are all even/odd pair of registers

Purpose

:

Limit the 32-bit signed integer elements of a register into a signed range simultaneously.

Description

:

This instruction limits the 32-bit signed integer elements stored in Rs1 into a signed integer range between -2^imm5u and 2^imm5u-1, and writes the limited results to Rd. For example, if imm5u is 3, the 32-bit input values should be saturated between 7 and -8. If saturation is performed, set OV bit to 1.

Operations:

src = Rs1.W[x];
if (src > (2^imm5u)-1) {
  src = (2^imm5u)-1;
  OV = 1;
} else if (src < -2^imm5u) {
  src = -2^imm5u;
  OV = 1;
}
Rd.W[x] = src
x=1...0

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

Functions

__STATIC_FORCEINLINE unsigned long long __RV_DKHMX8 (unsigned long long a, unsigned long long b)

DKHMX8 (64-bit SIMD Signed Crossed Saturating Q7 Multiply)

Type: SIMD

Syntax:

DKHMX8 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do Q7xQ7 element crossed multiplications simultaneously. The Q15 results are then reduced to Q7 numbers again.

Description

:

For the

KHM8 instruction, multiply the top 8-bit Q7 content of 16-bit chunks in Rs1 with the bottom 8-bit Q7 content of 16-bit chunks in Rs2. At the same time, multiply the bottom 8-bit Q7 content of 16-bit chunks in Rs1 with the top 8-bit Q7 content of 16-bit chunks in Rs2.

The Q14 results are then right-shifted 7-bits and saturated into Q7 values. The Q7 results are then written into Rd. When both the two Q7 inputs of a multiplication are 0x80, saturation will happen. The result will be saturated to 0x7F and the overflow flag OV will be set.

Operations:

op1t = Rs1.B[x+1]; op2t = Rs2.B[x]; // top
op1b = Rs1.B[x]; op2b = Rs2.B[x+1]; // bottom
for ((aop,bop,res) in [(op1t,op2t,rest), (op1b,op2b,resb)]) {
  if (0x80 != aop | 0x80 != bop) {
    res = (aop s* bop) >> 7;
  } else {
    res= 0x7F;
    OV = 1;
  }
}
Rd.H[x/2] = concat(rest, resb);
for RV32, x=0,2,4,6

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DKHMX16 (unsigned long long a, unsigned long long b)

DKHMX16 (64-bit SIMD Signed Crossed Saturating Q15 Multiply)

Type: SIMD

Syntax:

DKHMX16 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do Q15xQ15 element crossed multiplications simultaneously. The Q31 results are then reduced to Q15 numbers again.

Description

:

For the

KHMX16 instruction, multiply the top 16-bit Q15 content of 32-bit chunks in Rs1 with the bottom 16-bit Q15 content of 32-bit chunks in Rs2. At the same time, multiply the bottom 16-bit Q15 content of 32-bit chunks in Rs1 with the top 16-bit Q15 content of 32-bit chunks in Rs2.

The Q30 results are then right-shifted 15-bits and saturated into Q15 values. The Q15 results are then written into Rd. When both the two Q15 inputs of a multiplication are 0x8000, saturation will happen. The result will be saturated to 0x7FFF and the overflow flag OV will be set.

Operations:

op1t = Rs1.H[x+1]; op2t = Rs2.H[x]; // top
op1b = Rs1.H[x]; op2b = Rs2.H[x+1]; // bottom
for ((aop,bop,res) in [(op1t,op2t,rest), (op1b,op2b,resb)]) {
  if (0x8000 != aop | 0x8000 != bop) {
    res = (aop s* bop) >> 15;
  } else {
    res= 0x7FFF;
    OV = 1;
  }
}
Rd.W[x/2] = concat(rest, resb);
for RV32, x=0,2

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DSMMUL (unsigned long long a, unsigned long long b)

DSMMUL (64-bit MSW 32x32 Signed Multiply)

Type: SIMD

Syntax:

DSMMUL Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do MSW 32x32 element signed multiplications simultaneously. The results are written into Rd.

Description

:

This instruction multiplies the 32-bit elements of Rs1 with the 32-bit elements of Rs2 and writes the most significant 32-bit multiplication results to the corresponding 32-bit elements of Rd. The 32-bit elements of Rs1 and Rs2 are treated as signed integers. The .u form of the instruction rounds up the most significant 32-bit of the 64-bit multiplication results by adding a 1 to bit 31 of the results.

Operations:

op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top
op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom
for ((aop,bop,res) in [(op1t,op2t,rest), (op1b,op2b,resb)]) {
  res = (aop s* bop)[63:32];
}
Rd = concat(rest, resb);
x=0

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DSMMUL_U (unsigned long long a, unsigned long long b)

DSMMULU (64-bit MSW 32x32 Unsigned Multiply)

Type: SIMD

Syntax:

DSMMUL.U Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do MSW 32x32 element unsigned multiplications simultaneously. The results are written into Rd.

Description

:

This instruction multiplies the 32-bit elements of Rs1 with the 32-bit elements of Rs2 and writes the most significant 32-bit multiplication results to the corresponding 32-bit elements of Rd. The 32-bit elements of Rs1 and Rs2 are treated as unsigned integers. The .u form of the instruction rounds up the most significant 32-bit of the 64-bit multiplication results by adding a 1 to bit 31 of the results.

Operations:

op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top
op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom
for ((aop,bop,res) in [(op1t,op2t,rest), (op1b,op2b,resb)]) {
  res = RUND(aop u* bop)[63:32];
}
Rd = concat(rest, resb);
x=0

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DKWMMUL (unsigned long long a, unsigned long long b)

DKWMMUL (64-bit MSW 32x32 Signed Multiply & Double)

Type: SIMD

Syntax:

DKWMMUL Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do MSW 32x32 element signed multiplications simultaneously and double. The results are written into Rd.

Description

:

This instruction multiplies the 32-bit elements of Rs1 with the 32-bit elements of Rs2. It then shifts the multiplication results one bit to the left and takes the most significant 32-bit results. If the shifted result is greater than 2^31-1, it is saturated to 2^31-1 and the OV flag is set to 1. The final element result is written to Rd. The 32-bit elements of Rs1 and Rs2 are treated as signed integers. The .u form of the instruction additionally rounds up the 64-bit multiplication results by adding a 1 to bit 30 before the shift and saturation operations.

Operations:

op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top
op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom
for ((aop,bop,res) in [(op1t,op2t,rest), (op1b,op2b,resb)]) {
    res = sat.q31((aop s* bop) << 1)[63:32];
}
Rd = concat(rest, resb);
x=0

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DKWMMUL_U (unsigned long long a, unsigned long long b)

DKWMMULU (64-bit MSW 32x32 Unsigned Multiply & Double)

Type: SIMD

Syntax:

DKWMMUL.U Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do MSW 32x32 element unsigned multiplications simultaneously and double. The results are written into Rd.

Description

:

This instruction multiplies the 32-bit elements of Rs1 with the 32-bit elements of Rs2. It then shifts the multiplication results one bit to the left and takes the most significant 32-bit results. If the shifted result is greater than 2^31-1, it is saturated to 2^31-1 and the OV flag is set to 1. The final element result is written to Rd. The 32-bit elements of Rs1 and Rs2 are treated as signed integers. The .u form of the instruction additionally rounds up the 64-bit multiplication results by adding a 1 to bit 30 before the shift and saturation operations.

Operations:

op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top
op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom
for ((aop,bop,res) in [(op1t,op2t,rest), (op1b,op2b,resb)]) {
  res = sat.q31(RUND(aop u* bop) << 1)[63:32];
}
Rd = concat(rest, resb);
x=0

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DKABS32 (unsigned long long a)

DKABS32 (64-bit SIMD 32-bit Saturating Absolute)

Type: SIMD

Syntax:

DKABS32 Rd, Rs1
# Rd, Rs1 are all even/odd pair of registers

Purpose

:

Get the absolute value of 32-bit signed integer elements simultaneously.

Description

:

This instruction calculates the absolute value of 32-bit signed integer elements stored in Rs1 and writes the element results to Rd. If the input number is 0x8000_0000, this instruction generates 0x7fff_ffff as the output and sets the OV bit to 1.

Operations:

src = Rs1.W[x];
if (src == 0x8000_0000) {
  src = 0x7fff_ffff;
  OV = 1;
} else if (src[31] == 1)
  src = -src;
}
Rd.W[x] = src;
x=1...0

Parameters

a[in] unsigned long long type of value stored in a

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DKSLRA32 (unsigned long long a, int b)

DKSLRA32 (64-bit SIMD 32-bit Shift Left Logical with Saturation or Shift Right Arithmetic)

Type: SIMD

Syntax:

DKSLRA32 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do 31-bit elements logical left (positive) or arithmetic right (negative) shift operation with Q31 saturation for the left shift.

Description

:

The 31-bit data elements of Rs1 are left-shifted logically or right-shifted arithmetically based on the value of Rs2[5:0]. Rs2[5:0] is in the signed range of [-2^5, 2^5-1]. A positive Rs2[5:0] means logical left shift and a negative Rs2[4:0] means arithmetic right shift. The shift amount is the absolute value of Rs2[5:0]. However, the behavior of Rs2[5:0]==- 2^5 (0x20) is defined to be equivalent to the behavior of Rs2[5:0]==-(2^5-1) (0x21).

Operations:

if (Rs2[5:0] < 0) {
  sa = -Rs2[5:0];
  sa = (sa == 32)? 31 : sa;
  Rd.W[x] = SE32(Rs1.W[x][31:sa]);
} else {
  sa = Rs2[4:0];
  res[(31+sa):0] = Rs1.W[x] <<(logic) sa;
  if (res > (2^31)-1) {
  res[31:0] = 0x7fff_ffff; OV = 1;
} else if (res < -2^31) {
  res[31:0] = 0x8000_0000; OV = 1;
}
  Rd.W[x] = res[31:0];
}
x=1...0

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] int type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DKADD32 (unsigned long long a, unsigned long long b)

DKADD32(64-bit SIMD 32-bit Signed Saturating Addition)

Type: SIMD

Syntax:

DKADD32 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do 32-bit signed integer element saturating additions simultaneously.

Description

:

This instruction adds the 32-bit signed integer elements in Rs1 with the 32-bit signed integer elements in Rs2. If any of the results are beyond the Q31 number range (-2^31 <= Q31 <= 2^31-1), they are saturated to the range and the OV bit is set to 1. The saturated results are written to Rd.

Operations:

res[x] = Rs1.W[x] + Rs2.W[x];
if (res[x] > 0x7fff_ffff) {
  res[x] = 0x7fff_ffff;
  OV = 1;
} else if (res[x] < 0x8000_0000) {
  res[x] = 0x8000_0000;
  OV = 1;
}
Rd.W[x] = res[x];
x=1...0

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DKSUB32 (unsigned long long a, unsigned long long b)

DKSUB32 (64-bit SIMD 32-bit Signed Saturating Subtraction)

Type: SIMD

Syntax:

DKSUB32 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do 32-bit signed integer element saturating subtractions simultaneously.

Description

:

This instruction subtracts the 32-bit signed integer elements in Rs2 from the 32-bit signed integer elements in Rs1. If any of the results are beyond the Q31 number range (-2^31 <= Q31 <= 2^31-1), they are saturated to the range and the OV bit is set to 1. The saturated results are written to Rd.

Operations:

res[x] = Rs1.W[x] - Rs2.W[x];
if (res[x] > (2^31)-1) {
  res[x] = (2^31)-1;
  OV = 1;
} else if (res[x] < -2^31) {
  res[x] = -2^31;
  OV = 1;
}
Rd.W[x] = res[x];
x=1...0

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DRADD16 (unsigned long long a, unsigned long long b)

DRADD16 (64-bit SIMD 16-bit Halving Signed Addition)

Type: SIMD

Syntax:

DRADD16 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do 16-bit signed integer element additions simultaneously. The results are halved to avoid overflow or saturation.

Description

:

This instruction adds the 16-bit signed integer elements in Rs1 with the 16-bit signed integer elements in Rs2. The results are first arithmetically right-shifted by 1 bit and then written to Rd.

Operations:

Rd.H[x] = [(Rs1.H[x]) + (Rs2.H[x])] s>> 1;
x=3...0

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DSUB16 (unsigned long long a, unsigned long long b)

DSUB16 (64-bit SIMD 16-bit Halving Signed Subtraction)

Type: SIMD

Syntax:

DSUB16 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do 16-bit integer element subtractions simultaneously.

Description

:

This instruction adds the 16-bit signed integer elements in Rs1 with the 16-bit signed integer elements in Rs2. The results are first arithmetically right-shifted by 1 bit and then written to Rd.

Operations:

Rd.H[x] = [(Rs1.H[x]) - (Rs2.H[x])] ;
x=3...0

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DRADD32 (unsigned long long a, unsigned long long b)

DRADD32 (64-bit SIMD 32-bit Halving Signed Addition)

Type: SIMD

Syntax:

DRADD32 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do 32-bit signed integer element additions simultaneously. The results are halved to avoid overflow or saturation.

Description

:

This instruction adds the 32-bit signed integer elements in Rs1 with the 32-bit signed integer elements in Rs2. The results are first arithmetically right-shifted by 1 bit and then written to Rd.

Operations:

Rd.W[x] = [(Rs1.W[x]) + (Rs2.W[x])] s>> 1;
x=1...0

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DSUB32 (unsigned long long a, unsigned long long b)

DSUB32 (64-bit SIMD 32-bit Halving Signed Subtraction)

Type: SIMD

Syntax:

DSUB32 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do 32-bit integer element subtractions simultaneously.

Description

:

This instruction subtracts the 32-bit signed integer elements in Rs2 from the 32-bit signed integer elements in Rs1 . The results are written to Rd.

Operations:

Rd.W[x] = [(Rs1.E[x]) - (Rs2.E[x])] ;
x=1...0

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DMSR16 (unsigned long a, unsigned long b)

DMSR16 (Signed Multiply Halfs with Right Shift 16-bit and Cross Multiply Halfs with Right Shift 16-bit)

Type: SIMD

Syntax:

DMSR16 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do two signed 16-bit multiplications and cross multiplications from the 16-bit elements of two registers; and each multiplications performs a right shift operation.

Description

:

For the

DMSR16 instruction, multiply the top 16-bit Q15 content of 32-bit chunks in Rs1 with the top 16-bit Q15 content of 32-bit chunks in Rs2, multiply the bottom 16-bit Q15 content of 32-bit chunks in Rs1 with the bottom 16-bit Q15 content of 32-bit chunks in Rs2. At the same time, multiply the top 16-bit Q15 content of 32-bit chunks in Rs1 with the bottom16-bit Q15 content of 32-bit chunks in Rs2 and multiply the bottom16-bit Q15 content of 32-bit chunks in Rs1 with the top16-bit Q15 content of 32-bit chunks in Rs2. The Q31 results are then right-shifted 16-bits and clipped to Q15 values. The Q15 results are then written into Rd.

Operations:

Rd.H[0] = (Rs1.H[0] s* Rs2.H[0]) s>> 16
Rd.H[1] = (Rs1.H[1] s* Rs2.H[1]) s>> 16
Rd.H[2] = (Rs1.H[1] s* Rs2.H[0]) s>> 16
Rd.H[3] = (Rs1.H[0] s* Rs2.H[1]) s>> 16

Parameters
  • a[in] unsigned long type of value stored in a

  • b[in] unsigned long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DMSR17 (unsigned long a, unsigned long b)

DMSR17 (Signed Multiply Halfs with Right Shift 17-bit and Cross Multiply Halfs with Right Shift 17-bit)

Type: SIMD

Syntax:

DMSR17 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do two signed 16-bit multiplications and cross multiplications from the 16-bit elements of two registers; and each multiplications performs a right shift operation.

Description

:

For the

DMSR17 instruction, multiply the top 16-bit Q15 content of 32-bit chunks in Rs1 with the top 16-bit Q15 content of 32-bit chunks in Rs2, multiply the bottom 16-bit Q15 content of 32-bit chunks in Rs1 with the bottom 16-bit Q15 content of 32-bit chunks in Rs2. At the same time, multiply the top 16-bit Q15 content of 32-bit chunks in Rs1 with the bottom 16-bit Q15 content of 32-bit chunks in Rs2 and multiply the bottom 16-bit Q15 content of 32-bit chunks in Rs1 with the top 16-bit Q15 content of 32-bit chunks in Rs2. The Q31 results are then right-shifted 17-bits and clipped to Q15 values. The Q15 results are then written into Rd.

Operations:

Rd.H[0] = (Rs1.H[0] s* Rs2.H[0]) s>> 17
Rd.H[1] = (Rs1.H[1] s* Rs2.H[1]) s>> 17
Rd.H[2] = (Rs1.H[1] s* Rs2.H[0]) s>> 17
Rd.H[3] = (Rs1.H[0] s* Rs2.H[1]) s>> 17

Parameters
  • a[in] unsigned long type of value stored in a

  • b[in] unsigned long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DMSR33 (unsigned long long a, unsigned long long b)

DMSR33 (Signed Multiply with Right Shift 33-bit and Cross Multiply with Right Shift 33-bit)

Type: SIMD

Syntax:

DMSR33 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do two signed 32-bit multiplications from the 32-bit elements of two registers, and each multiplications performs a right shift operation.

Description

:

For the

DMSR33 instruction, multiply the top 32-bit Q31 content of 64-bit chunks in Rs1 with the top 32-bit Q31 content of 64-bit chunks in Rs2. At the same time, multiply the bottom 32-bit Q31 content of 64bit chunks in Rs1 with the bottom 32-bit Q31 content of 64-bit. The Q64 results are then right-shifted 33-bits and clipped to Q31 values. The Q31 results are then written into Rd.

Operations:

Rd.W[0] = (Rs1.W[0] s* Rs2.W[0]) s>> 33
Rd.W[1] = (Rs1.W[1] s* Rs2.W[1]) s>> 33

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DMXSR33 (unsigned long long a, unsigned long long b)

DMXSR33 (Signed Multiply with Right Shift 33-bit and Cross Multiply with Right Shift 33-bit)

Type: SIMD

Syntax:

DMXSR33 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do two signed 32-bit cross multiplications from the 32-bit elements of two registers, and each multiplications performs a right shift operation.

Description

:

For the

DMXSR33 instruction, multiply the top 32-bit Q31 content of 64-bit chunks in Rs1 with the bottom 32-bit Q31 content of 64-bit chunks in Rs2. At the same time, multiply the bottom 32-bit Q31 content of 64-bit chunks in Rs1 with the top 32-bit Q31 content of 64-bit chunks in Rs2. The Q63 results are then right-shifted 33-bits and clipped to Q31 values. The Q31 results are then written into Rd.

Operations:

Rd.W[0] = (Rs1.W[0] s* Rs2.W[1]) s>> 33
Rd.W[1] = (Rs1.W[1] s* Rs2.W[0]) s>> 33

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long __RV_DREDAS16 (unsigned long long a)

DREDAS16 (Reduced Addition and Reduced Subtraction)

Type: SIMD

Syntax:

DREDAS16 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do halfs reduced subtraction and halfs reduced addition from a register. The result is written to Rd.

Description

:

For the

DREDAS16 instruction, subtract the top 16-bit Q15 element from the bottom 16-bit Q15 element of the bottom 32-bit Q31 content of 64-bit chunks in Rs1. At the same time, add the the top16-bit Q15 element with the bottom16-bit Q15 element of the top 32-bit Q31 content of 64-bit chunks in Rs1. The two Q15 results are then written into Rd.

Operations:

Rd.H[0] = Rs1.H[0] - Rs1.H[1]
Rd.H[1] = Rs1.H[2] + Rs1.H[3]

Parameters

a[in] unsigned long long type of value stored in a

Returns

value stored in unsigned long type

__STATIC_FORCEINLINE unsigned long __RV_DREDSA16 (unsigned long long a)

DREDSA16 (Reduced Subtraction and Reduced Addition)

Type: SIMD

Syntax:

DREDSA16 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do halfs reduced subtraction and halfs reduced addition from a register. The result is written to Rd.

Description

:

For the

DREDSA16 instruction, add the top 16-bit Q15 element from the bottom 16-bit Q15 element of the bottom 32-bit Q31 content of 64-bit chunks in Rs1. At the same time, subtract the the top16-bit Q15 element with the bottom16-bit Q15 element of the top 32-bit Q31 content of 64-bit chunks in Rs1. The two Q15 results are then written into Rd.

Operations:

Rd.H[0] = Rs1.H[0] + Rs1.H[1]
Rd.H[1] = Rs1.H[2] - Rs1.H[3]

Parameters

a[in] unsigned long longtype of value stored in a

Returns

value stored in unsigned long type

__STATIC_FORCEINLINE int16_t __RV_DKCLIP64 (unsigned long long a)

DKCLIP64 (64-bit Clipped to 16-bit Saturation Value)

Type: SIMD

Syntax:

DKCLIP64 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do 15-bit element arithmetic right shift operations and limit result into 32-bit int,then do saturate operation to 16-bit and clip result to 16-bit Q15.

Description

:

For the

DKCLIP64 instruction, shift the input 15 bits to the right and data convert the result to 32-bit int type, after which the input is saturated to limit the data to between 2^15-1 and -2^15. the result is converted to 16-bits q15 type. The final results are written to Rd.

Operations:

const int32_t max = (int32_t)((1U << 15U) - 1U);
const int32_t min = -1 - max ;
int32_t val = (int32_t)(Rs s>> 15);
if (val > max) {
  Rd = max;
} else if (val < min) {
  Rd = min;
} else {
  Rd = (int16_t)val;
}

Parameters

a[in] unsigned long long type of value stored in a

Returns

value stored in int16_t type

__STATIC_FORCEINLINE unsigned long long __RV_DKMDA (unsigned long long a, unsigned long long b)

DKMDA (Signed Multiply Two Halfs and Add)

Type: SIMD

Syntax:

DKMDA Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do two signed 16-bit multiplications from the 32-bit elements of two registers; and then adds the two 32-bit results together. The addition result may be saturated.

Description

:

This instruction multiplies the bottom 16-bit content of the 32-bit elements of Rs1 with the bottom 16-bit content of the 32-bit elements of Rs2 and then adds the result to the result of multiplying the top 16-bit content of the 32-bit elements of Rs1 with the top 16-bit content of the 32-bit elements of Rs2. The addition result is checked for saturation. If saturation happens, the result is saturated to 2^31-1 The final results are written to Rd. The 16-bit contents are treated as signed integers

Operations:

if (Rs1.W[x] != 0x80008000) or (Rs2.W[x] != 0x80008000){
  Rd.W[x] = (Rs1.W[x].H[1] * Rs2.W[x].H[1]) + (Rs1.W[x].H[0] * Rs2.W[x].H[0]);
} else {
  Rd.W[x] = 0x7fffffff;
  OV = 1;
}
x=1...0

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DKMXDA (unsigned long long a, unsigned long long b)

DKMXDA (Signed Crossed Multiply Two Halfs and Add)

Type: SIMD

Syntax:

DKMXDA Rd, Rs1, Rs2

Purpose

:

Do two signed 16-bit multiplications from the 32-bit elements of two registers; and then adds the two 32-bit results together. The addition result may be saturated.

  • DKMXDA: top*bottom + top*bottom (per 32-bit element)

Description

:

This instruction multiplies the bottom 16-bit content of the 32-bit elements of Rs1 with the top 16-bit content of the 32-bit elements of Rs2 and then adds the result to the result of multiplying the top 16-bit content of the 32-bit elements of Rs1 with the bottom 16-bit content of the 32-bit elements of Rs2. The addition result is checked for saturation.If saturation happens, the result is saturated to 2^31-1 The final results are written to Rd. The 16-bit contents are treated as signed integers.

Operations:

if (Rs1.W[x] != 0x80008000) or (Rs2.W[x] != 0x80008000){
Rd.W[x] = (Rs1.W[x].H[1] * Rs2.W[x].H[0]) + (Rs1.W[x].H[0] * Rs2.W[x].H[1]);
} else {
Rd.W[x] = 0x7fffffff;
OV = 1;
}
x=1...0

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DSMDRS (unsigned long long a, unsigned long long b)

DSMDRS (Signed Multiply Two Halfs and Reverse Subtract)

Type: SIMD

Syntax:

DSMDRS Rd, Rs1, Rs2

Purpose

:

Do two signed 16-bit multiplications from the 32-bit elements of two registers; and then perform a subtraction operation between the two 32-bit results.

  • DSMDRS: bottom*bottom - top*top (per 32-bit element)

Description

:

This instruction multiplies the top 16-bit content of the 32-bit elements of Rs1 with the top 16-bit content of the 32-bit elements of Rs2 and then subtracts the result from the result of multiplying the bottom 16-bit content of the 32-bit elements of Rs1 with the bottom 16-bit content of the 32-bit elements of Rs2. The subtraction result is written to the corresponding 32-bit element of Rd (The 16-bit contents of multiplication are treated as signed integers).

Operations:

Rd.W[x] = (Rs1.W[x].H[0] * Rs2.W[x].H[0]) - (Rs1.W[x].H[1] * Rs2.W[x].H[1]); x = 1...0

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DSMXDS (unsigned long long a, unsigned long long b)

DSMXDS (Signed Crossed Multiply Two Halfs and Subtract)

Type: SIMD

Syntax:

DSMXDS Rd, Rs1, Rs2

Purpose

:

Do two signed 16-bit multiplications from the 32-bit elements of two registers; and then perform a subtraction operation between the two 32-bit results.

  • DSMXDS: top*bottom - bottom*top (per 32-bit element)

Description

:

This instruction multiplies the bottom 16-bit content of the 32-bit elements of Rs1 with the top 16-bit content of the 32-bit elements of Rs2 and then subtracts the result from the result of multiplying the top 16-bit content of the 32-bit elements of Rs1 with the bottom 16-bit content of the 32-bit elements of Rs2. The subtraction result is written to the corresponding 32-bit element of Rd. The 16-bit contents of multiplication are treated as signed integers.

Operations:

Rd.W[x] = (Rs1.W[x].H[1] * Rs2.W[x].H[0]) - (Rs1.W[x].H[0] * Rs2.W[x].H[1]); x = 1...0

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE long long __RV_DSMBB32 (unsigned long long a, unsigned long long b)

DSMBB32 (Signed Multiply Bottom Word & Bottom Word)

Type: SIMD

Syntax:

DSMBB32 Rd, Rs1, Rs2

Purpose

:

Multiply the signed 32-bit element of a register with the signed 32-bit element of another register and write the 64-bit result to a third register.

  • DSMBB32: bottom*bottom

Description

:

This instruction multiplies the bottom 32-bit element of Rs1 with the bottom 32-bit element of Rs2. The 64-bit multiplication result is written to Rd. The 32-bit contents of Rs1 and Rs2 are treated as signed integers.

Operations:

res = (Rs1.W[0] * Rs2.W[0]);
Rd = res;

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in long long type

__STATIC_FORCEINLINE long long __RV_DSMBB32_SRA14 (unsigned long long a, unsigned long long b)

DSMBB32.sra14 (Signed Crossed Multiply Two Halfs and Subtract with Right Shift 14)

Type: SIMD

Syntax:

DSMBB32.sra14 Rd, Rs1, Rs2

Purpose

:

Multiply the signed 32-bit element of a register with the signed 32-bit element of another register, then right shift 14- bit,finally write the 64-bit result to a third register.

  • DSMBB32.SRL14: bottom*bottom s>> 14

Description

:

This instruction multiplies the bottom 32-bit element of Rs1 with the bottom 32-bit element of Rs2. The 64-bit multiplication result is written to Rd after right shift 14-bit. The 32-bit contents of Rs1 and Rs2 are treated as signed integers.

Operations:

res = (Rs1.W[0] * Rs2.W[0]) s>> 14;
Rd = res;

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in long long type

__STATIC_FORCEINLINE long long __RV_DSMBB32_SRA32 (unsigned long long a, unsigned long long b)

DSMBB32.sra32 (Signed Crossed Multiply Two Halfs and Subtract with Right Shift 32)

Type: SIMD

Syntax:

DSMBB32.sra32 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Multiply the signed 32-bit element of a register with the signed 32-bit element of another register, then right shift 32- bit,finally write the 64-bit result to a third register.

  • DSMBB32.SRL32: bottom*bottom s >> 32

Description

:

This instruction multiplies the bottom 32-bit element of Rs1 with the bottom 32-bit element of Rs2. The 64-bit multiplication result is written to Rd after right shift 32-bit. The 32-bit contents of Rs1 and Rs2 are treated as signed integers.

Operations:

res = (Rs1.W[0] * Rs2.W[0]) s>> 32;
Rd = res;

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in long long type

__STATIC_FORCEINLINE long long __RV_DSMBT32 (unsigned long long a, unsigned long long b)

SMBT32 (Signed Multiply Bottom Word & Top Word)

Type: SIMD

Syntax:

DSMBT32 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Multiply the signed 32-bit element of a register with the signed 32-bit element of another register and write the 64-bit result to a third register.

  • DSMBT32: bottom*top

Description

:

This instruction multiplies the bottom 32-bit element of Rs1 with the top 32-bit element of Rs2. The 64-bit multiplication result is written to Rd. The 32-bit contents of Rs1 and Rs2 are treated as signed integers.

Operations:

res = (Rs1.W[0] * Rs2.W[0]);
Rd = res;

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in long long type

__STATIC_FORCEINLINE long long __RV_DSMBT32_SRA14 (unsigned long long a, unsigned long long b)

DSMBT32.sra14 (Signed Multiply Bottom Word & Top Word with Right Shift 14)

Type: SIMD

Syntax:

DSMBT32.sra14 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Multiply the signed 32-bit element of a register with the signed 32-bit element of another register, then right shift 14- bit,finally write the 64-bit result to a third register.

  • DSMBT32.SRL14: bottom*bottom s>> 14

Description

:

This instruction multiplies the bottom 32-bit element of Rs1 with the top 32-bit element of Rs2. The 64-bit multiplication result is written to Rd after right shift 14-bit. The 32-bit contents of Rs1 and Rs2 are treated as signed integers.

Operations:

res = (Rs1.W[0] * Rs2.W[0]) s>> 14;
Rd = res;

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in long long type

__STATIC_FORCEINLINE long long __RV_DSMBT32_SRA32 (unsigned long long a, unsigned long long b)

DSMBT32.sra32 (Signed Crossed Multiply Two Halfs and Subtract with Right Shift 32)

Type: SIMD

Syntax:

DSMBT32.sra32 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Multiply the signed 32-bit element of a register with the signed 32-bit element of another register, then right shift 32- bit,finally write the 64-bit result to a third register.

  • DSMBT32.SRL32: bottom*bottom s>> 32

Description

:

This instruction multiplies the bottom 32-bit element of Rs1 with the top 32-bit element of Rs2. The 64-bit multiplication result is written to Rd after right shift 32-bit. The 32-bit contents of Rs1 and Rs2 are treated as signed integers.

Operations:

res = (Rs1.W[0] * Rs2.W[0]) s>> 14;
Rd = res;

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in long long type

__STATIC_FORCEINLINE long long __RV_DSMTT32 (unsigned long long a, unsigned long long b)

DSMTT32 (Signed Multiply Top Word & Top Word)

Type: SIMD

Syntax:

DSMTT32 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Multiply the signed 32-bit element of a register with the signed 32-bit element of another register and write the 64-bit result to a third register.

  • DSMTT32: top*top

Description

:

This instruction multiplies the top 32-bit element of Rs1 with the top 32-bit element of Rs2. The 64-bit multiplication result is written to Rd. The 32-bit contents of Rs1 and Rs2 are treated as signed integers.

Operations:

res = Rs1.W[1] * Rs2.W[1];
Rd = res;

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in long long type

__STATIC_FORCEINLINE long long __RV_DSMTT32_SRA14 (unsigned long long a, unsigned long long b)

DSMTT32.sra14 (Signed Multiply Top Word & Top Word with Right Shift 14-bit)

Type: SIMD

Syntax:

DSMTT32.sra14 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Multiply the signed 32-bit element of a register with the signed 32-bit element of another register,then right shift 14-bit, finally write the 64-bit result to a third register.

  • DSMTT32.SRL14: top*top s>> 14

Description

:

This instruction multiplies the top 32-bit element of Rs1 with the top 32-bit element of Rs2. The 64-bit multiplication result is written to Rd after right shift 14-bit. The 32-bit contents of Rs1 and Rs2 are treated as signed integers.

Operations:

res = Rs1.W[1] * Rs2.W[1] >> 14;
Rd = res;

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in long long type

__STATIC_FORCEINLINE long long __RV_DSMTT32_SRA32 (unsigned long long a, unsigned long long b)

DSMTT32.sra32 (Signed Multiply Top Word & Top Word with Right Shift 32-bit)

Type: SIMD

Syntax:

DSMTT32.sra32 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Multiply the signed 32-bit element of a register with the signed 32-bit element of another register,then right shift 32-bit, finally write the 64-bit result to a third register.

  • DSMTT32.SRL14: top*top s>> 32

Description

:

This instruction multiplies the top 32-bit element of Rs1 with the top 32-bit element of Rs2. The 64-bit multiplication result is written to Rd after right shift 32-bit. The 32-bit contents of Rs1 and Rs2 are treated as signed integers.

Operations:

res = Rs1.W[1] * Rs2.W[1] >> 32;
Rd = res;

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in long long type

__STATIC_FORCEINLINE unsigned long long __RV_DPKBB32 (unsigned long long a, unsigned long long b)

DPKBB32 (Pack Two 32-bit Data from Both Bottom Half)

Type: SIMD

Syntax:

DPKBB32 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Pack 32-bit data from 64-bit chunks in two registers.

  • DPKBB32: bottom.bottom

Description

:

This instruction moves Rs1.W[0] to Rd.W[1] and moves Rs2.W[0] to Rd.W[0].

Operations:

Rd = CONCAT(Rs1.W[0], Rs2.W[0]);

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DPKBT32 (unsigned long long a, unsigned long long b)

DPKBT32 (Pack Two 32-bit Data from Bottom and Top Half)

Type: SIMD

Syntax:

DPKBT32 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Pack 32-bit data from 64-bit chunks in two registers.

  • DPKBT32: bottom.top

Description

:

This instruction moves Rs1.W[0] to Rd.W[1] and moves Rs2.W[1] to Rd.W[0].

Operations:

Rd = CONCAT(Rs1.W[0], Rs2.W[1]);

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DPKTT32 (unsigned long long a, unsigned long long b)

DPKTT32 (Pack Two 32-bit Data from Both Top Half)

Type: SIMD

Syntax:

DPKTT32 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Pack 32-bit data from 64-bit chunks in two registers.

  • DPKTT32: top.top

Description

:

This instruction moves Rs1.W[1] to Rd.W[0] and moves Rs2.W[1] to Rd.W[0].

Operations:

Rd = CONCAT(Rs1.W[1], Rs2.W[1]);

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DPKTB32 (unsigned long long a, unsigned long long b)

DPKTB32 (Pack Two 32-bit Data from Top and Bottom Half)

Type: SIMD

Syntax:

DPKTB32 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Pack 32-bit data from 64-bit chunks in two registers.

  • DPKTB32: top.bottom

Description

:

This instruction moves Rs1.W[1] to Rd.W[1] and moves Rs2.W[0] to Rd.W[0].

Operations:

Rd = CONCAT(Rs1.W[1], Rs2.W[0]);

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DPKTB16 (unsigned long long a, unsigned long long b)

DPKTB16 (Pack Two 32-bit Data from Top and Bottom Half)

Type: SIMD

Syntax:

DPKTB16 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Pack 16-bit data from 32-bit chunks in two registers.

  • DPKTB16: top.bottom

Description

:

This instruction moves Rs1.W[x] [31:16] to Rd.W[x] [31:16] and moves Rs2.W[x] [15:0] to Rd.W[x] [15:0].

Operations:

Rd.W[x][31:0] = CONCAT(Rs1.W[x][31:16], Rs2.W[x][15:0]);
x=1...0

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DPKBB16 (unsigned long long a, unsigned long long b)

DPKBB16 (Pack Two 16-bit Data from Both Bottom Half)

Type: SIMD

Syntax:

DPKBB16 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Pack 16-bit data from 32-bit chunks in two registers.

  • PKBB16: bottom.bottom

Description

:

This instruction moves Rs1.W[x][15:0] to Rd.W[x][31:16] and moves Rs2.W[x] [15:0] to Rd.W[x] [15:0].

Operations:

Rd.W[x][31:0] = CONCAT(Rs1.W[x][15:0], Rs2.W[x][15:0]);
x=1...0

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DPKBT16 (unsigned long long a, unsigned long long b)

DPKBT16 (Pack Two 16-bit Data from Bottom and Top Half)

Type: SIMD

Syntax:

DPKBT16 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Pack 16-bit data from 32-bit chunks in two registers.

  • PKBT16: bottom.top

Description

:

This instruction moves Rs1.W[x] [15:0] to Rd.W[x] [31:16] and moves Rs2.W[x] [31:16] to Rd.W[x] [15:0].

Operations:

Rd.W[x][31:0] = CONCAT(Rs1.W[x][15:0], Rs2.W[x][31:16]);
x=1...0

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DPKTT16 (unsigned long long a, unsigned long long b)

DPKTT16 (Pack Two 16-bit Data from Both Top Half)

Type: SIMD

Syntax:

DPKTT16 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Pack 16-bit data from 32-bit chunks in two registers.

  • PKTT16 top.top

Description

:

This instruction moves Rs1.W[x] [31:16] to Rd.W[x] [31:16] and moves Rs2.W[x] [31:16] to Rd.W[x] [15:0].

Operations:

Rd.W[x][31:0] = CONCAT(Rs1.W[x][31:16], Rs2.W[x][31:16]);
x=1...0

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DSRA16 (unsigned long long a, unsigned long b)

DSRA16 (32-bit Signed Saturating Cross Addition & Subtraction)

Type: SIMD

Syntax:

DSRA16 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do 16-bit element arithmetic right shift operations simultaneously. The shift amount is a variable from a GPR.

Description

:

The 16-bit data elements in Rs1 are right-shifted arithmetically, that is, the shifted out bits are filled with the sign-bit of the data elements. The shift amount is specified by the low-order 4-bits of the value in the Rs2 register. And the results are written to Rd.

Operations:

sa = Rs2[3:0];
if (sa != 0)
{
Rd.H[x] = SE16(Rs1.H[x][15:sa]);
} else {
Rd = Rs1;
}
x=3...0

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DADD16 (unsigned long long a, unsigned long long b)

DADD16 (16-bit Addition)

Type: SIMD

Syntax:

DADD16 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do 16-bit integer element additions simultaneously.

Description

:

This instruction adds the 16-bit unsigned integer elements in Rs1 with the 16-bit unsigned integer elements in Rs2. And the results are written to Rd.

Operations:

Rd.H[x] = Rs1.H[x] + Rs2.H[x];
x=3...0

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DADD32 (unsigned long long a, unsigned long long b)

DADD32 (32-bit Addition)

Type: SIMD

Syntax:

DADD32 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do 32-bit integer element additions simultaneously.

Description

:

This instruction adds the 32-bit integer elements in Rs1 with the 32-bit integer elements in Rs2, and then writes the 32-bit element results to Rd.

Operations:

Rd.W[x] = Rs1.W[x] + Rs2.W[x];
x=1...0

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DSMBB16 (unsigned long long a, unsigned long long b)

DSMBB16 (Signed Multiply Bottom Half & Bottom Half)

Type: SIMD

Syntax:

DSMBB16 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Multiply the signed 16-bit content of the 32-bit elements of a register with the signed 16-bit content of the 32-bit elements of another register and write the result to a third register.

  • DSMBB16: W[x].bottom*W[x].bottom

Description

:

For the

DSMBB16 instruction, it multiplies the bottom 16-bit content of the 32-bit elements of Rs1 with the bottom 16-bit content of the 32-bit elements of Rs2. The multiplication results are written to Rd. The 16-bit contents of Rs1 and Rs2 are treated as signed integers.

Operations:

Rd.W[x] = Rs1.W[x].H[0] * Rs2.W[x].H[0];
x=1...0

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DSMBT16 (unsigned long long a, unsigned long long b)

DSMBT16 (Signed Multiply Bottom Half & Top Half)

Type: SIMD

Syntax:

DSMBT16 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Multiply the signed 16-bit content of the 32-bit elements of a register with the signed 16-bit content of the 32-bit elements of another register and write the result to a third register.

  • DSMBT16: W[x].bottom *W[x].top

Description

:

For the

DSMBT16 instruction, it multiplies the bottom 16-bit content of the 32-bit elements of Rs1 with the top 16-bit content of the 32-bit elements of Rs2. The multiplication results are written to Rd. The 16-bit contents of Rs1 and Rs2 are treated as signed integers.

Operations:

Rd.W[x] = Rs1.W[x].H[0] * Rs2.W[x].H[1];
x=1...0

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DSMTT16 (unsigned long long a, unsigned long long b)

DSMTT16 (Signed Multiply Top Half & Top Half)

Type: SIMD

Syntax:

DSMTT16 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Multiply the signed 16-bit content of the 32-bit elements of a register with the signed 16-bit content of the 32-bit elements of another register and write the result to a third register.

  • DSMTT16: W[x].top * W[x].top

Description

:

For the

DSMTT16 instruction, it multiplies the top 16-bit content of the 32-bit elements of Rs1 with the top 16-bit content of the 32-bit elements of Rs2. The multiplication results are written to Rd. The 16-bit contents of Rs1 and Rs2 are treated as signed integers.

Operations:

Rd.W[x] = Rs1.W[x].H[1] * Rs2.W[x].H[1];
x=1...0

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DRCRSA16 (unsigned long long a, unsigned long long b)

DRCRSA16 (16-bit Signed Halving Cross Subtraction & Addition)

Type: SIMD

Syntax:

DRCRSA16 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do 16-bit signed integer element subtraction and 16-bit signed integer element addition in a 32-bit chunk simultaneously. Operands are from crossed positions in 32-bit chunks. The results are halved to avoid overflow or saturation.

Description

:

This instruction subtracts the 16-bit signed integer in [31:16] of 32-bit chunks in Rs1 with the 16-bit signed integer in [15:0] of 32-bit chunks in Rs2, and adds the 16-bit signed integer in [31:16] of 32-bit chunks in Rs2 from the 16-bit signed integer in [15:0] of 32-bit chunks in Rs1. The element results are first logically right-shifted by 1 bit and then written to [31:16] of 32- bit chunks in Rd and [15:0] of 32-bit chunks in Rd.

Operations:

Rd.W[x][31:16] = (Rs1.W[x][31:16] - Rs2.W[x][15:0]) s>> 1;
Rd.W[x][15:0] = (Rs1.W[x][15:0] + Rs2.W[x][31:16]) s>> 1;
x=1...0

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DRCRSA32 (unsigned long long a, unsigned long long b)

DRCRSA32 (32-bit Signed Halving CrossSubtraction & Addition)

Type: SIMD

Syntax:

DRCRSA32 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do 32-bit signed integer element subtraction and 32-bit signed integer element addition in a 64-bit chunk simultaneously. Operands are from crossed 32-bit elements. The results are halved to avoid overflow or saturation.

Description

:

This instruction subtracts the 32-bit signed integer element in [63:32] of Rs1 with the 32-bit signed integer element in [31:0] of Rs2, and adds the 32-bit signed integer element in [63:32] of Rs2 from the 32-bit signed integer element in [31:0] of Rs1. The element results are first arithmetically right-shifted by 1 bit and then written to [63:32] of Rd for addition and [31:0] of Rd for subtraction.

Operations:

Rd.W[1] = (Rs1.W[1] - Rs2.W[0]) s>> 1;
Rd.W[0] = (Rs1.W[0] + Rs2.W[1]) s>> 1;

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DRCRAS16 (unsigned long long a, unsigned long long b)

DRCRAS16 (16-bit Signed Halving Cross Addition & Subtraction)

Type: SIMD

Syntax:

DRCRAS16 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do 16-bit signed integer element subtraction and 16-bit signed integer element addition in a 32-bit chunk simultaneously. Operands are from crossed positions in 32-bit chunks. The results are halved to avoid overflow or saturation.

Description

:

This instruction adds the 16-bit unsigned integer in [31:16] of 32-bit chunks in Rs1 with the 16-bit unsigned integer in [15:0] of 32-bit chunks in Rs2, and subtracts the 16-bit unsigned integer in [31:16] of 32-bit chunks in Rs2 from the 16-bit unsigned integer in [15:0] of 32-bit chunks in Rs1. The element results are first logically right-shifted by 1 bit and then written to [31:16] of 32-bit chunks in Rd and [15:0] of 32-bit chunks in Rd.

Operations:

Rd.W[x][31:16] = (Rs1.W[x][31:16] + Rs2.W[x][15:0]) s>> 1;
Rd.W[x][15:0] = (Rs1.W[x][15:0] - Rs2.W[x][31:16]) s>> 1;
x=1...0

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DRCRAS32 (unsigned long long a, unsigned long long b)

DRCRAS32 (32-bit Signed Cross Addition & Subtraction)

Type: SIMD

Syntax:

DRCRAS32 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do 32-bit signed integer element addition and 32-bit signed integer element subtraction in a 64-bit chunk simultaneously. Operands are from crossed 32-bit elements. The results are halved to avoid overflow or saturation.

Description

:

This instruction adds the 32-bit signed integer element in [63:32] of Rs1 with the 32-bit signed integer element in [31:0] of Rs2, and subtracts the 32-bit signed integer element in [63:32] of Rs2 from the 32-bit signed integer element in [31:0] of Rs1. The element results are first arithmetically right-shifted by 1 bit and then written to [63:32] of Rd for addition and [31:0] of Rd for subtraction.

Operations:

Rd.W[1] = (Rs1.W[1] + Rs2.W[0]) s>> 1;
Rd.W[0] = (Rs1.W[0] - Rs2.W[1]) s>> 1;

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DKCRAS16 (unsigned long long a, unsigned long long b)

DKCRAS16 (16-bit Signed Saturating Cross Addition & Subtraction)

Type: SIMD

Syntax:

DKCRAS16 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do 16-bit signed integer element saturating addition and 16-bit signed integer element saturating subtraction in a 32-bit chunk simultaneously. Operands are from crossed positions in 32-bit chunks.

Description

:

This instruction adds the 16-bit signed integer element in [31:16] of 32-bit chunks in Rs1 with the 16-bit signed integer element in [15:0] of 32-bit chunks in Rs2; at the same time, it subtracts the 16-bit signed integer element in [31:16] of 32-bit chunks in Rs2 from the 16-bit signed integer element in [15:0] of 32-bit chunks in Rs1. If any of the results are beyond the Q15 number range (-2^15 <= Q15 <= 2^15-1), they are saturated to the range and the OV bit is set to 1. The saturated results are written to [31:16] of 32-bit chunks in Rd for subtraction and [15:0] of 32-bit chunks in Rd for addition.

Operations:

res1 = Rs1.W[x][31:16] - Rs2.W[x][15:0];
res2 = Rs1.W[x][15:0] + Rs2.W[x][31:16];
for (res in [res1, res2]) {
  if (res > (2^15)-1) {
    res = (2^15)-1;
    OV = 1;
  } else if (res < -2^15) {
    res = -2^15;
    OV = 1;
  }
}
Rd.W[x][31:16] = res1;
Rd.W[x][15:0] = res2;
x=1...0

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DKCRSA16 (unsigned long long a, unsigned long long b)

DKCRSA16 (16-bit Signed Saturating Cross Subtraction & Addition)

Type: SIMD

Syntax:

DKCRSA16 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do 16-bit signed integer element saturating subtraction and 16-bit signed integer element saturating addition in a 32-bit chunk simultaneously. Operands are from crossed positions in 32-bit chunks.

Description

:

This instruction subtracts the 16-bit signed integer element in [15:0] of 32-bit chunks in Rs2 from the 16-bit signed integer element in [31:16] of 32-bit chunks in Rs1; at the same time, it adds the 16-bit signed integer element in [31:16] of 32-bit chunks in Rs2 with the 16-bit signed integer element in [15:0] of 32-bit chunks in Rs1. If any of the results are beyond the Q15 number range (-2^15 <= Q15 <= 2^15-1), they are saturated to the range and the OV bit is set to 1. The saturated results are written to [31:16] of 32-bit chunks in Rd for addition and [15:0] of 32-bit chunks in Rd for subtraction.

Operations:

res1 = Rs1.W[x][31:16] + Rs2.W[x][15:0];
res2 = Rs1.W[x][15:0] - Rs2.W[x][31:16];
for (res in [res1, res2]) {
  if (res > (2^15)-1) {
    res = (2^15)-1;
    OV = 1;
  } else if (res < -2^15) {
    res = -2^15;
    OV = 1;
  }
}
Rd.W[x][31:16] = res1;
Rd.W[x][15:0] = res2;
x=1...0

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DRSUB16 (unsigned long long a, unsigned long long b)

DRSUB16 (16-bit Signed Halving Subtraction)

Type: SIMD

Syntax:

DRSUB16 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do 16-bit signed integer element subtractions simultaneously. The results are halved to avoid overflow or saturation.

Description

:

This instruction subtracts the 16-bit signed integer elements in Rs2 from the 16-bit signed integer elements in Rs1. The results are first arithmetically right-shifted by 1 bit and then written to Rd.

Operations:

Rd.H[x] = (Rs1.H[x] - Rs2.H[x]) s>> 1;
x=3...0

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DSTSA32 (unsigned long long a, unsigned long long b)

DSTSA32 (32-bit Straight Subtraction & Addition)

Type: SIMD

Syntax:

DSTSA32 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do 32-bit integer element subtraction and 32-bit integer element addition in a 64-bit chunk simultaneously. Operands are from corresponding 32-bit elements.

Description

:

This instruction subtracts the 32-bit integer element in [63:32] of Rs2 from the 32-bit integer element in [63:32] of Rs1, and writes the result to [63:32] of Rd; at the same time, it adds the 32-bit integer element in [31:0] of Rs1 with the 32-bit integer element in [31:0] of Rs2, and writes the result to [31:0] of Rd.

Operations:

Rd.W[1] = Rs1.W[1] - Rs2.W[1];
Rd.W[0] = Rs1.W[0] + Rs2.W[0];

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DSTAS32 (unsigned long long a, unsigned long long b)

DSTAS32 (SIMD 32-bit Straight Addition & Subtractionn)

Type: SIMD

Syntax:

DSTAS32 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do 32-bit integer element addition and 32-bit integer element subtraction in a 64-bit chunk simultaneously. Operands are from corresponding 32-bit elements.

Description

:

This instruction adds the 32-bit integer element in [63:32] of Rs1 with the 32-bit integer element in [63:32] of Rs2, and writes the result to [63:32] of Rd; at the same time, it subtracts the 32-bit integer element in [31:0] of Rs2 from the 32-bit integer element in [31:0] of Rs1, and writes the result to [31:0] of Rd.

Operations:

Rd.W[1] = Rs1.W[1] + Rs2.W[1];
Rd.W[0] = Rs1.W[0] - Rs2.W[0];

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DKCRSA32 (unsigned long long a, unsigned long long b)

DKCRSA32 (32-bit Signed Saturating Cross Subtraction & Addition)

Type: SIMD

Syntax:

DKCRSA32 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do 32-bit signed integer element saturating subtraction and 32-bit signed integer element saturating addition in a 64-bit chunk simultaneously. Operands are from crossed 32-bit elements.

Description

:

This instruction subtracts the 32-bit integer element in [31:0] of Rs2 from the 32-bit integer element in [63:32] of Rs1; at the same time, it adds the 32-bit integer element in [31:0] of Rs1 with the 32-bit integer element in [63:32] of Rs2. If any of the results are beyond the Q31 number range (-2^31 <= Q31 <= 2^31-1), they are saturated to the range and the OV bit is set to 1. The saturated results are written to [63:32] of Rd for subtraction and [31:0] of Rd for addition.

Operations:

res[1] = Rs1.W[1] - Rs2.W[0];
res[0] = Rs1.W[0] + Rs2.W[1];
if (res[x] > (2^31)-1) {
  res[x] = (2^31)-1;
  OV = 1;
} else if (res < -2^31) {
  res[x] = -2^31;
  OV = 1;
}
Rd.W[1] = res[1];
Rd.W[0] = res[0];

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DKCRAS32 (unsigned long long a, unsigned long long b)

DKCRAS32 (32-bit Signed Saturating Cross Addition & Subtraction)

Type: SIMD

Syntax:

DKCRAS32 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do 32-bit signed integer element saturating subtraction and 32-bit signed integer element saturating addition in a 64-bit chunk simultaneously. Operands are from crossed 32-bit elements.

Description

:

This instruction adds the 32-bit integer element in [31:0] of Rs2 from the 32-bit integer element in [63:32] of Rs1; at the same time, it subtracts the 32-bit integer element in [31:0] of Rs1 with the 32-bit integer element in [63:32] of Rs2. If any of the results are beyond the Q31 number range (-2^31 <= Q31 <= 2^31-1), they are saturated to the range and the OV bit is set to 1. The saturated results are written to [63:32] of Rd for subtraction and [31:0] of Rd for addition.

Operations:

res[1] = Rs1.W[1] + Rs2.W[0];
res[0] = Rs1.W[0] - Rs2.W[1];
if (res[x] > (2^31)-1) {
  res[x] = (2^31)-1;
  OV = 1;
} else if (res < -2^31) {
  res[x] = -2^31;
  OV = 1;
}
Rd.W[1] = res[1];
Rd.W[0] = res[0];

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DCRSA32 (unsigned long long a, unsigned long long b)

DCRSA32 (32-bit Cross Subtraction & Addition)

Type: SIMD

Syntax:

DCRSA32 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do 32-bit integer element subtraction and 32-bit integer element addition in a 64-bit chunk simultaneously. Operands are from crossed 32-bit elements.

Description

:

This instruction adds the 32-bit integer element in [63:32] of Rs1 with the 32-bit integer element in [31:0] of Rs2, and writes the result to [63:32] of Rd; at the same time, it subtracts the 32-bit integer element in [63:32] of Rs2 from the 32-bit integer element in [31:0] of Rs1, and writes the result to [31:0] of Rd.

Operations:

res[1] = Rs1.W[1] - Rs2.W[0];
res[0] = Rs1.W[0] + Rs2.W[1];

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DCRAS32 (unsigned long long a, unsigned long long b)

DCRAS32 (32-bit Cross Addition & Subtraction)

Type: SIMD

Syntax:

DCRAS32 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do 32-bit integer element addition and 32-bit integer element subtraction in a 64-bit chunk simultaneously. Operands are from crossed 32-bit elements.

Description

:

This instruction subtracts the 32-bit integer element in [63:32] of Rs1 with the 32-bit integer element in [31:0] of Rs2, and writes the result to [63:32] of Rd; at the same time, it adds the 32-bit integer element in [63:32] of Rs2 from the 32-bit integer element in [31:0] of Rs1, and writes the result to [31:0] of Rd.

Operations:

res[1] = Rs1.W[1] - Rs2.W[0];
res[0] = Rs1.W[0] + Rs2.W[1];

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DKSTSA16 (unsigned long long a, unsigned long long b)

DKSTSA16 (16-bit Signed Saturating Straight Subtraction & Addition)

Type: SIMD

Syntax:

DKSTSA16 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do 16-bit signed integer element saturating subtraction and 16-bit signed integer element saturating addition in a 32-bit chunk simultaneously. Operands are from corresponding positions in 32-bit chunks.

Description

:

This instruction subtracts the 16-bit signed integer element in [31:16] of 32-bit chunks in Rs2 from the 16-bit signed integer element in [31:16] of 32-bit chunks in Rs1; at the same time, it adds the 16-bit signed integer element in [15:0] of 32-bit chunks in Rs2 with the 16-bit signed integer element in [15:0] of 32-bit chunks in Rs1. If any of the results are beyond the Q15 number range (-2^15 <= Q15 <= 2^15-1), they are saturated to the range and the OV bit is set to 1. The saturated results are written to [31:16] of 32-bit chunks in Rd for subtraction and [15:0] of 32-bit chunks in Rd for addition.

Operations:

res1 = Rs1.W[x][31:16] - Rs2.W[x][31:16];
res2 = Rs1.W[x][15:0] + Rs2.W[x][15:0];
for (res in [res1, res2]) {
  if (res > (2^15)-1) {
    res = (2^15)-1;
    OV = 1;
  } else if (res < -2^15) {
    res = -2^15;
    OV = 1;
  }
}
Rd.W[x][31:16] = res1;
Rd.W[x][15:0] = res2;
x=1...0

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DKSTAS16 (unsigned long long a, unsigned long long b)

DKSTAS16 (16-bit Signed Saturating Straight Addition & Subtraction)

Type: SIMD

Syntax:

DKSTAS16 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do 16-bit signed integer element saturating addition and 16-bit signed integer element saturating subtraction in a 32-bit chunk simultaneously. Operands are from corresponding positions in 32-bit chunks.

Description

:

This instruction adds the 16-bit signed integer element in [31:16] of 32-bit chunks in Rs1 with the 16-bit signed integer element in [31:16] of 32-bit chunks in Rs2; at the same time, it subtracts the 16-bit signed integer element in [15:0] of 32-bit chunks in Rs2 from the 16-bit signed integer element in [15:0] of 32-bit chunks in Rs1. If any of the results are beyond the Q15 number range (-2^15 <= Q15 <= 2^15-1), they are saturated to the range and the OV bit is set to 1. The saturated results are written to [31:16] of 32-bit chunks in Rd for subtraction and [15:0] of 32-bit chunks in Rd for addition.

Operations:

res1 = Rs1.W[x][31:16] + Rs2.W[x][31:16];
res2 = Rs1.W[x][15:0] - Rs2.W[x][15:0];
for (res in [res1, res2]) {
  if (res > (2^15)-1) {
    res = (2^15)-1;
    OV = 1;
  } else if (res < -2^15) {
    res = -2^15;
    OV = 1;
  }
}
Rd.W[x][31:16] = res1;
Rd.W[x][15:0] = res2;
x=1...0

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DRSUB32 (unsigned long long a, unsigned long long b)

DRSUB32 (32-bit Signed Halving Subtraction)

Type: SIMD

Syntax:

DRSUB32 Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do 32-bit signed integer element subtractions simultaneously. The results are halved to avoid overflow or saturation.

Description

:

This instruction subtracts the 32-bit signed integer elements in Rs2 from the 32-bit signed integer elements in Rs1. The results are first arithmetically right-shifted by 1 bit and then written to Rd.

Operations:

Rd.W[x] = (Rs1.W[x] - Rs2.W[x]) s>> 1;
x=1...0

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

group NMSIS_Core_DSP_Intrinsic_NUCLEI_N3

(RV32 only)Nuclei Customized N3 DSP Instructions

This is Nuclei customized DSP N3 instructions only for RV32

Functions

__STATIC_FORCEINLINE unsigned long long __RV_DKMMAC (unsigned long long t, unsigned long long a, unsigned long long b)

DKMMAC (64-bit MSW 32x32 Signed Multiply and Saturating Add)

Type: SIMD

Syntax:

DKMMAC Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do MSW 32x32 element signed multiplications and saturating addition simultaneously. The results are written into Rd.

Description

:

This instruction multiplies the signed 32-bit elements of Rs1 with the signed 32-bit elements of Rs2 and adds the most significant 32-bit multiplication results with the signed 32-bit elements of Rd. If the addition result is beyond the Q31 number range (-2^31 <= Q31 <= 2^31-1), it is saturated to the range and the OV bit is set to 1. The results after saturation are written to Rd. The .u form of the instruction additionally rounds up the most significant 32-bit of the 64-bit multiplication results by adding a 1 to bit 31 of the results.

Operations:

op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; op3t = Rd.W[x+1] // top
op1b = Rs1.W[x]; op2b = Rs2.W[x]; op3b = Rd.W[x] // bottom
for ((aop,bop,dop,res) in [(op1t,op2t,op3t,rest), (op1b,op2b,op3b,resb)]) {
   res = sat.q31(dop + (aop s* bop)[63:32]);
}
Rd = concat(rest, resb);
x=0

Parameters
  • t[in] unsigned long long type of value stored in t

  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DKMMAC_U (unsigned long long t, unsigned long long a, unsigned long long b)

DKMMACU (64-bit MSW 32x32 Unsigned Multiply and Saturating Add)

Type: SIMD

Syntax:

DKMMACU Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do MSW 32x32 element unsigned multiplications and saturating addition simultaneously. The results are written into Rd.

Description

:

This instruction multiplies the signed 32-bit elements of Rs1 with the signed 32-bit elements of Rs2 and adds the most significant 32-bit multiplication results with the signed 32-bit elements of Rd. If the addition result is beyond the Q31 number range (-2^31 <= Q31 <= 2^31-1), it is saturated to the range and the OV bit is set to 1. The results after saturation are written to Rd. The .u form of the instruction additionally rounds up the most significant 32-bit of the 64-bit multiplication results by adding a 1 to bit 31 of the results.

Operations:

op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; op3t = Rd.W[x+1] // top
op1b = Rs1.W[x]; op2b = Rs2.W[x]; op3b = Rd.W[x] // bottom
for ((aop,bop,dop,res) in [(op1t,op2t,op3t,rest), (op1b,op2b,op3b,resb)]) {
  res = sat.q31(dop + RUND(aop u* bop)[63:32]);
}
Rd = concat(rest, resb);
x=0

Parameters
  • t[in] unsigned long long type of value stored in t

  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DKMMSB (unsigned long long t, unsigned long long a, unsigned long long b)

DKMMSB (64-bit MSW 32x32 Signed Multiply and Saturating Sub)

Type: SIMD

Syntax:

DKMMSB Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do MSW 32x32 element signed multiplications and saturating subtraction simultaneously. The results are written into Rd.

Description

:

This instruction multiplies the signed 32-bit elements of Rs1 with the signed 32-bit elements of Rs2 and subtracts the most significant 32-bit multiplication results from the signed 32-bit elements of Rd. If the subtraction result is beyond the Q31 number range (-2^31 <= Q31 <= 2^31-1), it is saturated to the range and the OV bit is set to 1. The results after saturation are written to Rd. The .u form of the instruction additionally rounds up the most significant 32-bit of the 64-bit multiplication results by adding a 1 to bit 31 of the results.

Operations:

op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; op3t = Rd.W[x+1] // top
op1b = Rs1.W[x]; op2b = Rs2.W[x]; op3b = Rd.W[x] // bottom
for ((aop,bop,dop,res) in [(op1t,op2t,op3t,rest), (op1b,op2b,op3b,resb)]) {
   res = sat.q31(dop - (aop s* bop)[63:32]);
}
Rd = concat(rest, resb);
x=0

Parameters
  • t[in] unsigned long long type of value stored in t

  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DKMMSB_U (unsigned long long t, unsigned long long a, unsigned long long b)

DKMMSBU (64-bit MSW 32x32 Unsigned Multiply and Saturating Sub)

Type: SIMD

Syntax:

DKMMSBU Rd, Rs1, Rs2
# Rd, Rs1, Rs2 are all even/odd pair of registers

Purpose

:

Do MSW 32x32 element unsigned multiplications and saturating subtraction simultaneously. The results are written into Rd.

Description

:

This instruction multiplies the signed 32-bit elements of Rs1 with the signed 32-bit elements of Rs2 and subtracts the most significant 32-bit multiplication results from the signed 32-bit elements of Rd. If the subtraction result is beyond the Q31 number range (-2^31 <= Q31 <= 2^31-1), it is saturated to the range and the OV bit is set to 1. The results after saturation are written to Rd. The .u form of the instruction additionally rounds up the most significant 32-bit of the 64-bit multiplication results by adding a 1 to bit 31 of the results.

Operations:

op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; op3t = Rd.W[x+1] // top
op1b = Rs1.W[x]; op2b = Rs2.W[x]; op3b = Rd.W[x] // bottom
for ((aop,bop,dop,res) in [(op1t,op2t,op3t,rest), (op1b,op2b,op3b,resb)]) {
   res = sat.q31(dop - (aop u* bop)[63:32]);
}
Rd = concat(rest, resb);
x=0

Parameters
  • t[in] unsigned long long type of value stored in t

  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DKMADA (unsigned long long t, unsigned long long a, unsigned long long b)

DKMADA (Saturating Signed Multiply Two Halfs and Two Adds)

Type: DSP

Syntax:

DKMADA Rd, Rs1, Rs2

Purpose

:

Do two 16x16 with 32-bit signed double addition simultaneously. The results are written into Rd.

Description

:

It multiplies the bottom 16-bit content of 32-bit elements in Rs1 with the bottom 16-bit content of 32-bit elements in Rs2 and then adds the result to the result of multiplying the top 16-bit content of 32-bit elements in Rs1 with the top 16-bit content of 32-bit elements in Rs2.

Operations:

op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; op3t = Rd.W[x+1] // top
op1b = Rs1.W[x]; op2b = Rs2.W[x]; op3b = Rd.W[x] // bottom

for ((aop,bop,dop,res) in [(op1t,op2t,op3t,rest), (op1b,op2b,op3b,resb)]) {
  mul1 = aop.H[1] s* bop.H[1];
  mul2 = aop.H[0] s* bop.H[0];
  res = sat.q31(dop + mul1 + mul2);
}
Rd = concat(rest, resb);
x=0

Parameters
  • t[in] unsigned long long type of value stored in t

  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DKMAXDA (unsigned long long t, unsigned long long a, unsigned long long b)

DKMAXDA (Two Cross 16x16 with 32-bit Signed Double Add)

Type: DSP

Syntax:

DKMAXDA Rd, Rs1, Rs2

Purpose

:

Do two cross 16x16 with 32-bit signed double addition simultaneously. The results are written into Rd.

Description

:

It multiplies the top 16-bit content of 32-bit elements in Rs1 with the bottom 16-bit content of 32-bit elements in Rs2 and then adds the result to the result of multiplying the bottom 16-bit content of 32-bit elements in Rs1 with the top 16-bit content of 32-bit elements in elements in Rs2.

Operations:

op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; op3t = Rd.W[x+1] // top
op1b = Rs1.W[x]; op2b = Rs2.W[x]; op3b = Rd.W[x] // bottom

for ((aop,bop,dop,res) in [(op1t,op2t,op3t,rest), (op1b,op2b,op3b,resb)]) {
  mul1 = aop.H[1] s* bop.H[0];
  mul2 = aop.H[0] s* bop.H[1];
  res = sat.q31(dop + mul1 + mul2);
}
Rd = concat(rest, resb);
x=0

Parameters
  • t[in] unsigned long long type of value stored in t

  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DKMADS (unsigned long long t, unsigned long long a, unsigned long long b)

DKMADS (Two 16x16 with 32-bit Signed Add and Sub)

Type: DSP

Syntax:

DKMADS Rd, Rs1, Rs2

Purpose

:

Do two 16x16 with 32-bit signed addition and subtraction simultaneously. The results are written into Rd.

Description

:

It multiplies the bottom 16-bit content of 32-bit elements in Rs1 with the bottom 16-bit content of 32-bit elements in Rs2 and then subtracts the result from the result of multiplying the top 16-bit content of 32-bit elements in Rs1 with the top 16-bit content of 32-bit elements in Rs2.

Operations:

op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; op3t = Rd.W[x+1] // top
op1b = Rs1.W[x]; op2b = Rs2.W[x]; op3b = Rd.W[x] // bottom

for ((aop,bop,dop,res) in [(op1t,op2t,op3t,rest), (op1b,op2b,op3b,resb)]) {
  mul1 = aop.H[1] s* bop.H[1];
  mul2 = aop.H[0] s* bop.H[0];
  res = sat.q31(dop + mul1 - mul2);
}
Rd = concat(rest, resb);
x=0

Parameters
  • t[in] unsigned long long type of value stored in t

  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DKMADRS (unsigned long long t, unsigned long long a, unsigned long long b)

DKMADRS (Two 16x16 with 32-bit Signed Add and Reversed Sub)

Type: DSP

Syntax:

DKMADRS Rd, Rs1, Rs2

Purpose

:

Do two 16x16 with 32-bit signed addition and revered subtraction simultaneously. The results are written into Rd.

Description

:

it multiplies the top 16-bit content of 32-bit elements in Rs1 with the top 16-bit content of 32-bit elements in Rs2 and then subtracts the result from the result of multiplying the bottom 16-bit content of 32-bit elements in Rs1 with the bottom 16-bit content of 32- bit elements in Rs2

Operations:

op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; op3t = Rd.W[x+1] // top
op1b = Rs1.W[x]; op2b = Rs2.W[x]; op3b = Rd.W[x] // bottom

for ((aop,bop,dop,res) in [(op1t,op2t,op3t,rest), (op1b,op2b,op3b,resb)]) {
  mul1 = aop.H[1] s* bop.H[1];
  mul2 = aop.H[0] s* bop.H[0];
  res = sat.q31(dop - mul1 + mul2);
}
Rd = concat(rest, resb);
x=0

Parameters
  • t[in] unsigned long long type of value stored in t

  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DKMAXDS (unsigned long long t, unsigned long long a, unsigned long long b)

DKMAXDS (Saturating Signed Crossed Multiply Two Halfs & Subtract & Add)

Type: DSP

Syntax:

DKMAXDS Rd, Rs1, Rs2

Purpose

:

Do two cross 16x16 with 32-bit signed addition and subtraction simultaneously. The results are written into Rd.

Description

:

Do two signed 16-bit multiplications from 32-bit elements in two registers; and then perform a subtraction operation between the two 32-bit results. Then add the subtraction result to the corresponding 32-bit elements in a third register. The addition result may be saturated.

Operations:

op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; op3t = Rd.W[x+1] // top
op1b = Rs1.W[x]; op2b = Rs2.W[x]; op3b = Rd.W[x] // bottom

for ((aop,bop,dop,res) in [(op1t,op2t,op3t,rest), (op1b,op2b,op3b,resb)]) {
  mul1 = aop.H[1] s* bop.H[0];
  mul2 = aop.H[0] s* bop.H[1];
  res = sat.q31(dop + mul1 - mul2);
}
Rd = concat(rest, resb);
x=0

Parameters
  • t[in] unsigned long long type of value stored in t

  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DKMSDA (unsigned long long t, unsigned long long a, unsigned long long b)

DKMSDA (Two 16x16 with 32-bit Signed Double Sub)

Type: DSP

Syntax:

DKMSDA Rd, Rs1, Rs2

Purpose

:

Do two 16x16 with 32-bit signed double subtraction simultaneously. The results are written into Rd.

Description

:

it multiplies the bottom 16-bit content of the 32-bit elements of Rs1 with the bottom 16-bit content of the 32-bit elements of Rs2 and multiplies the top 16-bit content of the 32-bit elements of Rs1 with the top 16-bit content of the 32-bit elements of Rs2.

Operations:

op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; op3t = Rd.W[x+1] // top
op1b = Rs1.W[x]; op2b = Rs2.W[x]; op3b = Rd.W[x] // bottom

for ((aop,bop,dop,res) in [(op1t,op2t,op3t,rest), (op1b,op2b,op3b,resb)]) {
  mul1 = aop.H[1] s* bop.H[0];
  mul2 = aop.H[0] s* bop.H[1];
  res = sat.q31(dop - mul1 - mul2);
}
Rd = concat(rest, resb);
x=0

Parameters
  • t[in] unsigned long long type of value stored in t

  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DKMSXDA (unsigned long long t, unsigned long long a, unsigned long long b)

DKMSXDA (Two Cross 16x16 with 32-bit Signed Double Sub)

Type: DSP

Syntax:

DKMSXDA Rd, Rs1, Rs2

Purpose

:

Do two cross 16x16 with 32-bit signed double subtraction simultaneously. The results are written into Rd.

Description

:

It multiplies the bottom 16-bit content of the 32-bit elements of Rs1 with the top 16-bit content of the 32-bit elements of Rs2 and multiplies the top 16-bit content of the 32-bit elements of Rs1 with the bottom 16-bit content of the 32-bit elements of Rs2.

Operations:

op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; op3t = Rd.W[x+1] // top
op1b = Rs1.W[x]; op2b = Rs2.W[x]; op3b = Rd.W[x] // bottom

for ((aop,bop,dop,res) in [(op1t,op2t,op3t,rest), (op1b,op2b,op3b,resb)]) {
  mul1 = aop.H[1] s* bop.H[0];
  mul2 = aop.H[0] s* bop.H[1];
  res = sat.q31(dop - mul1 - mul2);
}
Rd = concat(rest, resb);
x=0

Parameters
  • t[in] unsigned long long type of value stored in t

  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DSMAQA (unsigned long long t, unsigned long long a, unsigned long long b)

DSMAQA (Four Signed 8x8 with 32-bit Signed Add)

Type: DSP

Syntax:

DSMAQA Rd, Rs1, Rs2

Purpose

:

Do four signed 8x8 with 32-bit signed addition simultaneously. The results are written into Rd.

Description

:

This instruction multiplies the four signed 8-bit elements of 32-bit chunks of Rs1 with the four signed 8-bit elements of 32-bit chunks of Rs2 and then adds the four results together with the signed content of the corresponding 32-bit chunks of Rd. The final results are written back to the corresponding 32-bit chunks in Rd.

Operations:

op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; op3t = Rd.W[x+1] // top
op1b = Rs1.W[x]; op2b = Rs2.W[x]; op3b = Rd.W[x] // bottom

for ((aop,bop,dop,res) in [(op1t,op2t,op3t,rest), (op1b,op2b,op3b,resb)]) {
  m0 = aop.B[0] s* bop.B[0];
  m1 = aop.B[1] s* bop.B[1];
  m2 = aop.B[2] s* bop.B[2];
  m3 = aop.B[3] s* bop.B[3];
  res = dop + m0 + m1 + m2 + m3;
}
Rd = concat(rest, resb);
x=0

Parameters
  • t[in] unsigned long long type of value stored in t

  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DSMAQA_SU (unsigned long long t, unsigned long long a, unsigned long long b)

DSMAQASU (Four Signed 8 x Unsigned 8 with 32-bit Signed Add)

Type: DSP

Syntax:

DSMAQASU Rd, Rs1, Rs2

Purpose

:

Do four Signed 8 x Unsigned 8 with 32-bit unsigned addition simultaneously. The results are written into Rd.

Description

:

This instruction multiplies the four unsigned 8-bit elements of 32-bit chunks of Rs1 with the four signed 8-bit elements of 32-bit chunks of Rs2 and then adds the four results together with the unsigned content of the corresponding 32-bit chunks of Rd. The final results are written back to the corresponding 32-bit chunks in Rd.

Operations:

op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; op3t = Rd.W[x+1] // top
op1b = Rs1.W[x]; op2b = Rs2.W[x]; op3b = Rd.W[x] // bottom

for ((aop,bop,dop,res) in [(op1t,op2t,op3t,rest), (op1b,op2b,op3b,resb)]) {
  m0 = aop.B[0] su* bop.B[0];
  m1 = aop.B[1] su* bop.B[1];
  m2 = aop.B[2] su* bop.B[2];
  m3 = aop.B[3] su* bop.B[3];
  res = dop + m0 + m1 + m2 + m3;
}
Rd = concat(rest, resb);
x=0

Parameters
  • t[in] unsigned long long type of value stored in t

  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE unsigned long long __RV_DUMAQA (unsigned long long t, unsigned long long a, unsigned long long b)

DUMAQA (Four Unsigned 8x8 with 32-bit Unsigned Add)

Type: DSP

Syntax:

DUMAQA Rd, Rs1, Rs2

Purpose

:

Do four unsigned 8x8 with 32-bit unsigned addition simultaneously. The results are written into Rd.

Description

:

This instruction multiplies the four unsigned 8-bit elements of 32-bit chunks of Rs1 with the four unsigned 8-bit elements of 32-bit chunks of Rs2 and then adds the four results together with the unsigned content of the corresponding 32-bit chunks of Rd. The final results are written back to the corresponding 32-bit chunks in Rd.

Operations:

op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; op3t = Rd.W[x+1] // top
op1b = Rs1.W[x]; op2b = Rs2.W[x]; op3b = Rd.W[x] // bottom

for ((aop,bop,dop,res) in [(op1t,op2t,op3t,rest), (op1b,op2b,op3b,resb)]) {
  m0 = aop.B[0] su* bop.B[0];
  m1 = aop.B[1] su* bop.B[1];
  m2 = aop.B[2] su* bop.B[2];
  m3 = aop.B[3] su* bop.B[3];
  res = dop + m0 + m1 + m2 + m3;
}
Rd = concat(rest, resb);
x=0

Parameters
  • t[in] unsigned long long type of value stored in t

  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE long long __RV_DKMDA32 (unsigned long long a, unsigned long long b)

DKMDA32 (Two Signed 32x32 with 64-bit Saturation Add)

Type: DSP

Syntax:

DKMDA32 Rd, Rs1, Rs2

Purpose

:

Do two signed 32x32 add the signed multiplication results with Q63 saturation. The results are written into Rd.

Description

:

For the

KMDA32 instruction, it multiplies the bottom 32-bit element of Rs1 with the bottom 32-bit element of Rs2 and then adds the result to the result of multiplying the top 32-bit element of Rs1 with the top 32-bit element of Rs2.

Operations:

op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top
op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom
t0 = op1b s* op2b;
t1 = op1t s* op2t;
Rd = concat(rest, resb);
x=0

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in long long type

__STATIC_FORCEINLINE long long __RV_DKMXDA32 (unsigned long long a, unsigned long long b)

DKMXDA32 (Two Cross Signed 32x32 with 64-bit Saturation Add)

Type: DSP

Syntax:

DKMXDA32 Rd, Rs1, Rs2

Purpose

:

Do two cross signed 32x32 and add the signed multiplication results with Q63 saturation. The results are written into Rd.

Description

:

It multiplies the bottom 32-bit element of Rs1 with the top 32-bit element of Rs2 and then adds the result to the result of multiplying the top 32-bit element of Rs1 with the bottom 32-bit element of Rs2.

Operations:

op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top
op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom
t01 = op1b s* op2t;
t10 = op1t s* op2b;
Rd = sat.q63(t01 + t10);
x=0

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in long long type

__STATIC_FORCEINLINE long long __RV_DKMADA32 (long long t, unsigned long long a, unsigned long long b)

DKMADA32 (Two Signed 32x32 with 64-bit Saturation Add)

Type: DSP

Syntax:

DKMADA32 Rd, Rs1, Rs2

Purpose

:

Do two signed 32x32 and add the signed multiplication results and a third register with Q63 saturation. The results are written into Rd.

Description

:

It multiplies the bottom 32-bit element of Rs1 with the bottom 32-bit element of Rs2 and then adds the result to the result of multiplying the top 32-bit element of Rs1 with the top 32-bit element of Rs2.

Operations:

op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top
op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom
t01 = op1b s* op2b;
t10 = op1t s* op2t;
Rd = sat.q63(t01 + t10);
x=0

Parameters
  • t[in] long long type of value stored in t

  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in long long type

__STATIC_FORCEINLINE long long __RV_DKMAXDA32 (long long t, unsigned long long a, unsigned long long b)

DKMAXDA32 (Two Cross Signed 32x32 with 64-bit Saturation Add)

Type: DSP

Syntax:

DKMAXDA32 Rd, Rs1, Rs2

Purpose

:

Do two cross signed 32x32 and add the signed multiplication results and a third register with Q63 saturation. The results are written into Rd.

Description

:

It multiplies the top 32-bit element in Rs1 with the bottom 32-bit element in Rs2 and then adds the result to the result of multiplying the bottom 32-bit element in Rs1 with the top 32-bit element in Rs2.

Operations:

op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top
op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom
t01 = op1b s* op2t;
t10 = op1t s* op2b;
Rd = sat.q63(Rd + t01 + t10);
x=0

Parameters
  • t[in] long long type of value stored in t

  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in long long type

__STATIC_FORCEINLINE long long __RV_DKMADS32 (long long t, unsigned long long a, unsigned long long b)

DKMADS32 (Two Signed 32x32 with 64-bit Saturation Add and Sub)

Type: DSP

Syntax:

DKMADS32 Rd, Rs1, Rs2

Purpose

:

Do two signed 32x32 and add the top signed multiplication results and subtraction bottom signed multiplication results and add a third register with Q63 saturation. The results are written into Rd.

Description

:

It multiplies the top 32-bit element in Rs1 with the bottom 32-bit element in Rs2 and then subtracts the result to the result of multiplying the top 32-bit element in Rs1 with the top 32-bit element in Rs2.

Operations:

op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top
op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom

t0 = op1b s* op2b;
t1 = op1t s* op2t;
Rd = sat.q63(Rd - t0 + t1);
x=0

Parameters
  • t[in] long long type of value stored in t

  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in long long type

__STATIC_FORCEINLINE long long __RV_DKMADRS32 (long long t, unsigned long long a, unsigned long long b)

DKMADRS32 (Two Signed 32x32 with 64-bit Saturation Revered Add and Sub)

Type: DSP

Syntax:

DKMADRS32 Rd, Rs1, Rs2

Purpose

:

Do two signed 32x32 and add the signed multiplication results and a third register with Q63 saturation. The results are written into Rd.Do two signed 32x32 and subtraction the top signed multiplication results and add bottom signed multiplication results and add a third register with Q63 saturation. The results are written into Rd.

Description

:

It multiplies the top 32-bit element in Rs1 with the top 32-bit element in Rs2 and then subtracts the result from the result of multiplying the bottom 32-bit element in Rs1 with the bottom 32-bit element in Rs2.

Operations:

op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top
op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom
t0 = op1b s* op2b;
t1 = op1t s* op2t;
Rd = sat.q63(Rd + t0 - t1);
x=0

Parameters
  • t[in] long long type of value stored in t

  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in long long type

__STATIC_FORCEINLINE long long __RV_DKMAXDS32 (long long t, unsigned long long a, unsigned long long b)

DKMAXDS32 (Two Cross Signed 32x32 with 64-bit Saturation Add and Sub)

Type: DSP

Syntax:

DKMAXDS32 Rd, Rs1, Rs2

Purpose

:

Do two signed 32x32 and add the top signed multiplication results and subtraction bottom signed multiplication results and add a third register with Q63 saturation. The results are written into Rd.

Description

:

It multiplies the bottom 32-bit element in Rs1 with the top 32-bit element in Rs2 and then subtracts the result from the result of multiplying the top 32-bit element in Rs1 with the bottom 32-bit element in Rs2.

Operations:

op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top
op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom

t01 = op1b s* op2t;
t10 = op1t s* op2b;
Rd = sat.q63(Rd - t01 + t10);
x=0

Parameters
  • t[in] long long type of value stored in t

  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in long long type

__STATIC_FORCEINLINE long long __RV_DKMSDA32 (long long t, unsigned long long a, unsigned long long b)

DKMSDA32 (Two Signed 32x32 with 64-bit Saturation Sub)

Type: DSP

Syntax:

DKMSDA32 Rd, Rs1, Rs2

Purpose

:

Do two signed 32x32 and subtraction the top signed multiplication results and subtraction bottom signed multiplication results and add a third register with Q63 saturation. The results are written into Rd.

Description

:

It multiplies the bottom 32-bit element of Rs1 with the bottom 32-bit element of Rs2 and multiplies the top 32-bit element of Rs1 with the top 32-bit element of Rs2.

Operations:

op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top
op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom

t0 = op1b s* op2b;
t1 = op1t s* op2t;
Rd = sat.q63(Rd - t0 - t1);
x=0

Parameters
  • t[in] long long type of value stored in t

  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in long long type

__STATIC_FORCEINLINE long long __RV_DKMSXDA32 (long long t, unsigned long long a, unsigned long long b)

DKMSXDA32 (Two Cross Signed 32x32 with 64-bit Saturation Sub)

Type: DSP

Syntax:

DKMSXDA32 Rd, Rs1, Rs2

Purpose

:

Do two cross signed 32x32 and subtraction the top signed multiplication results and subtraction bottom signed multiplication results and add a third register with Q63 saturation. The results are written into Rd.

Description

:

It multiplies the bottom 32-bit element of Rs1 with the top 32-bit element of Rs2 and multiplies the top 32-bit element of Rs1 with the bottom 32-bit element of Rs2.

Operations:

op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top
op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom

t0 = op1b s* op2t;
t1 = op1t s* op2b;
Rd = sat.q63(Rd - t0 - t1);
x=0

Parameters
  • t[in] long long type of value stored in t

  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in long long type

__STATIC_FORCEINLINE long long __RV_DSMDS32 (unsigned long long a, unsigned long long b)

DSMDS32 (Two Signed 32x32 with 64-bit Sub)

Type: DSP

Syntax:

DSMDS32 Rd, Rs1, Rs2

Purpose

:

Do two signed 32x32 and add the top signed multiplication results and subtraction bottom signed multiplication. The results are written into Rd.

Description

:

It multiplies the bottom 32-bit element of Rs1 with the bottom 32-bit element of Rs2 and then subtracts the result from the result of multiplying the top 32-bit element of Rs1 with the top 32-bit element of Rs2.

Operations:

op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top
op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom

t0 = op1b s* op2t;
t1 = op1t s* op2b;
Rd = t1 - t0;
x=0

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in long long type

__STATIC_FORCEINLINE long long __RV_DSMDRS32 (unsigned long long a, unsigned long long b)

DSMDRS32 (Two Signed 32x32 with 64-bit Revered Sub)

Type: DSP

Syntax:

DSMDRS32 Rd, Rs1, Rs2

Purpose

:

Do two signed 32x32 and subtraction the top signed multiplication results and add bottom signed multiplication. The results are written into Rd

Description

:

It multiplies the top 32-bit element of Rs1 with the top 32-bit element of Rs2 and then subtracts the result from the result of multiplying the bottom 32-bit element of Rs1 with the bottom 32-bit element of Rs2.

Operations:

op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top
op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom

t0 = op1b s* op2b;
t1 = op1t s* op2t;
Rd = t1 - t0;
x=0

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in long long type

__STATIC_FORCEINLINE long long __RV_DSMXDS32 (unsigned long long a, unsigned long long b)

DSMXDS32 (Two Cross Signed 32x32 with 64-bit Sub)

Type: DSP

Syntax:

DSMXDS32 Rd, Rs1, Rs2

Purpose

:

Do two cross signed 32x32 and add the top signed multiplication results and subtraction bottom signed multiplication. The results are written into Rd.

Description

:

It multiplies the bottom 32-bit element of Rs1 with the top 32-bit element of Rs2 and then subtracts the result from the result of multiplying the top 32-bit element of Rs1 with the bottom 32-bit element of Rs2.

Operations:

op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top
op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom

t01 = op1b s* op2t;
t10 = op1t s* op2b;
Rd = t1 - t0;
x=0

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in long long type

__STATIC_FORCEINLINE long long __RV_DSMALDA (long long t, unsigned long long a, unsigned long long b)

DSMALDA (Four Signed 16x16 with 64-bit Add)

Type: DSP

Syntax:

DSMALDA Rd, Rs1, Rs2

Purpose

:

Do four signed 16x16 and add signed multiplication results and a third register. The results are written into Rd.

Description

:

It multiplies the bottom 16-bit content of Rs1 with the bottom 16-bit content of Rs2 and then adds the result to the result of multiplying the top 16-bit content of Rs1 with the top 16-bit content of Rs2 with unlimited precision

Operations:

op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top
op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom

m0 = op1b.H[0] s* op2b.H[0];
m1 = op1b.H[1] s* op2b.H[1];
m2 = op1t.H[0] s* op2t.H[0];
m3 = op1t.H[1] s* op2t.H[1];

Rd = Rd + m0 + m1 + m2 + m3;
x=0

Parameters
  • t[in] long long type of value stored in t

  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in long long type

__STATIC_FORCEINLINE long long __RV_DSMALXDA (long long t, unsigned long long a, unsigned long long b)

DSMALXDA (Four Signed 16x16 with 64-bit Add)

Type: DSP

Syntax:

DSMALXDA Rd, Rs1, Rs2

Purpose

:

Do four cross signed 16x16 and add signed multiplication results and a third register. The results are written into Rd.

Description

:

It multiplies the top 16-bit content of Rs1 with the bottom 16-bit content of Rs2 and then adds the result to the result of multiplying the bottom 16-bit content of Rs1 with the top 16-bit content of Rs2 with unlimited precision.

Operations:

op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top
op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom

m0 = op1b.H[0] s* op2b.H[1];
m1 = op1b.H[1] s* op2b.H[0];
m2 = op1t.H[0] s* op2t.H[1];
m3 = op1t.H[1] s* op2t.H[0];

Rd = Rd + m0 + m1 + m2 + m3;
x=0

Parameters
  • t[in] long long type of value stored in t

  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in long long type

__STATIC_FORCEINLINE long long __RV_DSMALDS (long long t, unsigned long long a, unsigned long long b)

DSMALDS (Four Signed 16x16 with 64-bit Add and Sub)

Type: DSP

Syntax:

DSMALDS Rd, Rs1, Rs2

Purpose

:

Do four signed 16x16 and add and subtraction signed multiplication results and a third register. The results are written into Rd.

Description

:

It multiplies the bottom 16-bit content of Rs1 with the bottom 16-bit content of Rs2 and then subtracts the result from the result of multiplying the top 16-bit content of Rs1 with the top 16-bit content of Rs2.

Operations:

op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top
op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom

m0 = op1b.H[1] s* op2b.H[1];
m1 = op1b.H[0] s* op2b.H[0];
m2 = op1t.H[1] s* op2t.H[1];
m3 = op1t.H[0] s* op2t.H[0];

Rd = Rd + m0 - m1 + m2 - m3;
x=0

Parameters
  • t[in] long long type of value stored in t

  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in long long type

__STATIC_FORCEINLINE long long __RV_DSMALDRS (long long t, unsigned long long a, unsigned long long b)

DSMALDRS (Four Signed 16x16 with 64-bit Add and Revered Sub)

Type: DSP

Syntax:

DSMALDRS Rd, Rs1, Rs2

Purpose

:

Do two signed 16x16 and add and revered subtraction signed multiplication results and a third register. The results are written into Rd.

Description

:

It multiplies the top 16-bit content of Rs1 with the top 16-bit content of Rs2 and then subtracts the result from the result of multiplying the bottom 16-bit content of Rs1 with the bottom 16-bit content of Rs2.

Operations:

op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top
op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom

m0 = op1b.H[0] s* op2b.H[0];
m1 = op1b.H[1] s* op2b.H[1];
m2 = op1t.H[0] s* op2t.H[0];
m3 = op1t.H[1] s* op2t.H[1];

Rd = Rd + m0 - m1 + m2 - m3;
x=0

Parameters
  • t[in] long long type of value stored in t

  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in long long type

__STATIC_FORCEINLINE long long __RV_DSMALXDS (long long t, unsigned long long a, unsigned long long b)

DSMALXDS (Four Cross Signed 16x16 with 64-bit Add and Sub)

Type: DSP

Syntax:

DSMALXDS Rd, Rs1, Rs2

Purpose

:

Do four cross signed 16x16 and add and subtraction signed multiplication results and a third register. The results are written into Rd.

Description

:

It multiplies the bottom 16-bit content of Rs1 with the top 16-bit content of Rs2 and then subtracts the result from the result of multiplying the top 16-bit content of Rs1 with the bottom 16-bit content of Rs2.

Operations:

op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top
op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom

m0 = op1b.H[1] s* op2b.H[0];
m1 = op1b.H[0] s* op2b.H[1];
m2 = op1t.H[1] s* op2t.H[0];
m3 = op1t.H[0] s* op2t.H[1];

Rd = Rd + m0 - m1 + m2 - m3;
x=0

Parameters
  • t[in] long long type of value stored in t

  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in long long type

__STATIC_FORCEINLINE long long __RV_DSMSLDA (long long t, unsigned long long a, unsigned long long b)

DSMSLDA (Four Signed 16x16 with 64-bit Sub)

Type: DSP

Syntax:

DSMSLDA Rd, Rs1, Rs2

Purpose

:

Do four signed 16x16 and subtraction signed multiplication results and add a third register. The results are written into Rd.

Description

:

It multiplies the bottom 16-bit content of Rs1 with the bottom 16-bit content Rs2 and multiplies the top 16-bit content of Rs1 with the top 16-bit content of Rs2.

Operations:

op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top
op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom

m0 = op1b.H[0] s* op2b.H[0];
m1 = op1b.H[1] s* op2b.H[1];
m2 = op1t.H[0] s* op2t.H[0];
m3 = op1t.H[1] s* op2t.H[1];

Rd = Rd - m0 - m1 - m2 - m3;
x=0

Parameters
  • t[in] long long type of value stored in t

  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in long long type

__STATIC_FORCEINLINE long long __RV_DSMSLXDA (long long t, unsigned long long a, unsigned long long b)

DSMSLXDA (Four Cross Signed 16x16 with 64-bit Sub)

Type: DSP

Syntax:

DSMSLXDA Rd, Rs1, Rs2

Purpose

:

Do four signed 16x16 and subtraction signed multiplication results and add a third register. The results are written into Rd.

Description

:

It multiplies the top 16-bit content of Rs1 with the bottom 16-bit content of Rs2 and multiplies the bottom 16-bit content of Rs1 with the top 16-bit content of Rs2.

Operations:

op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top
op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom

m0 = op1b.H[0] s* op2b.H[1];
m1 = op1b.H[1] s* op2b.H[0];
m2 = op1t.H[0] s* op2t.H[1];
m3 = op1t.H[1] s* op2t.H[0];

Rd = Rd - m0 - m1 - m2 - m3;
x=0

Parameters
  • t[in] long long type of value stored in t

  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in long long type

__STATIC_FORCEINLINE long long __RV_DDSMAQA (long long t, unsigned long long a, unsigned long long b)

DDSMAQA (Eight Signed 8x8 with 64-bit Add)

Type: DSP

Syntax:

DDSMAQA Rd, Rs1, Rs2

Purpose

:

Do eight signed 8x8 and add signed multiplication results and a third register. The results are written into Rd.

Description

:

Do eight signed 8-bit multiplications from eight 8-bit chunks of two registers; and then adds the eight 16-bit results and the content of 64-bit chunks of a third register.

Operations:

op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top
op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom

m0 = op1b.B[0] s* op2b.B[0];
m1 = op1b.B[1] s* op2b.B[1];
m2 = op1b.B[2] s* op2b.B[2];
m3 = op1b.B[3] s* op2b.B[3];
m4 = op1t.B[0] s* op2t.B[0];
m5 = op1t.B[1] s* op2t.B[1];
m6 = op1t.B[2] s* op2t.B[2];
m7 = op1t.B[3] s* op2t.B[3];

s0 = m0 + m1 + m2 + m3;
s1 = m4 + m5 + m6 + m7;
Rd = Rd + s0 + s1;
x=0

Parameters
  • t[in] long long type of value stored in t

  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in long long type

__STATIC_FORCEINLINE long long __RV_DDSMAQASU (long long t, unsigned long long a, unsigned long long b)

DDSMAQASU (Eight Signed 8 x Unsigned 8 with 64-bit Add)

Type: DSP

Syntax:

DDSMAQASU Rd, Rs1, Rs2

Purpose

:

Do eight signed 8 x unsigned 8 and add signed multiplication results and a third register. The results are written into Rd.

Description

:

Do eight signed 8 x unsigned 8 and add signed multiplication results and a third register; and then adds the eight 16-bit results and the content of 64-bit chunks of a third register.

Operations:

op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top
op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom

m0 = op1b.B[0] su* op2b.B[0];
m1 = op1b.B[1] su* op2b.B[1];
m2 = op1b.B[2] su* op2b.B[2];
m3 = op1b.B[3] su* op2b.B[3];
m4 = op1t.B[0] su* op2t.B[0];
m5 = op1t.B[1] su* op2t.B[1];
m6 = op1t.B[2] su* op2t.B[2];
m7 = op1t.B[3] su* op2t.B[3];

s0 = m0 + m1 + m2 + m3;
s1 = m4 + m5 + m6 + m7;
Rd = Rd + s0 + s1;
x=0

Parameters
  • t[in] long long type of value stored in t

  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in long long type

__STATIC_FORCEINLINE long long __RV_DDUMAQA (long long t, unsigned long long a, unsigned long long b)

DDUMAQA (Eight Unsigned 8x8 with 64-bit Unsigned Add)

Type: DSP

Syntax:

DDUMAQA Rd, Rs1, Rs2

Purpose

:

Do eight unsigned 8x8 and add unsigned multiplication results and a third register. The results are written into Rd.

Description

:

Do eight unsigned 8x8 and add unsigned multiplication results and a third register; and then adds the eight 16-bit results and the content of 64-bit chunks of a third register.

Operations:

op1t = Rs1.W[x+1]; op2t = Rs2.W[x+1]; // top
op1b = Rs1.W[x]; op2b = Rs2.W[x]; // bottom

m0 = op1b.B[0] u* op2b.B[0];
m1 = op1b.B[1] u* op2b.B[1];
m2 = op1b.B[2] u* op2b.B[2];
m3 = op1b.B[3] u* op2b.B[3];
m4 = op1t.B[0] u* op2t.B[0];
m5 = op1t.B[1] u* op2t.B[1];
m6 = op1t.B[2] u* op2t.B[2];
m7 = op1t.B[3] u* op2t.B[3];

s0 = m0 + m1 + m2 + m3;
s1 = m4 + m5 + m6 + m7;
Rd = Rd + s0 + s1;
x=0

Parameters
  • t[in] long long type of value stored in t

  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in long long type

__STATIC_FORCEINLINE long __RV_DSMA32_U (unsigned long long a, unsigned long long b)

DSMA32.u (64-bit SIMD 32-bit Signed Multiply Addition With Rounding and Clip)

Type: DSP

Syntax:

DSMA32.u Rd, Rs1, Rs2

Purpose

:

Do two signed 32x32 and add signed multiplication results with Rounding, then right shift 32-bit and clip q63 to q31. The result is written to Rd.

Description

:

For the

DSMA32.u instruction, multiply the top 32-bit Q31 content of 64-bit chunks in Rs1 with the top 32-bit Q31 content of 64-bit chunks in Rs2. At the same time, multiply the bottom 32-bit Q31 content of 64-bit chunks in Rs1 with the bottom 32-bit Q31 content of 64-bit chunks in Rs2. Then, do the addtion for the results above and perform the addtional rounding operations, and then move the data to the right by 32-bit, and clip the 64-bit data into 32-bit.The result is written to Rd.

Operations:

Rd = (q31_t)((Rs1.W[x] s* Rs2.W[x] + Rs1.W[x + 1] s* Rs2.W[x + 1] + 0x80000000LL) s>> 32);
x=0

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in long type

__STATIC_FORCEINLINE long __RV_DSMXS32_U (unsigned long long a, unsigned long long b)

DSMXS32.u (64-bit SIMD 32-bit Signed Multiply Cross Subtraction With Rounding and Clip)

Type: DSP

Syntax:

DSMXS32.u Rd, Rs1, Rs2

Purpose

:

Do two cross signed 32x32 and sub signed multiplication results with Rounding, then right shift 32-bit and clip q63 to q31. The result is written to Rd.

Description

:

For the

DSMXS32.u instruction, multiply the top 32-bit Q31 content of 64-bit chunks in Rs1 with the bottom 32-bit Q31 content of 64-bit chunks in Rs2. At the same time, multiply the bottom 32-bit Q31 content of 64-bit chunks in Rs1 with the top 32-bit Q31 content of 64-bit chunks in Rs2. Then, do the subtraction for the results above and perform the addtional rounding operations, and then move the data to the right by 32-bit, and clip the 64-bit data into 32-bit.The result is written to Rd.

Operations:

Rd = (q31_t)((Rs1.W[x + 1] s* Rs2.W[x] - Rs1.W[x] s* Rs2.W[x + 1] + 0x80000000LL) s>> 32);
x=0

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in long type

__STATIC_FORCEINLINE long __RV_DSMXA32_U (unsigned long long a, unsigned long long b)

DSMXA32.u (64-bit SIMD 32-bit Signed Cross Multiply Addition with Rounding and Clip)

Type: DSP

Syntax:

DSMXA32.u Rd, Rs1, Rs2

Purpose

:

Do two cross signed 32x32 and add signed multiplication results with Rounding, then right shift 32-bit and clip q63 to q31. The result is written to Rd.

Description

:

For the

DSMXA32.u instruction,multiply the top 32-bit Q31 content of 64-bit chunks in Rs1 with the bottom 32-bit Q31 content of 64-bit chunks in Rs2. At the same time, multiply the bottom 32-bit Q31 content of 64-bit chunks in Rs1 with the top 32-bit Q31 content of 64-bit chunks in Rs2. Then, do the addtion for the results above and perform the addtional rounding operations, and then move the data to the right by 32-bit, and clip the 64-bit data into 32-bit.The result is written to Rd.

Operations:

Rd = (q31_t)((Rs1.W[x + 1] s* Rs2.W[x] + Rs1.W[x] s* Rs2.W[x + 1] + 0x80000000LL) s>> 32);
x=0

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in long type

__STATIC_FORCEINLINE long __RV_DSMS32_U (unsigned long long a, unsigned long long b)

DSMS32.u (64-bit SIMD 32-bit Signed Multiply Subtraction with Rounding and Clip)

Type: DSP

Syntax:

DSMS32.u Rd, Rs1, Rs2

Purpose

:

Do two signed 32x32 and sub signed multiplication results with Rounding, then right shift 32-bit and clip q63 to q31. The result is written to Rd.

Description

:

For the

DSMS32.u instruction, multiply the bottom 32-bit Q31 content of 64-bit chunks in Rs1 with the bottom 32-bit Q31 content of 64-bit chunks in Rs2. At the same time, multiply the top 32-bit Q31 content of 64-bit chunks in Rs1 with the top 32-bit Q31 content of 64-bit chunks in Rs2. Then, do the subtraction for the results above and perform the addtional rounding operations, and then move the data to the right by 32-bit, and clip the 64-bit data into 32-bit.The result is written to Rd.

Operations:

Rd = (q31_t)((Rs1.W[x] s* Rs2.W[x] - Rs1.W[x + 1] s* Rs2.W[x + 1] + 0x80000000LL) s>> 32);
x=0

Parameters
  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in long type

__STATIC_FORCEINLINE long __RV_DSMADA16 (long long t, unsigned long long a, unsigned long long b)

DSMADA16 (Signed Multiply Two Halfs and Two Adds 32-bit)

Type: SIMD

Syntax:

DSMADA16 Rd, Rs1, Rs2

Purpose

:

Do two signed 16-bit multiplications of two 32-bit registers; and then adds the 32-bit results and the 32-bit value of an even/odd pair of registers together.

  • DSMADA16: rt pair+ top*top + bottom*bottom

Description

:

This instruction multiplies the per 16-bit content of the 32-bit elements of Rs1 with the corresponding 16-bit content of the 32-bit elements of Rs2. The result is added to the 32-bit value of an even/odd pair of registers specified by Rd(4,1). The 32-bit addition result is written back to the register-pair. The 16-bit values of Rs1 and Rs2, and the 32-bit value of the register-pair are treated as signed integers.

Operations:

Mres0[0][31:0] = (Rs1.W[0].H[0] * Rs2.W[0].H[0]);
Mres1[0][31:0] = (Rs1.W[0].H[1] * Rs2.W[0].H[1]);
Mres0[1][31:0] = (Rs1.W[1].H[0] * Rs2.W[1].H[0]);
Mres1[1][31:0] = (Rs1.W[1].H[1] * Rs2.W[1].H[1]);
Rd.W = Rd.W + SE32(Mres0[0][31:0]) + SE32(Mres1[0][31:0]) + SE32(Mres0[1][31:0]) + SE32(Mres1[1][31:0]);

Parameters
  • t[in] long long type of value stored in t

  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in long type

__STATIC_FORCEINLINE long __RV_DSMAXDA16 (long long t, unsigned long long a, unsigned long long b)

DSMAXDA16 (Signed Crossed Multiply Two Halfs and Two Adds 32-bit)

Type: SIMD

Syntax:

DSMAXDA16 Rd, Rs1, Rs2

Purpose

:

Do two signed 16-bit multiplications of two 32-bit registers; and then adds the 32-bit results and the 32-bit value of an even/odd pair of registers together.

  • DSMAXDA: rt pair+ top*bottom + bottom*top (all 32-bit elements)

Description

:

This instruction crossly multiplies the top 16-bit content of the 32-bit elements of Rs1 with the bottom 16-bit content of the 32-bit elements of Rs2 and then adds the result to the result of multiplying the bottom 16-bit content of the 32-bit elements of Rs1 with the top 16-bit content of the 32-bit elements of Rs2 with unlimited precision. The result is added to the 64-bit value of an even/odd pair of registers specified by Rd(4,1).The 64-bit addition result is clipped to 32-bit result.

Operations:

Mres0[0][31:0] = (Rs1.W[0].H[0] * Rs2.W[0].H[1]);
Mres1[0][31:0] = (Rs1.W[0].H[1] * Rs2.W[0].H[0]);
Mres0[1][31:0] = (Rs1.W[1].H[0] * Rs2.W[1].H[1]);
Mres1[1][31:0] = (Rs1.W[1].H[1] * Rs2.W[1].H[0]);
Rd.W = Rd.W + SE32(Mres0[0][31:0]) + SE32(Mres1[0][31:0]) + SE32(Mres0[1][31:0]) + SE32(Mres1[1][31:0]);

Parameters
  • t[in] long long type of value stored in t

  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in long type

__STATIC_FORCEINLINE unsigned long long __RV_DKSMS32_U (unsigned long long t, unsigned long long a, unsigned long long b)

DKSMS32.u (Two Signed Multiply Shift-clip and Saturation with Rounding)

Type: SIMD

Syntax:

DKSMS32.u Rd, Rs1, Rs2

Purpose

:

Computes saturated multiplication of two pairs of q31 type with shifted rounding.

Description

:

Compute the multiplication of Rs1 and Rs2 of type q31_t, intercept [47:16] for the resulting 64-bit product to get the 32-bit number, then add 1 to it to do rounding, and finally saturate the result after rounding.

Operations:

Mres[x][63:0] = Rs1.W[x] s* Rs2.W[x];
Round[x][32:0] = Mres[x][47:15] + 1;
Rd.W[x] = sat.31(Rd.W[x] + Round[x][32:1]);
x=1...0

Parameters
  • t[in] unsigned long long type of value stored in t

  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type

__STATIC_FORCEINLINE long __RV_DMADA32 (long long t, unsigned long long a, unsigned long long b)

DMADA32 ((Two Cross Signed 32x32 with 64-bit Add and Clip to 32-bit)

Type: SIMD

Syntax:

DMADA32 Rd, Rs1, Rs2

Purpose

:

Do two cross signed 32x32 and add the signed multiplication results to q63, then clip the q63 result to q31 , the final results are written into Rd.

Description

:

For the

DMADA32 instruction, it multiplies the top 32-bit element in Rs1 with the bottom 32-bit element in Rs2 and then adds the result to the result of multiplying the bottom 32-bit element in Rs1 with the top 32-bit element in Rs2, then clip the q63 result to q31.

Operations:

res = (q31_t)((((q63_t) Rd.w[0] << 32) + (q63_t)Rs1.w[0] s*  Rs2.w[1] + (q63_t)Rs1.w[1] s*  Rs2.w[0]) s>> 32);
rd = res;

Parameters
  • t[in] long long type of value stored in t

  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in long type

__STATIC_FORCEINLINE long long __RV_DSMALBB (long long t, unsigned long long a, unsigned long long b)

DSMALBB (Signed Multiply Bottom Halfs & Add 64-bit)

Type: SIMD

Syntax:

DSMALBB Rd, Rs1, Rs2

Purpose

:

Multiply the signed 16-bit content of the 32-bit elements of a register with the 16-bit content of the corresponding 32-bit elements of another register and add the results with a 64-bit value of an even/odd pair of registers. The addition result is written back to the register-pair.

  • DSMALBB: rt pair + bottom*bottom (all 32-bit elements)

Description

:

For the

DSMALBB instruction, it multiplies the bottom 16-bit content of Rs1 with the bottom 16-bit content of Rs2.The multiplication results are added with the 64-bit value of Rd. The 64-bit addition result is written back to Rd.

Operations:

Mres[0][31:0] = Rs1.W[0].H[0] * Rs2.W[0].H[0];
Mres[1][31:0] = Rs1.W[1].H[0] * Rs2.W[1].H[0];
Rd = Rd + SE64(Mres[0][31:0]) + SE64(Mres[1][31:0]);

Parameters
  • t[in] long long type of value stored in t

  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in long long type

__STATIC_FORCEINLINE long long __RV_DSMALBT (long long t, unsigned long long a, unsigned long long b)

DSMALBT (Signed Multiply Bottom Half & Top Half & Add 64-bit)

Type: SIMD

Syntax:

DSMALBT Rd, Rs1, Rs2

Purpose

:

Multiply the signed 16-bit content of the 32-bit elements of a register with the 16-bit content of the corresponding 32-bit elements of another register and add the results with a 64-bit value of an even/odd pair of registers. The addition result is written back to the register-pair.

  • DSMALBT: rt pair + bottom*top (all 32-bit elements)

Description

:

For the

DSMALBT instruction, it multiplies the bottom 16-bit content of the 32-bit elements of Rs1 with the top 16-bit content of the 32-bit elements of Rs2. The multiplication results are added with the 64-bit value of Rd. The 64-bit addition result is written back to Rd. The 16-bit values of Rs1 and Rs2, and the 64-bit value of Rd are treated as signed integers

Operations:

Mres[0][31:0] = Rs1.W[0].H[0] * Rs2.W[0].H[1];
Mres[1][31:0] = Rs1.W[1].H[0] * Rs2.W[1].H[1];
Rd = Rd + SE64(Mres[0][31:0]) + SE64(Mres[1][31:0]);

Parameters
  • t[in] long long type of value stored in t

  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in long long type

__STATIC_FORCEINLINE long long __RV_DSMALTT (long long t, unsigned long long a, unsigned long long b)

DSMALTT (Signed Multiply Top Half & Add 64-bit)

Type: SIMD

Syntax:

DSMALTT Rd, Rs1, Rs2

Purpose

:

Multiply the signed 16-bit content of the 32-bit elements of a register with the 16-bit content of the corresponding 32-bit elements of another register and add the results with a 64-bit value of an even/odd pair of registers. The addition result is written back to the register-pair.

  • DSMALTT: DSMALTT rt pair + top*top (all 32-bit elements)

Description

:

For the

DSMALTT instruction, it multiplies the top 16-bit content of the 32-bit elements of Rs1 with the top 16-bit content of the 32-bit elements of Rs2. The multiplication results are added with the 64-bit value of Rd. The 64-bit addition result is written back to Rd. The 16-bit values of Rs1 and Rs2, and the 64-bit value of Rd are treated as signed integers.

Operations:

Mres[0][31:0] = Rs1.W[0].H[1] * Rs2.W[0].H[1];
Mres[1][31:0] = Rs1.W[1].H[1] * Rs2.W[1].H[1];
Rd = Rd + SE64(Mres[0][31:0]) + SE64(Mres[1][31:0]);

Parameters
  • t[in] long long type of value stored in t

  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in long long type

__STATIC_FORCEINLINE long long __RV_DKMABB32 (long long t, unsigned long long a, unsigned long long b)

DKMABB32 (Saturating Signed Multiply Bottom Words & Add)

Type: SIMD

Syntax:

DKMABB32 Rd, Rs1, Rs2

Purpose

:

Multiply the signed 32-bit element in a register with the 32-bit element in another register and add the result to the content of 64-bit data in the third register. The addition result may besaturated and is written to the third register.

  • DKMABB32: rd + bottom*bottom

Description

:

For the

DKMABB32 instruction, it multiplies the bottom 32-bit element in Rs1 with the bottom 32-bit element in Rs2 The multiplication result is added to the content of 64-bit data in Rd. If the addition result is beyond the Q63 number range (-2^63 <= Q63 <= 2^63-1), it is saturated to the range and the OV bit is set to 1. The result after saturation is written to Rd. The 32-bit contents of Rs1 and Rs2 are treated as signed integers.

Operations:

res = Rd + (Rs1.W[0] * Rs2.W[0]);
if (res > (2^63)-1) {
  res = (2^63)-1;
  OV = 1;
} else if (res < -2^63) {
  res = -2^63;
  OV = 1;
}
Rd = res;

Parameters
  • t[in] long long type of value stored in t

  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in long long type

__STATIC_FORCEINLINE long long __RV_DKMABT32 (long long t, unsigned long long a, unsigned long long b)

DKMABT32 (Saturating Signed Multiply Bottom & Top Words & Add)

Type: SIMD

Syntax:

DKMABT32 Rd, Rs1, Rs2

Purpose

:

Multiply the signed 32-bit element in a register with the 32-bit element in another register and add the result to the content of 64-bit data in the third register. The addition result may be saturated and is written to the third register.

  • DKMABT32: rd + bottom*top

Description

:

For the

DKMABT32 instruction, it multiplies the bottom 32-bit element in Rs1 with the top 32-bit element in Rs2 The multiplication result is added to the content of 64-bit data in Rd. If the addition result is beyond the Q63 number range (-2^63 <= Q63 <= 2^63-1), it is saturated to the range and the OV bit is set to 1. The result after saturation is written to Rd. The 32-bit contents of Rs1 and Rs2 are treated as signed integers.

Operations:

res = Rd + (Rs1.W[0] * Rs2.W[1]);
if (res > (2^63)-1) {
  res = (2^63)-1;
  OV = 1;
} else if (res < -2^63) {
  res = -2^63;
  OV = 1;
}
Rd = res;

Parameters
  • t[in] long long type of value stored in t

  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in long long type

__STATIC_FORCEINLINE long long __RV_DKMATT32 (long long t, unsigned long long a, unsigned long long b)

DKMATT32 (Saturating Signed Multiply Bottom & Top Words & Add)

Type: SIMD

Syntax:

DKMATT32 Rd, Rs1, Rs2

Purpose

:

Multiply the signed 32-bit element in a register with the 32-bit element in another register and add the result to the content of 64-bit data in the third register. The addition result may be saturated and is written to the third register.

  • DKMATT32: rd + top*top

Description

:

For the

DKMATT32 instruction, it multiplies the top 32-bit element in Rs1 with the top 32-bit element in Rs2 The multiplication result is added to the content of 64-bit data in Rd. If the addition result is beyond the Q63 number range (-2^63 <= Q63 <= 2^63-1), it is saturated to the range and the OV bit is set to 1. The result after saturation is written to Rd. The 32-bit contents of Rs1 and Rs2 are treated as signed integers.

Operations:

res = Rd + (Rs1.W[1] * Rs2.W[1]);
if (res > (2^63)-1) {
  res = (2^63)-1;
  OV = 1;
} else if (res < -2^63) {
  res = -2^63;
  OV = 1;
}
Rd = res;

Parameters
  • t[in] long long type of value stored in t

  • a[in] unsigned long long type of value stored in a

  • b[in] unsigned long long type of value stored in b

Returns

value stored in unsigned long long type