NMSIS-Core
Version 1.2.0
NMSIS-Core support for Nuclei processor-based devices
|
SIMD 16-bit Multiply Instructions. More...
Functions | |
__STATIC_FORCEINLINE unsigned long | __RV_KHM16 (unsigned long a, unsigned long b) |
KHM16 (SIMD Signed Saturating Q15 Multiply) More... | |
__STATIC_FORCEINLINE unsigned long | __RV_KHMX16 (unsigned long a, unsigned long b) |
KHMX16 (SIMD Signed Saturating Crossed Q15 Multiply) More... | |
__STATIC_FORCEINLINE unsigned long long | __RV_SMUL16 (unsigned int a, unsigned int b) |
SMUL16 (SIMD Signed 16-bit Multiply) More... | |
__STATIC_FORCEINLINE unsigned long long | __RV_SMULX16 (unsigned int a, unsigned int b) |
SMULX16 (SIMD Signed Crossed 16-bit Multiply) More... | |
__STATIC_FORCEINLINE unsigned long long | __RV_UMUL16 (unsigned int a, unsigned int b) |
UMUL16 (SIMD Unsigned 16-bit Multiply) More... | |
__STATIC_FORCEINLINE unsigned long long | __RV_UMULX16 (unsigned int a, unsigned int b) |
UMULX16 (SIMD Unsigned Crossed 16-bit Multiply) More... | |
SIMD 16-bit Multiply Instructions.
there are 6 SIMD 16-bit Multiply instructions.
__STATIC_FORCEINLINE unsigned long __RV_KHM16 | ( | unsigned long | a, |
unsigned long | b | ||
) |
KHM16 (SIMD Signed Saturating Q15 Multiply)
Type: SIMD
Syntax:
Purpose:
Do Q15xQ15 element multiplications simultaneously. The Q30 results are then reduced to Q15 numbers again.
Description:
For the KHM16
instruction, multiply the top 16-bit Q15 content of 32-bit chunks in Rs1 with the top 16-bit Q15 content of 32-bit chunks in Rs2. At the same time, multiply the bottom 16-bit Q15 content of 32-bit chunks in Rs1 with the bottom 16-bit Q15 content of 32-bit chunks in Rs2. For the KHMX16
instruction, multiply the top 16-bit Q15 content of 32-bit chunks in Rs1 with the bottom 16-bit Q15 content of 32-bit chunks in Rs2. At the same time, multiply the bottom 16-bit Q15 content of 32-bit chunks in Rs1 with the top 16-bit Q15 content of 32-bit chunks in Rs2. The Q30 results are then right-shifted 15-bits and saturated into Q15 values. The Q15 results are then written into Rd. When both the two Q15 inputs of a multiplication are 0x8000, saturation will happen. The result will be saturated to 0x7FFF and the overflow flag OV will be set.
Operations:
[in] | a | unsigned long type of value stored in a |
[in] | b | unsigned long type of value stored in b |
Definition at line 2419 of file core_feature_dsp.h.
References __ASM.
__STATIC_FORCEINLINE unsigned long __RV_KHMX16 | ( | unsigned long | a, |
unsigned long | b | ||
) |
KHMX16 (SIMD Signed Saturating Crossed Q15 Multiply)
Type: SIMD
Syntax:
Purpose:
Do Q15xQ15 element multiplications simultaneously. The Q30 results are then reduced to Q15 numbers again.
Description:
For the KHM16
instruction, multiply the top 16-bit Q15 content of 32-bit chunks in Rs1 with the top 16-bit Q15 content of 32-bit chunks in Rs2. At the same time, multiply the bottom 16-bit Q15 content of 32-bit chunks in Rs1 with the bottom 16-bit Q15 content of 32-bit chunks in Rs2. For the KHMX16
instruction, multiply the top 16-bit Q15 content of 32-bit chunks in Rs1 with the bottom 16-bit Q15 content of 32-bit chunks in Rs2. At the same time, multiply the bottom 16-bit Q15 content of 32-bit chunks in Rs1 with the top 16-bit Q15 content of 32-bit chunks in Rs2. The Q30 results are then right-shifted 15-bits and saturated into Q15 values. The Q15 results are then written into Rd. When both the two Q15 inputs of a multiplication are 0x8000, saturation will happen. The result will be saturated to 0x7FFF and the overflow flag OV will be set.
Operations:
[in] | a | unsigned long type of value stored in a |
[in] | b | unsigned long type of value stored in b |
Definition at line 2482 of file core_feature_dsp.h.
References __ASM.
__STATIC_FORCEINLINE unsigned long long __RV_SMUL16 | ( | unsigned int | a, |
unsigned int | b | ||
) |
SMUL16 (SIMD Signed 16-bit Multiply)
Type: SIMD
Syntax:
Purpose:
Do signed 16-bit multiplications and generate two 32-bit results simultaneously.
RV32 Description:
For the SMUL16
instruction, multiply the top 16-bit Q15 content of Rs1 with the top 16-bit Q15 content of Rs2. At the same time, multiply the bottom 16-bit Q15 content of Rs1 with the bottom 16-bit Q15 content of Rs2. For the SMULX16
instruction, multiply the top 16-bit Q15 content of Rs1 with the bottom 16-bit Q15 content of Rs2. At the same time, multiply the bottom 16-bit Q15 content of Rs1 with the top 16- bit Q15 content of Rs2. The two Q30 results are then written into an even/odd pair of registers specified by Rd(4,1). Rd(4,1), i.e., d, determines the even/odd pair group of two registers. Specifically, the register pair includes register 2d and 2d+1. The odd 2d+1
register of the pair contains the 32-bit result calculated from the top part of Rs1 and the even 2d
register of the pair contains the 32-bit result calculated from the bottom part of Rs1.
RV64 Description:
For the SMUL16
instruction, multiply the top 16-bit Q15 content of the lower 32-bit word in Rs1 with the top 16-bit Q15 content of the lower 32-bit word in Rs2. At the same time, multiply the bottom 16-bit Q15 content of the lower 32-bit word in Rs1 with the bottom 16-bit Q15 content of the lower 32-bit word in Rs2. For the SMULX16
instruction, multiply the top 16-bit Q15 content of the lower 32-bit word in Rs1 with the bottom 16-bit Q15 content of the lower 32-bit word in Rs2. At the same time, multiply the bottom 16-bit Q15 content of the lower 32-bit word in Rs1 with the top 16-bit Q15 content of the lower 32-bit word in Rs2. The two 32-bit Q30 results are then written into Rd. The result calculated from the top 16-bit of the lower 32-bit word in Rs1 is written to Rd.W[1]. And the result calculated from the bottom 16-bit of the lower 32-bit word in Rs1 is written to Rd.W[0]
Operations:
[in] | a | unsigned int type of value stored in a |
[in] | b | unsigned int type of value stored in b |
Definition at line 9484 of file core_feature_dsp.h.
References __ASM.
__STATIC_FORCEINLINE unsigned long long __RV_SMULX16 | ( | unsigned int | a, |
unsigned int | b | ||
) |
SMULX16 (SIMD Signed Crossed 16-bit Multiply)
Type: SIMD
Syntax:
Purpose:
Do signed 16-bit multiplications and generate two 32-bit results simultaneously.
RV32 Description:
For the SMUL16
instruction, multiply the top 16-bit Q15 content of Rs1 with the top 16-bit Q15 content of Rs2. At the same time, multiply the bottom 16-bit Q15 content of Rs1 with the bottom 16-bit Q15 content of Rs2. For the SMULX16
instruction, multiply the top 16-bit Q15 content of Rs1 with the bottom 16-bit Q15 content of Rs2. At the same time, multiply the bottom 16-bit Q15 content of Rs1 with the top 16- bit Q15 content of Rs2. The two Q30 results are then written into an even/odd pair of registers specified by Rd(4,1). Rd(4,1), i.e., d, determines the even/odd pair group of two registers. Specifically, the register pair includes register 2d and 2d+1. The odd 2d+1
register of the pair contains the 32-bit result calculated from the top part of Rs1 and the even 2d
register of the pair contains the 32-bit result calculated from the bottom part of Rs1.
RV64 Description:
For the SMUL16
instruction, multiply the top 16-bit Q15 content of the lower 32-bit word in Rs1 with the top 16-bit Q15 content of the lower 32-bit word in Rs2. At the same time, multiply the bottom 16-bit Q15 content of the lower 32-bit word in Rs1 with the bottom 16-bit Q15 content of the lower 32-bit word in Rs2. For the SMULX16
instruction, multiply the top 16-bit Q15 content of the lower 32-bit word in Rs1 with the bottom 16-bit Q15 content of the lower 32-bit word in Rs2. At the same time, multiply the bottom 16-bit Q15 content of the lower 32-bit word in Rs1 with the top 16-bit Q15 content of the lower 32-bit word in Rs2. The two 32-bit Q30 results are then written into Rd. The result calculated from the top 16-bit of the lower 32-bit word in Rs1 is written to Rd.W[1]. And the result calculated from the bottom 16-bit of the lower 32-bit word in Rs1 is written to Rd.W[0]
Operations:
[in] | a | unsigned int type of value stored in a |
[in] | b | unsigned int type of value stored in b |
Definition at line 9569 of file core_feature_dsp.h.
References __ASM.
__STATIC_FORCEINLINE unsigned long long __RV_UMUL16 | ( | unsigned int | a, |
unsigned int | b | ||
) |
UMUL16 (SIMD Unsigned 16-bit Multiply)
Type: SIMD
Syntax:
Purpose:
Do unsigned 16-bit multiplications and generate two 32-bit results simultaneously.
RV32 Description:
For the UMUL16
instruction, multiply the top 16-bit U16 content of Rs1 with the top 16-bit U16 content of Rs2. At the same time, multiply the bottom 16-bit U16 content of Rs1 with the bottom 16-bit U16 content of Rs2. For the UMULX16
instruction, multiply the top 16-bit U16 content of Rs1 with the bottom 16-bit U16 content of Rs2. At the same time, multiply the bottom 16-bit U16 content of Rs1 with the top 16- bit U16 content of Rs2. The two U32 results are then written into an even/odd pair of registers specified by Rd(4,1). Rd(4,1), i.e., d, determines the even/odd pair group of two registers. Specifically, the register pair includes register 2d and 2d+1. The odd 2d+1
register of the pair contains the 32-bit result calculated from the top part of Rs1 and the even 2d
register of the pair contains the 32-bit result calculated from the bottom part of Rs1.
RV64 Description:
For the UMUL16
instruction, multiply the top 16-bit U16 content of the lower 32-bit word in Rs1 with the top 16-bit U16 content of the lower 32-bit word in Rs2. At the same time, multiply the bottom 16-bit U16 content of the lower 32-bit word in Rs1 with the bottom 16-bit U16 content of the lower 32-bit word in Rs2. For the UMULX16
instruction, multiply the top 16-bit U16 content of the lower 32-bit word in Rs1 with the bottom 16-bit U16 content of the lower 32-bit word in Rs2. At the same time, multiply the bottom 16-bit U16 content of the lower 32-bit word in Rs1 with the top 16-bit U16 content of the lower 32-bit word in Rs2. The two 32-bit U32 results are then written into Rd. The result calculated from the top 16-bit of the lower 32-bit word in Rs1 is written to Rd.W[1]. And the result calculated from the bottom 16-bit of the lower 32-bit word in Rs1 is written to Rd.W[0]
Operations:
[in] | a | unsigned int type of value stored in a |
[in] | b | unsigned int type of value stored in b |
Definition at line 12762 of file core_feature_dsp.h.
References __ASM.
__STATIC_FORCEINLINE unsigned long long __RV_UMULX16 | ( | unsigned int | a, |
unsigned int | b | ||
) |
UMULX16 (SIMD Unsigned Crossed 16-bit Multiply)
Type: SIMD
Syntax:
Purpose:
Do unsigned 16-bit multiplications and generate two 32-bit results simultaneously.
RV32 Description:
For the UMUL16
instruction, multiply the top 16-bit U16 content of Rs1 with the top 16-bit U16 content of Rs2. At the same time, multiply the bottom 16-bit U16 content of Rs1 with the bottom 16-bit U16 content of Rs2. For the UMULX16
instruction, multiply the top 16-bit U16 content of Rs1 with the bottom 16-bit U16 content of Rs2. At the same time, multiply the bottom 16-bit U16 content of Rs1 with the top 16- bit U16 content of Rs2. The two U32 results are then written into an even/odd pair of registers specified by Rd(4,1). Rd(4,1), i.e., d, determines the even/odd pair group of two registers. Specifically, the register pair includes register 2d and 2d+1. The odd 2d+1
register of the pair contains the 32-bit result calculated from the top part of Rs1 and the even 2d
register of the pair contains the 32-bit result calculated from the bottom part of Rs1.
RV64 Description:
For the UMUL16
instruction, multiply the top 16-bit U16 content of the lower 32-bit word in Rs1 with the top 16-bit U16 content of the lower 32-bit word in Rs2. At the same time, multiply the bottom 16-bit U16 content of the lower 32-bit word in Rs1 with the bottom 16-bit U16 content of the lower 32-bit word in Rs2. For the UMULX16
instruction, multiply the top 16-bit U16 content of the lower 32-bit word in Rs1 with the bottom 16-bit U16 content of the lower 32-bit word in Rs2. At the same time, multiply the bottom 16-bit U16 content of the lower 32-bit word in Rs1 with the top 16-bit U16 content of the lower 32-bit word in Rs2. The two 32-bit U32 results are then written into Rd. The result calculated from the top 16-bit of the lower 32-bit word in Rs1 is written to Rd.W[1]. And the result calculated from the bottom 16-bit of the lower 32-bit word in Rs1 is written to Rd.W[0]
Operations:
[in] | a | unsigned int type of value stored in a |
[in] | b | unsigned int type of value stored in b |
Definition at line 12847 of file core_feature_dsp.h.
References __ASM.