Convert 32-bit floating point value

void riscv_float_to_f16(const float32_t *pSrc, float16_t *pDst, uint32_t blockSize)
void riscv_float_to_f64(const float32_t *pSrc, float64_t *pDst, uint32_t blockSize)
void riscv_float_to_q15(const float32_t *pSrc, q15_t *pDst, uint32_t blockSize)
void riscv_float_to_q31(const float32_t *pSrc, q31_t *pDst, uint32_t blockSize)
void riscv_float_to_q7(const float32_t *pSrc, q7_t *pDst, uint32_t blockSize)
group float_to_x

Functions

void riscv_float_to_f16(const float32_t *pSrc, float16_t *pDst, uint32_t blockSize)

Converts the elements of the floating-point vector to f16 vector.

Converts the elements of the floating-point vector to Q31 vector.

Parameters
  • pSrc[in] points to the f32 input vector

  • pDst[out] points to the f16 output vector

  • blockSize[in] number of samples in each vector

Returns

none

void riscv_float_to_f64(const float32_t *pSrc, float64_t *pDst, uint32_t blockSize)

Converts the elements of the floating-point vector to f64 vector.

Converts the elements of the floating-point vector to 64 bit floating-point vector.

Parameters
  • pSrc[in] points to the f32 input vector

  • pDst[out] points to the f64 output vector

  • blockSize[in] number of samples in each vector

Returns

none

void riscv_float_to_q15(const float32_t *pSrc, q15_t *pDst, uint32_t blockSize)

Converts the elements of the floating-point vector to Q15 vector.

Details

The equation used for the conversion process is:

Scaling and Overflow Behavior

The function uses saturating arithmetic. Results outside of the allowable Q15 range [0x8000 0x7FFF] are saturated.

Note

In order to apply rounding, the library should be rebuilt with the ROUNDING macro defined in the preprocessor section of project options.

Parameters
  • pSrc[in] points to the floating-point input vector

  • pDst[out] points to the Q15 output vector

  • blockSize[in] number of samples in each vector

Returns

none

void riscv_float_to_q31(const float32_t *pSrc, q31_t *pDst, uint32_t blockSize)

Converts the elements of the floating-point vector to Q31 vector.

Details

The equation used for the conversion process is:

Scaling and Overflow Behavior

The function uses saturating arithmetic. Results outside of the allowable Q31 range[0x80000000 0x7FFFFFFF] are saturated.

Note

In order to apply rounding, the library should be rebuilt with the ROUNDING macro defined in the preprocessor section of project options.

Parameters
  • pSrc[in] points to the floating-point input vector

  • pDst[out] points to the Q31 output vector

  • blockSize[in] number of samples in each vector

Returns

none

void riscv_float_to_q7(const float32_t *pSrc, q7_t *pDst, uint32_t blockSize)

Converts the elements of the floating-point vector to Q7 vector.

Description:

The equation used for the conversion process is:

Scaling and Overflow Behavior:

The function uses saturating arithmetic. Results outside of the allowable Q7 range [0x80 0x7F] will be saturated.

Note

In order to apply rounding, the library should be rebuilt with the ROUNDING macro defined in the preprocessor section of project options.

Parameters
  • *pSrc[in] points to the floating-point input vector

  • *pDst[out] points to the Q7 output vector

  • blockSize[in] length of the input vector

Returns

none.