APyFloat
¶
- class apytypes.APyFloat¶
Floating-point scalar with configurable format.
The implementation is a generalization of the IEEE 754 standard, meaning that features like subnormals, infinities, and NaN, are still supported. The format is defined by the number of exponent and mantissa bits, and a non-negative bias. These fields are named
exp_bits
,man_bits
, andbias
respectively. Similar to the hardware representation of a floating-point number, the value is stored using three fields; a sign bitsign
, a biased exponentexp
, and an integral mantissa with a hidden oneman
. The value of a normal number would thus be\[(-1)^{\texttt{sign}} \times 2^{\texttt{exp} - \texttt{bias}} \times (1 + \texttt{man} \times 2^{\texttt{-man_bits}}).\]In general, if the bias is not explicitly given for a format
APyFloat
will default to an IEEE-like bias using the formula\[\texttt{bias} = 2^{\texttt{exp_bits - 1}} - 1.\]Arithmetic can be performed similarly to the operations of the built-in type
float
in Python. The resulting word length from operations will be the same as the input operands’ by quantizing to nearest number with ties to even (QuantizationMode.TIES_EVEN
). If the operands do not share the same format, the resulting bit widths of the exponent and mantissa field will be the maximum of its inputs:- Attributes:
bias
Exponent bias.
bits
Total number of bits.
exp
Exponent bits with bias.
exp_bits
Number of exponent bits.
is_finite
True if and only if value is zero, subnormal, or normal.
is_inf
True if and only if value is infinite.
is_nan
True if and only if value is NaN.
is_normal
True if and only if value is normal (not zero, subnormal, infinite, or NaN).
is_subnormal
True if and only if value is subnormal.
is_zero
True if and only if value is zero.
man
Mantissa bits.
man_bits
Number of mantissa bits.
sign
Sign bit.
true_exp
Exponent value.
true_man
Mantissa value.
true_sign
Sign value.
Methods
Change format of the floating-point number.
Cast to bfloat16 format.
Cast to IEEE 754 binary64 (double-precision) format.
Cast to IEEE 754 binary16 (half-precision) format.
Cast to IEEE 754 binary32 (single-precision) format.
Test if two
APyFloat
objects are identical.Get the largest floating-point number in the same format that compares less.
Get the smallest floating-point number in the same format that compares greater.
Get the bit-representation of an
APyFloat
.Examples
>>> from apytypes import APyFloat >>> a = APyFloat.from_float(1.25, exp_bits=5, man_bits=2) >>> b = APyFloat.from_float(1.75, exp_bits=5, man_bits=2)
Operands with same format, result will have exp_bits=5, man_bits=2
>>> a + b APyFloat(sign=0, exp=16, man=2, exp_bits=5, man_bits=2)
>>> d = APyFloat.from_float(1.75, exp_bits=4, man_bits=4)
Operands with different formats, result will have exp_bits=5, man_bits=4
>>> a + d APyFloat(sign=0, exp=16, man=8, exp_bits=5, man_bits=4)
If the operands of an arithmetic operation have IEEE-like biases, then the result will also have an IEEE-like bias – based on the resulting number of exponent bits. To support operations with biases deviating from the standard, the bias of the resulting format is calculated as the “average” of the inputs’ biases:
\[\texttt{bias}_3 = \frac{\left ( \left (\texttt{bias}_1 + 1 \right ) / 2^{\texttt{exp_bits}_1} + \left (\texttt{bias}_2 + 1 \right ) / 2^{\texttt{exp_bits}_2} \right ) \times 2^{\texttt{exp_bits}_3}}{2} - 1,\]where \(\texttt{exp_bits}_1\) and \(\texttt{exp_bits}_2\) are the bit widths of the operands, \(\texttt{bias}_1\) and \(\texttt{bias}_2\) are the input biases, and \(\texttt{exp_bits}_3\) is the target bit width. Note that this formula still results in an IEEE-like bias when the inputs use IEEE-like biases.
Constructor¶
- __init__(self, sign: int, exp: int, man: int, exp_bits: int, man_bits: int, bias: int | None = None) None ¶
Create an
APyFloat
object.- Parameters:
- sign
bool
or int The sign of the float. False/0 means positive. True/non-zero means negative.
- exp
int
Exponent of the float as stored, i.e., actual value + bias.
- man
int
Mantissa of the float as stored, i.e., without a hidden one.
- exp_bits
int
Number of exponent bits.
- man_bits
int
Number of mantissa bits.
- bias
int
, optional Exponent bias. If not provided, bias is
2**exp_bits - 1
.
- sign
- Returns:
Creation from other types¶
- from_float(value: object, exp_bits: int, man_bits: int, bias: int | None = None) APyFloat ¶
Create an
APyFloat
object from anint
,float
,APyFixed
, orAPyFloat
.The quantization mode used is
QuantizationMode.TIES_EVEN
.- Parameters:
- Returns:
See also
Examples
>>> from apytypes import APyFloat
a, initialized from floating-point values.
>>> a = APyFloat.from_float(1.35, exp_bits=10, man_bits=15)
- from_bits(bits: int, exp_bits: int, man_bits: int, bias: int | None = None) APyFloat ¶
Create an
APyFloat
object from a bit-representation.- Parameters:
- Returns:
See also
Examples
>>> from apytypes import APyFloat
a, initialized to -1.5 from a bit pattern.
>>> a = APyFloat.from_bits(0b1_01111_10, exp_bits=5, man_bits=2)
Change word length¶
- cast(self, exp_bits: int | None = None, man_bits: int | None = None, bias: int | None = None, quantization: QuantizationMode | None = None) APyFloat ¶
Change format of the floating-point number.
This is the primary method for performing quantization when dealing with APyTypes floating-point numbers.
- Parameters:
- exp_bits
int
, optional Number of exponent bits in the result.
- man_bits
int
, optional Number of mantissa bits in the result.
- bias
int
, optional Exponent bias. If not provided, bias is
2**exp_bits - 1
.- quantization
QuantizationMode
, optional. Quantization mode to use in this cast. If None, use the global quantization mode.
- exp_bits
- Returns:
Get bit representation¶
- to_bits(self) int ¶
Get the bit-representation of an
APyFloat
.- Returns:
See also
Examples
>>> from apytypes import APyFloat
a, initialized to -1.5 from a bit pattern.
>>> a = APyFloat.from_bits(0b1_01111_10, exp_bits=5, man_bits=2) >>> a APyFloat(sign=1, exp=15, man=2, exp_bits=5, man_bits=2) >>> a.to_bits() == 0b1_01111_10 True
Comparison¶
Convenience methods¶
Casting¶
- cast_to_bfloat16(self, quantization: QuantizationMode | None = None) APyFloat ¶
Cast to bfloat16 format.
Convenience method corresponding to
f.cast(exp_bits=8, man_bits=7)
- Parameters:
- quantization
QuantizationMode
, optional Quantization mode to use. If not provided, the global mode, see
get_float_quantization_mode()
, is used.
- quantization
- cast_to_double(self, quantization: QuantizationMode | None = None) APyFloat ¶
Cast to IEEE 754 binary64 (double-precision) format.
Convenience method corresponding to
f.cast(exp_bits=11, man_bits=52)
- Parameters:
- quantization
QuantizationMode
, optional Quantization mode to use. If not provided, the global mode, see
get_float_quantization_mode()
, is used.
- quantization
- cast_to_half(self, quantization: QuantizationMode | None = None) APyFloat ¶
Cast to IEEE 754 binary16 (half-precision) format.
Convenience method corresponding to
f.cast(exp_bits=5, man_bits=10)
- Parameters:
- quantization
QuantizationMode
, optional Quantization mode to use. If not provided, the global mode, see
get_float_quantization_mode()
, is used.
- quantization
- cast_to_single(self, quantization: QuantizationMode | None = None) APyFloat ¶
Cast to IEEE 754 binary32 (single-precision) format.
Convenience method corresponding to
f.cast(exp_bits=8, man_bits=23)
- Parameters:
- quantization
QuantizationMode
, optional Quantization mode to use. If not provided, the global mode, see
get_float_quantization_mode()
, is used.
- quantization
Calculations¶
Properties¶
Word length¶
Values¶
- property man¶
Mantissa bits.
These are without a possible hidden one.
- Returns:
See also
- property true_exp¶
Exponent value.
The bias value is subtracted and exponent adjusted in case of a subnormal number.
- Returns:
See also
Bit pattern information¶
- property is_normal¶
True if and only if value is normal (not zero, subnormal, infinite, or NaN).
- Returns: