GFloat Basics

This notebook shows the use of decode_float to explore properties of some float formats.

# Install packages
from pandas import DataFrame
import numpy as np

from gfloat import decode_float
from gfloat.formats import *
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Cell In[1], line 2
      1 # Install packages
----> 2 from pandas import DataFrame
      3 import numpy as np
      5 from gfloat import decode_float

ModuleNotFoundError: No module named 'pandas'

List all the values in a format

The first example shows how to list all values in a given format. We will choose the OCP E5M2 format.

The object format_info_ocp_e5m2 is from the gfloat.formats package, and describes the characteristics of that format:

format_info_ocp_e5m2
FormatInfo(name='ocp_e5m2', k=8, precision=3, emax=15, has_nz=True, has_infs=True, num_high_nans=3, has_subnormals=True, is_signed=True, is_twos_complement=False)

We shall use the format to decode all values from 0..255, and gather them in a pandas DataFrame. We see that decode_float returns a lot more than just the value - it also splits out the exponent, significand, and sign, and returns the FloatClass, which allows us to distinguish normal and subnormal numbers, as well as zero, infinity, and nan.

fmt = format_info_ocp_e5m2
vals = [decode_float(fmt, i) for i in range(256)]
DataFrame(vals).set_index("ival")
fval exp expval significand fsignificand signbit fclass
ival
0 0.000000e+00 0 -14 0 0.00 0 FloatClass.ZERO
1 1.525879e-05 0 -14 1 0.25 0 FloatClass.SUBNORMAL
2 3.051758e-05 0 -14 2 0.50 0 FloatClass.SUBNORMAL
3 4.577637e-05 0 -14 3 0.75 0 FloatClass.SUBNORMAL
4 6.103516e-05 1 -14 0 1.00 0 FloatClass.NORMAL
... ... ... ... ... ... ... ...
251 -5.734400e+04 30 15 3 1.75 1 FloatClass.NORMAL
252 -inf 31 16 0 1.00 1 FloatClass.INFINITE
253 NaN 31 16 1 1.25 1 FloatClass.NAN
254 NaN 31 16 2 1.50 1 FloatClass.NAN
255 NaN 31 16 3 1.75 1 FloatClass.NAN

256 rows × 7 columns

Additional format info: special values, min, max, dynamic range

In addition, FormatInfo can tell us about other characteristics of each format. To reproduce some of the OCP spec’s tables 1 and 2:

def compute_dynamic_range(fi):
    return np.log2(fi.max / fi.smallest)


for prop, probe in (
    ("Max exponent (emax)    ", lambda fi: fi.emax),
    ("Exponent bias          ", lambda fi: fi.expBias),
    ("Infinities             ", lambda fi: 2 * int(fi.has_infs)),
    ("Number of NaNs         ", lambda fi: fi.num_nans),
    ("Number of zeros        ", lambda fi: int(fi.has_zero) + int(fi.has_nz)),
    ("Max normal number      ", lambda fi: fi.max),
    ("Min normal number      ", lambda fi: fi.smallest_normal),
    ("Min subnormal number   ", lambda fi: fi.smallest_subnormal),
    ("Dynamic range (binades)", lambda x: round(compute_dynamic_range(x))),
):
    print(
        f"{prop} {probe(format_info_ocp_e4m3):<20} {probe(format_info_ocp_e5m2):<20}  {probe(format_info_p3109(3))}"
    )
Max exponent (emax)     8                    15                    15
Exponent bias           7                    15                    16
Infinities              0                    2                     2
Number of NaNs          2                    6                     1
Number of zeros         2                    2                     1
Max normal number       448.0                57344.0               49152.0
Min normal number       0.015625             6.103515625e-05       3.0517578125e-05
Min subnormal number    0.001953125          1.52587890625e-05     7.62939453125e-06
Dynamic range (binades) 18                   32                    33

How do subnormals affect dynamic range?

Most, if not all, low-precision formats include subnormal numbers, as they increase the number of values near zero, and increase dynamic range. A natural question is “by how much?”. To answer this, we can create a mythical new format, a copy of e4m3, but with has_subnormals set to true.

import copy

e4m3_no_subnormals = copy.copy(format_info_ocp_e4m3)
e4m3_no_subnormals.has_subnormals = False

And now compute the dynamic range with and without:

dr_with = compute_dynamic_range(format_info_ocp_e4m3)
dr_without = compute_dynamic_range(e4m3_no_subnormals)

print(f"Dynamic range with subnormals = {dr_with}")
print(f"Dynamic range without subnormals = {dr_without}")
print(f"Ratio = {2**(dr_with - dr_without):.1f}")
Dynamic range with subnormals = 17.807354922057606
Dynamic range without subnormals = 15.637429920615292
Ratio = 4.5