Skip to content

Entropy

Entropy Manual

Overview

There are two types of entropy codecs - Finite State Entropy (FSE) and Huffman. Both codecs use redundancy to compress data, building probability tables for symbols as an initial step.

Huffman is theoretically the most efficient binary code for encoding symbols separately, however this encoding model has an assumption that all symbols must be represented in an integer number of bits. In a scenario where one character occuring 90% of the time, Huffman encoding must assign 1 bit to this symbol, where using 0.15 bits is theortically ideal, reducing its compression efficiency. FSE breaks this "1 bit per symbol" limit, to offer improved compression ratio for cases where integer bit lengths per symbol are inefficient, at the cost of slightly worse compression speed.

Inputs

A single serial input

Outputs

A single serial output

Use Cases

In practice, entropy codecs are expected to be used at the final stage of compression after appropriate transformation has been done to capture the known structure of the data. These codecs are a core ingredient of compression, eliminating redundancy in the data. Typically, these codecs will not be used directly unless building a new LZ compressor, however they are used in Field LZ and Zstd compression. When used as a entropy coder for the LZ compressor, the associated graphs for the codecs can be useful as a final stage after match finding.

openzl.ext.graphs.Entropy

Bases: Graph

Compress the input using an order-0 entropy compressor

Inputs: input: TypeMask.Serial | TypeMask.Struct | TypeMask.Numeric

Source code in build-openzl/py/site-packages/openzl/ext/graphs.pyi
class Entropy(Graph):
    """
    Compress the input using an order-0 entropy compressor

    Inputs:
    input: TypeMask.Serial | TypeMask.Struct | TypeMask.Numeric
    """

    def __init__(self) -> None: ...

    def __call__(self, arg: ext.Compressor, /) -> ext.GraphID: ...

    def parameterize(self, compressor: ext.Compressor) -> ext.GraphID: ...

    def set_destination(self, edge: ext.Edge) -> None: ...

    def set_multi_input_destination(self, edges: Sequence[ext.Edge]) -> None: ...

    @property
    def base_graph(self) -> ext.GraphID: ...

base_graph property

__call__(arg)

Source code in build-openzl/py/site-packages/openzl/ext/graphs.pyi
def __call__(self, arg: ext.Compressor, /) -> ext.GraphID: ...

__init__()

Source code in build-openzl/py/site-packages/openzl/ext/graphs.pyi
def __init__(self) -> None: ...

parameterize(compressor)

Source code in build-openzl/py/site-packages/openzl/ext/graphs.pyi
def parameterize(self, compressor: ext.Compressor) -> ext.GraphID: ...

set_destination(edge)

Source code in build-openzl/py/site-packages/openzl/ext/graphs.pyi
def set_destination(self, edge: ext.Edge) -> None: ...

set_multi_input_destination(edges)

Source code in build-openzl/py/site-packages/openzl/ext/graphs.pyi
def set_multi_input_destination(self, edges: Sequence[ext.Edge]) -> None: ...

openzl.ext.graphs.Huffman

Bases: Graph

Compress the input using Huffman

Inputs: input: TypeMask.Serial | TypeMask.Struct | TypeMask.Numeric

Source code in build-openzl/py/site-packages/openzl/ext/graphs.pyi
class Huffman(Graph):
    """
    Compress the input using Huffman

    Inputs:
    input: TypeMask.Serial | TypeMask.Struct | TypeMask.Numeric
    """

    def __init__(self) -> None: ...

    def __call__(self, arg: ext.Compressor, /) -> ext.GraphID: ...

    def parameterize(self, compressor: ext.Compressor) -> ext.GraphID: ...

    def set_destination(self, edge: ext.Edge) -> None: ...

    def set_multi_input_destination(self, edges: Sequence[ext.Edge]) -> None: ...

    @property
    def base_graph(self) -> ext.GraphID: ...

base_graph property

__call__(arg)

Source code in build-openzl/py/site-packages/openzl/ext/graphs.pyi
def __call__(self, arg: ext.Compressor, /) -> ext.GraphID: ...

__init__()

Source code in build-openzl/py/site-packages/openzl/ext/graphs.pyi
def __init__(self) -> None: ...

parameterize(compressor)

Source code in build-openzl/py/site-packages/openzl/ext/graphs.pyi
def parameterize(self, compressor: ext.Compressor) -> ext.GraphID: ...

set_destination(edge)

Source code in build-openzl/py/site-packages/openzl/ext/graphs.pyi
def set_destination(self, edge: ext.Edge) -> None: ...

set_multi_input_destination(edges)

Source code in build-openzl/py/site-packages/openzl/ext/graphs.pyi
def set_multi_input_destination(self, edges: Sequence[ext.Edge]) -> None: ...

openzl.ext.graphs.Fse

Bases: Graph

Compress the input using FSE

Inputs: input: TypeMask.Serial

Source code in build-openzl/py/site-packages/openzl/ext/graphs.pyi
class Fse(Graph):
    """
    Compress the input using FSE

    Inputs:
    input: TypeMask.Serial
    """

    def __init__(self) -> None: ...

    def __call__(self, arg: ext.Compressor, /) -> ext.GraphID: ...

    def parameterize(self, compressor: ext.Compressor) -> ext.GraphID: ...

    def set_destination(self, edge: ext.Edge) -> None: ...

    def set_multi_input_destination(self, edges: Sequence[ext.Edge]) -> None: ...

    @property
    def base_graph(self) -> ext.GraphID: ...

base_graph property

__call__(arg)

Source code in build-openzl/py/site-packages/openzl/ext/graphs.pyi
def __call__(self, arg: ext.Compressor, /) -> ext.GraphID: ...

__init__()

Source code in build-openzl/py/site-packages/openzl/ext/graphs.pyi
def __init__(self) -> None: ...

parameterize(compressor)

Source code in build-openzl/py/site-packages/openzl/ext/graphs.pyi
def parameterize(self, compressor: ext.Compressor) -> ext.GraphID: ...

set_destination(edge)

Source code in build-openzl/py/site-packages/openzl/ext/graphs.pyi
def set_destination(self, edge: ext.Edge) -> None: ...

set_multi_input_destination(edges)

Source code in build-openzl/py/site-packages/openzl/ext/graphs.pyi
def set_multi_input_destination(self, edges: Sequence[ext.Edge]) -> None: ...