tdigest-ch

A Python library for estimating quantiles in a stream, using ClickHouse t-digest data structure.

The t-digest data structure is designed around computing accurate quantile estimates from streaming data. Two t-digests can be merged, making the data structure well suited for map-reduce settings.

Repository

API reference

class tdigest_ch.TDigest(elems: Iterable[float] | TDigest | None = None)

T-digest data structure for approximating the quantiles of a distribution.

Examples:
>>> digest = TDigest();
>>> # Add some elements.
>>> digest.add(1.0);
>>> digest.add(2.0);
>>> digest.add(3.0);
>>> # Get the median of the distribution.
>>> digest.quantile(0.5);
2.0
__ior__(other: TDigest) TDigest

Update the t-digest, adding elements from the other.

Examples:
>>> digest_1 = TDigest([1.0, 2.0, 3.0])
>>> digest_2 = TDigest([4.0, 5.0])
>>> digest_1 |= digest_2
>>> len(digest_1)
5
__len__() int

Return the number of elements in the t-digest.

Examples:
>>> digest = TDigest([1.0, 2.0, 3.0])
>>> len(digest)
3
>>> digest.add(3.0, count=2)
>>> len(digest)
5
__or__(other: TDigest) TDigest

Return a new t-digest with elements from the t-digest and the other.

Examples:
>>> digest_1 = TDigest([1.0, 2.0, 3.0])
>>> digest_2 = TDigest([4.0, 5.0])
>>> digest = digest_1 | digest_2
>>> len(digest)
5
>>> digest.quantile(0.5)
3.0
add(value: float, count: int = 1) None

Add a value to the t-digest.

Examples:
>>> digest = TDigest()
>>> digest.add(1.0)
>>> digest.add(2.0)
>>> len(digest)
2
clear() None

Clear the t-digest, removing all values.

Examples:
>>> digest = TDigest()
>>> digest.add(1.0)
>>> digest.clear()
>>> len(digest)
0
copy() TDigest

Return a copy of the t-digest.

static from_json(json: str | bytes) TDigest

Return a t-digest from a JSON representation.

quantile(level: float) float

Return the estimated quantile of the t-digest.

Examples:
>>> digest = TDigest([1.0, 2.0, 3.0, 4.0, 5.0])
>>> digest.quantile(0.5)
3.0
to_json() bytes

Return a JSON representation of the t-digest.

union(*others: Iterable[float] | TDigest) TDigest

Return a new t-digest with elements from the t-digest and all others.

Examples:
>>> digest_1 = TDigest([1.0, 2.0, 3.0])
>>> digest_2 = TDigest([4.0, 5.0])
>>> digest = digest_1.union(digest_2)
>>> len(digest)
5
>>> digest.quantile(0.5)
3.0
update(*others: Iterable[float] | TDigest) None

Update the t-digest, adding elements from all others.

Examples:
>>> digest = TDigest([1.0, 2.0, 3.0])
>>> digest.update([4.0, 5.0])
>>> len(digest)
5
>>> digest.quantile(0.5)
3.0