binned-cdf

A PyTorch-based distribution parametrized by the logits of CDF bins

Background

The Cumulative Distribution Function (CDF) is a fundamental concept in probability theory and statistics that describes the probability that a random variable $X$ takes on a value less than or equal to a given threshold $x$. Formally, the CDF is defined as $F(x) = P(X \leq x)$, where $F(x)$ ranges from 0 to 1 as $x$ varies from negative to positive infinity. The CDF provides a complete characterization of the probability distribution of a random variable: for continuous distributions, it is the integral of the probability density function (PDF), while for discrete distributions, it is the sum of probabilities up to and including $x$. Key properties of any CDF are the monotonicity and the boundary conditions $\lim_{x \to -\infty} F(x) = 0$ and $\lim_{x \to \infty} F(x) = 1$. CDFs are particularly useful for computing probabilities of intervals, quantiles, and for statistical inference.

Application to Machine Learning

This repository uses the CDF to model and learn flexible probability distributions in machine learning tasks. By parameterizing the CDF with binned logits, it enables differentiable training and efficient sampling, making it suitable for uncertainty estimation, probabilistic prediction, and distributional modeling in neural networks.

Implementation

The PiecewiseConstantBinnedCDF and PiecewiseLinearBinnedCDF classes inherit directly from torch.distributions.Distribution, implementing all necessary methods plus some convenience functions. They support multi-dimensional batch shapes and CUDA devices. The bins can be initialized linearly or log-spaced.

torch>=2.7 it the only non-dev dependency of this repo.

Getting Started

I recommend using PiecewiseLinearBinnedCDF for most applications.

from binned_cdf import PiecewiseLinearBinnedCDF

distr = PiecewiseLinearBinnedCDF(
    logits=logits,  # shape: (*batch_shape, num_bins)
    bound_low=-5,  # adapt to your data
    bound_up=7,  # adapt to your data
    log_spacing=True,  # if False, linear spacing is used
    bin_normalization_method="sigmoid",  # "sigmoid" or "softmax"
)

# ... use it like any other torch.distribution.Distribution

👉 Please have a look at the documentation to get started.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.github		.github
binned_cdf		binned_cdf
docs		docs
examples		examples
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
license.md		license.md
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

binned-cdf

Background

Application to Machine Learning

Implementation

Getting Started

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

binned-cdf

Background

Application to Machine Learning

Implementation

Getting Started

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages