Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 3 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@

Scientific computations are surrounded by various forms of uncertainty, requiring appropriate treatment to maximise the credibility of computations. Empirical information is often scarce, vague, conflicting and imprecise, requiring expressive uncertainty structures for trustful representation, aggregation and propagation.

This package is underpinned by a framework of ***uncertain number*** which allows for a closed computation ecosystem whereby trustworthy computations can be conducted in a rigorous manner. <ins>It provides capabilities across the typical uncertainty analysis pipeline, encompassing characterisation, aggregation, propagation, and applications including reliability analysis and optimisation under uncertainty, especially with a focus on imprecise probabilities.</ins>
This package is underpinned by a framework of ***uncertain number*** which allows for a closed computation ecosystem whereby trustworthy computations can be conducted in a rigorous manner. <ins>It provides capabilities across the typical uncertainty analysis pipeline, encompassing characterisation, aggregation, propagation, model updating, and applications including reliability analysis and optimisation under uncertainty, especially with a focus on imprecise probabilities.</ins>

> ***Uncertain Number*** refers to a generalised representation that unifies several uncertainty constructs including real numbers, intervals, probability distributions, interval bounds on probability distributions (i.e. [probability boxes](https://en.wikipedia.org/wiki/Probability_box)), and [finite DempsterShafer structures](https://en.wikipedia.org/wiki/Dempster–Shafer_theory). It is mostly suitable for managing mixed types of uncertainties.

Expand All @@ -22,7 +22,7 @@ This package is underpinned by a framework of ***uncertain number*** which allow
Explore the [documentation](https://pyuncertainnumber.readthedocs.io/en/latest/index.html) to get started, featuring hands-on [tutorials](https://pyuncertainnumber.readthedocs.io/en/latest/tutorials/index.html) and in-depth [examples](https://pyuncertainnumber.readthedocs.io/en/latest/examples/index.html) that showcase the power of the package.


>`pyuncertainnumber` exposes APIs at different levels. It features high-level APIs best suited for new users to quickly start with uncertainty computations with [*uncertain numbers*](https://pyuncertainnumber.readthedocs.io/en/latest/tutorials/what_is_un.html), and also low-level APIs allowing experts to have additional controls over mathematical constructs such as p-boxes, Dempster Shafer structures, probability distributions, etc.
>`pyuncertainnumber` [exposes APIs at different levels](file:///Users/lesliec/Documents/Github_repos/pyuncertainnumber/docs/_build/html/tutorials/getting_started.html). It features **high-level APIs** best suited for new users to quickly start with uncertainty computations with [*uncertain numbers*](https://pyuncertainnumber.readthedocs.io/en/latest/tutorials/what_is_un.html), and also **low-level APIs** allowing experts to have additional controls over mathematical constructs such as p-boxes, Dempster Shafer structures, probability distributions, etc.


## Installation
Expand All @@ -47,8 +47,7 @@ pip install pyuncertainnumber

## UQ multiverse

UQ is a big world (like Marvel multiverse) consisting of abundant theories and software implementations on multiple platforms. We focus mainly on the imprecise probability frameworks. Some notable examples include [OpenCossan](https://github.com/cossan-working-group/OpenCossan), [UQlab](https://www.uqlab.com/) in Matlab and [UncertaintyQuantification.jl](https://github.com/FriesischScott/UncertaintyQuantification.jl), [ProbabilityBoundsAnalysis.jl](https://github.com/AnderGray/ProbabilityBoundsAnalysis.jl) in Julia, and many others of course.
`PyUncertainNumber` is rooted in Python and has close ties with the Python scientific computing ecosystem, it builds upon and greatly extends a few pioneering projects, such as [intervals](https://github.com/marcodeangelis/intervals), [scipy-stats](https://docs.scipy.org/doc/scipy/tutorial/stats.html) and [pba-for-python](https://github.com/Institute-for-Risk-and-Uncertainty/pba-for-python) to generalise probability and interval arithmetic. Beyond arithmetic, `PyUncertainNumber` has offered a wide spectrum of algorithms and methods for uncertainty characterisation, propagation, surrogate modelling, and optimisation under uncertainty, allowing imprecise uncertainty analysis in both intrusive and non-intrusive manner. `PyUncertainNumber` is under active development and will continue to be dedicated to support imprecise analysis in engineering using Python.
UQ is a big world (like Marvel multiverse) consisting of abundant theories and software implementations on multiple platforms. Some notable examples include [OpenCossan](https://github.com/cossan-working-group/OpenCossan) [UQlab](https://www.uqlab.com/) in Matlab and [ProbabilityBoundsAnalysis.jl](https://github.com/AnderGray/ProbabilityBoundsAnalysis.jl) in Julia, and many others of course. We focus mainly on the imprecise probability frameworks. `PyUncertainNumber` is rooted in Python and has close ties with the Python scientific computing ecosystem, it builds upon and greatly extends a few pioneering projects, such as [intervals](https://github.com/marcodeangelis/intervals), [scipy-stats](https://docs.scipy.org/doc/scipy/tutorial/stats.html) and [pba-for-python](https://github.com/Institute-for-Risk-and-Uncertainty/pba-for-python) to generalise probability and interval arithmetic. Beyond arithmetic calculations, `PyUncertainNumber` has offered a wide spectrum of algorithms and methods for uncertainty characterisation, propagation, surrogate modelling, and optimisation under uncertainty, allowing imprecise uncertainty analysis in both intrusive and non-intrusive manner. `PyUncertainNumber` is under active development and will continue to be dedicated to support imprecise analysis in engineering using Python.


## Citation
Expand Down
Binary file added docs/source/_static/distribution_expert.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/_static/pbox_array.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/_static/propagation_flowchart.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/_static/uc_diagram_smaller.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
14 changes: 0 additions & 14 deletions docs/source/examples/calibration/2dof_tmcmc_demo.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -160,30 +160,16 @@
" # standard deviations (5% noise on true eigenvalues λ1=0.382, λ2=2.618)\n",
" sig1 = 0.0191 # 0.05 * 0.382\n",
" sig2 = 0.1309 # 0.05 * 2.618\n",
"\n",
" # compute eigenvalues λ1(q1, q2) and λ2(q1, q2) for the 2×2 system\n",
" # characteristic equation: λ^2 - (q1 + 2 q2) λ + q1 q2 = 0\n",
" # closed-form solution:\n",
" # λ₁,₂ = (q1/2 + q2) ∓ sqrt((q1/2 + q2)^2 - q1 q2)\n",
" center = q1 / 2.0 + q2\n",
" disc = center**2 - q1 * q2\n",
" if disc < 0:\n",
" # if the discriminant is negative, eigenvalues are complex -> impossible here physically\n",
" # give a very low likelihood\n",
" return -np.inf\n",
"\n",
" sqrt_disc = np.sqrt(disc)\n",
" lambda1_s = center - sqrt_disc\n",
" lambda2_s = center + sqrt_disc\n",
"\n",
" # Gaussian likelihood for 5 measurements of λ1 and 5 of λ2\n",
" # p(d | θ) ∝ exp( -1/(2σ1²) Σ (λ1_s - d1_m)² - 1/(2σ2²) Σ (λ2_s - d2_m)² )\n",
" # log p(d | θ) = const - 0.5/σ1² Σ (λ1_s - d1_m)² - 0.5/σ2² Σ (λ2_s - d2_m)²\n",
"\n",
" # constant term (same form as in your Case 3 implementation)\n",
" const_term = np.log((2 * np.pi * sig1 * sig2) ** -5)\n",
"\n",
" # misfit terms\n",
" misfit1 = -0.5 * (sig1**-2) * np.sum((lambda1_s - data1) ** 2)\n",
" misfit2 = -0.5 * (sig2**-2) * np.sum((lambda2_s - data2) ** 2)\n",
"\n",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@
"metadata": {},
"source": [
"<figure style=\"text-align: center;\">\n",
" <img src=\"../_static/free_pbox_constraint_long_form.png\" width=\"1000\">\n",
" <img src=\"../../_static/free_pbox_constraint_long_form.png\" width=\"1000\">\n",
" <figcaption>Empirical knowledge serving as constraints in characterising an uncertain number </figcaption>\n",
"</figure>"
]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@
"metadata": {},
"source": [
"<figure style=\"text-align: center;\">\n",
" <img src=\"../_static/dependency_illustration.png\" width=\"1000\">\n",
" <img src=\"../../_static/dependency_illustration.png\" width=\"1000\">\n",
" <figcaption>Dependency structures </figcaption>\n",
"</figure>"
]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -147,7 +147,7 @@
"\n",
"\n",
"<figure style=\"text-align: center;\">\n",
" <img src=\"../_static/function_hint.png\" width=\"1000\">\n",
" <img src=\"../../_static/function_hint.png\" width=\"1000\">\n",
"</figure>\n",
"\n",
"Mathematically, under ordinary arithmetic, we can write in a few equivalent manners:\n",
Expand Down
72 changes: 52 additions & 20 deletions docs/source/guides/uc.md
Original file line number Diff line number Diff line change
@@ -1,22 +1,20 @@
# Uncertainty characterisation

### variability and incertitude
## Variability and incertitude

Modern risk analysts distinguish between variability and incertitude. Variability (also called randomness, aleatory uncertainty, or irreducible uncertainty) arises from natural stochasticity, environmental or structural variation across space or time, manufacturing heterogeneity among components or individuals. Incertitude, also called ignorance, epistemic uncertainty, subjective uncertainty or reducible uncertainty, arises from incompleteness of knowledge. Sources of incertitude include measurement uncertainty, small sample sizes, and data censoring, ignorance about the details of physical mechanisms and processes.
Modern risk analysts distinguish between variability and incertitude. ***Variability*** (also called randomness, aleatory uncertainty, or irreducible uncertainty) arises from natural stochasticity, environmental or structural variation across space or time, manufacturing heterogeneity among components or individuals. ***Incertitude***, also called ignorance, epistemic uncertainty, subjective uncertainty or reducible uncertainty, arises from incompleteness of knowledge. Sources of incertitude include measurement uncertainty, small sample sizes, and data censoring, ignorance about the details of physical mechanisms and processes.

For an engineering analysis, the **challenge** lies in formulating suitable uncertainty models given available information, **without introducing unwarranted assumptions**. However, the available information is often vague, ambiguous, or qualitive. Available data are frequently limited and of poor quality, giving rise to challenges in eliciting precise probabilistic specifications. Solutions to this problem are discussed in the literature, under the framework of imprecise probability, from various perspectives using different mathematical concepts, including for example random sets, evidence theory, fuzzy stochastic concepts, info-gap theory, and probability bounds analysis.
For an engineering analysis, the **challenge** lies in formulating suitable uncertainty models given available information, **without introducing unwarranted assumptions**. However, the available information is often vague, ambiguous, or qualitative. Available data are frequently limited and of poor quality, giving rise to challenges in eliciting precise probabilistic specifications. Solutions to this problem are discussed in the literature, under the framework of imprecise probability, from various perspectives using different mathematical concepts, including for example random sets, evidence theory, fuzzy stochastic concepts, info-gap theory, and probability bounds analysis.

```{tip}
It is suggested to use interval analysis for propagating ignorance and the methods of probability theory for propagating variability.
```

```{seealso}
See also in the [propagation](./up.md)
```

### bounding distributional parameters

The mean of a normal distribution may be elicited from an expert but this expert cannot be precise to a certain value but rather give a range based on past experience.
## Bounding distributional parameters

The mean of a normal distribution may be elicited from an expert, but this expert cannot be precise to a certain value but rather give a range based on past experience.

````{tab} verbose
To comprehensively characterise a pbox, specify the bounds for the parameters along with many other ancillary fields.
Expand All @@ -42,21 +40,39 @@ un = pun.norm([0,12],[1,4])
```
````

```{tip}
shortcut may not work for some distribution families at the moment as an internal restructure is underway. Use the canonical verbose constructor for best compatibility.
```

### aggregation of multiple sources of information
````{tab} pba API
For low-level controls and customisation

Expert elicitation has been a challenging topic, especially when knowledge is limited and measurements are sparse. Multiple experts may not necessarily agree on the choice of elicited prbability distributions, which leads to the need for aggregation. Below shows two situations for illustration.
```python
from pyuncertainnumber import pba
pbox = pba.normal([0,12],[1,4])
```
````

Assume the expert opinions are expressed in closed intervals. There may well be multiple such intervals from different experts and these collections of interval can be overlapping, partially contradictory or even completely contradictory. Their relative credibility may be expressed in probabilities. Essentially such information creates a **Dempster-Shafer structure**. On the basis of a mixture operation, such information can be aggregated into a **pbox**.

```{tip}
The different sub-types of uncertain number can normally convert to one another (though may not be one by one), ergo the uncertain number been said to be a unified representation.
```

Pbox arithmetic also extends the convolution of probability distributions which has typically been done with the independence assumption. However, often in engineering modelling practices independence is assumed for mathematical easiness rather than warranted. Fortunately, the uncertainty about the dependency between random variables can be characterised by the probability bounds, as seen below. It should be noted that such dependency bound does not imply independence.
```{seealso}
See also the tutorial the [What is an uncertain number](https://pyuncertainnumber.readthedocs.io/en/latest/tutorials/what_is_un.html) to get started.
```


## Aggregation of multiple sources of information

Expert elicitation has been a challenging topic, especially when knowledge is limited and measurements are sparse. Multiple experts may not necessarily agree on the choice of elicited probability distributions, which leads to the need for aggregation. Below shows two situations for illustration.

Assume the expert opinions are expressed in closed intervals. There may well be multiple such intervals from different experts and these collections of intervals can be overlapping, partially contradictory or even completely contradictory. Their relative credibility may be expressed in probabilities. Essentially such information creates a **Dempster-Shafer structure**. On the basis of a mixture operation, such information can be aggregated into a **p-box**.

```{seealso}
See also the tutorial [uncertainty aggregation](https://pyuncertainnumber.readthedocs.io/en/latest/tutorials/uncertainty_aggregation.html) to get started.
```

## Inter-variable dependence

P-box arithmetic also extends the convolution of probability distributions which has typically been done with the independence assumption. However, often in engineering modelling practices independence is assumed for mathematical easiness rather than warranted. Fortunately, the uncertainty about the dependency between random variables can be characterised by the probability bounds, as seen below. It should be noted that such dependency bound does not imply independence.

```{image} ../../../assets/addition_bound.png
:alt: sum of two random variables without dependency specification
Expand All @@ -66,8 +82,13 @@ Pbox arithmetic also extends the convolution of probability distributions which

The sum of two random variables of lognormal distribution without dependency specification
```
```{seealso}
See also the tutorial [depenency structure](https://pyuncertainnumber.readthedocs.io/en/latest/examples/characterisation/example_dependency_dev_purpose.html) to get started .
```


### known statistical properties

## Known statistical properties

When the knowledge of a quantity is limited to the point where only some statistical information is available, such as the *min*, *max*, *median* etc. but not about the distribution and parameters, such partial information can serve as **constraints** to bound the underlying distribution:

Expand All @@ -78,9 +99,14 @@ When the knowledge of a quantity is limited to the point where only some statist
:align: center
```

### hedged numerical expression
```{seealso}
See also the tutorial [characterise as you go](https://pyuncertainnumber.readthedocs.io/en/latest/examples/characterisation/characterise_what_you_know.html) to get started.
```


Sometimes only purely qualitive information is available. An important part of processing elicited numerical inputs is an ability to quantitatively decode natural-language words, the linguistic information, that are commonly used to express or modify numerical values. Some example include ‘about’, ‘around’, ‘almost’, ‘exactly’, ‘nearly’, ‘below’, ‘at least’, ‘order of’, etc. A numerical expression with these approximators are called *hedges*. Extending upon the significant-digit convention, a series of interval interpretations of common hedged numerical expressions are proposed.
## Hedged numerical expression

Sometimes only purely qualitative information is available. An important part of processing elicited numerical inputs is an ability to quantitatively decode natural-language words, the linguistic information, that are commonly used to express or modify numerical values. Some examples include ‘about’, ‘around’, ‘almost’, ‘exactly’, ‘nearly’, ‘below’, ‘at least’, ‘order of’, etc. A numerical expression with these approximators are called *hedges*. Extending upon the significant-digit convention, a series of interval interpretations of common hedged numerical expressions are proposed.

```{image} ../../../assets/interval_hedge.png
:alt: interval hedges
Expand All @@ -107,15 +133,20 @@ pun.hedge_interpret('about 200', return_type='pbox').display()
hedged numerical expression "about 200"
```

### data uncertainty
```{seealso}
See also the tutorial [Interpret linguistic hedges](https://pyuncertainnumber.readthedocs.io/en/latest/examples/characterisation/linguistic_approximation.html) to get started.
```


## Data uncertainty

Measurement uncertainty is another main source uncertainty of data uncertainty besides sampling uncertainty. Point estimates from samples vary from one to another. We will typically use confidence intervals (as interval estimators) to account for the sampling uncertainty. As an example, `PyUncertainNumber` provides support for Kolmogorov–Smirnov (KS) confidence limits to infer the confidence limits for empirical cumulative distribution function.

```{seealso}
See also the [confidence box](../cbox.md) for a distributional estimator.
```

As to measurement uncertainty, `Intervals` turn out to be a natural means of mathematical construct for imprecise data due to the common understanding of margin of error, which leads to the midpoint notation of an interval object. `PyUncertainNumber` provides an extension of the Kolmogorov–Smirnov confidence limits for interval-valued data as well. [](#KS-bounds-imprecise) shows such confidence limits for the skinny data.
As to measurement uncertainty, `Intervals` turn out to be a natural means of mathematical construct for imprecise data due to the common understanding of margin of error, which leads to the midpoint notation of an interval object. `PyUncertainNumber` provides an extension of the Kolmogorov–Smirnov confidence limits for interval-valued data as well. The lower figure shows such confidence limits for the skinny data.

```{figure} ../../../assets/ks_precise.png
:alt: Kolmogorov–Smirnov bounds for precise data
Expand All @@ -130,3 +161,4 @@ As to measurement uncertainty, `Intervals` turn out to be a natural means of mat
:align: center
:width: 400px
```

Loading