Floating point bug in heterozygosity_expected resulting in nans

### Description

Hi, I think there's a float precision error in `heterozygosity_expected` caused by the comparison statement `af_sum < 1` [here](https://github.com/cggh/scikit-allel/blob/master/allel/stats/hw.py#L98).
Some values that are effectively 1 will return `True` as a result they get filled e.g.
`0.99999999...`.

### Example

```
import numpy as np
import allel

g = allel.GenotypeArray([
    [[0, 0], [0, 0], [0, 0]],
    [[0, 0], [0, 1], [1, 1]],
    [[0, 1], [2, 3], [4, 5]]
])

af = g.count_alleles().to_frequencies()

allel.heterozygosity_expected(af, ploidy=2)
```
returns
```
array([0. , 0.5, nan])
```

The expected result is 
```
array([0.        , 0.5       , 0.83333333])
```

The third value is nan-filled because
```
af_sum = np.sum(af, axis=1)
print(af_sum[0], af_sum[1], af_sum[2])
```
`1.0 1.0 0.9999999999999999`


This actually occurs in the polyploid test case but the same check is used in the reference implementation in the test suit.


### Fix

I've got a PR ready to fix it by rounding to a suitable precision based on the dtype.

```
precision = np.finfo(af_sum.dtype).precision
af_sum = np.round(np.sum(af, axis=1), decimals=precision)
print(af_sum[0], af_sum[1], af_sum[2])
```
`1.0 1.0 1.0`

Though I'm not sure what case  `af_sum < 1` is actually checking for? 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Floating point bug in heterozygosity_expected resulting in nans #274

Description

Example

Fix

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Floating point bug in heterozygosity_expected resulting in nans #274

Description

Description

Example

Fix

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions