IntCastingNaNError after outer join with int64 column

**Describe the issue**:
An outer join with an int64 column, will keep the column as int64 even though it may have introduced NaNs.


**Minimal Complete Verifiable Example**:

```python
from dask.distributed import Client
from dask.distributed import LocalCluster
import dask.dataframe as dd

cluster = LocalCluster()
client = Client(cluster)
client.cluster.scale(1)


dask_df = dd.from_dict(
    {
        "a": [1, 2],
    },
    npartitions=1,
)
dask_df2 = dd.from_dict(
    {
        "b": [1],
    },
    npartitions=1,
)
print(dask_df2.b.dtype) # int64
df = dask_df.join(dask_df2, how="left")
print(df.b.dtype) # int64, this would previously be a float
df.shuffle(on="a").compute() # causes IntCastingNaNError
```

**Anything else we need to know?**:

In a previous version (dask 2022.5.2 don't know which version the change first occurred) this same code would output a float column and therefore not throw an error.

Is this expected behaviour?

The problem may of course be prevented by first casting b to "Int64" (pandas nullable) or "float" manually.

**Environment**:

- Dask version:
    - dask                                 2023.9.1
    - dask-bigquery                 2023.5.1
    - dask-glm                          0.2.0
    - dask-kubernetes             2023.9.0
    - dask_labextension          7.0.0
    - dask-ml                            2023.3.24
- Python version: Python 3.10.11
- Operating System: Linux
- Install method: conda


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

IntCastingNaNError after outer join with int64 column #8183

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

IntCastingNaNError after outer join with int64 column #8183

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions