Skip to content

aerobulk-python is SLOWWW #32

@jbusecke

Description

@jbusecke

I am currently struggling with what I perceive as very slow performance with aerobulk-python.

For some of the work we are doing over at https://github.com/ocean-transport/scale-aware-air-sea, it seems that a single timestep of CM2.6 data takes about 45-50s to execute.

This does not seem to be a particularity of the data wer are using, since I am getting similar times for synthetic data:

from typing import Dict, Tuple
import time
import numpy as np
import pytest
import xarray as xr
from aerobulk import noskin

"""Tests for the xarray wrapper"""


def create_data(
    shape: Tuple[int, ...],
    chunks: Dict[str, int] = {},
    skin_correction: bool = False,
    order: str = "F",
):
    def _arr(value, chunks, order):
        arr = xr.DataArray(np.full(shape, value, order=order))

        # adds random noise scaled by a percentage of the value
        randomize_factor = 0.001
        randomize_range = value * randomize_factor
        arr = arr + np.random.rand(*shape) + randomize_range
        if chunks:
            arr = arr.chunk(chunks)
        return arr

    sst = _arr(290.0, chunks, order)
    t_zt = _arr(280.0, chunks, order)
    hum_zt = _arr(0.001, chunks, order)
    u_zu = _arr(1.0, chunks, order)
    v_zu = _arr(-1.0, chunks, order)
    slp = _arr(101000.0, chunks, order)
    rad_sw = _arr(0.000001, chunks, order)
    rad_lw = _arr(350, chunks, order)
    if skin_correction:
        return sst, t_zt, hum_zt, u_zu, v_zu, rad_sw, rad_lw, slp
    else:
        return sst, t_zt, hum_zt, u_zu, v_zu, slp
    
data = create_data((3600, 2700, 2), chunks=None)

When I time this execution

tic = time.time()
out_data = noskin(*data)
toc = time.time() - tic

I get a runtime of ~100sec, so about 50s per timestep.

Am I expecting too much of the fortran code? Or is this slow for computing on ~1e7 data points.

I wonder if there is some obvious issue with our compiler flags?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions