Skip to content

Discussion: efficient SHM array interop between Julia and Python #13

@csvance

Description

@csvance

Hello, this isn't an issue, but more something I figure would be helpful for other users looking to use Julia and Python together with shared memory without having to convert between Fortran/C style array ordering. I had searched for something like this within the documentation, issues, and source code here and on Julia Discourse but I couldn't find exactly what I was looking for. For the project I'm working on this reduced the overhead of our RPCs to the point where the Julia RPC backend sped our entire application up ~30% over the baseline. I'm relatively new to Julia so maybe there is a better way to handle some of these things.

To avoid converting between Fortran/C style arrays, these helper functions will let Julia load arrays from either style with zero copies when N <= 2. I'm not sure how to efficiently handle this for N > 2 as permutedims doesn't return a view, I'm guessing we need an additional array library for that. For my use case, N is always equal to 2 so handling this wasn't needed.

WrappedFArray(shm::SharedMemory, ::Type{T}, shape) where {T} = WrappedArray(shm, T, shape...)
function WrappedCArray(shm::SharedMemory, ::Type{T}, shape) where {T} 
    if length(shape) == 1
        # dense array with one axis always has stride of 1, so order doesn't matter
        return WrappedFArray(shm, T, shape)
    elseif length(shape) == 2
        # transpose gives us a view which is handled nicely by many linear algebra routines
        return transpose(WrappedArray(shm, T, reverse(shape)...))
    else
        # allocates memory :(
        return permutedims(WrappedArray(shm, T, reverse(shape)...), reverse(1:length(shape)))
    end
end

Keep in mind your Julia algorithms might want to automatically adapt to array style if you use cartesian indexing. Obviously with Fortran/Julia style you usually want to iterate j, i instead of i, j. Here is how I handled it for grayscale images using ResumableFunctions.jl:

@resumable function RowMajorIterator2D(grid_axes)
    for i in first(grid_axes[1]):last(grid_axes[1])
        for j in first(grid_axes[2]):last(grid_axes[2])
            @yield (i, j)
        end
    end
end


@resumable function ColMajorIterator2D(grid_axes)
    for j in first(grid_axes[2]):last(grid_axes[2])
        for i in first(grid_axes[1]):last(grid_axes[1])
            @yield (i, j)
        end
    end
end


function MajorIterator2D(img, grid_axes)
    if stride(img, 1) > stride(img, 2)
        return RowMajorIterator2D(grid_axes)
    else
        return ColMajorIterator2D(grid_axes)
    end
end

On the Python side you can easily load arrays from Julia with zero copies via multiprocessing.shared_memory.SharedMemory:

import numpy as np
from multiprocessing import shared_memory

def shm_array(name: str, shape, dtype, order='F'):
  return array = np.ndarray(
                  shape=shape,
                  dtype=dtype,
                  buffer=shared_memory.SharedMemory(name).buf,
                  order=order
              )

Hopefully this is helpful to people who are looking to deploy Julia together with Python/C style arrays in production.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions