Skip to content

Conversation

@jakirkham
Copy link
Member

Enables Cythonization of serialize, this should improve functions like extract_serialize, which have already been annotated for this purpose.

Note: Have sometimes encountered ImportErrors locally when doing this. Plus we've only done the work to annotate extract_serialize and nothing else. Hence why this is disabled. In the end we may end up optimizing Scheduler communication to bypass this function ( #4376 ). So this may not be needed in the end.

cc @quasiben @mrocklin @madsbk

Enables Cythonization of `serialize`, this should improve functions like
`extract_serialize`, which have already been annotated for this purpose.
We're no longer able to inspect the cythonized functions
@mrocklin
Copy link
Member

I tried playing around with this briefly. Made a small change. Still not working sadly :/

@jakirkham
Copy link
Member Author

Are you seeing an ImportError or something else?

@mrocklin
Copy link
Member

Unable to serialize bytes. The serializers aren't working correctly. I may have screwed up the context bit that I added. I added that by the way because inspect no longer works on the cythonized functions

@jakirkham
Copy link
Member Author

Yeah maybe Cythonizing this whole file is overkill. Let me try something

@mrocklin
Copy link
Member

Traceback (most recent call last):
  File "/home/mrocklin/miniconda/lib/python3.8/site-packages/distributed/batched.py", line 93, in _background_send
    nbytes = yield self.comm.write(
  File "/home/mrocklin/workspace/tornado/tornado/gen.py", line 766, in run
    value = future.result()
  File "/home/mrocklin/miniconda/lib/python3.8/site-packages/distributed/comm/tcp.py", line 230, in write
    frames = await to_frames(
  File "/home/mrocklin/miniconda/lib/python3.8/site-packages/distributed/comm/utils.py", line 52, in to_frames
    return _to_frames()
  File "/home/mrocklin/miniconda/lib/python3.8/site-packages/distributed/comm/utils.py", line 32, in _to_frames
    protocol.dumps(
  File "/home/mrocklin/miniconda/lib/python3.8/site-packages/distributed/protocol/core.py", line 50, in dumps
    data = {
  File "/home/mrocklin/miniconda/lib/python3.8/site-packages/distributed/protocol/core.py", line 51, in <dictcomp>
    key: serialize(
  File "distributed/protocol/serialize.py", line 318, in distributed.protocol.serialize.serialize
TypeError: ('Could not serialize object of type bytes.', '

I tried moving the extract, Serialize, and Serialized classes to a new extract.py, but it's not behaving for me. It's entirely possible that I'm not clearing out some old files or something though.

@jakirkham
Copy link
Member Author

Yeah was thinking something similar, but actually just lumping them in scheduler.py just in case the ImportError turns up (though maybe that isn't the concern).

@jakirkham
Copy link
Member Author

jakirkham commented Jan 22, 2021

We could also simplify the code further such that Serialize and Serialized don't need to move (like checking typ_v.__name__ instead). Just in case that is causing issues.

For example...

        elif typ_v.__name__ in {"Serialize", "Serialized"}
            ser[path_k] = v

Base automatically changed from master to main March 8, 2021 19:04
@jakirkham
Copy link
Member Author

Superseded by PR ( #4531 )

@jakirkham jakirkham closed this Mar 9, 2021
@jakirkham jakirkham deleted the bld_cy_ser branch March 9, 2021 19:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants