Skip to content

Conversation

@madsbk
Copy link
Contributor

@madsbk madsbk commented Mar 2, 2021

This PR makes sure that memoryview given to Tornado's TCP stream have an item size of one. See tornadoweb/tornado#2996

Notice, since we are calling stream._write_buffer.append() directly, we still need this PR even when tornadoweb/tornado#2996 get merged.

  • Tests added / passed
  • Passes black distributed / flake8 distributed

madsbk added a commit to madsbk/distributed that referenced this pull request Mar 2, 2021
Now that we depend on dask#4555,
we don't need this hack.
@madsbk madsbk changed the title tcp.write(): cast memoryview to 1byte itemsize tcp.write(): cast memoryview to "B" Mar 2, 2021
madsbk added a commit to madsbk/distributed that referenced this pull request Mar 2, 2021
Now that we depend on dask#4555,
we don't need this hack.
@madsbk madsbk marked this pull request as ready for review March 2, 2021 13:21
@madsbk madsbk changed the title tcp.write(): cast memoryview to "B" [REVIEW] tcp.write(): cast memoryview to "B" Mar 2, 2021
@jakirkham
Copy link
Member

Currently this gets handled as part of serialization

frame = frame.cast("B")

Would it be possible to address this in the MsgPack serialization work ( #4531 )?

@madsbk
Copy link
Contributor Author

madsbk commented Mar 2, 2021

Would it be possible to address this in the MsgPack serialization work ( #4531 )?

Sure, do you think it is better to do this when serializing? My idea was to do it for any message for the sake of robustness.

@jakirkham
Copy link
Member

I think it makes sense to do in serialization as we may want to capture information about the shape and type in the message we send. That way if it is important to have when reconstructing on the other side, we can shape and cast the array as expected. Otherwise the receiving side may not incorporate that information

@madsbk
Copy link
Contributor Author

madsbk commented Mar 3, 2021

I think it makes sense to do in serialization as we may want to capture information about the shape and type in the message we send. That way if it is important to have when reconstructing on the other side, we can shape and cast the array as expected. Otherwise the receiving side may not incorporate that information

Yes, but I think we should let the specific serializers handle shape, type, etc. like the NumPy serializer does currently. The TCP just flattens the data to make sure that all the bytes are delivered to the peer correctly.

@madsbk madsbk force-pushed the tcp_cast_to_1byte_itemsize branch from 482c642 to a3c5794 Compare March 3, 2021 09:13
@jakirkham jakirkham merged commit 074bc48 into dask:master Mar 4, 2021
@jakirkham
Copy link
Member

Thanks Mads! 😄

@madsbk madsbk deleted the tcp_cast_to_1byte_itemsize branch March 4, 2021 09:19
jakirkham added a commit that referenced this pull request Mar 8, 2021
* tcp.write(): cast memoryview to 1byte itemsize

* dumps and loads now extract Serialize and Serialized

* test_numpy: now use the new dumps scheme

* Currently not handling bytes in dumps()

* Removing extract_serialize()

* test_comms: handle implicit list => tuple conversion by msgpack

* Workaround a buffer length bug

* Fixed some tuple/list mismatch in test_comms

* test_compression_takes_advantage_of_itemsize(): xfail

* to_serialize tests using bytearrays

* Removing "B" cast of memoryviews hack

Now that we depend on #4555,
we don't need this hack.

* TCP.write(): use nbytes() to determine if memoryview 0-size

Co-authored-by: John Kirkham <jakirkham@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants