Skip to content

Conversation

@Dandandan
Copy link
Contributor

@Dandandan Dandandan commented Feb 14, 2021

Same idea as #9493 but for examples in DataFusion

FYI @alamb

Clean + building with cargo test.

Moving the micro benchmarks out of the crate is also another possibility.

Old New
Nr dependencies 309 249
build time(s) 77 68

@github-actions
Copy link

@Dandandan Dandandan changed the title ARROW-11626: [Rust][DataFusion] Move datafusion examples to own project ARROW-11626: [Rust][DataFusion] Move [DataFusion] examples to own project Feb 14, 2021
@codecov-io
Copy link

codecov-io commented Feb 14, 2021

Codecov Report

Merging #9494 (edbea69) into master (bfa99d9) will decrease coverage by 0.07%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #9494      +/-   ##
==========================================
- Coverage   82.49%   82.42%   -0.08%     
==========================================
  Files         245      245              
  Lines       57347    57646     +299     
==========================================
+ Hits        47311    47517     +206     
- Misses      10036    10129      +93     
Impacted Files Coverage Δ
rust/datafusion-examples/examples/flight_server.rs 0.00% <ø> (ø)
rust/datafusion-examples/examples/simple_udaf.rs 0.00% <ø> (ø)
rust/parquet/src/basic.rs 87.22% <0.00%> (-10.04%) ⬇️
rust/parquet/src/arrow/array_reader.rs 77.53% <0.00%> (-0.08%) ⬇️
rust/parquet/src/schema/parser.rs 90.20% <0.00%> (ø)
rust/parquet/src/file/footer.rs 96.26% <0.00%> (+0.03%) ⬆️
rust/parquet/src/file/writer.rs 95.32% <0.00%> (+0.21%) ⬆️
rust/parquet/src/schema/types.rs 89.79% <0.00%> (+0.25%) ⬆️
rust/parquet/src/file/metadata.rs 92.96% <0.00%> (+0.74%) ⬆️
rust/arrow/src/array/equal/utils.rs 76.47% <0.00%> (+0.98%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bfa99d9...edbea69. Read the comment docs.

@alamb
Copy link
Contributor

alamb commented Feb 15, 2021

This is a cool idea @Dandandan -- what do you think about doing this via features rather than its own crate? I can't really articulate clearly why I would prefer one way over the other. I will try and get my thoughts formed more coherently later today

@Dandandan
Copy link
Contributor Author

I have the feeling that using more features rather than moving the examples would be adding more knobs / complexity to the DataFusion crate. But I'm open for both options.

@alamb
Copy link
Contributor

alamb commented Mar 3, 2021

@Dandandan I think this idea is worth pursuing -- would you be willing to ressurect this PR?

@Dandandan
Copy link
Contributor Author

Dandandan commented Mar 4, 2021

PR is resurrected! (I think) @alamb

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this. What do you think @andygrove and @jorgecarleitao ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI @andygrove -- I am not sure what you think about possibly releasing the examples as a crate of their own. I personally think it would be necessary. However perhaps it would be the right way to to eventually to somehow link to the example code from the docs.rs page https://docs.rs/datafusion/3.0.0/datafusion/

I filed this JIRA https://issues.apache.org/jira/browse/ARROW-11863 to track that

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd lean on not publishing by default, so I'd prefer a publish = false on the Cargo file. I support clarifying this as part of ARROW-11863 @alamb.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added publish = false.
The examples are currently also not on https://docs.rs/datafusion/3.0.0/datafusion/ I think - agree that it would be very useful.

Copy link
Contributor Author

@Dandandan Dandandan Mar 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ok, outcome would be to point the docs from docs.rs always at master

another (better) option would be to link at a versioned url (based on published version)
https://github.com/apache/arrow/tree/apache-arrow-3.0.0

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

@Dandandan Dandandan force-pushed the move_datafusion_examples branch from f7f6103 to edbea69 Compare March 7, 2021 21:43
@nevi-me
Copy link
Contributor

nevi-me commented Mar 22, 2021

@alamb @Dandandan are we still planning on completing this? The benefits are tangible enough as they reduce the number of required dependencies, and we'll catch any breakages with CI###

@Dandandan
Copy link
Contributor Author

@nevi-me yeah - that's the idea from my side at least. I will update the PR again so it could be merged if it has enough support

Copy link
Contributor

@nevi-me nevi-me left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A change to not publish this, otherwise I think we can go ahead with this PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd lean on not publishing by default, so I'd prefer a publish = false on the Cargo file. I support clarifying this as part of ARROW-11863 @alamb.

Copy link
Contributor

@nevi-me nevi-me left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, pending CI

@alamb
Copy link
Contributor

alamb commented Mar 23, 2021

Looks like test failed with

error: Package `datafusion-examples v4.0.0-SNAPSHOT (/__w/arrow/arrow/rust/datafusion-examples)` does not have the feature `cli`

😞

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we just need to clean up the CI and then this will be good to go. Thank you so much @Dandandan !

@alamb
Copy link
Contributor

alamb commented Mar 24, 2021

Oh no ! this PR seems like it needs a rebase

@alamb alamb added the needs-rebase A PR that needs to be rebased by the author label Mar 24, 2021
@alamb alamb force-pushed the move_datafusion_examples branch from bd26b02 to 19884c3 Compare March 28, 2021 16:47
@alamb
Copy link
Contributor

alamb commented Mar 28, 2021

I have rebased this PR against master and hopefully will get it merged once CI passes

@Dandandan
Copy link
Contributor Author

thanks a lot @alamb !

@alamb
Copy link
Contributor

alamb commented Mar 28, 2021

The integration test failure seems unrelated (out of space): https://github.com/apache/arrow/pull/9494/checks?check_run_id=2213195937


+ source_dir=/arrow/js
+ with_docs=/build
+ pushd /arrow/js
+ npm install
/arrow/js /
npm WARN tar ENOSPC: no space left on device, write
npm WARN tar ENOSPC: no space left on device, write
npm ERR! cb() never called!

@alamb
Copy link
Contributor

alamb commented Mar 28, 2021

I am going to merge this one

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Component: Rust - DataFusion Component: Rust needs-rebase A PR that needs to be rebased by the author

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants