Skip to content

[WIP] Vectorized Reads using Arrow#313

Closed
prodeezy wants to merge 15 commits intoapache:vectorized-readfrom
prodeezy:issue-9-support-arrow-based-reading-WIP
Closed

[WIP] Vectorized Reads using Arrow#313
prodeezy wants to merge 15 commits intoapache:vectorized-readfrom
prodeezy:issue-9-support-arrow-based-reading-WIP

Conversation

@prodeezy
Copy link
Copy Markdown
Contributor

@prodeezy prodeezy commented Jul 25, 2019

Just a WIP POC right now so is not expected to be merged. This is just for comments.

Things I intend to add:

  • a separate module iceberg-arrow that will house the code for reading parquet into Arrow
    Iceberg's Reader needs to choose Row-wise or Vectorized reading based on config.
    issue#311
  • batch sizing as config issue#312

Update: #319 is the new PR

@danielcweeks
Copy link
Copy Markdown
Contributor

Hey @prodeezy, thanks for updating and creating the pull request. Looks like the build is breaking and while we don't need exhaustive reviews, it would be nice to keep the branch in a buildable state. I pulled it down and it looks like the dependency locks are preventing. Could you just add the updated version locks to pull in arrow (i.e. gradlew --write-locks)?

There are a number of checkstyle violations, but I think we can look past those for now (unless you want to tackle them with the version locks).

@prodeezy
Copy link
Copy Markdown
Contributor Author

prodeezy commented Jul 26, 2019

will do. i have a different branch "vectorized-read" in my local repo with latest code from master merged in. will update locks and fix checkstyle errors and create another pr. closing this for now. sorry bout the confusion.

@prodeezy prodeezy closed this Jul 26, 2019
@prodeezy
Copy link
Copy Markdown
Contributor Author

raised #319 as a replacement with the updated version locks and checkstyle errors fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants