Skip to content

Add some checks for HadoopTables#create#298

Merged
rdblue merged 4 commits intoapache:masterfrom
chenjunjiedada:minor
Jul 25, 2019
Merged

Add some checks for HadoopTables#create#298
rdblue merged 4 commits intoapache:masterfrom
chenjunjiedada:minor

Conversation

@chenjunjiedada
Copy link
Copy Markdown
Collaborator

No description provided.

@chenjunjiedada
Copy link
Copy Markdown
Collaborator Author

close and reopen to trigger CI.

*
* @param schema iceberg schema used to create the table
* @param spec partition specification
* @param properties properties of the table to be created
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you note that null is accepted?

Also, should we do something similar for spec? If spec is null, we could use PartitionSpec.unpartitioned().

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, please see the latest code change.

@chenjunjiedada
Copy link
Copy Markdown
Collaborator Author

Recently, some unit tests fail randomly. Close and reopen to trigger CI again. We need to fix that.

*
* @param schema iceberg schema used to create the table
* @param spec partition specification
* @param spec partition specification. It can be null in case of unpartitioned table
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about "partitioning spec, if null the table will be unpartitioned"

* @param schema iceberg schema used to create the table
* @param spec partition specification
* @param spec partition specification. It can be null in case of unpartitioned table
* @param properties properties of the table to be created, it can be null
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about "a string map of table properties, initialized to empty if null"

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@rdblue rdblue merged commit e2bdf33 into apache:master Jul 25, 2019
danielcweeks pushed a commit that referenced this pull request Jul 26, 2019
* First cut impl of reading Parquet FileIterator into ArrowRecordBatch based reader

* made num records per arrow batch configurable

* addressed comments

* Added docs for public methods and ArrowReader class

* Fixed javadoc

* WIP first stab at reading into Arrow and returning as InternalRow iterator

* Add publish to snapshot repository by replacing version to `1.0-adobe-2.0-SNAPSHOT` (snapshot prefix is required by snapshot repo)

* Adding arrow schema conversion utility

* adding arrow-vector dep to tests

* [WIP] Working vectorization for primitive types. Added test for VectorizedSparkParquetReaders.

* [WIP] Added Decimal types to vectorization

* [WIP] added remaining primitive type vectorization and tests

* [WIP] unused imports fixed

* Add argument validation to HadoopTables#create (#298)

* Install source JAR when running install target (#310)

* Bump version to 1.0-adobe-3.0-vectorized-SNAPSHOT

* Temporarily ignore applying style check

* Fixing javadoc error

* Updating versions.lock

* fixed checkstyle errors

* Revert "Bump version to 1.0-adobe-3.0-vectorized-SNAPSHOT"

This reverts commit ceae2fd.

* cleanup
danielcweeks pushed a commit that referenced this pull request Aug 1, 2019
* Add argument validation to HadoopTables#create (#298)

* Install source JAR when running install target (#310)

* Add projectStrict for Dates and Timestamps (#283)

* Correctly publish artifacts on JitPack (#321)

The Gradle install target produces invalid POM files that are missing
the dependencyManagement section and versions for some dependencies.
Instead, we directly tell JitPack to run the correct Gradle target.

* Add build info to README.md (#304)

* Convert Iceberg time type to Hive string type (#325)

* Add overwrite option to write builders (#318)

* Fix out of order Pig partition fields (#326)

* Add mapping to Iceberg for external name-based schemas (#338)

* Site: Fix broken link to Iceberg API (#333)

* Add forTable method for Avro WriteBuilder (#322)

* Remove multiple literal strings check rule for scala (#335)

* Fix invalid javadoc url in README.md (#336)

* Use UnicodeUtil.truncateString for Truncate transform. (#340)

This truncates by unicode codepoint instead of Java chars.

* Refactor metrics tests for reuse (#331)

* Spark: Add support for write-audit-publish workflows (#342)

* Avoid write failures if metrics mode is invalid (#301)

* Fix truncateStringMax in UnicodeUtil (#334)

Fixes #328, fixes #329.

Index to codePointAt should be the offset calculated by code points

* [Vectorization] Added batch sizing, switched to BufferAllocator, other minor style fixes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants