Skip to content

feat: support _rowid meta column for spark connector in java#3194

Merged
wjones127 merged 4 commits intolance-format:mainfrom
SaintBacchus:SupportRowIdInJava
Dec 6, 2024
Merged

feat: support _rowid meta column for spark connector in java#3194
wjones127 merged 4 commits intolance-format:mainfrom
SaintBacchus:SupportRowIdInJava

Conversation

@SaintBacchus
Copy link
Copy Markdown
Collaborator

As discussion in PR, I had implement the _rowid meta column just in java package.

@github-actions github-actions Bot added enhancement New feature or request java labels Dec 4, 2024
@SaintBacchus
Copy link
Copy Markdown
Collaborator Author

@LuQQiu @eddyxu Can you help review this PR?

@eddyxu eddyxu requested review from LuQQiu, westonpace and wjones127 and removed request for westonpace and wjones127 December 4, 2024 07:55
Copy link
Copy Markdown
Contributor

@wjones127 wjones127 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks good to me. My only serious worry is the licensing.

Comment thread java/spark/src/test/java/com/lancedb/lance/spark/TestUtils.java Outdated
@SaintBacchus SaintBacchus force-pushed the SupportRowIdInJava branch 2 times, most recently from b329f52 to d324d62 Compare December 5, 2024 02:47
@SaintBacchus
Copy link
Copy Markdown
Collaborator Author

@wjones127 I had added the license file in lancedb. Plz review it

Comment thread LICENSE Outdated
Comment on lines +438 to +437
------------------------------------------------------------------------------------
This product bundles various third-party components under other open source licenses.
This section summarizes those components and their licenses. See licenses/
for text of these licenses.


Apache Software Foundation License 2.0
--------------------------------------

common/network-common/src/main/java/org/apache/spark/network/util/LimitedInputStream.java
core/src/main/java/org/apache/spark/util/collection/TimSort.java
core/src/main/resources/org/apache/spark/ui/static/bootstrap*
core/src/main/resources/org/apache/spark/ui/static/vis*
docs/js/vendor/bootstrap.js
connector/spark-ganglia-lgpl/src/main/java/com/codahale/metrics/ganglia/GangliaReporter.java
core/src/main/resources/org/apache/spark/ui/static/d3-flamegraph.min.js
core/src/main/resources/org/apache/spark/ui/static/d3-flamegraph.css

Python Software Foundation License
----------------------------------

python/pyspark/loose_version.py

BSD 3-Clause
------------

python/lib/py4j-*-src.zip
python/pyspark/cloudpickle/*.py
python/pyspark/join.py

The CSS style for the navigation sidebar of the documentation was originally
submitted by Óscar Nájera for the scikit-learn project. The scikit-learn project
is distributed under the 3-Clause BSD license.


MIT License
-----------

core/src/main/resources/org/apache/spark/ui/static/dagre-d3.min.js
core/src/main/resources/org/apache/spark/ui/static/*dataTables*
core/src/main/resources/org/apache/spark/ui/static/graphlib-dot.min.js
core/src/main/resources/org/apache/spark/ui/static/jquery*
core/src/main/resources/org/apache/spark/ui/static/sorttable.js
docs/js/vendor/anchor.min.js
docs/js/vendor/jquery*
docs/js/vendor/modernizer*

ISC License
-----------

core/src/main/resources/org/apache/spark/ui/static/d3.min.js


Creative Commons CC0 1.0 Universal Public Domain Dedication
-----------------------------------------------------------
(see LICENSE-CC0.txt)

data/mllib/images/kittens/29.5.a_b_EGDP022204.jpg
data/mllib/images/kittens/54893.jpg
data/mllib/images/kittens/DP153539.jpg
data/mllib/images/kittens/DP802813.jpg
data/mllib/images/multi-channel/chr30.4.184.jpg

https://github.com/apache/spark/blob/master/LICENSE No newline at end of file
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can remove all of this, since none of this applies to the files you brought over.

Suggested change
------------------------------------------------------------------------------------
This product bundles various third-party components under other open source licenses.
This section summarizes those components and their licenses. See licenses/
for text of these licenses.
Apache Software Foundation License 2.0
--------------------------------------
common/network-common/src/main/java/org/apache/spark/network/util/LimitedInputStream.java
core/src/main/java/org/apache/spark/util/collection/TimSort.java
core/src/main/resources/org/apache/spark/ui/static/bootstrap*
core/src/main/resources/org/apache/spark/ui/static/vis*
docs/js/vendor/bootstrap.js
connector/spark-ganglia-lgpl/src/main/java/com/codahale/metrics/ganglia/GangliaReporter.java
core/src/main/resources/org/apache/spark/ui/static/d3-flamegraph.min.js
core/src/main/resources/org/apache/spark/ui/static/d3-flamegraph.css
Python Software Foundation License
----------------------------------
python/pyspark/loose_version.py
BSD 3-Clause
------------
python/lib/py4j-*-src.zip
python/pyspark/cloudpickle/*.py
python/pyspark/join.py
The CSS style for the navigation sidebar of the documentation was originally
submitted by Óscar Nájera for the scikit-learn project. The scikit-learn project
is distributed under the 3-Clause BSD license.
MIT License
-----------
core/src/main/resources/org/apache/spark/ui/static/dagre-d3.min.js
core/src/main/resources/org/apache/spark/ui/static/*dataTables*
core/src/main/resources/org/apache/spark/ui/static/graphlib-dot.min.js
core/src/main/resources/org/apache/spark/ui/static/jquery*
core/src/main/resources/org/apache/spark/ui/static/sorttable.js
docs/js/vendor/anchor.min.js
docs/js/vendor/jquery*
docs/js/vendor/modernizer*
ISC License
-----------
core/src/main/resources/org/apache/spark/ui/static/d3.min.js
Creative Commons CC0 1.0 Universal Public Domain Dedication
-----------------------------------------------------------
(see LICENSE-CC0.txt)
data/mllib/images/kittens/29.5.a_b_EGDP022204.jpg
data/mllib/images/kittens/54893.jpg
data/mllib/images/kittens/DP153539.jpg
data/mllib/images/kittens/DP802813.jpg
data/mllib/images/multi-channel/chr30.4.184.jpg
https://github.com/apache/spark/blob/master/LICENSE

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I had removed it

Comment on lines +25 to +36
/**
SPDX-License-Identifier: Apache-2.0
SPDX-FileCopyrightText: Copyright The Lance Authors

The following code is originally from https://github.com/apache/spark/blob/master/sql/catalyst/src/test/scala/org/apache/spark/sql/util/ArrowUtilsSuite.scala
and is licensed under the Apache license:

License: Apache License 2.0, Copyright 2014 and onwards The Apache Software Foundation.
https://github.com/apache/spark/blob/master/LICENSE

It has been modified by the Lance developers to fit the needs of the Lance project.
*/
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you move this up? It's confusing with the other license above.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just put it in the apache license header.

Comment on lines +25 to +36
/**
SPDX-License-Identifier: Apache-2.0
SPDX-FileCopyrightText: Copyright The Lance Authors

The following code is originally from https://github.com/apache/spark/blob/master/sql/core/src/test/scala/org/apache/spark/sql/vectorized/ArrowColumnVectorSuite.scala
and is licensed under the Apache license:

License: Apache License 2.0, Copyright 2014 and onwards The Apache Software Foundation.
https://github.com/apache/spark/blob/master/LICENSE

It has been modified by the Lance developers to fit the needs of the Lance project.
*/
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same thing here.

Copy link
Copy Markdown
Contributor

@wjones127 wjones127 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working with me on the license stuff. Looks good now :)

@wjones127 wjones127 merged commit e4ab9a8 into lance-format:main Dec 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request java

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants