Remove Apache Pig from the tests#7810
Conversation
|
Hmm, ordinarily I would suggest to keep the flag in, and instead remove the Pig test dependency and test it some other way. But Pig looks like it is not under very active development anymore. There hasn't been a release in almost two years, and I only see a handful of commits made to the repo in 2019. So for those reasons, removing the flag seems fine to me. We would need to mention it as an incompatibility in the release notes. |
b02876c to
8a31061
Compare
dd57ec4 to
0fdd7be
Compare
|
@gianm Can you shed some light on the flattening part? I noticed that you've written parts of the original code. I do understand this one: https://github.com/apache/incubator-druid/blob/master/extensions-core/avro-extensions/src/main/java/org/apache/druid/data/input/avro/AvroFlattenerMaker.java#L113 However, this one looks awkward: https://github.com/apache/incubator-druid/blob/master/extensions-core/avro-extensions/src/main/java/org/apache/druid/data/input/avro/AvroFlattenerMaker.java#L120 |
|
@Fokko Check out the |
|
Thanks @gianm for the pointer. That is actually pretty neat. I've added some additional tests. |
| /** | ||
| * imitate avro extension {@link org.apache.druid.data.input.avro.AvroParsers#parseGenericRecord} | ||
| */ | ||
| @Nonnull |
There was a problem hiding this comment.
We aren't super consistent about this in the codebase but we strive to be better, I think this isn't necessary because we all assume @Nonnull is the default. You can enforce this with a package-info.java if you'd like, but it should be fine to leave out as well.
There was a problem hiding this comment.
Thanks, what is the path forward here? For me, explicit is better than implicit. WDYT
There was a problem hiding this comment.
I'm happy as well to make this more consistent in the codebase. Maybe we can also use SpotBugs to enforce this a bit.
There was a problem hiding this comment.
Ah, I see now, thanks for pointing this out. I think the setting everything to NonNull by default is the best practice, I'll pick this up in another PR if that is fine by you.
7ae3fe2 to
d1a1ce7
Compare
|
@gianm Anything more required to get this merged? Thanks! |
|
Apologies for the delay getting this merged, got busy with some stuff, thanks! |
|
No problem, thanks! |
* Remove Apache Pig from the tests * Remove the Pig specific part * Fix the Checkstyle issues * Cleanup a bit * Add an additional test * Revert the abstract class
I strongly feel we should remove Pig from the tests since it is blocking #7772
Currently, we write the Avro to a file, load it with Apache Pig, and then read it again with Druid. I think this is the wrong way of testing. Avro is a language agnostic format for storing data. So if we're able to read the data with Druid, this should be sufficient. I've replaced the Pig part with just writing and reading Avro.
Besides that, I think Pig is not that commonly used anymore. Currently, we're using a version that has been released in 2015.