Skip to content

Add @ExtensionPoint and @PublicApi annotations.#4433

Merged
gianm merged 13 commits intoapache:masterfrom
gianm:extension-annotations
Aug 28, 2017
Merged

Add @ExtensionPoint and @PublicApi annotations.#4433
gianm merged 13 commits intoapache:masterfrom
gianm:extension-annotations

Conversation

@gianm
Copy link
Copy Markdown
Contributor

@gianm gianm commented Jun 21, 2017

This is an attempt to clarify questions about what is and is not a public API (see #4131 (comment), #4394 (comment), #4254 (comment)). It adds two new annotations: @ExtensionPoint which signifies something you can subclass, and @PublicApi which signifies something you're not meant to subclass, but that you can use for implementation. The idea is that if an extension author sticks to public APIs then their extension will keep working for at least an entire major Druid line.

Extension authors are of course free to use non-public APIs, but their extensions might break randomly.

Feedback on the specific list of annotated and non-annotated items is welcome.

Some notes:

  • The list of public apis is somewhat subjective, but I did my best guess based on the current contents of http://druid.io/docs/latest/development/modules.html and based on recent discussions about what is and isn't public. In general I erred on the side of annotating fewer rather than more entities, since I figured that annotating too many entities would make too much friction for dev of core Druid.
  • Speaking of subjectivity: relating to the last comment in the list above (Add support for headers and skipping thereof for CSV and TSV #4254 (comment)) I actually think CSVParseSpec shouldn't be considered a public api, since it's not really useful for extensions. So, I didn't annotate it. Tranquility does use it, but Tranquility really blatantly uses a ton of private Druid APIs. It should really have no expectation of being able to run with any Druid release other than the one it was built with.
  • Nothing in java-util is annotated since the annotations are in druid-api, and java-util doesn't depend on druid-api. We could deal with this by either saying "everything in java-util is public", or alternatively, creating a new package druid-annotations that java-util and druid-api both depend on.
  • I'm not sure I got the annotations for the query metrics stuff right (see QueryMetrics: abstraction layer of query metrics emitting (part of #3798) #3954). @leventov could you double check if I captured your intent? What I did was annotate GenericQueryMetricsFactory as public (since AFAIK any extension query would need to call methods on it) but nothing else in the query metrics world.

@leventov
Copy link
Copy Markdown
Member

Is druid-api module still needed?

@gianm
Copy link
Copy Markdown
Contributor Author

gianm commented Jun 21, 2017

Maybe not. The purpose was to put everything that was a public API in one package, but it turns out that there are public APIs in other packages too.

@gianm
Copy link
Copy Markdown
Contributor Author

gianm commented Jun 21, 2017

I cleaned up the wording a bit, to clarify that PublicApis can have new non-default methods added to an interface and this is not considered a breakage (because they are not meant to be subclassed).

@fjy
Copy link
Copy Markdown
Contributor

fjy commented Jun 23, 2017

👍

*
* @see PublicApi
*/
@Target({ElementType.TYPE, ElementType.FIELD, ElementType.METHOD, ElementType.CONSTRUCTOR})
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How could a field, method or constructor be an extension point?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can't, this is a copy/paste error. I'll remove those.

/**
*/
@PublicApi
public class Runnables
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Such utility classes are not generally in API, they are in druid-common

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is probably here since it's used by FileIteratingFirehose which is in druid-api.

/**
*/
@PublicApi
public class Queries
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it really be public API?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's utility methods that are use in the implementations of most built-in queries and post-aggregators. If it's not a public api, then people that implement their own extension based on the built-in code (which is probably common) will accidentally use internal apis.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a judgement call here: extension implementors could always copy/paste the code from Queries rather than linking to it directly, and there is cost to the core project for having too many public apis. But I think that in this case it's worth it.

@Nullable
public ColumnCapabilities getColumnCapabilities(String column);

public Map<String, DimensionHandler> getDimensionHandlers();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove this?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was unused, and I didn't want it to be part of the public API of StorageAdapter.

this.columnSize = columnSize;
}

@PublicApi
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why it needs to be public API?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it's commonly used in implementations of ComplexMetricsSerde, which is an extension point.

@gianm
Copy link
Copy Markdown
Contributor Author

gianm commented Jul 5, 2017

Pushed a new commit to resolve conflicts and address #4433 (comment).

@gianm
Copy link
Copy Markdown
Contributor Author

gianm commented Jul 7, 2017

Thanks for your reviews @fjy and @leventov. I'd like to get a few more eyes on this just to see what other people think. Let's keep it open for a bit and bring it up in a dev sync too.

@gianm gianm added this to the 0.11.0 milestone Jul 18, 2017
@drcrallen
Copy link
Copy Markdown
Contributor

@gianm I have heard from @cheddar before that the json is the api spec rather than the java code itself per-se. Since there is a mix of things which feed into jackson and things which don't feed into jackson that have the same annotation, would it make sense to have a special annotation for things which are part of the JSON api, rather than the Java api?

import java.util.List;

/**
* Not an {@code ExtensionPoint} since extension parsers are meant to extend {@code InputRowParser}.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given that InputRowParser is an extension point, this needs to be an extension point or else how would they implement InputRowParser.getParseSpec() ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point…

I guess ParseSpec should be an extension point? And that would mean TimestampSpec and DImensionsSpec should be public apis.

I'll change it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added that in the most recent commit.

/**
*/
@PublicApi
public class StringInputRowParser implements ByteBufferInputRowParser
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we expect extensions to directly reference StringInputRowParser ?

Copy link
Copy Markdown
Contributor Author

@gianm gianm Jul 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking it would be useful for people that want to implement extension firehoses that parse strings. Otherwise they'll need to duplicate a lot of logic.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe nobody really uses it though and we can take out the annotation. Hmm, ok I'll take it out.


/**
*/
@PublicApi
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually its more of an utility class and shouldn't necessarily be directly used in custom user extension (it can be used in out-of-the-box extensions) and those should really just use MapBinder.xxx .

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking it would be nice to make a utility like this public. I don't feel super strongly though.

@gianm
Copy link
Copy Markdown
Contributor Author

gianm commented Jul 25, 2017

@drcrallen,

@gianm I have heard from @cheddar before that the json is the api spec rather than the java code itself per-se. Since there is a mix of things which feed into jackson and things which don't feed into jackson that have the same annotation, would it make sense to have a special annotation for things which are part of the JSON api, rather than the Java api?

For this I think the attitude has been (and can continue to be) that a JSON API is stable if, and only if, it is documented. So I don't think an annotation is necessary.

@gianm
Copy link
Copy Markdown
Contributor Author

gianm commented Jul 27, 2017

It was brought up on the dev sync that it would be desirable to merge java-util into druid-common, which would solve the problem of "java-util can't see the annotations". That would be a big patch though and I think I'd like to do this patch first, for everything other than java-util. And then do another one for merging in java-util to druid-common.

*/
public interface
InputRow extends Row
@PublicApi
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this be an extension point instead ? InputRowParser is an extension point and parse(..) should be free to return a concrete implementation that it wants.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking they would all use MapBasedRow, since the current parsers do that, rather than implement their own Row subclasses. We can make it an extension point if you think that makes more sense though.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm, so I took a brief look at avro and orc parsers which create MapBasedInputRow and it looks like it would be better to create an impl of InputRow instead .

to use MapBasedInputRow..

with that information, what do you think ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, thanks for looking into that. In that case I think it makes sense to make this an extension point. I'll change it.


/**
*/
@PublicApi
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this public api so that users can use this class in their java clients to construct druid queries ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's because ParseSpec is an extension point, and it returns DimensionsSpec, which is constructed using DimensionSchema. So implementors of custom ParseSpecs might need to use this class.

import io.druid.guice.annotations.PublicApi;
import org.joda.time.Interval;

@PublicApi
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QueryToolChest.filterSegments(..) returns an impl of this , so this needs to be an extension point.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's supposed to return the same objects that it was given (but possibly filtered). So it doesn't need to be an extension point (people won't subclass it), just a public api (people may need to call methods on it).

@himanshug
Copy link
Copy Markdown
Contributor

@gianm is Task interface not marked as extension point consciously or just missed ? I know at least one user who have written their own task.

We should also mark LookupExtractorFactory and LookupExtractor as extension points. I know of multiple QTL extensions which extend those.

@himanshug
Copy link
Copy Markdown
Contributor

himanshug commented Jul 27, 2017

@gianm ServletFilterHolder and PasswordProvider too should be extension points.

@gianm
Copy link
Copy Markdown
Contributor Author

gianm commented Jul 27, 2017

@gianm is Task interface not marked as extension point consciously or just missed ? I know at least one user who have written their own task.

I consciously didn't add Task because it's not listed on http://druid.io/docs/latest/development/modules.html. And with this PR, I was more trying to encode existing policy in annotations, rather than change the policy. That being said, I would +1 a proposal to make Task an extension point.

We should also mark LookupExtractorFactory and LookupExtractor as extension points. I know of multiple QTL extensions which extend those.

Similar to Task, I left it out since it's not on http://druid.io/docs/latest/development/modules.html. In addition, Lookup APIs are explicitly described as "experimental" in the docs, so I think they shouldn't be annotated as stable. As long as they're experimental, users should be free to make extensions for lookups, but they shouldn't be guaranteed to keep working across upgrades.

@gianm ServletFilterHolder and PasswordProvider too should be extension points.

Hmm, ServletFilterHolder is kind of borderline (it's not on http://druid.io/docs/latest/development/modules.html) but it's close enough to "Jersey resources" (which is) that I guess it makes sense to add. I'll do that.

PasswordProvider isn't on http://druid.io/docs/latest/development/modules.html but maybe it should be. I'll leave the annotation out for now.

@gianm
Copy link
Copy Markdown
Contributor Author

gianm commented Jul 27, 2017

What do people think about adding these as extension points?

  • Task
  • PasswordProvider

They make sense to me but since they aren't currently on the modules.md list I wanted to check for opinions. If we generally think they make sense then I can add them to modules.md and annotate them.

@jihoonson
Copy link
Copy Markdown
Contributor

@gianm @himanshug it makes sense to me.

@himanshug
Copy link
Copy Markdown
Contributor

@gianm PasswordProvider should definitely be an extension point, that was the primary reason for introducing it so that users can integrate Druid with their own internal secrets management systems effectively.
I don't feel so strongly about Task though, haven't seen a general need to extend it beyond one use case.

@gianm
Copy link
Copy Markdown
Contributor Author

gianm commented Jul 28, 2017

@gianm PasswordProvider should definitely be an extension point, that was the primary reason for introducing it so that users can integrate Druid with their own internal secrets management systems effectively.

I'll make PasswordProvider an extension point too.

I don't feel so strongly about Task though, haven't seen a general need to extend it beyond one use case.

I'll leave Task as "internal" for now, but if there is some consensus that it's a good idea to make it a formal extension point, then I'm ok with changing it.

@gianm
Copy link
Copy Markdown
Contributor Author

gianm commented Jul 28, 2017

@himanshug @jihoonson @leventov I made the adjustments discussed, and also updated the modules.md doc to reflect what has been annotated (as well as provided some more "hints" for implementors).

Please let me know what you think, especially about the modules.md changes.

@himanshug
Copy link
Copy Markdown
Contributor

👍

@drcrallen
Copy link
Copy Markdown
Contributor

There have been 3 approvals here, it should be good to go after conflicts are resolved.

@gianm
Copy link
Copy Markdown
Contributor Author

gianm commented Aug 25, 2017

Resolved conflicts.

@himanshug
Copy link
Copy Markdown
Contributor

@gianm feel free to merge whenever you want. it has enough approvals.

@gianm
Copy link
Copy Markdown
Contributor Author

gianm commented Aug 25, 2017

Ok, I'll merge it after CI is done.

@gianm gianm merged commit 9fbfc1b into apache:master Aug 28, 2017
@gianm gianm deleted the extension-annotations branch August 28, 2017 21:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants