Skip to content

Implementing dropwizard emitter for druid#7363

Merged
himanshug merged 16 commits intoapache:masterfrom
nishantmonu51:jmx-metrics
Oct 1, 2019
Merged

Implementing dropwizard emitter for druid#7363
himanshug merged 16 commits intoapache:masterfrom
nishantmonu51:jmx-metrics

Conversation

@nishantmonu51
Copy link
Copy Markdown
Member

This is an update for previous PR - #4196
Corresponding issues are
#3927
#1996

This extension integrates Dropwizard metrics library with druid so that dropwizard users can easily absorb druid into their monitoring ecosystem. It accumulates druid metrics as dropwizard metrics, and emits them to various sinks via dropwizard supported reporters.

Currently supported dropwizard metrics types counter, gauge, meter, timer and histogram.
These metrics can be emitted using either Console or JMX reporter.

Support for other reporters can be easily added in future PRs.

Note - I have cherry-picked the commit from previous author to provide him the credit of his work.

@nishantmonu51
Copy link
Copy Markdown
Member Author

@jihoonson @b-slim: Can you please help with the review here ?
It will be a nice addition to 0.15 release.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is scary, how much it accumulates is this onheap or offheap? is there a config to bound this ?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the accumulation does not mean it queues things up, it stores the latest values in the metrics registry, the memory usage depends on the number of metrics and metric types.

Note: We are not handling all individual query metrics here, it's just the metrics defined in the extension config which are counters, gauges and timers. So have minimal impact on heap usage.e.g. for a counter it uses a single AtomicLong, for gauge it gives an instantaneous value and we are caching that in the metrics extensions impl.

https://metrics.dropwizard.io/4.0.0/getting-started.html has more details on the individual event types.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this includes host and ports ? or only host?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

host and port both, It is essentially ServiceMetricEvent.getHost() value

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you please make the doc more clear that it includes both host and port?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

am trying to understand here, why a used would like to get a metric without dimension name? am not sure am getting this.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is mostly a copy of other stats-d emitter extension and left over of that, Agree, It is not much useful for dropwizard, will remove

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is statsD a typo mistake or it is actually called statsD ?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left over of copy and paste, fixed.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add a link to the default mapping ?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about ignored or warning ? seems like error is too much, especially in Druid we add metrics pretty often.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to ignored.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you please add docs on how and why this will return null ? plus i see this nullable annotation all over the Druid code, am not sure what is used for?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added javadoc. null is returned when there is no mapping for a metric.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reading the signature of this method i can not tell what is actually doing and why a map object is called a builder ?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed the builder to filteredDimensions and mentioned it in javadoc.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we do this in one get and a null check?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the meaning of null ?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it means no mapping found for metric

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about closing the stream on exception ?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, why is is not closed on normal exit?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought ObjectReader.readValue will close the stream in normal flow

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ObjectReader.readValue(..) doesn't mention closing the InputStream by itself, it might be safer to close it in finally block.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure why we are not guarding this and check for start not set ?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added guards.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, when can it be called twice? It should be managed by the lifecycle.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, it should be called through lifecycle, It seems other emitter has guards too so i added a it here also.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LifecycleLock utility is useful for this usecase

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

final?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

private static final

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as for start, shouldn't we check for start is true ?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added guard

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make sure this is not blocking?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it is non-blocking, basically a no-op for the current two implementations of reporters.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could mention that it must be non blocking in the javadoc of DropwizardReporter.flush(), so that if someone implements another then they know about it.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use native JDK functions/streams and move away from guava.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nothing to close ? this reporter has not state to clean ?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure why this change is needed ?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed.

@b-slim
Copy link
Copy Markdown
Contributor

b-slim commented Apr 25, 2019

@nishantmonu51 am very ignorant of this DropWizard stuff, can you please explain where the metrics are buffered ? and how often cleaned ? how we can make sure it does not overflow ? and how the JMX one works do we need some Jvm flags to be set to turn it on can you please document how we can test that ?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so if let say we have a metric where the set of the key grows over time eg queryId or something similar, this map will grow without any bound and that can lead to big troubles, how about making the keys weak refs or having an actual limit on the size of this map ?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking into it, we can add a limit to the map size.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a limit to the size of map.

@stale
Copy link
Copy Markdown

stale Bot commented Jun 25, 2019

This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that's incorrect or this pull request should instead be reviewed, please simply write any comment. Even if closed, you can still revive the PR at any time or discuss it on the dev@druid.apache.org list. Thank you for your contributions.

@stale stale Bot added the stale label Jun 25, 2019
@stale
Copy link
Copy Markdown

stale Bot commented Jun 26, 2019

This pull request/issue is no longer marked as stale.

1 similar comment
@stale
Copy link
Copy Markdown

stale Bot commented Jun 26, 2019

This pull request/issue is no longer marked as stale.

@nishantmonu51
Copy link
Copy Markdown
Member Author

@jihoonson @b-slim : Updated PR based on review comments, please check.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
druid.emitter.dropwizard.reporter={"type":"jmx"}
druid.emitter.dropwizard.reporters=[{"type":"jmx"}]

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don e.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
druid.emitter.dropwizard.reporter={"type":"console","emitIntervalInSecs":30}"}
druid.emitter.dropwizard.reporters=[{"type":"console","emitIntervalInSecs":30}"}]

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it is possible to point this to actual file in the source code because this has the potential to go out of sync

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a link too. Initially it was a link only but added the file as part of review comments.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should use string format instead of concatenations here and in all other places as this code is gonna be called many many times.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

null/empty check ? it would be useless to set it up with no reporters and probably a configuration error.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private final Map<String, Number> gagues;
private final Map<String, Number> gauges;

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we use guava Cache instead of doing our own ?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done. replaced it use caffeine cache instead.

@himanshug
Copy link
Copy Markdown
Contributor

Copy link
Copy Markdown
Contributor

@himanshug himanshug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please fix the build

@himanshug himanshug merged commit 8537fbe into apache:master Oct 1, 2019
@jon-wei jon-wei added this to the 0.17.0 milestone Dec 17, 2019
@jon-wei jon-wei mentioned this pull request Dec 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants