Skip to content

Conversation

@egor-ryashin
Copy link
Contributor

Addressing #6604

@egor-ryashin egor-ryashin requested a review from leventov December 5, 2018 17:20
try {
// no Java official public API is present for the following method, so using reflection
threadMXBean = ManagementFactory.getThreadMXBean();
getThreadAllocatedBytes = threadMXBean.getClass().getMethod("getThreadAllocatedBytes", long[].class);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain this? I see this method: https://docs.oracle.com/javase/8/docs/jre/api/management/extension/com/sun/management/ThreadMXBean.html#getThreadAllocatedBytes-long:A- in official Java SE API.

Maybe you could do an instanceof check for com.sun.management.ThreadMXBean, I think it should be clearer than reflection.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Classes in the sun.* packages are not part of the public/supported Java API and should not be used directly.
https://www.oracle.com/technetwork/java/faq-sun-packages-142232.html

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

long[] allThreadIds = threadMXBean.getAllThreadIds();
long[] bytes = (long[]) getThreadAllocatedBytes.invoke(threadMXBean, (Object) allThreadIds);

previousResults = new HashMap<>(allThreadIds.length);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use Long2LongOpenHashMap from fastutil.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I bet it's a nano-optimization for ~500 entries.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it also cost nothing in terms of code complexity. In fact, it adds categorization mental load, "if optimization needed in this case, or not"? So HashMap with primitive elements should simply never be used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for (int i = 0; i < allThreadIds.length; i++) {
long threadId = allThreadIds[i];
previousResults.put(threadId, bytes[i]);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra line

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}
initialized = true;
}
catch (Exception e) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you catch only reflection access exceptions to not accidentally mask some programming mistake?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a whole lot of runtime exception types which can be thrown by reflection API (ie IllegalArgumentException and NPE)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how IllegalArgumentException and NPE are possible. NoSuchMethodException should be rethrown, because it's a programming mistake. IllegalAccessException and SecurityException should be reported in logs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NoSuchMethodException can be due to a JVM private API change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right.

class AllocationMetricCollector
{
private static final Logger log = new Logger(AllocationMetricCollector.class);
private Method getThreadAllocatedBytes;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make this variable and initialize this as a static

Copy link
Contributor Author

@egor-ryashin egor-ryashin Dec 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't get it, it's already a variable...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an instance field, suggested to make it a static field.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be final?

{
private static final Logger log = new Logger(AllocationMetricCollector.class);
private Method getThreadAllocatedBytes;
private ThreadMXBean threadMXBean;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be final?


AllocationMetricCollector()
{
try {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a delta metric, maybe it makes more sense to start with an empty map. So the first call to calculateDelta() accounts all data emitted since the Druid application start.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this.previousResults.put(threadId, current);
}
// remove terminated thread ids
previousResults.keySet().retainAll(Arrays.stream(allThreadIds).boxed().collect(Collectors.toList()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This operation has quadratic complexity. It's easier to build a new map as you iterate allThreadIds and reassign previousResults to that new map here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say linear (iterate and remove if not exists while remove is constant).

Copy link
Member

@leventov leventov Dec 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, retainAll(List) is quadratic, because it calls contains() on the List in the loop over the hash table. #5632 was a similar issue, for example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

final ServiceMetricEvent.Builder builder = builder();
MonitorUtils.addDimensionsToBuilder(builder, dimensions);
Optional<Long> delta = collector.calculateDelta();
delta.ifPresent((value) -> emitter.emit(builder.build("jvm/heapAlloc/bytes", value)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe jvm/heapAlloc/bytes/delta

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say .../delta/bytes

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here: #6425 "delta" is the last part, what are the statistics across all metrics?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What statistics? Just checked the code and /delta/* matches > 50 and /delta" just half a dozen.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I meant the statistics that you provided. Let it be delta/bytes then

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (!initialized) {
return Optional.empty();
}
try {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also it would be useful to add a comment with the expected (i. e. default) emit period of this metric, and the approximate number of threads in a Druid node, so that readers of this code could quantify the GC and computational overhead of computing this metric.

I understand that the overhead is small, but want to make it more obvious for the readers.

Copy link
Contributor Author

@egor-ryashin egor-ryashin Dec 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to this guy experiments: http://openjdk.5641.n7.nabble.com/ThreadMXBean-getCurrentThreadAllocatedBytes-td343872.html
The call to getThreadAllocatedBytes for 500 threads in the worst case takes:

perl -e 'print (500*9000/1000/1000);'
4.5

4.5 ms
AllocationMetricCollector takes linear time to caculate delta, for 500 threads it's negligible. One can call it as freqeuntly as once a second, which is far too frequent for any known purpose to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add this to code to make it evident for code readers

this.previousResults.put(threadId, current);
}
// remove terminated thread ids
previousResults.keySet().retainAll(Arrays.stream(allThreadIds).boxed().collect(Collectors.toSet()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now you do the same work three times:

  1. put() in the loop
  2. toSet() - set construction
  3. retainAll()

Instead of doing it once in the loop and then replace the map

}
}

public Optional<Long> calculateDelta()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logically, once AllocationMetricCollector is created, it should always report the metric, because the risk are inaccessible APIs. So suggested to create a static factory that returns an instance of AllocationMetricCollector if successful, or none if not successful. and then calculateDelta() always returns just long.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess then you have to check if the factory returns a true collector, still an additional check anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but it's coherent from type system perspective - because currently there is a method that returns Optional<Long>, it may seem that it may return empty or full from time to time, but in fact it always returns only one type of thing.

private static Method getThreadAllocatedBytes;
private static ThreadMXBean threadMXBean;

private Map<Long, Long> previousResults;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using this type instead of Long2LongMap imposes boxing on access, halving the value of primitive collections.

P. S. in fact I want to prohibit using such types altogether: #6861.

previousResults = newResults;
return Optional.of(sum);
}
catch (ReflectiveOperationException e) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once you initialized the class successfully I don't believe this is possible.

{
private static final Logger log = new Logger(AllocationMetricCollector.class);
private Method getThreadAllocatedBytes;
private ThreadMXBean threadMXBean;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be final?

class AllocationMetricCollector
{
private static final Logger log = new Logger(AllocationMetricCollector.class);
private Method getThreadAllocatedBytes;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be final?

class AllocationMetricCollectorFactory
{
private static final Logger log = new Logger(AllocationMetricCollectorFactory.class);
private static Method getThreadAllocatedBytes;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of these fields could be final?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope.

import java.lang.management.ThreadMXBean;
import java.lang.reflect.Method;

class AllocationMetricCollectorFactory
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add private constructor to prohibit accidental instantiation

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class name is confusing for me because if a class is called "XxxFactory" I expect that I could instantiate it and get an "xxx" by calling an instance method of the factory. Static factory method containers are usually called "Xxxs", i. e. AllocationMetricCollectors in this case.

Copy link
Contributor Author

@egor-ryashin egor-ryashin Jan 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's the instance method getAllocationMetricCollector. I don't understand.

}
}

AllocationMetricCollector getAllocationMetricCollector()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add @Nullable

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, when writing my above comments, I didn't notice that this method is not static. But it could be.

}

/**
* Uses getThreadAllocatedBytes internally.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make a Javadoc link

* Tests show the call to getThreadAllocatedBytes for a single thread ID out of 500 threads running takes around
* 9000 ns (in the worst case), which for 500 IDs should take 500*9000/1000/1000 = 4.5 ms to the max.
* AllocationMetricCollector takes linear time to calculate delta, for 500 threads it's negligible. One can call it as
* frequently as once a second, which is far too frequent for any known purpose to me.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of the passage "One can call it as frequently as once a second, which is far too frequent for any known purpose to me." please refer to the default emitting period and where it is configured.


/**
* Uses getThreadAllocatedBytes internally.
* @see com.sun.management.ThreadMXBean#getThreadAllocatedBytes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think @see tags could appear in the middle of the method. I meant just Uses {@link com.sun.management.ThreadMXBean#getThreadAllocatedBytes} internally.

* Tests show the call to getThreadAllocatedBytes for a single thread ID out of 500 threads running takes around
* 9000 ns (in the worst case), which for 500 IDs should take 500*9000/1000/1000 = 4.5 ms to the max.
* AllocationMetricCollector takes linear time to calculate delta, for 500 threads it's negligible.
* See the default emitting period.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly! :) I also spent good 5 minutes trying to figure out what is the default emitting period, and failed. But making each reader of this class to perform the same task is not humane.

@leventov leventov merged commit 2803fda into apache:master Jan 29, 2019
@leventov leventov deleted the feature-6604 branch January 29, 2019 13:16
justinborromeo pushed a commit to justinborromeo/incubator-druid that referenced this pull request Feb 2, 2019
@jon-wei jon-wei added this to the 0.14.0 milestone Feb 20, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants