feat: Emit git commit metric and add a commit version to sys.servers for validations#19123
Conversation
Added a method to load git properties from the classpath and include the git commit SHA in emitted metrics.
There was a problem hiding this comment.
Thanks @razinbouzar! It’s useful to have this to track service health across rolling deployments with the same version. Left some high-level thoughts.
- Please document the new dimension in metrics.md: https://druid.apache.org/docs/latest/operations/metrics/ (something like a high level info note would suffice)
- Once the new dimension is in, we can also add the same info to
sys.serverstable similar to #18542. This would provide a SQL-friendly way to verify that clusters are running the intended commit, version, etc.
| private static Map<String, String> loadGitProperties() | ||
| { | ||
| try (InputStream is = EmitterModule.class.getClassLoader().getResourceAsStream("git.properties")) { | ||
| if (is != null) { | ||
| Properties props = new Properties(); | ||
| props.load(is); | ||
|
|
||
| // Convert Properties to Map<String, String> | ||
| Map<String, String> gitProperties = new HashMap<>(); | ||
| for (String key : props.stringPropertyNames()) { | ||
| String value = props.getProperty(key); | ||
| if (value != null && !value.trim().isEmpty()) { | ||
| gitProperties.put(key, value.trim()); | ||
| } | ||
| } | ||
|
|
||
| return gitProperties; |
There was a problem hiding this comment.
On the approach, I'm wondering whether using git.properties from the class loader vs using META-INF/MANIFEST.MF (produced by the JAR build) would be better.
Is the former coming from git-commit-id-maven-plugin: https://github.com/apache/druid/blob/master/pom.xml#L1878-L1903?
The latter should be available through https://github.com/apache/druid/blob/master/pom.xml#L1870:
Build-Revision: 1c2eb65...
Build-Version: 37.0.0-SNAPSHOT
Looks like version internally reads from the manifest as well: String version = getClass().getPackage().getImplementationVersion()
But those are standard properties, so we may want a small utility to read the Git SHA / build revision from the manifest, such as the Build-Revision property. I’ll also give it some thought.
I included 2 in this PR as well since. Can break it out if preferred. |
SHA gets flagged by spellcheck. Using git commit ID in its place.
changing to git commit hash.
spellcheck doesn't like jar. replacing it with binary to be consistent. also complained about build_revision
abhishekrb19
left a comment
There was a problem hiding this comment.
Looks good to me overall. Left some suggestions. Thanks!
| private static String readBuildRevisionFromManifest() | ||
| { | ||
| try { | ||
| URL classUrl = DruidNode.class.getResource(DruidNode.class.getSimpleName() + ".class"); | ||
| if (classUrl != null && "jar".equals(classUrl.getProtocol())) { | ||
| String classPath = classUrl.toString(); | ||
| String manifestPath = classPath.substring(0, classPath.lastIndexOf('!') + 1) + "/META-INF/MANIFEST.MF"; | ||
| try (InputStream is = new URL(manifestPath).openStream()) { | ||
| return new Manifest(is).getMainAttributes().getValue("Build-Revision"); | ||
| } | ||
| } | ||
| } | ||
| catch (IOException e) { | ||
| // Fall through and return null | ||
| } | ||
| return null; | ||
| } |
There was a problem hiding this comment.
We can create a class BuildInfo and move this utility to it so both EmitterModule and the DruidNode code here can share it. We can also add more build-related utilities to this class in the future.
| } | ||
| } | ||
| catch (IOException e) { | ||
| // Fall through and return null |
There was a problem hiding this comment.
Log the exception so it doesn't mask a legitimate issue
| catch (IOException e) { | ||
| // Fall through and return null | ||
| } | ||
| return null; |
There was a problem hiding this comment.
Since both EmitterModule and DruidNode converts null to empty with nullToEmptyNonDruidDataString(), I think it'd be good to have this method directly return "" (for tests)
| * `service`: the service name that emitted the metric | ||
| * `host`: the host name that emitted the metric | ||
| * `version`: the Druid version of the service that emitted the metric | ||
| * `buildRevision`: the git commit of the build that produced the service binary. Useful for verifying that all nodes in a cluster are running the intended revision during rolling deployments. Empty string when running outside a packaged binary (e.g., during `mvn test`). |
There was a problem hiding this comment.
These are user-facing docs, so I think we can exclude the note about mvn test that's already captured in the javadocs for devs:
| * `buildRevision`: the git commit of the build that produced the service binary. Useful for verifying that all nodes in a cluster are running the intended revision during rolling deployments. Empty string when running outside a packaged binary (e.g., during `mvn test`). | |
| * `buildRevision`: the git commit of the build that produced the service binary. Useful for verifying that all nodes in a cluster are running the intended revision during rolling deployments. |
| |is_leader|BIGINT|1 if the server is currently the 'leader' (for services which have the concept of leadership), otherwise 0 if the server is not the leader, or null if the server type does not have the concept of leadership| | ||
| |start_time|STRING|Timestamp in ISO8601 format when the server was announced in the cluster| | ||
| |version|VARCHAR|Druid version running on the server| | ||
| |build_revision|VARCHAR|The git commit of the build that produced the server binary. Empty string when running outside a packaged binary (e.g., during `mvn test`)| |
There was a problem hiding this comment.
| |build_revision|VARCHAR|The git commit of the build that produced the server binary. Empty string when running outside a packaged binary (e.g., during `mvn test`)| | |
| |build_revision|VARCHAR|The git commit of the build that produced the server binary.| |
There was a problem hiding this comment.
Optionally we could also wire this to the console's services view by default in web-console/src/views/services-view/services-view.tsx and web-console/src/views/services-view/__snapshots__/services-view.spec.tsx.snap: https://github.com/apache/druid/pull/18542/changes#diff-07dc2b09eeb1fc84b83d5ec80b95e4cda9fb235200e26b37271cc73d8d62b883
Adding a git commit sha as an emitted metric for version tracking. Additionally added a column to sys.servers which does the same, using #18542 as motivation.
This feature will be useful in troubleshooting/debugging, dashboards, and as a deployment status signal.
Description
Fixed the bug ...
Renamed the class ...
Added a forbidden-apis entry ...
Release note
New: You can now verify the exact build revision of each Druid node.
Two new fields expose the git commit SHA of the JAR running on each node:
running outside a packaged JAR (e.g., during mvn test).
Key changed/added classes in this PR
service dimension on every emitted metric
This PR has: