Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 40 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

11 changes: 11 additions & 0 deletions crates/datadog-metrics-collector/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
[package]
name = "datadog-metrics-collector"
version = "0.1.0"
edition.workspace = true
license.workspace = true
description = "Collector to read, compute, and submit enhanced metrics in Serverless environments"

[dependencies]
dogstatsd = { path = "../dogstatsd", default-features = true }
tracing = { version = "0.1", default-features = false }
libdd-common = { git = "https://github.com/DataDog/libdatadog", rev = "8c88979985154d6d97c0fc2ca9039682981eacad", default-features = false }
176 changes: 176 additions & 0 deletions crates/datadog-metrics-collector/src/instance.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,176 @@
// Copyright 2023-Present Datadog, Inc. https://www.datadoghq.com/
// SPDX-License-Identifier: Apache-2.0

//! Instance identity metric collector for Azure Functions.
//!
//! Submits `azure.functions.enhanced.instance` with value 1.0 on each
//! collection tick, tagged with the instance identifier.

use dogstatsd::aggregator::AggregatorHandle;
use dogstatsd::metric::{Metric, MetricValue, SortedTags};
use std::env;
use tracing::{error, warn};

const INSTANCE_METRIC: &str = "azure.functions.enhanced.instance";

/// Resolves the instance ID from explicit values (used by tests).
///
/// Picks the env var that matches the Azure integration metric's `instance`
/// tag for the current hosting plan with fallback logic
/// if the preferred source is empty.
fn resolve_instance_id_from(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should libddcommon be thinking about website pod name / container name? Will there be potential inconsistencies?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! I had created a ticket for this actually https://datadoghq.atlassian.net/browse/SVLS-8931 - but this led me to realize that the instance ID used in libddcommon / spans is different from the instance tag on integration metrics.

I compared the env var values to the instance tag on integration metrics across hosting plans and found that in Elastic Premium and Premium plans, the integration metrics actually match the COMPUTERNAME env var rather than WEBSITE_INSTANCE_ID which the spans use

And for Flex Consumption and Consumption, on spans the instance id is often unknown. I documented my env var investigations here as well as in the ticket above

website_sku: Option<&str>,
container_name: Option<&str>,
website_pod_name: Option<&str>,
computer_name: Option<&str>,
) -> Option<String> {
fn non_empty(s: Option<&str>) -> Option<&str> {
s.filter(|v| !v.is_empty())
}

let sku_preferred = match website_sku {
Some("FlexConsumption") | Some("Dynamic") => {
non_empty(container_name).or(non_empty(website_pod_name))
}
Some(_) => non_empty(computer_name),
None => None,
};

sku_preferred
.or_else(|| non_empty(container_name))
.or_else(|| non_empty(website_pod_name))
.or_else(|| non_empty(computer_name))
.map(|s| s.to_lowercase())
}

/// Resolves the instance ID from environment variables.
fn resolve_instance_id() -> Option<String> {
resolve_instance_id_from(
env::var("WEBSITE_SKU").ok().as_deref(),
env::var("CONTAINER_NAME").ok().as_deref(),
env::var("WEBSITE_POD_NAME").ok().as_deref(),
env::var("COMPUTERNAME").ok().as_deref(),
)
Comment on lines +46 to +53
Copy link

Copilot AI Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolve_instance_id() reads COMPUTERNAME but the intended Azure Functions instance identifier (per the PR description and the test comment below) appears to be WEBSITE_INSTANCE_ID. If WEBSITE_INSTANCE_ID is set on some plans where COMPUTERNAME is not, the instance metric may never be submitted. Consider including WEBSITE_INSTANCE_ID in the resolution inputs (and updating the preference/fallback order accordingly), or updating the PR/docs to explicitly state why COMPUTERNAME is the correct source.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description is updated, removed the test comment in e2adcff

WEBSITE_INSTANCE_ID should not be used as the instance tag, this is explained in the PR description + linked doc

}

pub struct InstanceMetricsCollector {
aggregator: AggregatorHandle,
tags: Option<SortedTags>,
}

impl InstanceMetricsCollector {
/// Creates a new collector, returning `None` if no instance ID is found.
pub fn new(aggregator: AggregatorHandle, tags: Option<SortedTags>) -> Option<Self> {
let instance_id = resolve_instance_id();
let Some(instance_id) = instance_id else {
warn!("No instance ID found, instance metric will not be submitted");
return None;
};

// Precompute tags: enhanced metrics tags + instance tag
let instance_tag = format!("instance:{}", instance_id);
let tags = match tags {
Some(mut existing) => {
if let Ok(id_tag) = SortedTags::parse(&instance_tag) {
existing.extend(&id_tag);
}
Some(existing)
}
None => SortedTags::parse(&instance_tag).ok(),
};

Some(Self { aggregator, tags })
}

pub fn collect_and_submit(&self) {
let metric = Metric::new(
INSTANCE_METRIC.into(),
MetricValue::gauge(1.0),
self.tags.clone(),
None,
);

if let Err(e) = self.aggregator.insert_batch(vec![metric]) {
error!("Failed to insert instance metric: {}", e);
}
}
}

#[cfg(test)]
mod tests {
use super::*;

#[test]
fn test_flex_consumption_uses_container_name() {
let id = resolve_instance_id_from(
Some("FlexConsumption"),
Some("0--abc-DEF"),
Some("0--abc-DEF"),
None,
);
assert_eq!(id, Some("0--abc-def".to_string()));
}

#[test]
fn test_flex_consumption_falls_back_to_pod_name_if_container_missing() {
let id = resolve_instance_id_from(Some("FlexConsumption"), None, Some("pod-XYZ"), None);
assert_eq!(id, Some("pod-xyz".to_string()));
}

#[test]
fn test_consumption_uses_container_name() {
let id = resolve_instance_id_from(
Some("Dynamic"),
Some("ABCD1234-111122223333444455"),
None,
None,
);
assert_eq!(id, Some("abcd1234-111122223333444455".to_string()));
}

#[test]
fn test_elastic_premium_uses_computer_name() {
let id =
resolve_instance_id_from(Some("ElasticPremium"), None, None, Some("ep0fakewk0000A1"));
assert_eq!(id, Some("ep0fakewk0000a1".to_string()));
}

#[test]
fn test_dedicated_uses_computer_name() {
let id = resolve_instance_id_from(Some("PremiumV3"), None, None, Some("p3fakewk0000B2"));
assert_eq!(id, Some("p3fakewk0000b2".to_string()));
}

#[test]
fn test_empty_string_is_treated_as_missing() {
let id =
resolve_instance_id_from(Some("ElasticPremium"), Some(""), Some(""), Some("worker-1"));
assert_eq!(id, Some("worker-1".to_string()));
}

#[test]
fn test_unknown_sku_falls_back_to_search_order() {
let id = resolve_instance_id_from(Some("SomeNewSku"), Some("container-1"), None, None);
assert_eq!(id, Some("container-1".to_string()));
}

#[test]
fn test_missing_sku_falls_back_to_search_order() {
let id = resolve_instance_id_from(None, Some("container-1"), None, Some("worker-1"));
assert_eq!(id, Some("container-1".to_string()));
}

#[test]
fn test_no_env_vars_returns_none() {
let id = resolve_instance_id_from(None, None, None, None);
assert_eq!(id, None);
}
Comment on lines +103 to +167
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The instance ID resolution tests only cover the fallback cases. To fully lock in the intended precedence, add a test asserting WEBSITE_INSTANCE_ID wins over WEBSITE_POD_NAME/CONTAINER_NAME, and (optionally) a test for the all-None case returning None.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If any of these env vars exist, they should give the correct instance value, so I think this test case is unnecessary. The all-None case also isn't relevant since these are env vars injected by Azure. End to end tests to ensure these instance-identifying environment variables don't change would be more helpful


// On Windows Consumption we've observed CONTAINER_NAME and WEBSITE_POD_NAME
// unset but COMPUTERNAME set
#[test]
fn test_windows_consumption_falls_through_to_computer_name() {
let id = resolve_instance_id_from(Some("Dynamic"), None, None, Some("10-20-30-40"));
assert_eq!(id, Some("10-20-30-40".to_string()));
}
}
11 changes: 11 additions & 0 deletions crates/datadog-metrics-collector/src/lib.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
// Copyright 2023-Present Datadog, Inc. https://www.datadoghq.com/
// SPDX-License-Identifier: Apache-2.0

#![cfg_attr(not(test), deny(clippy::panic))]
#![cfg_attr(not(test), deny(clippy::unwrap_used))]
#![cfg_attr(not(test), deny(clippy::expect_used))]
#![cfg_attr(not(test), deny(clippy::todo))]
#![cfg_attr(not(test), deny(clippy::unimplemented))]

pub mod instance;
pub mod tags;
123 changes: 123 additions & 0 deletions crates/datadog-metrics-collector/src/tags.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
// Copyright 2023-Present Datadog, Inc. https://www.datadoghq.com/
// SPDX-License-Identifier: Apache-2.0

//! Shared tag builder for enhanced metrics.
//!
//! Tags are attached to all enhanced metrics submitted by the metrics collector.

use dogstatsd::metric::SortedTags;
use libdd_common::{azure_app_services, tag::Tag};
use std::env;
use tracing::warn;

/// `libdd_common::azure_app_services` returns this value when the corresponding Azure metadata isn't populated.
const AAS_UNKNOWN_VALUE: &str = "unknown";

/// Builds the common tags for all enhanced metrics.
///
/// Sources:
/// - Azure metadata (resource_group, subscription_id, name) from libdd_common
/// - Environment variables (region, plan_tier, service, env, version, serverless_compat_version)
///
/// The DogStatsD origin tag (e.g. `origin:azurefunction`) is added by the metrics aggregator,
/// not here.
pub fn build_enhanced_metrics_tags() -> Option<SortedTags> {
let mut pairs: Vec<(&'static str, String)> = Vec::new();

if let Some(aas_metadata) = &*azure_app_services::AAS_METADATA_FUNCTION {
for (name, value) in [
("resource_group", aas_metadata.get_resource_group()),
Copy link
Copy Markdown
Contributor

@Lewis-E Lewis-E Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"resource_group" vs ""aas.resource.group" (used in common metadata)? should we have both? Probably not given the whole cardinality choice, but wondering why to decide one way or another.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had the same confusion initially - we want to use the same tags that integration metrics are using so that we can JOIN them, which is why we don't have the aas* prefix!

("subscription_id", aas_metadata.get_subscription_id()),
("name", aas_metadata.get_site_name()),
] {
if value != AAS_UNKNOWN_VALUE {
pairs.push((name, value.to_string()));
}
}
}

for (tag_name, env_var) in [
("region", "REGION_NAME"),
("plan_tier", "WEBSITE_SKU"),
("service", "DD_SERVICE"),
("env", "DD_ENV"),
("version", "DD_VERSION"),
("serverless_compat_version", "DD_SERVERLESS_COMPAT_VERSION"),
] {
if let Ok(val) = env::var(env_var) {
pairs.push((tag_name, val));
}
Comment on lines +39 to +49
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

build_enhanced_metrics_tags concatenates raw environment variable values into a comma-separated tag string and then parses it. If any value contains a comma, it will be split into multiple tags (producing incorrect tags or parse failures). Consider sanitizing/escaping tag values (or dropping values containing reserved delimiters like ,/|) before building the tag list.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}

build_tags(pairs)
}

fn build_tags(pairs: impl IntoIterator<Item = (&'static str, String)>) -> Option<SortedTags> {
let mut tags: Vec<Tag> = Vec::new();
for (key, value) in pairs {
if value.is_empty() {
continue;
}
// Tag::new validates the combined "key:value" string: it must be
// non-empty and not start or end with a colon
match Tag::new(key, &value) {
Ok(t) => tags.push(t),
Err(e) => warn!("Skipping invalid tag {key}:{value}: {e}"),
}
}
if tags.is_empty() {
return None;
}
let joined = tags
.iter()
.map(|t| t.as_ref())
.collect::<Vec<&str>>()
.join(",");
SortedTags::parse(&joined).ok()
}

#[cfg(test)]
mod tests {
use super::*;

#[test]
fn test_build_tags_returns_none_when_no_pairs() {
let pairs: Vec<(&'static str, String)> = Vec::new();
assert!(build_tags(pairs).is_none());
}

#[test]
fn test_build_tags_returns_none_when_all_values_empty() {
let pairs = vec![("service", String::new()), ("env", String::new())];
assert!(build_tags(pairs).is_none());
}

#[test]
fn test_build_tags_skips_empty_values() {
let pairs = vec![("service", String::new()), ("env", "dev".to_string())];
let tags = build_tags(pairs).unwrap().to_strings();
assert_eq!(tags, vec!["env:dev"]);
}

#[test]
fn test_build_tags_includes_all_nonempty_pairs() {
let pairs = vec![
("service", "svc-1".to_string()),
("env", "dev".to_string()),
("version", "1.2.3".to_string()),
];
let mut tags = build_tags(pairs).unwrap().to_strings();
tags.sort();
assert_eq!(tags, vec!["env:dev", "service:svc-1", "version:1.2.3"]);
}

#[test]
fn test_build_tags_rejects_trailing_colon_values() {
let pairs = vec![
("service", "svc-1:".to_string()),
("env", "dev".to_string()),
];
let tags = build_tags(pairs).unwrap().to_strings();
assert_eq!(tags, vec!["env:dev"]);
}
}
1 change: 1 addition & 0 deletions crates/datadog-serverless-compat/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ windows-pipes = ["datadog-trace-agent/windows-pipes", "dogstatsd/windows-pipes"]

[dependencies]
datadog-logs-agent = { path = "../datadog-logs-agent" }
datadog-metrics-collector = { path = "../datadog-metrics-collector" }
datadog-trace-agent = { path = "../datadog-trace-agent" }
libdd-trace-utils = { git = "https://github.com/DataDog/libdatadog", rev = "27aa92cfeeca073d8730a8b4974bd3fdef7ddf3a" }
datadog-fips = { path = "../datadog-fips", default-features = false }
Expand Down
Loading
Loading