Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
0245a18
Fix typo.
Kami Aug 17, 2018
0a975aa
Add missing __all__, instrument rules engine with additional metrics.
Kami Aug 17, 2018
9b5a5af
Use consistent metric name.
Kami Aug 17, 2018
4d1b8f7
Add missing __all__.
Kami Aug 17, 2018
20134d8
Add support for "Gauge" metric type to our metrics drivers and code.
Kami Aug 17, 2018
0b4a4e5
Add new instrumentation middleware which allows us to instrument our API
Kami Aug 17, 2018
5d78d67
Add new instrumentation middleware to all the API services.
Kami Aug 17, 2018
b483f75
Add new echo metrics driver which prints out metric calls and use it in
Kami Aug 17, 2018
24f5d1a
Also track total number of the incoming requests.
Kami Aug 17, 2018
b16fb32
Fix lint.
Kami Aug 17, 2018
c759b63
Add tests for new gauge methods.
Kami Aug 17, 2018
179768e
Use echo driver by default in dev environments.
Kami Aug 17, 2018
8d207ea
Use consistent metric names, add some additional instrumentation.
Kami Aug 20, 2018
9f8cb89
Use consistent method names.
Kami Aug 20, 2018
27e85dc
Fix method arguments.
Kami Aug 20, 2018
478744e
Get rid of format_metric_key() function calls which provide no value and
Kami Aug 20, 2018
4086b5c
Fix typo.
Kami Aug 20, 2018
387cf92
Merge branch 'master' into rules_engine_metrics_instrumentation
Kami Aug 21, 2018
55b16f9
Reduce code duplication.
Kami Aug 21, 2018
abe6ca1
Don't decrease counter value on context manager exit.
Kami Aug 21, 2018
74efba2
Increase _counter and _timer suffixes since statsd already correctly
Kami Aug 21, 2018
b014d7d
Merge branch 'rules_engine_metrics_instrumentation' of github.com:Sta…
Kami Aug 21, 2018
8354248
Remove unused module.
Kami Aug 21, 2018
82772cc
Remove unused driver for now since it's just causing confusion.
Kami Aug 21, 2018
e2883d4
Fix metric name.
Kami Aug 21, 2018
0110721
Update affected tests.
Kami Aug 21, 2018
efb7856
Update changelog.
Kami Aug 21, 2018
ebc1245
Add sample statsd config.
Kami Aug 22, 2018
cacaa42
Add sample metrics configs for statsd config and carbon cache.
Kami Aug 22, 2018
7d04e4e
Fix file extension.
Kami Aug 22, 2018
ee8996f
Add new metrics.prefix config option.
Kami Aug 22, 2018
eb2646d
Add changelog entry.
Kami Aug 22, 2018
6732471
Remove unused code.
Kami Aug 22, 2018
1170280
Add a comment.
Kami Aug 22, 2018
f4b3ca3
Make metric key generation more robust, include prefix after "st2" and
Kami Aug 22, 2018
505106e
Update affected code and tests, add new tests.
Kami Aug 22, 2018
4fb318e
Add missing module.
Kami Aug 22, 2018
d4c3aaf
Fix typo.
Kami Aug 22, 2018
fb54c2a
Re-gen sample config.
Kami Aug 22, 2018
1f30852
Re-generate sample config.
Kami Aug 22, 2018
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions CHANGELOG.rst
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,12 @@ Changed
``st2rulesengine`` service. This would make such issues very hard to troubleshoot because only
way to find out about this failure would be to inspect the ``st2rulesengine`` service logs.
(improvement) #4231
* Improve code metric instrumentation and instrument code and various services with more metrics.
(improvement) #4310
* Add new ``metrics.prefix`` config option. With this option user can specify an optional prefix
which is prepended to each metric key (name). This comes handy in scenarios where user wants to
submit metrics from multiple environments / deployments (e.g. testing, staging, dev) to the same
backend instance. (improvement) #4310

Fixed
~~~~~
Expand Down
52 changes: 52 additions & 0 deletions conf/metrics/carbon/storage-aggregation.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# Aggregation methods for whisper files. Entries are scanned in order,
# and first match wins. This file is scanned for changes every 60 seconds
#
# [name]
# pattern = <regex>
# xFilesFactor = <float between 0 and 1>
# aggregationMethod = <average|sum|last|max|min>
#
# name: Arbitrary unique name for the rule
# pattern: Regex pattern to match against the metric name
# xFilesFactor: Ratio of valid data points required for aggregation to the next retention to occur
# aggregationMethod: function to apply to data points for aggregation
#
[min]
pattern = \.min$
xFilesFactor = 0.1
aggregationMethod = min

[max]
pattern = \.max$
xFilesFactor = 0.1
aggregationMethod = max

[count]
pattern = \.count$
xFilesFactor = 0
aggregationMethod = sum

[count_legacy]
pattern = ^stats_counts.*
xFilesFactor = 0
aggregationMethod = sum

[lower]
pattern = \.lower(_\d+)?$
xFilesFactor = 0.1
aggregationMethod = min

[upper]
pattern = \.upper(_\d+)?$
xFilesFactor = 0.1
aggregationMethod = max

[sum]
pattern = \.sum$
xFilesFactor = 0
aggregationMethod = sum

[default_average]
pattern = .*
xFilesFactor = 0.5
aggregationMethod = average
20 changes: 20 additions & 0 deletions conf/metrics/carbon/storage-schemas.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Schema definitions for Whisper files. Entries are scanned in order,
# and first match wins. This file is scanned for changes every 60 seconds.
#
# [name]
# pattern = regex
# retentions = timePerPoint:timeToStore, timePerPoint:timeToStore, ...

# Carbon's internal metrics. This entry should match what is specified in
# CARBON_METRIC_PREFIX and CARBON_METRIC_INTERVAL settings
[stats]
pattern = ^stats.*
retentions = 10s:1d,1m:7d,10m:1y

[carbon]
pattern = ^carbon\.
retentions = 60:90d

[default_1min_for_1day]
pattern = .*
retentions = 60s:1d
19 changes: 19 additions & 0 deletions conf/metrics/statsd/localConfig.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
// Sample statsd config for usage with metrics instrumentation
{
// IP and port of a local or remote graphite instance to which statsd will
// submit metrics
graphiteHost: "127.0.0.1",
graphitePort: 2003,

// statsd listen IP and port
address: "0.0.0.0",
port: 8125,

// Enable debug mode for easier debugging, disable in production
debug: true,

// Disable legacy name prefix
graphite: {
legacyNamespace: false
}
}
2 changes: 2 additions & 0 deletions conf/st2.conf.sample
Original file line number Diff line number Diff line change
Expand Up @@ -183,6 +183,8 @@ cluster_urls = # comma separated list allowed here.
[metrics]
# Destination server to connect to if driver requires connection.
host = 127.0.0.1
# Optional prefix which is prepended to all the metric names. Comes handy when you want to submit metrics from various environment to the same metric backend instance.
prefix = None
# Driver type for metrics collection.
driver = noop
# Destination port to connect to if driver requires connection.
Expand Down
2 changes: 1 addition & 1 deletion conf/st2.dev.conf
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,6 @@ jitter_interval = 0
enable_common_libs = True

[metrics]
driver = noop
driver = echo
host = 127.0.0.1
port = 8125
11 changes: 6 additions & 5 deletions st2actions/st2actions/container/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@
from st2common.util.action_db import (update_liveaction_status, get_liveaction_by_id)
from st2common.util import param as param_utils
from st2common.util.config_loader import ContentPackConfigLoader
from st2common.metrics.base import CounterWithTimer, format_metrics_key
from st2common.metrics.base import CounterWithTimer
from st2common.util import jsonify

from st2common.runners.base import get_runner
Expand Down Expand Up @@ -82,7 +82,7 @@ def dispatch(self, liveaction_db):
'in an unsupported status of "%s".' % liveaction_db.status
)

with CounterWithTimer(key="st2.action.executions"):
with CounterWithTimer(key="action.executions"):
liveaction_db = funcs[liveaction_db.status](runner)

return liveaction_db.result
Expand Down Expand Up @@ -122,9 +122,10 @@ def _do_run(self, runner):
extra = {'runner': runner, 'parameters': resolved_action_params}
LOG.debug('Performing run for runner: %s' % (runner.runner_id), extra=extra)

with CounterWithTimer(key=format_metrics_key(action_db=runner.action, key='action')):
(status, result, context) = runner.run(action_params)
result = jsonify.try_loads(result)
with CounterWithTimer(key='action.executions'):
with CounterWithTimer(key='action.%s.executions' % (runner.action.ref)):
(status, result, context) = runner.run(action_params)
result = jsonify.try_loads(result)

action_completed = status in action_constants.LIVEACTION_COMPLETED_STATES

Expand Down
4 changes: 4 additions & 0 deletions st2api/st2api/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,8 @@
from st2common.middleware.cors import CorsMiddleware
from st2common.middleware.request_id import RequestIDMiddleware
from st2common.middleware.logging import LoggingMiddleware
from st2common.middleware.instrumentation import RequestInstrumentationMiddleware
from st2common.middleware.instrumentation import ResponseInstrumentationMiddleware
from st2common.router import Router
from st2common.util.monkey_patch import monkey_patch
from st2common.constants.system import VERSION_STRING
Expand Down Expand Up @@ -75,6 +77,8 @@ def setup_app(config={}):
app = ErrorHandlingMiddleware(app)
app = CorsMiddleware(app)
app = LoggingMiddleware(app, router)
app = ResponseInstrumentationMiddleware(app, service_name='api')
app = RequestIDMiddleware(app)
app = RequestInstrumentationMiddleware(app, service_name='api')

return app
4 changes: 4 additions & 0 deletions st2auth/st2auth/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,8 @@
from st2common.middleware.cors import CorsMiddleware
from st2common.middleware.request_id import RequestIDMiddleware
from st2common.middleware.logging import LoggingMiddleware
from st2common.middleware.instrumentation import RequestInstrumentationMiddleware
from st2common.middleware.instrumentation import ResponseInstrumentationMiddleware
from st2common.router import Router
from st2common.util.monkey_patch import monkey_patch
from st2common.constants.system import VERSION_STRING
Expand Down Expand Up @@ -69,6 +71,8 @@ def setup_app(config={}):
app = ErrorHandlingMiddleware(app)
app = CorsMiddleware(app)
app = LoggingMiddleware(app, router)
app = ResponseInstrumentationMiddleware(app, service_name='auth')
app = RequestIDMiddleware(app)
app = RequestInstrumentationMiddleware(app, service_name='auth')

return app
1 change: 1 addition & 0 deletions st2common/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,7 @@
'st2common.metrics.driver': [
'statsd = st2common.metrics.drivers.statsd_driver:StatsdDriver',
'noop = st2common.metrics.drivers.noop_driver:NoopDriver',
'echo = st2common.metrics.drivers.echo_driver:EchoDriver',
],
}
)
5 changes: 5 additions & 0 deletions st2common/st2common/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -538,6 +538,11 @@ def register_opts(ignore_errors=False):
cfg.IntOpt(
'port', default=8125,
help='Destination port to connect to if driver requires connection.'),
cfg.StrOpt(
'prefix', default=None,
help='Optional prefix which is prepended to all the metric names. Comes handy when '
'you want to submit metrics from various environment to the same metric '
'backend instance.')
]

do_register_opts(metrics_opts, group='metrics', ignore_errors=ignore_errors)
Expand Down
Loading