Skip to content

Conversation

@aakarshg
Copy link
Contributor

This is an initial draft to collect the pprof data for apiserver and prometheus through conprof when running nodervert. Once, this is merged will make same changes for other workloads

Copy link
Contributor

@rsevilla87 rsevilla87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice job @aakarshg :) I've added a couple of minor comments

conprof_stop.sh: |
#!/bin/sh
set -o pipefail
ps -ef | grep "conprof all --config.file /tmp/conprof.yaml --log.level=debug" | grep -v grep | awk '{print $2}' | xargs kill
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A pkill conprof should be enough to kill all conprof instances.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ooh nice, i'd try that and use that instead...

tls_config:
insecure_skip_verify: true
static_configs:
- targets: ['apiserver0-openshift-kube-apiserver.apps.{{clustername}}.perf-testing.devcluster.openshift.com']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might specify multiple targets here to avoid repeating ourselves, i.e

- targets: ['apiserver0-openshift-kube-apiserver.apps.{{clustername}}.perf-testing.devcluster.openshift.com', 'apiserver1-openshift-kube-apiserver.apps.{{clustername}}.perf-testing.devcluster.openshift.com', 'apiserver2-openshift-kube-apiserver.apps.{{clustername}}.perf-testing.devcluster.openshift.com'] 

In this case we should also make the the job_name more generic with something like apiserver .

Copy link
Contributor Author

@aakarshg aakarshg Apr 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its actually better to avoid specifying multiple targets, as we can then use specific labels for each in this case i'm trying to apply label node=node_name ( might not happen in this pr but maybe subsequent ones ) and then also add a label called apiserver=true. Having more detailed labels will help us easily get results, vs trying to figure out which profile is associated with which master.

Copy link
Contributor

@jtaleric jtaleric left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add a check for both the start and stop, to ensure the process has indeed started and stopped? It seems like we have blind faith in things running :)

@aakarshg aakarshg mentioned this pull request Apr 29, 2020
pbench_server: "{{ lookup('env', 'PBENCH_SERVER')|default('', true) }}"

# pporf variables
pprof_collect: "{{ lookup('env', 'PPROF_COLLECT')|default(false, true)|bool|lower }}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs needs be updated with the details about the pprof collection.

tls_config:
insecure_skip_verify: true
static_configs:
- targets: ['apiserver1-openshift-kube-apiserver.apps.{{clustername}}.perf-testing.devcluster.openshift.com']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have to modify the target when using a different base domain instead of perf-testing.devcluster.openshift.com or it doesn't matter?

@aakarshg aakarshg force-pushed the conprof branch 2 times, most recently from 0ea5cab to 3319e64 Compare May 6, 2020 21:45
@aakarshg aakarshg requested a review from chaitanyaenr May 11, 2020 17:55
Copy link
Member

@chaitanyaenr chaitanyaenr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@aakarshg aakarshg force-pushed the conprof branch 3 times, most recently from be4f4c5 to 2b0040f Compare May 11, 2020 19:54
Copy link
Contributor

@rsevilla87 rsevilla87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@chaitanyaenr chaitanyaenr merged commit f8792ce into openshift-scale:master May 14, 2020
chaitanyaenr added a commit to chaitanyaenr/workloads that referenced this pull request Jun 30, 2020
This PR is highly inspired from openshift-scale#131.
It enables the users to collect pprof data through conprof.
chaitanyaenr added a commit to chaitanyaenr/workloads that referenced this pull request Jun 30, 2020
This PR is highly inspired from openshift-scale#131.
It enables the users to collect pprof data through conprof.
chaitanyaenr added a commit that referenced this pull request Jul 3, 2020
This PR is highly inspired from #131.
It enables the users to collect pprof data through conprof.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants