[windows] make windows python checks (specifically WMI) load and run#303
[windows] make windows python checks (specifically WMI) load and run#303derekwbrown wants to merge 98 commits intomasterfrom
Conversation
There was a problem hiding this comment.
This is the change that concerns me the most. Python should find everything in "site-packages", but the included DLLs weren't found until I explicitly added the directories below. Which makes me concerned for the checks that use the other packages in site-packages.
There was a problem hiding this comment.
hmm, so on Unix the actual script that's executed to the agent is https://github.com/DataDog/datadog-agent/blob/1b15f18edbfb5848f16a4613c858ae31b42289d5/pkg/collector/dist/agent, which populates the PYTHONPATH before calling the actual agent binary.
Is this what was missing on Windows and justifies this change, or is there a larger issue?
93fbb8c to
2f77868
Compare
…294) It should only panic when it makes sense for it to panic
* Fix error handling by using the same `stickyLock` when fetching the interpreter error Changing locks between a python call that sets the error indicator on the interpreter and a fetch of that error can result in the error indicator not being there anymore when the fetch is attempted. I'm not sure exactly why this happens. The fact that it happens "randomly" makes me think that it's related to how the goroutines are scheduled on OS threads by the go runtime, and to how this can affect the state of the python interpreter. * Various improvements on `getPythonError`: * now a method of the `stickyLock` struct for clarity * check that an error occurred before trying to fetch it (improves quality of error that's returned by the function) * normalize the python error (recommended by the python C API docs) * use the string representation of `pvalue` (improves error message) * Use `getPythonError` in py's `check.Run` to fetch errors from the python interpreter. We should use it wherever possible.
* [py] Add `increment`/`decrement` methods to `AgentCheck` Eases the transition from agent 5 to agent 6 for the checks. The methods use `count` after adding a suffix to the metric name. The suffix is required because `count` submits metrics with the `COUNT` api metric type, whereas in agent 5 these methods use the `RATE` api metric type, so we need to use a different metric name (the backend doesn't support using a different API metric type for the same metric name). * [py] Log deprecation warning on first usage of `increment`/`decrement` Logged at most once per check to avoid spamming the logs.
* [snmp] instance should be snmp_device, submit counters as rates. * [snmp] fix broken snmp tests.
This time when `instances` is the only argument passed in `kwargs`. This pattern is quite common in the existing checks (since it's the signature of the agent5's `AgentCheck.__init__`), so it makes sense to support it. Added a test case, and assertions in the python test code.
* Add expvars to the scheduler * Add expvar to check conf
* cleaned up dogstatsd rake commands * build the static version of the binary when running size test
[percentiles] first pass at adding percentile sketch
* [jmx][auto-discovery] enabling auto-config for JMXfetch - WIP [jmxfetch] bumping jmxfetch to 0.14.0. [pipe][windows] fixing compilation issues. [jmx] try to instantiate loader, if we fail to create pipe - skip. [py] skip python if we cant load python loader. [jmx][auto-discovery] implementing whitelist for specific jmxfetch checks. [jmx][auto-discovery] fix output YAML indentation. [config] include String() method. [auto-discovery] separator includes cr. [auto-discovery] renaming some more vars. * [jmxfetch] fix multiple vet/lint errors, adding test. * [jmx][test] fix loader test with in-memory pipe + parallel I/O * [check][test] adding check String function unit test. * [jmx] renaming loader module in embed package to jmxloader.
Only imported the modules that are used by existing integrations-core checks.
As discussed. I've removed them entirely from `.gitlab-ci.yml` since there's nothing very specific to the gitlab tasks. We can re-add them once we've put some work into actually making them fully standalone.
* consolidate default values for settings * explicitly set default values * improved descriptions in example file * moar fixes
Should be replaced by a "standard" `device:` tag. This change shouldn't make a difference in the backend, I'll make sure of that though. 2 parts to this change: * `AgentCheck` supports `device_name` as a parameter to the metric submission methods for backwards-compatibility, but we should stop supporting it at some point (we log a deprecation notice when `device_name` is used). * Remove all traces of `DeviceName` field in aggregator
[k8s] Support older versions missing ListDeployment API
Quite a few `integrations-core` checks use this class of exceptions for some of the exceptions they raise. It makes sense to keep it, it encourages some level of exception granularity in the checks.
[percentile] change GKArray methods to value receivers
* [load] adding system check. * [system][iostats] adding iostats check + tests. * [iostats] more precise comment.. * [uptime] adding check + tests. * [load] removing logging statement. * [iostats][test] amending expected number of asserted calls. * [iostats] improve windows support. * [iostats] refactor unix specific io submission, for clarity. * [iostats] refactor, windows numbers differ greatly. Currently not BW compatible. * [iostats] removing unnecessary C(go) header. * [iostats] implement device blacklist. * [iostats] fixing blacklist management + adding test. * [windows] implement windows IO check * Initial checkin of code using both WMI & PDH * Modified APM request * touch up logging. Remove calls to APM * [load] adding system check. * [system][iostats] adding iostats check + tests. * [iostats] more precise comment.. * [load] removing logging statement. * [iostats] improve windows support. * [iostats] refactor unix specific io submission, for clarity. * [iostats] refactor, windows numbers differ greatly. Currently not BW compatible. * [iostats] removing unnecessary C(go) header. * [iostats] implement device blacklist. * [iostats] fixing blacklist management + adding test. * [windows] implement windows IO check * Initial checkin of code using both WMI & PDH * Modified APM request * touch up logging. Remove calls to APM * Switch back to wmi, at least for now. Neither PDH nor WMI is picking up disk changes, at least in the manner tested. * Fix problems with merge * Fix merge. Currently uses WMI; will re-add pdh based IO check at a later date * Changes to reflect review feedback * centos6 changes (#354) * adds upstart file to centos6 * adds upstart file to rpm * changes conditional * enables alternative python * Fix gitlab ci so ci testing can continue * Final review feedback. Make string conversion more efficient by using a byte buffer to do the original conversion, rather than appending a character on to the string for each pass. * Final review feedback. Make string conversion more efficient by using a byte buffer to do the original conversion, rather than appending a character on to the string for each pass.
These python checks should be pulled from integrations-core now. Also, log with warning level the `self.warning` messages
Thin class that allows running the integrations-core `NetworkCheck`s out of the box. Also, added the `default_integration_http_timeout` config parameter and `AgentCheck` attribute that some checks use.
[percentile/forwarder] use v2 endpoint to forward sketch series
* add send_host_metadata to disable host metadata collection if running several instances/host * rename option to enable_metadata_collection and reword documentation
Using `exec` makes the agent process replace the shell script's process instead of starting the agent process as a child process. This is desirable for init systems that can't really track the actual agent PID otherwise, and can then fail to track the agent process properly.
* [py] Initialize site, and use PYTHONHOME * Allow python to automatically initialize `site` when we initialize the interpreter (see https://docs.python.org/2/library/site.html). This makes python build its own default `PYTHONPATH` from the `PYTHONHOME`, so that it can load modules (built-ins, `site-packages` modules, etc). * Instead of setting `PYTHONPATH` (since we let python build that on its own), set `PYTHONHOME`. That'll pick up the embedded python, the rest will be handled by python. * [py] Log paths that we add to the python path
* Clean up the `SubmitV2` methods that were duplicates of the `Submit` methods * Rename `CheckRuns` to `ServiceChecks` on the v2 endpoint, and change the endpoint to `/v2/service_checks`
* Add the `UNKNOWN` service check status * Remove constants that were there only for the agent6-specific checks
|
@derekwbrown Can you try the changes that I've merged in #390? I think it should solve the loading issues you had on the modules in the |
|
created new PR #406 |
Work in progress.
Agent6 can now load python checks, specifically the WMI check.