Merge ci_feature to ci_feature_prod by rashmichandrashekar · Pull Request #271 · microsoft/Docker-Provider

rashmichandrashekar · 2019-10-07T18:55:53Z

No description provided.

Dilipr/fluentd config updates

* Build out_oms.so and include in docker-cimprov package * Adding fluent-bit-config file to base container * PR Feedback * Adding out_oms.conf to base_container.data * PR Feedback * Making the critical section as small as possible * PR Feedback * Fixing the newline bug for Computer, and changing containerId to Id

* Updating glide.* files to include lumberjack

* Using KubeAPI for getting image,name. Adding more logs * Moving log file and state file to within the omsagent container * Changing log and state paths

* Marks Comments + Error Handling * Drop records from files that are not in k8s format * Remove unnecessary log line' * Adding Log to the file that doesn't conform to the expected format

* adding null checks in all providers * fixing type * fixing type * adding more null checks * update cjson

* adding null check for cjson-delete * null chk * removing null check

Removing fluent-bit filters, CPU optimizations

…ontinue when there is an error with k8s api (#141) * Removing some logs, added more error checking, continue on kube-api error * Return FLB OK for json Marshall error, instead of RETRY

* Remove ContainerPerf, ContainerServiceLog,ContainerProcess (OMI workflows) for Daemonset

…properly, tab to whitespace in conf file

* Telemetry Fixes 1. Added Log Generation Rate 2. Fixed parsing bugs 3. Added code to send Exceptions/errors * PR Feedback

* Changes to send omsagent/omsagent-rs kubectl logs to App Insights * PR Feedback

* updating the OMS agent to also collect container last state * changed a comment * git surrounded ContainerLastStatus code in a begin/rescue block * added a lot of error checking and logging

* fix prom telemetry * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes

… perf testing (#246) Merge Health to ci_feature

…used (#251) * Fix the Capacity computation * fix node cpu and memory limits calculation

Added new regions, added handler for MDM plugin start

* Added Missing Handlers

* Adding explicit require_relative

* enable ai telemetry to configure different ikey and endpoint per cloud

…ervice name as an ENV variable (#261) * Expose replica set service as an env variable * Fixing null check out_mdm bug, and tomlparser bug * Updating the env variable name to be more specific to health model

…theus scraping (#262) * changes * changes * changes * changes * changes * changes * chnages * changes * telemetry changes * changes

* add telemetry to detect the cloud, distro and kernel version * add null check since providerId optional * detect azurestack cloud * rename to KubernetesProviderID since ProviderID name already used in LA * capture workspaceCloud to the telemetry * trim the domain read from file

* changes * changes * changes * changes * changes * changes * env changes * changes * changes * changes * reverting * changes * cahnges * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * chnages * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes

* Dupe Perf Record Fix

…itions 3. Fixed Type Changes 4. Use Env variable, and health_forward (that handles network errors at init) 5. Unit Tests (#268)

* init container - KPI and kubeperf changes * changes * changes * changes * changes for empty array fix * changes * changes * pod inventory exception fix * nil check changes * changes * fixing typo * changes * changes * PR - feedback * remove comment * tag pass changes * changes * tagdrop changes * changes * changes

send when an agg monitor details change, but state did not change

vishiy

approving to be able to resolve conflicts

vishiy · 2019-10-07T22:13:14Z

installer/conf/telegraf.conf

  #tagexclude = ["AgentVersion","AKS_RESOURCE_ID","ACS_RESOURCE_NAME", "Region", "ClusterName", "ClusterType", "Computer", "ControllerType"]
-
+  [inputs.prometheus.tagpass]
+    operation_type = ["create_container", "remove_container", "pull_image"]


please take a look at workbook.

* Updatng release history * fixing the plugin logs for emit stream * updating log message * Remove Log Processing from fluentd configuration * Remove plugin references from base_container.data * Dilipr/fluent bit log processing (#126) * Build out_oms.so and include in docker-cimprov package * Adding fluent-bit-config file to base container * PR Feedback * Adding out_oms.conf to base_container.data * PR Feedback * Making the critical section as small as possible * PR Feedback * Fixing the newline bug for Computer, and changing containerId to Id * Dilipr/glide updates (#127) * Updating glide.* files to include lumberjack * containerID="" for pull issues * Using KubeAPI for getting image,name. Adding more logs (#129) * Using KubeAPI for getting image,name. Adding more logs * Moving log file and state file to within the omsagent container * Changing log and state paths * Dilipr/mark comments (#130) * Marks Comments + Error Handling * Drop records from files that are not in k8s format * Remove unnecessary log line' * Adding Log to the file that doesn't conform to the expected format * Rashmi/segfault latest (#132) * adding null checks in all providers * fixing type * fixing type * adding more null checks * update cjson * Adding a missed null check (#135) * reusing some variables (#136) * Rashmi/cjson delete null check (#138) * adding null check for cjson-delete * null chk * removing null check * updating log level to debug for some provider workflows (#139) * Fixing CPU Utilization and removing Fluent-bit filters (#140) Removing fluent-bit filters, CPU optimizations * Minor tweaks 1. Remove some logging 2. Added more Error Handling 3. Continue when there is an error with k8s api (#141) * Removing some logs, added more error checking, continue on kube-api error * Return FLB OK for json Marshall error, instead of RETRY * * Change FluentBit flush interval to 30 secs (from 5 secs) * Remove ContainerPerf, ContainerServiceLog,ContainerProcess (OMI workflows) for Daemonset * Container Log Telemetry * Fixing an issue with Send Init Event if Telemetry is not initialized properly, tab to whitespace in conf file * PR feedback * PR feedback * Sending an event every 5 mins(Heartbeat) (#146) * PR feedback to cleanup removed workflows * updating agent version for telemetry * updating agent version * Telemetry Updates (#149) * Telemetry Fixes 1. Added Log Generation Rate 2. Fixed parsing bugs 3. Added code to send Exceptions/errors * PR Feedback * Changes to send omsagent/omsagent-rs kubectl logs to App Insights (#159) * Changes to send omsagent/omsagent-rs kubectl logs to App Insights * PR Feedback * Rashmi/fluentd docker inventory (#160) * first stab * changes * changes * docker util changes * working tested util * input plugin and conf * changes * changes * changes * changes * changes * working containerinventory * fixing omi removal from container.conf * removing comments * file write and read * deleted containers working * changes * changes * socket timeout * deleting test files * adding log * fixing comment * appinsights changes * changes * tel changes * changes * changes * changes * changes * lib changes * changes * changes * fixes * PR comments * changes * updating the ownership * changes * changes * changes to container data * removing comment * changes * adding collection time * bug fix * env string truncation * changes for acs-engine test * Fix Telemetry Bug -- Initialize Telemetry Client after Initializing all required properties (#162) * Fix kube events memory leak due to yaml serialization for > 5k events (#163) * Setting Timeout for HTTP Client in PostDataHelper in outoms go plugin(#164) * Vishwa/perftelemetry 2 (#165) * add cpu usage telemetry for ds & rs * add cpu & memory usage telemetry for ds & rs * environment variable fix (#166) * environment variable fix * updating agent version * Fixing a bug where we were crashing due to container statuses not present when not was lost (#167) * Updating title * updating right versions for last release * Updating the break condition to look for end of response (#168) * Updating the break condition to look for end of response * changes for docker response * updating AgentVersion for telemetry * Updating readme for latest release changes * Changes - (#173) * use /var/log for state * new metric ContainerLogsAgentSideLatencyMs * new field 'timeOfComand' * Rashmi/kubenodeinventory (#174) * containernodeinventory changes * changes for containernodeinventory * changes to add node telemetry * pod telemetry cahnges * updated telemetry changes * changes to get uid of owner references as controller id * Get cpuusage from usageseconds (#175) * Rashmi/kubenodeinventory (#176) * containernodeinventory changes * changes for containernodeinventory * changes to add node telemetry * pod telemetry cahnges * updated telemetry changes * changes to get uid of owner references as controller id * updating socket to the new mount location * Adding exception telemetry and heartbeat * changes to fix controller type * Fixing typo * fixing method signature * updating plugins to get controller type from env * fixing bugs * Rashmi/kubenodeinventory (#178) * containernodeinventory changes * changes for containernodeinventory * changes to add node telemetry * pod telemetry cahnges * updated telemetry changes * changes to get uid of owner references as controller id * updating socket to the new mount location * Adding exception telemetry and heartbeat * changes to fix controller type * Fixing typo * fixing method signature * updating plugins to get controller type from env * fixing bugs * changes to fixed type * removing comments * changes for fixed type * Fixing an issue on the cpurate metric, which happens for the first time (when cache is empty) (#179) * Rashmi/kubenodeinventory (#180) * containernodeinventory changes * changes for containernodeinventory * changes to add node telemetry * pod telemetry cahnges * updated telemetry changes * changes to get uid of owner references as controller id * updating socket to the new mount location * Adding exception telemetry and heartbeat * changes to fix controller type * Fixing typo * fixing method signature * updating plugins to get controller type from env * fixing bugs * changes to fixed type * removing comments * changes for fixed type * adding kubelet version as a dimension * Exclude docker containers from container inventory (#181) * containernodeinventory changes * changes for containernodeinventory * changes to add node telemetry * pod telemetry cahnges * updated telemetry changes * changes to get uid of owner references as controller id * updating socket to the new mount location * Adding exception telemetry and heartbeat * changes to fix controller type * Fixing typo * fixing method signature * updating plugins to get controller type from env * fixing bugs * changes to fixed type * removing comments * changes for fixed type * adding kubelet version as a dimension * Excluding raw docker containers from container inventory * making labels key case insensitive * make poduid label case insensitive * Exclude pauseamd64 containers from container inventory (#182) * containernodeinventory changes * changes for containernodeinventory * changes to add node telemetry * pod telemetry cahnges * updated telemetry changes * changes to get uid of owner references as controller id * updating socket to the new mount location * Adding exception telemetry and heartbeat * changes to fix controller type * Fixing typo * fixing method signature * updating plugins to get controller type from env * fixing bugs * changes to fixed type * removing comments * changes for fixed type * adding kubelet version as a dimension * Excluding raw docker containers from container inventory * making labels key case insensitive * make poduid label case insensitive * changes to exclude pause amd 64 containers * Update agent version * Updating readme for the latest release * Fix indentation in kube.conf and update readme (#184) * containernodeinventory changes * changes for containernodeinventory * changes to add node telemetry * pod telemetry cahnges * updated telemetry changes * changes to get uid of owner references as controller id * updating socket to the new mount location * Adding exception telemetry and heartbeat * changes to fix controller type * Fixing typo * fixing method signature * updating plugins to get controller type from env * fixing bugs * changes to fixed type * removing comments * changes for fixed type * adding kubelet version as a dimension * Excluding raw docker containers from container inventory * making labels key case insensitive * make poduid label case insensitive * changes to exclude pause amd 64 containers * fixing indentation so that kube.conf contents can be used in config map in the yaml * updating readme to fix date and agent version * updating agent tag * Get Pods for current Node Only (#185) * Fix KubeAPI Calls to filter to get pods for current node * Reinstate log line * changes for container node inventory fixed type (#186) * Fix for mooncake (disable telemetry optionally) (#191) * disable telemetry option * fix a typo * CustomMetrics to ci_feature (#193) Custom Metrics changes to ci_feature * add ContainerNotRunning column to KubePodInventory * merge pr feedback: update name to ContainerStatusReason * Zero Fill for Missing Pod Phases, Change Namespace Dimension to Kubernetes namespace, as it might be confused with metrics namespace in Metrics Explorer (#194) * Zero Fill for Pod Counts by Phase * Change namespace dimension to Kubernetes namespace * No Retries for non 404 4xx errors (#196) * Update agent version for telemetry * Update readme for upcoming (ciprod01202019) release * fix readme formatting * fix formatting for readme * fix formatting for readme * fix readme * fix readme * fix agent version for telemetry * fix date in readme * update readme * Restart logs every 10MB instead of weekly (#198) * Rotate logs every 10MB instead of weekly * Removing some logging, fixed log rotation * update agent version for telemetry * update readme * Update kube.conf to use %STATE_DIR_WS% instead of hardcoded path * Fix AKSEngine Crash (#200) * hotfix * close resp.Body * remove chatty logs * membuf=5m and ignore files not updated since 5 mins * fix readme for new version * Fix the pod count in mdm agent plugin (#203) * Update readme * string freeze for out_mdm plugin * Vishwa/resourcecentric (#208) * resourceid fix (for AKS only) * fix name * Rashmi/win nodepool - PR (#206) * changes for win nodes enumeration * changes * changes * changes * node cpu metric rate changes * container cpu rate * changes * changes * changes * changes * changes * changes to include in_win_cadvisor_perf.rb file * send containerinventoryheartbeatevent * changes * cahnges for mdm metrics * changes * cahnges * changes * container states * changes * changes * changes for env variables * changes * changes * changes * changes * delete comments * changes * mutex changes * changes * changes * changes * telemetry fix for docker version * removing hardcoded values for mdm * update docker version * telemetry for windows cadvisor timeouts * exeception key update to computer * PR comments * adding os to container inventory for windows nodes (#210) * Fix omsagent crash Error when kube-api returns non-200, send events for HTTP Errors (#211) * Fix omsagent crash Error when kube-api returns non-200, send events for HTTP Errors * Fixing the bug, deferring telemetry changes for later * updating to lowercase compare for units (#212) * Merge from vishwa/telegraftcp to ci_feature for telegraf changes (#214) * merge from Vishwa/telegraf to Vishwa/telegraftcp for telegraf changes (#207) * add configuration for telegraf * fix for perms * fix telegraf config. * fix file location & config * update to config * fix namespace * trying different namespace and also debug=true * add placeholder for nodename * change namespace * updated config * fix uri * fix azMon settings * remove aad settings * add custom metrics regions * fix config * add support for replica-set config * fix oomkilled * Add telegraf 403 metric telemetry & non 403 trace telemetry * fix type * fix package * fix package import * fix filename * delete unused file * conf file for rs; fix 403counttotal metric for telegraf, remove host and use nodeName consistently, rename metrics * fix statefulsets * fix typo. * fix another typo. * fix telemetry * fix casing issue * fix comma issue. * disable telemetry for rs ; fix stateful set name * worksround for namespace fix * telegraf integration - v1 * telemetry changes for telegraf * telemetry & other changes * remove custom metric regions as we dont need anymore * remove un-needed files * fixes * exclude certain volumes and fix telemetry to not have computer & nodename as dimensions (redundant) * Vishwa/resourcecentric (#208) (#209) * resourceid fix (for AKS only) * fix name * near final metric shape * change from customlog to fixed type (InsightsMetrics) * fix PR feedback * fix pr feedback * Fix telemetry error for telegraf err count metric (#215) * Fix Unscheduled Pod bug, remove excess telemetry (#218) * Fix Unscheduled Pod bug, remove excess telemetry * Send Success Telemetry only once after startup for a node in a cluster for MDM Post * Sending telemetry for successful push to MDM every hour * Merge from Vishwa/promstandardmetrics into ci_feature (#220) * enable prometheus metrics collection in replica-set * fixing typos * fix config file path for replicaset * fix configuration * config changes * merge config/settings to ci_feature (#221) * updating fluentbit to use LOG_TAIL_PATH * changes * log exclusion pattern * changes * removing comments * adding enviornment varibale collection/disable * disable env var for cluster variable change * changes * toml parser changes * adding directory tomlrb * changes for container inventory * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * Telemetry for config overrides * add schema version telemetry * reduce the number of api calls for namespace filtering add more telemetry for config processing move liveness probe & parser to this repo * optimize for default kube-system namespace log collection exclusion * Fix Scenario when Controller name is empty (#222) * fix ; * ContainerLog collection optimizations (#223) * * derive k8s namespace from file (rather than making a api call) * optimize perf by not tailing excluded namespaces in stdout & stderr * Tuning fluentbit settings based on Cortana teams findings * making db sync off * buffer chunk and max as 1m so that we dont flush > 1m payloads * increasing rotatte wait from 5 secs to 30 secs * decreasing refresh interval from 60 secs to 30 secs * adding retry limit as 10 so that items get dropped in 50 secs rather than infinetely trying * changing flush to 5 secs from 30 secs * merge final changes for release from Vishwa/june2019agentrel to ci_feature (#224) * * derive k8s namespace from file (rather than making a api call) * optimize perf by not tailing excluded namespaces in stdout & stderr * Tuning fluentbit settings based on Cortana teams findings * making db sync off * buffer chunk and max as 1m so that we dont flush > 1m payloads * increasing rotatte wait from 5 secs to 30 secs * decreasing refresh interval from 60 secs to 30 secs * adding retry limit as 10 so that items get dropped in 50 secs rather than infinetely trying * changing flush to 5 secs from 30 secs * fix a minor comment * * change flush from 5 to 10 secs based on perf findings * fix fluent bit tuning for perf run (#226) * fix fluent bit tuning for perf run * stop collecting our own partition * fix merge issue * add release notes for june release in ci_feature branch * fix title * update * fix title * Trim spaces in AKS_REGION (#233) This is not an issue for normal AKS Monitoring Addon Onboarding. ONLY an issue for backdoor onboarding * Add Logs Size To Telemetry (#234) * Add Logs to telemetry * Using len instead of unsafe.Sizeof * Merge Vishwa/promcustommetrics to ci_feature (#237) * hard code config for UST CCP team * fix config * fix config after discussion * fix error log to get errros * fix config * update config * Add telemetry * Rashmi/promcustomconfig (#231) * changes * formatting changes * changes * changes * changes * changes * changes * changes * changes * changes * adding telemetry * changes * changes * changes * changes * changes * changes * changes * cahnges * changes * Rashmi/promcustomconfig (#236) * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * fix exceptions * changes to remove some exceptions * exception fixes * changes * changes for poduid nil check * Fix Region space error (#239) * Trim spaces in AKS_REGION This is not an issue for normal AKS Monitoring Addon Onboarding. ONLY an issue for backdoor onboarding * Fix out_mdm parsing error * Removing buffer chunk size and buffer max size from fluentbit conf (#240) * hard code config for UST CCP team * fix config * fix config after discussion * fix error log to get errros * fix config * update config * Add telemetry * Rashmi/promcustomconfig (#231) * changes * formatting changes * changes * changes * changes * changes * changes * changes * changes * changes * adding telemetry * changes * changes * changes * changes * changes * changes * changes * cahnges * changes * Rashmi/promcustomconfig (#236) * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * fix exceptions * changes to remove some exceptions * exception fixes * changes * changes for poduid nil check * removing buffer chunk size and buffer max size from fluentbit conf * changes (#243) * Collect container last state (#235) * updating the OMS agent to also collect container last state * changed a comment * git surrounded ContainerLastStatus code in a begin/rescue block * added a lot of error checking and logging * Rashmi/fix prom telemetry (#247) * fix prom telemetry * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * Merge Health Model work into ci_feature behind a feature flag Pending perf testing (#246) Merge Health to ci_feature * Fix Deserialization Bug (#249) * Fix the bug where capacity is not updated and cached value was being used (#251) * Fix the Capacity computation * fix node cpu and memory limits calculation * changes (#250) * Added new Custom Metrics Regions, fixed MDM plugin crash bug (#253) Added new regions, added handler for MDM plugin start * Add Missing Handlers (#254) * Added Missing Handlers * Return MultiEventStream.new instead of empty array (#256) * Added explicit require_relative to avoid loading errors (#258) * Adding explicit require_relative * Gangams/enable ai telemetry in mc (#252) * enable ai telemetry to configure different ikey and endpoint per cloud * Fixing null check out_mdm bug, tomlparser bug, exposing Replica Set service name as an ENV variable (#261) * Expose replica set service as an env variable * Fixing null check out_mdm bug, and tomlparser bug * Updating the env variable name to be more specific to health model * Changes for creating custom plugins with namespace settings for prometheus scraping (#262) * changes * changes * changes * changes * changes * changes * chnages * changes * telemetry changes * changes * Cherry-pick hotfix 09092019 to ci_feature (#265) * Gangams/add telemetry hybrid (#264) * add telemetry to detect the cloud, distro and kernel version * add null check since providerId optional * detect azurestack cloud * rename to KubernetesProviderID since ProviderID name already used in LA * capture workspaceCloud to the telemetry * trim the domain read from file * KubeMonAgentEvents changes to collect configuration events (#267) * changes * changes * changes * changes * changes * changes * env changes * changes * changes * changes * reverting * changes * cahnges * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * chnages * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * Fix the Dupe Perf Data Issue from the DaemonSet (#266) * Dupe Perf Record Fix * PR for 1. Container Memory CPU monitor 2. Configuration for Node Conditions 3. Fixed Type Changes 4. Use Env variable, and health_forward (that handles network errors at init) 5. Unit Tests (#268) * init containers fix and other bug fixes (#269) * init container - KPI and kubeperf changes * changes * changes * changes * changes for empty array fix * changes * changes * pod inventory exception fix * nil check changes * changes * fixing typo * changes * changes * PR - feedback * remove comment * tag pass changes * changes * tagdrop changes * changes * changes * Send agg monitor signal on details change (#270) send when an agg monitor details change, but state did not change

vishiy and others added 30 commits August 1, 2018 16:54

Updatng release history

3c5b46d

fixing the plugin logs for emit stream

d31f588

updating log message

11fd5f6

Remove Log Processing from fluentd configuration

87a9cf8

Remove plugin references from base_container.data

308be41

Merge pull request #124 from Microsoft/dilipr/fluentdConfigUpdates

5bee0af

Dilipr/fluentd config updates

Dilipr/glide updates (#127)

b02f2ec

* Updating glide.* files to include lumberjack

containerID="" for pull issues

e01c678

Using KubeAPI for getting image,name. Adding more logs (#129)

b0ba22d

* Using KubeAPI for getting image,name. Adding more logs * Moving log file and state file to within the omsagent container * Changing log and state paths

Dilipr/mark comments (#130)

9783419

* Marks Comments + Error Handling * Drop records from files that are not in k8s format * Remove unnecessary log line' * Adding Log to the file that doesn't conform to the expected format

Rashmi/segfault latest (#132)

8e35b73

* adding null checks in all providers * fixing type * fixing type * adding more null checks * update cjson

Adding a missed null check (#135)

4b63021

reusing some variables (#136)

8b964fd

Rashmi/cjson delete null check (#138)

938c2ed

* adding null check for cjson-delete * null chk * removing null check

updating log level to debug for some provider workflows (#139)

fbfdf11

Fixing CPU Utilization and removing Fluent-bit filters (#140)

d426066

Removing fluent-bit filters, CPU optimizations

Minor tweaks 1. Remove some logging 2. Added more Error Handling 3. C…

c2cabab

…ontinue when there is an error with k8s api (#141) * Removing some logs, added more error checking, continue on kube-api error * Return FLB OK for json Marshall error, instead of RETRY

* Change FluentBit flush interval to 30 secs (from 5 secs)

32567db

* Remove ContainerPerf, ContainerServiceLog,ContainerProcess (OMI workflows) for Daemonset

Container Log Telemetry

afc981d

Fixing an issue with Send Init Event if Telemetry is not initialized …

4b958dd

…properly, tab to whitespace in conf file

PR feedback

510ef9f

PR feedback

684c39b

Sending an event every 5 mins(Heartbeat) (#146)

e165275

Merge branch 'ci_feature_prod' into ci_feature

eecb5db

PR feedback to cleanup removed workflows

cfe1ca9

updating agent version for telemetry

892b51c

updating agent version

9c83160

Telemetry Updates (#149)

f0b5a61

* Telemetry Fixes 1. Added Log Generation Rate 2. Fixed parsing bugs 3. Added code to send Exceptions/errors * PR Feedback

Changes to send omsagent/omsagent-rs kubectl logs to App Insights (#159)

a58998e

* Changes to send omsagent/omsagent-rs kubectl logs to App Insights * PR Feedback

daweim0 and others added 20 commits July 15, 2019 11:01

Collect container last state (#235)

5ee482b

* updating the OMS agent to also collect container last state * changed a comment * git surrounded ContainerLastStatus code in a begin/rescue block * added a lot of error checking and logging

Rashmi/fix prom telemetry (#247)

378cc93

* fix prom telemetry * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes

Merge Health Model work into ci_feature behind a feature flag Pending…

df60197

… perf testing (#246) Merge Health to ci_feature

Fix Deserialization Bug (#249)

4adcd8b

Fix the bug where capacity is not updated and cached value was being …

2ee4307

…used (#251) * Fix the Capacity computation * fix node cpu and memory limits calculation

changes (#250)

e86f82f

Added new Custom Metrics Regions, fixed MDM plugin crash bug (#253)

c76ce47

Added new regions, added handler for MDM plugin start

Add Missing Handlers (#254)

10a79c8

* Added Missing Handlers

Return MultiEventStream.new instead of empty array (#256)

851ab4e

Added explicit require_relative to avoid loading errors (#258)

f20debb

* Adding explicit require_relative

Gangams/enable ai telemetry in mc (#252)

a8804df

* enable ai telemetry to configure different ikey and endpoint per cloud

Fixing null check out_mdm bug, tomlparser bug, exposing Replica Set s…

8a5ebb0

…ervice name as an ENV variable (#261) * Expose replica set service as an env variable * Fixing null check out_mdm bug, and tomlparser bug * Updating the env variable name to be more specific to health model

Changes for creating custom plugins with namespace settings for prome…

a939bf7

…theus scraping (#262) * changes * changes * changes * changes * changes * changes * chnages * changes * telemetry changes * changes

Cherry-pick hotfix 09092019 to ci_feature (#265)

2a07233

Fix the Dupe Perf Data Issue from the DaemonSet (#266)

c472b12

* Dupe Perf Record Fix

PR for 1. Container Memory CPU monitor 2. Configuration for Node Cond…

98e4114

…itions 3. Fixed Type Changes 4. Use Env variable, and health_forward (that handles network errors at init) 5. Unit Tests (#268)

Send agg monitor signal on details change (#270)

3079471

send when an agg monitor details change, but state did not change

rashmichandrashekar requested review from r-dilip and vishiy October 7, 2019 18:55

vishiy approved these changes Oct 7, 2019

View reviewed changes

resolving conflicts with ci_feature_prod

d16e2b0

r-dilip approved these changes Oct 7, 2019

View reviewed changes

vishiy reviewed Oct 7, 2019

View reviewed changes

vishiy approved these changes Oct 7, 2019

View reviewed changes

rashmichandrashekar merged commit 3dce027 into ci_feature_prod Oct 7, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge ci_feature to ci_feature_prod#271

Merge ci_feature to ci_feature_prod#271
rashmichandrashekar merged 141 commits intoci_feature_prodfrom
ci_feature

rashmichandrashekar commented Oct 7, 2019

Uh oh!

vishiy left a comment

Uh oh!

vishiy Oct 7, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

rashmichandrashekar commented Oct 7, 2019

Uh oh!

vishiy left a comment

Choose a reason for hiding this comment

Uh oh!

vishiy Oct 7, 2019

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants