Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
85 commits
Select commit Hold shift + click to select a range
3c5b46d
Updatng release history
vishiy Aug 1, 2018
d31f588
fixing the plugin logs for emit stream
Aug 1, 2018
11fd5f6
updating log message
Aug 5, 2018
87a9cf8
Remove Log Processing from fluentd configuration
r-dilip Aug 16, 2018
308be41
Remove plugin references from base_container.data
r-dilip Aug 16, 2018
5bee0af
Merge pull request #124 from Microsoft/dilipr/fluentdConfigUpdates
r-dilip Aug 30, 2018
bcd1a3f
Dilipr/fluent bit log processing (#126)
r-dilip Sep 14, 2018
b02f2ec
Dilipr/glide updates (#127)
r-dilip Sep 14, 2018
e01c678
containerID="" for pull issues
vishiy Sep 17, 2018
b0ba22d
Using KubeAPI for getting image,name. Adding more logs (#129)
r-dilip Sep 18, 2018
9783419
Dilipr/mark comments (#130)
r-dilip Sep 27, 2018
8e35b73
Rashmi/segfault latest (#132)
rashmichandrashekar Sep 27, 2018
4b63021
Adding a missed null check (#135)
rashmichandrashekar Sep 27, 2018
8b964fd
reusing some variables (#136)
rashmichandrashekar Sep 28, 2018
938c2ed
Rashmi/cjson delete null check (#138)
rashmichandrashekar Sep 28, 2018
fbfdf11
updating log level to debug for some provider workflows (#139)
rashmichandrashekar Oct 3, 2018
d426066
Fixing CPU Utilization and removing Fluent-bit filters (#140)
r-dilip Oct 4, 2018
c2cabab
Minor tweaks 1. Remove some logging 2. Added more Error Handling 3. C…
r-dilip Oct 9, 2018
32567db
* Change FluentBit flush interval to 30 secs (from 5 secs)
vishiy Oct 10, 2018
afc981d
Container Log Telemetry
r-dilip Oct 12, 2018
4b958dd
Fixing an issue with Send Init Event if Telemetry is not initialized …
r-dilip Oct 12, 2018
510ef9f
PR feedback
r-dilip Oct 12, 2018
684c39b
PR feedback
r-dilip Oct 12, 2018
e165275
Sending an event every 5 mins(Heartbeat) (#146)
r-dilip Oct 15, 2018
eecb5db
Merge branch 'ci_feature_prod' into ci_feature
vishiy Oct 16, 2018
cfe1ca9
PR feedback to cleanup removed workflows
vishiy Oct 16, 2018
892b51c
updating agent version for telemetry
vishiy Oct 16, 2018
9c83160
updating agent version
vishiy Oct 17, 2018
f0b5a61
Telemetry Updates (#149)
r-dilip Oct 25, 2018
a58998e
Changes to send omsagent/omsagent-rs kubectl logs to App Insights (#159)
r-dilip Oct 30, 2018
4c2da9f
Rashmi/fluentd docker inventory (#160)
rashmichandrashekar Nov 5, 2018
6698fcd
Fix Telemetry Bug -- Initialize Telemetry Client after Initializing a…
r-dilip Nov 8, 2018
ad6bb93
Fix kube events memory leak due to yaml serialization for > 5k events…
vishiy Nov 12, 2018
eff92df
Setting Timeout for HTTP Client in PostDataHelper in outoms go plugi…
r-dilip Nov 14, 2018
9893e36
Vishwa/perftelemetry 2 (#165)
vishiy Nov 16, 2018
4f3c898
environment variable fix (#166)
rashmichandrashekar Nov 27, 2018
5e16467
Fixing a bug where we were crashing due to container statuses not pre…
vishiy Nov 27, 2018
b482b1e
Updating title
vishiy Nov 29, 2018
d75ba89
updating right versions for last release
vishiy Nov 29, 2018
cbd815c
Updating the break condition to look for end of response (#168)
rashmichandrashekar Nov 29, 2018
d0d5bf7
updating AgentVersion for telemetry
vishiy Nov 29, 2018
bfe27e5
Updating readme for latest release changes
vishiy Nov 29, 2018
5677560
Merge branch 'ci_feature_prod' into ci_feature
vishiy Nov 29, 2018
a621f88
Changes - (#173)
vishiy Dec 17, 2018
c9cf4fd
Rashmi/kubenodeinventory (#174)
rashmichandrashekar Dec 17, 2018
df6f122
Get cpuusage from usageseconds (#175)
vishiy Dec 20, 2018
dac9931
Rashmi/kubenodeinventory (#176)
rashmichandrashekar Dec 21, 2018
04cc1a8
Rashmi/kubenodeinventory (#178)
rashmichandrashekar Dec 26, 2018
5883f53
Fixing an issue on the cpurate metric, which happens for the first ti…
vishiy Dec 26, 2018
191f328
Rashmi/kubenodeinventory (#180)
rashmichandrashekar Dec 28, 2018
7e52e8c
Exclude docker containers from container inventory (#181)
rashmichandrashekar Jan 7, 2019
f0591f9
Exclude pauseamd64 containers from container inventory (#182)
rashmichandrashekar Jan 8, 2019
99e8813
Merge branch 'ci_feature_prod' into ci_feature
vishiy Jan 9, 2019
4782435
Update agent version
vishiy Jan 9, 2019
23bcc41
Updating readme for the latest release
vishiy Jan 9, 2019
51d5e93
Fix indentation in kube.conf and update readme (#184)
rashmichandrashekar Jan 11, 2019
decf86a
updating agent tag
rashmichandrashekar Jan 11, 2019
a1b35db
Get Pods for current Node Only (#185)
r-dilip Jan 29, 2019
22649ba
changes for container node inventory fixed type (#186)
rashmichandrashekar Jan 30, 2019
61e2eaf
Fix for mooncake (disable telemetry optionally) (#191)
vishiy Feb 13, 2019
30dff41
CustomMetrics to ci_feature (#193)
r-dilip Feb 15, 2019
f1b0cd2
add ContainerNotRunning column to KubePodInventory
bragi92 Jan 24, 2019
616a803
merge pr feedback: update name to ContainerStatusReason
bragi92 Jan 24, 2019
c33ca34
Zero Fill for Missing Pod Phases, Change Namespace Dimension to Kuber…
r-dilip Feb 19, 2019
2651750
No Retries for non 404 4xx errors (#196)
r-dilip Feb 20, 2019
195bc33
Update agent version for telemetry
vishiy Feb 20, 2019
59d6c61
Update readme for upcoming (ciprod01202019) release
vishiy Feb 20, 2019
0189bc0
fix readme formatting
vishiy Feb 20, 2019
8221d2d
fix formatting for readme
vishiy Feb 20, 2019
30aa305
fix formatting for readme
vishiy Feb 20, 2019
f401116
fix readme
vishiy Feb 20, 2019
a2f45af
fix readme
vishiy Feb 21, 2019
759dbb5
fix agent version for telemetry
vishiy Feb 21, 2019
8bff5f9
Merge branch 'ci_feature_prod' into ci_feature
vishiy Feb 21, 2019
7956f40
fix date in readme
vishiy Feb 21, 2019
ee05656
update readme
vishiy Feb 21, 2019
2abcf67
Restart logs every 10MB instead of weekly (#198)
r-dilip Feb 21, 2019
18c107c
update agent version for telemetry
vishiy Feb 21, 2019
14b2b87
update readme
vishiy Feb 21, 2019
a1b551f
Merge branch 'ci_feature_prod' into ci_feature
vishiy Feb 21, 2019
5479dff
Update kube.conf to use %STATE_DIR_WS% instead of hardcoded path
rashmichandrashekar Feb 22, 2019
cdded2e
Fix AKSEngine Crash (#200)
r-dilip Mar 4, 2019
57be1c4
hotfix
vishiy Mar 13, 2019
940a6eb
fix readme for new version
vishiy Mar 13, 2019
154fe56
Merge branch 'ci_feature_prod' into ci_feature
vishiy Mar 13, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,13 @@ additional questions or comments.
## Release History

Note : The agent version(s) below has dates (ciprod<mmddyyyy>), which indicate the agent build dates (not release dates)

### 03/12/2019 - Version microsoft/oms:ciprod03122019
- Fix for closing response.Body in outoms
- Update Mem_Buf_Limit to 5m for fluentbit
- Tail only files that were modified since 5 minutes
- Remove some unwanted logs that are chatty in outoms
- Fix for MDM disablement for AKS-Engine

### 02/21/2019 - Version microsoft/oms:ciprod02212019
- Container logs enrichment optimization
Expand Down
9 changes: 4 additions & 5 deletions installer/conf/td-agent-bit.conf
Original file line number Diff line number Diff line change
Expand Up @@ -10,16 +10,17 @@
Path /var/log/containers/*.log
DB /var/log/omsagent-fblogs.db
Parser docker
Mem_Buf_Limit 30m
Mem_Buf_Limit 5m
Path_Key filepath
Skip_Long_Lines On
Ignore_Older 5m

[INPUT]
Name tail
Tag oms.container.log.flbplugin.*
Path /var/log/containers/omsagent*.log
DB /var/opt/microsoft/docker-cimprov/state/omsagent-ai.db
Mem_Buf_Limit 30m
Mem_Buf_Limit 2m
Path_Key filepath
Skip_Long_Lines On

Expand All @@ -28,6 +29,4 @@
EnableTelemetry true
TelemetryPushIntervalSeconds 300
Match oms.container.log.*
AgentVersion ciprod02212019


AgentVersion ciprod03122019
11 changes: 4 additions & 7 deletions source/code/go/src/plugins/oms.go
Original file line number Diff line number Diff line change
Expand Up @@ -246,16 +246,11 @@ func PostDataHelper(tailPluginRecords []map[interface{}]interface{}) int {

if val, ok := imageIDMap[containerID]; ok {
stringMap["Image"] = val
} else {
Log("ContainerId %s not present in Name Map ", containerID)
}
}

if val, ok := nameIDMap[containerID]; ok {
stringMap["Name"] = val
} else {
Log("ContainerId %s not present in Image Map ", containerID)
}

}

dataItem := DataItem{
ID: stringMap["Id"],
Expand Down Expand Up @@ -319,6 +314,8 @@ func PostDataHelper(tailPluginRecords []map[interface{}]interface{}) int {
return output.FLB_RETRY
}

defer resp.Body.Close()

numRecords := len(dataItems)
Log("Successfully flushed %d records in %s", numRecords, elapsed)
ContainerLogTelemetryMutex.Lock()
Expand Down
4 changes: 2 additions & 2 deletions source/code/plugin/CustomMetricsUtils.rb
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,8 @@ class << self
def check_custom_metrics_availability(custom_metric_regions)
aks_region = ENV['AKS_REGION']
aks_resource_id = ENV['AKS_RESOURCE_ID']
if aks_region.to_s.empty? && aks_resource_id.to_s.empty?
false # This will also take care of AKS-Engine Scenario. AKS_REGION/AKS_RESOURCE_ID is not set for AKS-Engine. Only ACS_RESOURCE_NAME is set
if aks_region.to_s.empty? || aks_resource_id.to_s.empty?
return false # This will also take care of AKS-Engine Scenario. AKS_REGION/AKS_RESOURCE_ID is not set for AKS-Engine. Only ACS_RESOURCE_NAME is set
end

custom_metrics_regions_arr = custom_metric_regions.split(',')
Expand Down
23 changes: 18 additions & 5 deletions source/code/plugin/out_mdm.rb
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ def initialize
@cached_access_token = String.new
@last_post_attempt_time = Time.now
@first_post_attempt_made = false
@can_send_data_to_mdm = true
end

def configure(conf)
Expand All @@ -39,7 +40,13 @@ def configure(conf)

def start
super
file = File.read(@@azure_json_path)
begin
file = File.read(@@azure_json_path)
rescue => e
@log.info "Unable to read file #{@@azure_json_path} #{e}"
@can_send_data_to_mdm = false
return
end
# Handle the case where the file read fails. Send Telemetry and exit the plugin?
@data_hash = JSON.parse(file)
@token_url = @@token_url_template % {tenant_id: @data_hash['tenantId']}
Expand All @@ -48,11 +55,13 @@ def start
aks_region = ENV['AKS_REGION']
if aks_resource_id.to_s.empty?
@log.info "Environment Variable AKS_RESOURCE_ID is not set.. "
raise Exception.new "Environment Variable AKS_RESOURCE_ID is not set!!"
@can_send_data_to_mdm = false
return
end
if aks_region.to_s.empty?
@log.info "Environment Variable AKS_REGION is not set.. "
raise Exception.new "Environment Variable AKS_REGION is not set!!"
@can_send_data_to_mdm = false
return
end

@@post_request_url = @@post_request_url_template % {aks_region: aks_region, aks_resource_id: aks_resource_id}
Expand Down Expand Up @@ -115,14 +124,18 @@ def format(tag, time, record)
# 'chunk' is a buffer chunk that includes multiple formatted records
def write(chunk)
begin
if !@first_post_attempt_made || (Time.now > @last_post_attempt_time + retry_mdm_post_wait_minutes*60)
if (!@first_post_attempt_made || (Time.now > @last_post_attempt_time + retry_mdm_post_wait_minutes*60)) && @can_send_data_to_mdm
post_body = []
chunk.msgpack_each {|(tag, record)|
post_body.push(record.to_json)
}
send_to_mdm post_body
else
@log.info "Last Failed POST attempt to MDM was made #{((Time.now - @last_post_attempt_time)/60).round(1)} min ago. This is less than the current retry threshold of #{@retry_mdm_post_wait_minutes} min. NO-OP"
if !@can_send_data_to_mdm
@log.info "Cannot send data to MDM since all required conditions were not met"
else
@log.info "Last Failed POST attempt to MDM was made #{((Time.now - @last_post_attempt_time)/60).round(1)} min ago. This is less than the current retry threshold of #{@retry_mdm_post_wait_minutes} min. NO-OP"
end
end
rescue Exception => e
@log.info "Exception when writing to MDM: #{e}"
Expand Down