-
Notifications
You must be signed in to change notification settings - Fork 115
Gangams/fix rs ooming #473
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
46 commits
Select commit
Hold shift + click to select a range
82d91c2
optimize kpi
ganga1980 bede6ef
optimize kube node inventory
ganga1980 9f7759e
add flags for events, deployments and hpa
ganga1980 6073fed
have separate function parseNodeLimits
ganga1980 97f55f7
refactor code
ganga1980 abc28c2
fix crash
ganga1980 259a95c
fix bug with service name
ganga1980 b37529b
fix bugs related to get service name
ganga1980 7375e33
update oom fix test agent
ganga1980 ed0857b
debug logs
ganga1980 b69f032
fix service label issue
ganga1980 2eeaed4
update to latest agent and enable ephemeral annotation
ganga1980 10e4b71
change stream size to 200 from 250
ganga1980 d003daa
update yaml
ganga1980 0ba0610
adjust chunksizes
ganga1980 43975d9
add ruby gc env
ganga1980 2b8660b
yaml changes for cioomtest11282020-3
ganga1980 8e378fa
telemetry to track pods latency
ganga1980 fb56ab0
service count telemetry
ganga1980 e9541ea
rename variables
ganga1980 023a7cb
wip
ganga1980 26f0772
nodes inventory telemetry
ganga1980 79f40f1
configmap changes
ganga1980 3545773
add emit streams in configmap
ganga1980 9b7587d
yaml updates
ganga1980 9b857b4
fix copy and paste bug
ganga1980 5597360
add todo comments
ganga1980 8880e91
fix node latency telemetry bug
ganga1980 87f52d6
update yaml with latest test image
ganga1980 c4651c9
fix bug
ganga1980 95144a6
upping rs memory change
ganga1980 ae2cf42
fix mdm bug with final emit stream
ganga1980 cf8da5c
update to latest image
ganga1980 11eda7c
fix pr feedback
ganga1980 2f3574d
fix pr feedback
ganga1980 6b589a9
rename health config to agent config
ganga1980 53972c2
fix max allowed hpa chunk size
ganga1980 f8702ff
update to use 1k pod chunk since validated on 1.18+
ganga1980 531f768
remove debug logs
ganga1980 cff2ee4
minor updates
ganga1980 60d6391
move defaults to common place
ganga1980 acb2f8f
Merge branch 'ci_dev' into gangams/fix-rs-ooming
ganga1980 f88ae92
chart updates
ganga1980 0392e28
final oomfix agent
ganga1980 6be2e13
update to use prod image so that can be validated with build pipeline
ganga1980 1c25829
fix typo in comment
ganga1980 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
172 changes: 172 additions & 0 deletions
172
build/linux/installer/scripts/tomlparser-agent-config.rb
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,172 @@ | ||
| #!/usr/local/bin/ruby | ||
|
|
||
| #this should be require relative in Linux and require in windows, since it is a gem install on windows | ||
| @os_type = ENV["OS_TYPE"] | ||
| if !@os_type.nil? && !@os_type.empty? && @os_type.strip.casecmp("windows") == 0 | ||
| require "tomlrb" | ||
| else | ||
| require_relative "tomlrb" | ||
| end | ||
|
|
||
| require_relative "ConfigParseErrorLogger" | ||
|
|
||
| @configMapMountPath = "/etc/config/settings/agent-settings" | ||
| @configSchemaVersion = "" | ||
| @enable_health_model = false | ||
|
|
||
| # 250 Node items (15KB per node) account to approximately 4MB | ||
| @nodesChunkSize = 250 | ||
| # 1000 pods (10KB per pod) account to approximately 10MB | ||
| @podsChunkSize = 1000 | ||
| # 4000 events (1KB per event) account to approximately 4MB | ||
| @eventsChunkSize = 4000 | ||
| # roughly each deployment is 8k | ||
| # 500 deployments account to approximately 4MB | ||
| @deploymentsChunkSize = 500 | ||
| # roughly each HPA is 3k | ||
| # 2000 HPAs account to approximately 6-7MB | ||
| @hpaChunkSize = 2000 | ||
| # stream batch sizes to avoid large file writes | ||
| # too low will consume higher disk iops | ||
| @podsEmitStreamBatchSize = 200 | ||
| @nodesEmitStreamBatchSize = 100 | ||
|
|
||
| # higher the chunk size rs pod memory consumption higher and lower api latency | ||
| # similarly lower the value, helps on the memory consumption but incurrs additional round trip latency | ||
| # these needs to be tuned be based on the workload | ||
| # nodes | ||
| @nodesChunkSizeMin = 100 | ||
| @nodesChunkSizeMax = 400 | ||
| # pods | ||
| @podsChunkSizeMin = 250 | ||
| @podsChunkSizeMax = 1500 | ||
| # events | ||
| @eventsChunkSizeMin = 2000 | ||
| @eventsChunkSizeMax = 10000 | ||
| # deployments | ||
| @deploymentsChunkSizeMin = 500 | ||
| @deploymentsChunkSizeMax = 1000 | ||
| # hpa | ||
| @hpaChunkSizeMin = 500 | ||
| @hpaChunkSizeMax = 2000 | ||
|
|
||
| # emit stream sizes to prevent lower values which costs disk i/o | ||
| # max will be upto the chunk size | ||
| @podsEmitStreamBatchSizeMin = 50 | ||
| @nodesEmitStreamBatchSizeMin = 50 | ||
|
|
||
| def is_number?(value) | ||
| true if Integer(value) rescue false | ||
| end | ||
|
|
||
| # Use parser to parse the configmap toml file to a ruby structure | ||
| def parseConfigMap | ||
| begin | ||
| # Check to see if config map is created | ||
| if (File.file?(@configMapMountPath)) | ||
| puts "config::configmap container-azm-ms-agentconfig for agent settings mounted, parsing values" | ||
| parsedConfig = Tomlrb.load_file(@configMapMountPath, symbolize_keys: true) | ||
| puts "config::Successfully parsed mounted config map" | ||
| return parsedConfig | ||
| else | ||
| puts "config::configmap container-azm-ms-agentconfig for agent settings not mounted, using defaults" | ||
| return nil | ||
| end | ||
| rescue => errorStr | ||
| ConfigParseErrorLogger.logError("Exception while parsing config map for agent settings : #{errorStr}, using defaults, please check config map for errors") | ||
| return nil | ||
| end | ||
| end | ||
|
|
||
| # Use the ruby structure created after config parsing to set the right values to be used as environment variables | ||
| def populateSettingValuesFromConfigMap(parsedConfig) | ||
| begin | ||
| if !parsedConfig.nil? && !parsedConfig[:agent_settings].nil? | ||
| if !parsedConfig[:agent_settings][:health_model].nil? && !parsedConfig[:agent_settings][:health_model][:enabled].nil? | ||
| @enable_health_model = parsedConfig[:agent_settings][:health_model][:enabled] | ||
| puts "enable_health_model = #{@enable_health_model}" | ||
| end | ||
| chunk_config = parsedConfig[:agent_settings][:chunk_config] | ||
| if !chunk_config.nil? | ||
| nodesChunkSize = chunk_config[:NODES_CHUNK_SIZE] | ||
| if !nodesChunkSize.nil? && is_number?(nodesChunkSize) && (@nodesChunkSizeMin..@nodesChunkSizeMax) === nodesChunkSize.to_i | ||
| @nodesChunkSize = nodesChunkSize.to_i | ||
| puts "Using config map value: NODES_CHUNK_SIZE = #{@nodesChunkSize}" | ||
| end | ||
|
|
||
| podsChunkSize = chunk_config[:PODS_CHUNK_SIZE] | ||
| if !podsChunkSize.nil? && is_number?(podsChunkSize) && (@podsChunkSizeMin..@podsChunkSizeMax) === podsChunkSize.to_i | ||
| @podsChunkSize = podsChunkSize.to_i | ||
| puts "Using config map value: PODS_CHUNK_SIZE = #{@podsChunkSize}" | ||
| end | ||
|
|
||
| eventsChunkSize = chunk_config[:EVENTS_CHUNK_SIZE] | ||
| if !eventsChunkSize.nil? && is_number?(eventsChunkSize) && (@eventsChunkSizeMin..@eventsChunkSizeMax) === eventsChunkSize.to_i | ||
| @eventsChunkSize = eventsChunkSize.to_i | ||
| puts "Using config map value: EVENTS_CHUNK_SIZE = #{@eventsChunkSize}" | ||
| end | ||
|
|
||
| deploymentsChunkSize = chunk_config[:DEPLOYMENTS_CHUNK_SIZE] | ||
| if !deploymentsChunkSize.nil? && is_number?(deploymentsChunkSize) && (@deploymentsChunkSizeMin..@deploymentsChunkSizeMax) === deploymentsChunkSize.to_i | ||
| @deploymentsChunkSize = deploymentsChunkSize.to_i | ||
| puts "Using config map value: DEPLOYMENTS_CHUNK_SIZE = #{@deploymentsChunkSize}" | ||
| end | ||
|
|
||
| hpaChunkSize = chunk_config[:HPA_CHUNK_SIZE] | ||
| if !hpaChunkSize.nil? && is_number?(hpaChunkSize) && (@hpaChunkSizeMin..@hpaChunkSizeMax) === hpaChunkSize.to_i | ||
| @hpaChunkSize = hpaChunkSize.to_i | ||
| puts "Using config map value: HPA_CHUNK_SIZE = #{@hpaChunkSize}" | ||
| end | ||
|
|
||
| podsEmitStreamBatchSize = chunk_config[:PODS_EMIT_STREAM_BATCH_SIZE] | ||
| if !podsEmitStreamBatchSize.nil? && is_number?(podsEmitStreamBatchSize) && | ||
| podsEmitStreamBatchSize.to_i <= @podsChunkSize && podsEmitStreamBatchSize.to_i >= @podsEmitStreamBatchSizeMin | ||
| @podsEmitStreamBatchSize = podsEmitStreamBatchSize.to_i | ||
| puts "Using config map value: PODS_EMIT_STREAM_BATCH_SIZE = #{@podsEmitStreamBatchSize}" | ||
| end | ||
| nodesEmitStreamBatchSize = chunk_config[:NODES_EMIT_STREAM_BATCH_SIZE] | ||
| if !nodesEmitStreamBatchSize.nil? && is_number?(nodesEmitStreamBatchSize) && | ||
| nodesEmitStreamBatchSize.to_i <= @nodesChunkSize && nodesEmitStreamBatchSize.to_i >= @nodesEmitStreamBatchSizeMin | ||
| @nodesEmitStreamBatchSize = nodesEmitStreamBatchSize.to_i | ||
| puts "Using config map value: NODES_EMIT_STREAM_BATCH_SIZE = #{@nodesEmitStreamBatchSize}" | ||
| end | ||
| end | ||
| end | ||
| rescue => errorStr | ||
| puts "config::error:Exception while reading config settings for agent configuration setting - #{errorStr}, using defaults" | ||
| @enable_health_model = false | ||
| end | ||
| end | ||
|
|
||
| @configSchemaVersion = ENV["AZMON_AGENT_CFG_SCHEMA_VERSION"] | ||
| puts "****************Start Config Processing********************" | ||
| if !@configSchemaVersion.nil? && !@configSchemaVersion.empty? && @configSchemaVersion.strip.casecmp("v1") == 0 #note v1 is the only supported schema version , so hardcoding it | ||
| configMapSettings = parseConfigMap | ||
| if !configMapSettings.nil? | ||
| populateSettingValuesFromConfigMap(configMapSettings) | ||
| end | ||
| else | ||
| if (File.file?(@configMapMountPath)) | ||
| ConfigParseErrorLogger.logError("config::unsupported/missing config schema version - '#{@configSchemaVersion}' , using defaults, please use supported schema version") | ||
| end | ||
| @enable_health_model = false | ||
| end | ||
|
|
||
| # Write the settings to file, so that they can be set as environment variables | ||
| file = File.open("agent_config_env_var", "w") | ||
|
|
||
| if !file.nil? | ||
| file.write("export AZMON_CLUSTER_ENABLE_HEALTH_MODEL=#{@enable_health_model}\n") | ||
| file.write("export NODES_CHUNK_SIZE=#{@nodesChunkSize}\n") | ||
| file.write("export PODS_CHUNK_SIZE=#{@podsChunkSize}\n") | ||
| file.write("export EVENTS_CHUNK_SIZE=#{@eventsChunkSize}\n") | ||
| file.write("export DEPLOYMENTS_CHUNK_SIZE=#{@deploymentsChunkSize}\n") | ||
| file.write("export HPA_CHUNK_SIZE=#{@hpaChunkSize}\n") | ||
| file.write("export PODS_EMIT_STREAM_BATCH_SIZE=#{@podsEmitStreamBatchSize}\n") | ||
| file.write("export NODES_EMIT_STREAM_BATCH_SIZE=#{@nodesEmitStreamBatchSize}\n") | ||
| # Close file after writing all environment variables | ||
| file.close | ||
| else | ||
| puts "Exception while opening file for writing config environment variables" | ||
| puts "****************End Config Processing********************" | ||
| end |
This file was deleted.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.