Skip to content

Load collector configuration#18

Merged
masci merged 2 commits intomasterfrom
massi/config
Sep 8, 2016
Merged

Load collector configuration#18
masci merged 2 commits intomasterfrom
massi/config

Conversation

@masci
Copy link
Copy Markdown
Contributor

@masci masci commented Aug 30, 2016

What

Define a type to hold collector's configuration and a method to load config data from a YAML file.

Update:
Also provide an interface for configuration "providers" so that we can implement different procedures to retrieve configuration options from different locations.

Why

The collector Agent should be configured with global parameters as in v.5

How

We already have a system to search for and load configuration data from multiple sources and it is used by the check loader. This PR doesn't make any use of it for the following reasons:

  1. The configuration format for any check is unknown at boot time because the Agent binary searches for and loads config data in a generic way, and only later it tries to load the checks corresponding to the configs found. In this case instead, we know in advance which are the global configuration parameters and we can easily provide a struct to hold them.
  2. The configuration format for the checks heavily relies on the concept of Instance and we made specific choices to optimise that way, so the current loading system is not generic enough to support something that has nothing to do with instances.
  3. The global configuration must be loaded before anything else, difficult to achieve with a generic approach

Update:
Anyway, we decided to provide a similar approach also to global configuration, see comments below.

@masci masci changed the title [core] load collector configuration [WIP][core] load collector configuration Aug 30, 2016
@masci masci changed the title [WIP][core] load collector configuration [WIP] Load collector configuration Aug 30, 2016
@masci masci force-pushed the massi/config branch 5 times, most recently from c2a67cc to c145c87 Compare September 1, 2016 11:29
@masci masci changed the title [WIP] Load collector configuration Load collector configuration Sep 2, 2016
@masci masci force-pushed the massi/config branch 3 times, most recently from b4a3331 to 8a64e09 Compare September 2, 2016 15:42
@olivielpeau olivielpeau self-assigned this Sep 6, 2016
@olivielpeau
Copy link
Copy Markdown
Member

Just discussed with @masci: we could create an AgentConfigProvider interface that would roughly look like:

type AgentConfigProvider interface {
    Collect(*collector.Config) error
}

and have structs like FileAgentConfigProvider, EtcdAgentConfigProvider, etc that would implement it and populate the Config struct when Collect is called on them.

This would be "symmetric" with the way the check configs are loaded and probably make the logic clearer.

Comment thread cmd/agent/app/common_linux.go Outdated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on https://golang.org/pkg/go/build/#pkg-overview I think you need to name the file common.go instead of common_linux.go, otherwise the file will only be built on linux even if you list other platforms here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, will probably name it common_unix.go as they do in the standard lib, would leave common.go for stuff common to any OS

@olivielpeau
Copy link
Copy Markdown
Member

One other small thing that came to my mind: should we rather have all the agent-config-related stuff in a different package than collector since it's related to the whole agent? Maybe a config or common package, or something of that kind.

Comment thread cmd/agent/app/main.go Outdated

// for now, we only load configuration from file
fileProvider := config.NewFileProvider(configPath)
fileProvider.Configure(cfg)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should probably check the potential error Configure can return, and (for now, until we have more config providers) stop everything if there's an error

@olivielpeau
Copy link
Copy Markdown
Member

Added in a few comments, other than that this looks 👍 📈

@olivielpeau
Copy link
Copy Markdown
Member

:shipit:

@masci masci merged commit c3788a1 into master Sep 8, 2016
@masci masci deleted the massi/config branch September 8, 2016 15:28
masci added a commit that referenced this pull request Apr 1, 2019
hush-hush pushed a commit that referenced this pull request Apr 17, 2019
s-alad pushed a commit that referenced this pull request Nov 21, 2025
This will ensure that the proper code reviewers are auto-assigned when
PRs are opened
gh-worker-dd-mergequeue-cf854d Bot pushed a commit that referenced this pull request Jan 27, 2026
### What does this PR do?

Skip the SSH session patcher and add a test to illustrate the current issue.
In addition, adds the possibility to check specific fields in the json returned for ssh_session events.

### Motivation

The retry mechanism could cause the agent to send no more than one event per minute if an SSH session was not properly resolved.
Previously, the event was not sent and the agent would wait one minute before sending it with the `unknown` type. However, this `authtype` would never be resolved because the session was initialized before the agent started processing events. As a result, every subsequent SSH event would wait one minute for nothing, causing a significant delay in agent events, potentially blocking all the other events.

### Describe how you validated your changes
Added a test that illustrate the issue : `TestSSHUserSessionBlocking`
With this change, the ssh_session event is now sent with `authtype` set to `unknown` and directly sent.


Error without commenting the patcher :
```
        	Error:      	Received unexpected error:
        	            	All attempts fail:
        	            	#1: not found
        	            	#2: not found
        	            	#3: not found
        	            	#4: not found
        	            	#5: not found
        	            	#6: not found
        	            	#7: not found
        	            	#8: not found
        	            	#9: not found
        	            	#10: not found
        	            	#11: not found
        	            	#12: not found
        	            	#13: not found
        	            	#14: not found
        	            	#15: not found
        	            	#16: not found
        	            	#17: not found
        	            	#18: not found
        	            	#19: not found
        	            	#20: not found
        	            	#21: not found
        	            	#22: not found
        	            	#23: not found
        	            	#24: not found
        	            	#25: not found
        	            	#26: not found
        	            	#27: not found
        	            	#28: not found
        	            	#29: not found
        	            	#30: not found
        	Test:       	TestSSHUserSessionBlocking/second_ssh_no_auth
```

Co-authored-by: theo.putegnat <theo.putegnat@datadoghq.com>
dd-octo-sts Bot added a commit that referenced this pull request Jan 27, 2026
Skip the SSH session patcher and add a test to illustrate the current issue.
In addition, adds the possibility to check specific fields in the json returned for ssh_session events.

### Motivation

The retry mechanism could cause the agent to send no more than one event per minute if an SSH session was not properly resolved.
Previously, the event was not sent and the agent would wait one minute before sending it with the `unknown` type. However, this `authtype` would never be resolved because the session was initialized before the agent started processing events. As a result, every subsequent SSH event would wait one minute for nothing, causing a significant delay in agent events, potentially blocking all the other events.

### Describe how you validated your changes
Added a test that illustrate the issue : `TestSSHUserSessionBlocking`
With this change, the ssh_session event is now sent with `authtype` set to `unknown` and directly sent.

Error without commenting the patcher :
```
        	Error:      	Received unexpected error:
        	            	All attempts fail:
        	            	#1: not found
        	            	#2: not found
        	            	#3: not found
        	            	#4: not found
        	            	#5: not found
        	            	#6: not found
        	            	#7: not found
        	            	#8: not found
        	            	#9: not found
        	            	#10: not found
        	            	#11: not found
        	            	#12: not found
        	            	#13: not found
        	            	#14: not found
        	            	#15: not found
        	            	#16: not found
        	            	#17: not found
        	            	#18: not found
        	            	#19: not found
        	            	#20: not found
        	            	#21: not found
        	            	#22: not found
        	            	#23: not found
        	            	#24: not found
        	            	#25: not found
        	            	#26: not found
        	            	#27: not found
        	            	#28: not found
        	            	#29: not found
        	            	#30: not found
        	Test:       	TestSSHUserSessionBlocking/second_ssh_no_auth
```

Co-authored-by: theo.putegnat <theo.putegnat@datadoghq.com>
(cherry picked from commit 40d1f09)

___

Co-authored-by: Théo Putegnat <theo.putegnat@datadoghq.com>
dd-octo-sts Bot added a commit that referenced this pull request Jan 27, 2026
Skip the SSH session patcher and add a test to illustrate the current issue.
In addition, adds the possibility to check specific fields in the json returned for ssh_session events.

### Motivation

The retry mechanism could cause the agent to send no more than one event per minute if an SSH session was not properly resolved.
Previously, the event was not sent and the agent would wait one minute before sending it with the `unknown` type. However, this `authtype` would never be resolved because the session was initialized before the agent started processing events. As a result, every subsequent SSH event would wait one minute for nothing, causing a significant delay in agent events, potentially blocking all the other events.

### Describe how you validated your changes
Added a test that illustrate the issue : `TestSSHUserSessionBlocking`
With this change, the ssh_session event is now sent with `authtype` set to `unknown` and directly sent.

Error without commenting the patcher :
```
        	Error:      	Received unexpected error:
        	            	All attempts fail:
        	            	#1: not found
        	            	#2: not found
        	            	#3: not found
        	            	#4: not found
        	            	#5: not found
        	            	#6: not found
        	            	#7: not found
        	            	#8: not found
        	            	#9: not found
        	            	#10: not found
        	            	#11: not found
        	            	#12: not found
        	            	#13: not found
        	            	#14: not found
        	            	#15: not found
        	            	#16: not found
        	            	#17: not found
        	            	#18: not found
        	            	#19: not found
        	            	#20: not found
        	            	#21: not found
        	            	#22: not found
        	            	#23: not found
        	            	#24: not found
        	            	#25: not found
        	            	#26: not found
        	            	#27: not found
        	            	#28: not found
        	            	#29: not found
        	            	#30: not found
        	Test:       	TestSSHUserSessionBlocking/second_ssh_no_auth
```

Co-authored-by: theo.putegnat <theo.putegnat@datadoghq.com>
(cherry picked from commit 40d1f09)

___

Co-authored-by: Théo Putegnat <theo.putegnat@datadoghq.com>
gh-worker-dd-mergequeue-cf854d Bot pushed a commit that referenced this pull request Jan 28, 2026
Backport 40d1f09 from #45437.

 ___

### What does this PR do?

Skip the SSH session patcher and add a test to illustrate the current issue.
In addition, adds the possibility to check specific fields in the json returned for ssh_session events.

### Motivation

The retry mechanism could cause the agent to send no more than one event per minute if an SSH session was not properly resolved.
Previously, the event was not sent and the agent would wait one minute before sending it with the `unknown` type. However, this `authtype` would never be resolved because the session was initialized before the agent started processing events. As a result, every subsequent SSH event would wait one minute for nothing, causing a significant delay in agent events, potentially blocking all the other events.

### Describe how you validated your changes
Added a test that illustrate the issue : `TestSSHUserSessionBlocking`
With this change, the ssh_session event is now sent with `authtype` set to `unknown` and directly sent.


Error without commenting the patcher :
```
        	Error:      	Received unexpected error:
        	            	All attempts fail:
        	            	#1: not found
        	            	#2: not found
        	            	#3: not found
        	            	#4: not found
        	            	#5: not found
        	            	#6: not found
        	            	#7: not found
        	            	#8: not found
        	            	#9: not found
        	            	#10: not found
        	            	#11: not found
        	            	#12: not found
        	            	#13: not found
        	            	#14: not found
        	            	#15: not found
        	            	#16: not found
        	            	#17: not found
        	            	#18: not found
        	            	#19: not found
        	            	#20: not found
        	            	#21: not found
        	            	#22: not found
        	            	#23: not found
        	            	#24: not found
        	            	#25: not found
        	            	#26: not found
        	            	#27: not found
        	            	#28: not found
        	            	#29: not found
        	            	#30: not found
        	Test:       	TestSSHUserSessionBlocking/second_ssh_no_auth
```

Co-authored-by: axel.vonengel <axel.vonengel@datadoghq.com>
gh-worker-dd-mergequeue-cf854d Bot pushed a commit that referenced this pull request Jan 28, 2026
Backport 40d1f09 from #45437.

 ___

### What does this PR do?

Skip the SSH session patcher and add a test to illustrate the current issue.
In addition, adds the possibility to check specific fields in the json returned for ssh_session events.

### Motivation

The retry mechanism could cause the agent to send no more than one event per minute if an SSH session was not properly resolved.
Previously, the event was not sent and the agent would wait one minute before sending it with the `unknown` type. However, this `authtype` would never be resolved because the session was initialized before the agent started processing events. As a result, every subsequent SSH event would wait one minute for nothing, causing a significant delay in agent events, potentially blocking all the other events.

### Describe how you validated your changes
Added a test that illustrate the issue : `TestSSHUserSessionBlocking`
With this change, the ssh_session event is now sent with `authtype` set to `unknown` and directly sent.


Error without commenting the patcher :
```
        	Error:      	Received unexpected error:
        	            	All attempts fail:
        	            	#1: not found
        	            	#2: not found
        	            	#3: not found
        	            	#4: not found
        	            	#5: not found
        	            	#6: not found
        	            	#7: not found
        	            	#8: not found
        	            	#9: not found
        	            	#10: not found
        	            	#11: not found
        	            	#12: not found
        	            	#13: not found
        	            	#14: not found
        	            	#15: not found
        	            	#16: not found
        	            	#17: not found
        	            	#18: not found
        	            	#19: not found
        	            	#20: not found
        	            	#21: not found
        	            	#22: not found
        	            	#23: not found
        	            	#24: not found
        	            	#25: not found
        	            	#26: not found
        	            	#27: not found
        	            	#28: not found
        	            	#29: not found
        	            	#30: not found
        	Test:       	TestSSHUserSessionBlocking/second_ssh_no_auth
```

Co-authored-by: YoannGh <yoann.ghigoff@datadoghq.com>
Co-authored-by: florent.clarret <florent.clarret@datadoghq.com>
theomagellan pushed a commit that referenced this pull request Feb 2, 2026
### What does this PR do?

Skip the SSH session patcher and add a test to illustrate the current issue.
In addition, adds the possibility to check specific fields in the json returned for ssh_session events.

### Motivation

The retry mechanism could cause the agent to send no more than one event per minute if an SSH session was not properly resolved.
Previously, the event was not sent and the agent would wait one minute before sending it with the `unknown` type. However, this `authtype` would never be resolved because the session was initialized before the agent started processing events. As a result, every subsequent SSH event would wait one minute for nothing, causing a significant delay in agent events, potentially blocking all the other events.

### Describe how you validated your changes
Added a test that illustrate the issue : `TestSSHUserSessionBlocking`
With this change, the ssh_session event is now sent with `authtype` set to `unknown` and directly sent.


Error without commenting the patcher :
```
        	Error:      	Received unexpected error:
        	            	All attempts fail:
        	            	#1: not found
        	            	#2: not found
        	            	#3: not found
        	            	#4: not found
        	            	#5: not found
        	            	#6: not found
        	            	#7: not found
        	            	#8: not found
        	            	#9: not found
        	            	#10: not found
        	            	#11: not found
        	            	#12: not found
        	            	#13: not found
        	            	#14: not found
        	            	#15: not found
        	            	#16: not found
        	            	#17: not found
        	            	#18: not found
        	            	#19: not found
        	            	#20: not found
        	            	#21: not found
        	            	#22: not found
        	            	#23: not found
        	            	#24: not found
        	            	#25: not found
        	            	#26: not found
        	            	#27: not found
        	            	#28: not found
        	            	#29: not found
        	            	#30: not found
        	Test:       	TestSSHUserSessionBlocking/second_ssh_no_auth
```

Co-authored-by: theo.putegnat <theo.putegnat@datadoghq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants