Skip to content

Add dependency to ECS agent service#215

Closed
lewayne-aws wants to merge 4 commits into
aws:mainlinefrom
lewayne-aws:ecs-drop-in
Closed

Add dependency to ECS agent service#215
lewayne-aws wants to merge 4 commits into
aws:mainlinefrom
lewayne-aws:ecs-drop-in

Conversation

@lewayne-aws
Copy link
Copy Markdown
Contributor

Issue #, if available:
n/a

Description of changes:
If ECS starts before credentials-fetcher, the agent fails to connect to the credentials-fetcher socket. So, when installing credentials-fetcher.service, add a drop-in configuration for ecs.service, so that credentials-fetcher.service effectively becomes a component of ecs.service.

With this change, if ecs.service is started, stopped, or restarted, credentials-fetcher.service will also do the same, and ecs.service will wait for the socket file to be created when starting credentials-fetcher.service.

Since ecs.service becomes so tightly coupled with credentials-fetcher.service, we don't have to explicitly enable it for it to start on systems where the ECS agent is installed. Instead, ecs.service triggers the service to start with it as a dependency.

If this is installed on a system where ECS agent is not installed, there are no side effects. Although there would be a drop-in file for ecs.service, without the underlying unit file also present (provided by ECS agent), the drop-in will be ignored by systemd. Similarly, this approach does not clobber any of the files provided by ECS agent directly, and simply leverages existing systemd functionality to enforce our intent and requirement.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Startup order is important for this service to integrate with the ECS
agent. To account for this, create a systemd drop-in file for
`ecs.service` so that it depends on `credentials-fetcher.service` as
long as it is installed.

The drop-in also makes the agent wait for the credentials-fetcher socket
to be created.
Comment thread credentials-fetcher.spec
/usr/bin/systemctl stop credentials-fetcher.service
/usr/bin/systemctl is-enabled --quiet ecs.service 2>/dev/null && /usr/bin/systemctl restart ecs.service || :
fi

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add to the changelog as well

@heathhey
Copy link
Copy Markdown

After a review of the pull request, I think if you want to use systemd to enforce ordering of startup and restart etc. then this is the best solution. I have not personally setup or used this solution, so I recommend testing this setup. with this configuration, if the credentials-fetcher service dies, or is stopped or is restarted, then the ecs service will follow suit, this is what the BindsTo option does.

I would highly recommend testing this thoroughly. Doing something like, making sure both packages are installed, and that the service is running, then force a restart or a stop and stare on credentials-fetcher and verifying that the ecs service follows along. Also, a test with a custom AMI with both installed to verify that it does solve the race condition and the both services start properly.

lewayne-aws added a commit to lewayne-aws/credentials-fetcher that referenced this pull request Feb 13, 2026
Alternative approach to aws#215.

Currently, credentials-fetcher needs to start _before_ ECS agent in
order for gMSA tasks to be started successfully. In this approach, we
ship a script that is callable from the userdata script which installs
a dependency so that `ecs.service` requires
`credentials-fetcher.service` to start.

This is a loose dependency with a `Wants`, rather than the strict
`BindsTo` dependency from aws#215, so if credentials-fetcher runs into
issues, it will not attempt to restart the ECS agent.

This also does not default to creating the dependency via the package,
only a script that will do it for you. The package will, however, check
for the dependency and remove it if applicable.
Squashed commit of the following:

commit 748bf59
Author: Wayne Galen <lewayne@amazon.com>
Date:   Thu Feb 12 18:26:43 2026 -0800

    Fix rpmspec and userdata script check

    Libexec macro was not defined in the spec, and `systemctl` uses
    `is-active`, not `is-running`.

commit a573214
Author: Wayne Galen <lewayne@amazon.com>
Date:   Thu Feb 12 17:39:21 2026 -0800

    Ship a userdata script to set up startup order

    Alternative approach to aws#215.

    Currently, credentials-fetcher needs to start _before_ ECS agent in
    order for gMSA tasks to be started successfully. In this approach, we
    ship a script that is callable from the userdata script which installs
    a dependency so that `ecs.service` requires
    `credentials-fetcher.service` to start.

    This is a loose dependency with a `Wants`, rather than the strict
    `BindsTo` dependency from aws#215, so if credentials-fetcher runs into
    issues, it will not attempt to restart the ECS agent.

    This also does not default to creating the dependency via the package,
    only a script that will do it for you. The package will, however, check
    for the dependency and remove it if applicable.
@lewayne-aws
Copy link
Copy Markdown
Contributor Author

Based on concerns that were raised about the first approach, this PR now ships a script that can be run from userdata instead, along with the beginnings of a docs/ directory, detailing how to use the script.

The idea with putting this into a script is that if the underlying socket issue is addressed differently and this workaround is no longer needed, we can simply update the script to clean up the drop-in workaround if it is no longer needed at that time, while still giving a solid solution for right now.

`postun` is called during upgrades as well as full removal. We need to
ensure we *don't* automatically remove the workaround unless it's a full removal
Added relevant changelog details, and adjusted the release number.

The underlying application is identical to the previous 2.0.0, and only
the packaging is changed by my updates, so the release number seems like
the appropriate adjustment to make
Copy link
Copy Markdown
Contributor

@smhmhmd smhmhmd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making the change, the behavior is not turned by default, it needs a userdata script.

muskanlalit18 pushed a commit to muskanlalit18/credentials-fetcher that referenced this pull request Mar 6, 2026
Startup order is important for this service to integrate with the ECS
agent. To account for this, create a systemd drop-in file for
`ecs.service` so that it depends on `credentials-fetcher.service` as
long as it is installed.

The drop-in also makes the agent wait for the credentials-fetcher socket
to be created.

Ship a userdata script to set up startup order as an alternative
approach callable from the userdata script which installs a dependency
so that ecs.service requires credentials-fetcher.service to start.

Fix postun script to only clean up on full removal, not upgrades.
Update changelog and bump release number to 2.

GitHub-PR: aws#215

cr: https://code.amazon.com/reviews/CR-255829737
@bhallasaksham
Copy link
Copy Markdown
Contributor

Fix is merged after testing internally, will close this PR. Thanks for your contribution @lewayne-aws

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants