Skip to content

Conversation

@uncleDecart
Copy link
Member

@uncleDecart uncleDecart commented Sep 18, 2025

Description

Subscriptions must match the persistence of their corresponding publications. Previously, this consistency was not enforced in the msrv service, potentially causing issues—especially when using the PubSub memdriver, which requires all subscriptions to be persistent.

This patch fixes the inconsistency and adds a clarifying comment to help maintain correct persistence settings in future changes.

How to test and validate this PR

The issue is not seen on socketdriver since AF_UNIX socket pathname is the same for persistent and non-persistent publications

Checklist

  • I've provided a proper description
  • I've added the proper documentation
  • I've tested my PR on amd64 device
  • I've tested my PR on arm64 device
  • I've written the test verification instructions
  • I've set the proper labels to this PR

And the last but not least:

  • I've checked the boxes above, or I've provided a good reason why I didn't
    check them.

Please, check the boxes above after submitting the PR in interactive mode.

@uncleDecart uncleDecart added the bug Something isn't working label Sep 18, 2025
@OhmSpectator
Copy link
Member

I don't get what exactly it fixes. If it fixes anything, we should have a test scenario that can reproduce the original problem... If it fixes only some upcoming implementation, it's not a fix per se.
I am also sceptical about making subscription persistence. Why, in this case, is it not a problem for the update scenario? Also, pinging @milan-zededa to take a look.

@uncleDecart
Copy link
Member Author

AF_UNIX socket path is the same for both persist and non-persist publications, it's a specificity of the underlying implementation, pubsub API differentiates between persistent and non-persistent topics. All the changed subscriptions are persistent, this ensures consistency with pubsub API and its promise, the fact that it works right now with socket driver is a slippery slope

@eriknordmark
Copy link
Contributor

I wonder whether it makes sense to make the "persist" be part of the pathname for the AF_UNIX socket (so that e.g., the persistent DevicePortConfigList would have its socket be named /run/nim/DevicePortConfigList-persist.sock instead of the current /run/nim/DevicePortConfigList.sock)
That would ensure that the subscribers and publishers match in their settings.

@OhmSpectator
Copy link
Member

Still trying to understand what happened. Was it the problem that the consumer and producer had different persistence settings? And you are just adjusting them with this change?

@uncleDecart
Copy link
Member Author

Still trying to understand what happened. Was it the problem that the consumer and producer had different persistence settings? And you are just adjusting them with this change?

Exactly, I found out this problem testing nkvdriver, where I hold the promise that different persistence settings means different topics, in socketdriver we don't hone that promise, therefore it's working, but it's a bug, not a feature

@milan-zededa
Copy link
Contributor

Still trying to understand what happened. Was it the problem that the consumer and producer had different persistence settings? And you are just adjusting them with this change?

Exactly, I found out this problem testing nkvdriver, where I hold the promise that different persistence settings means different topics, in socketdriver we don't hone that promise, therefore it's working, but it's a bug, not a feature

Is msrv the only place we have this inconsistency, or is it necessary to manually check all the subscribers across pillar?

@uncleDecart
Copy link
Member Author

Still trying to understand what happened. Was it the problem that the consumer and producer had different persistence settings? And you are just adjusting them with this change?

Exactly, I found out this problem testing nkvdriver, where I hold the promise that different persistence settings means different topics, in socketdriver we don't hone that promise, therefore it's working, but it's a bug, not a feature

Is msrv the only place we have this inconsistency, or is it necessary to manually check all the subscribers across pillar?

So far I've found just this one, I haven't looked into all publications and subscriptions, they'll float up with my nkv shenanigans

@OhmSpectator
Copy link
Member

The branch is not rebased on top of base branch!

@uncleDecart
Copy link
Member Author

Will rebase it and add some more changes later today

@OhmSpectator
Copy link
Member

It's stil not rebased

Copy link
Contributor

@eriknordmark eriknordmark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@uncleDecart uncleDecart force-pushed the fix-msrv-pubusb-persist-consistency branch from bb7a400 to 762f921 Compare September 19, 2025 08:46
@uncleDecart
Copy link
Member Author

I've found one with GlobalConfig, which is perist, but msrv subscription is not, I'll check that one across services, @milan-zededa if it won't be a trouble for you during your refactoring artistic efforts, would be great if you check the consistency as well in the services you're refactoring

@OhmSpectator
Copy link
Member

@uncleDecart, @milan-zededa, @rucoder, I wrote a tool to check for that type of consistency. Along with the errors fixes here it also showed me this:

Producer: monitor https://github.com/lf-edge/eve/blob/0107f28a68b0f367abf7e0e5320e77a279c5db25/pkg/pillar/cmd/monitor/subscriptions.go#L247 Topic=types.DevicePortConfig Persistent=true
  Consumers (remoteAgent matches producer):
    - nim, from=monitor https://github.com/lf-edge/eve/blob/0107f28a68b0f367abf7e0e5320e77a279c5db25/pkg/pillar/cmd/nim/nim.go#L465 Persistent=false [MISMATCH]

Producer: nim https://github.com/lf-edge/eve/blob/0107f28a68b0f367abf7e0e5320e77a279c5db25/pkg/pillar/cmd/nim/nim.go#L365 Topic=types.DevicePortConfigList Persistent=true
  Consumers (remoteAgent matches producer):
    - zedagent, from=nim https://github.com/lf-edge/eve/blob/0107f28a68b0f367abf7e0e5320e77a279c5db25/pkg/pillar/cmd/zedagent/zedagent.go#L1718 Persistent=false (implicit default (unset -> false)) [MISMATCH]

@uncleDecart
Copy link
Member Author

Super cool @OhmSpectator ! I can include those fixes in my PR

@eriknordmark
Copy link
Contributor

@uncleDecart let me know when you have updated this PR with the other fixes.

Subscriptions _must match the persistence of their corresponding publications_.
Previously, this consistency was not enforced in the msrv service, potentially
causing issues—especially when using the PubSub memdriver, which requires all
subscriptions to be persistent.

This patch fixes the inconsistency and adds a clarifying comment to help
maintain correct persistence settings in future changes.

Signed-off-by: Pavel Abramov <uncle.decart@gmail.com>
@uncleDecart uncleDecart force-pushed the fix-msrv-pubusb-persist-consistency branch from 762f921 to e1b535c Compare September 24, 2025 16:18
@uncleDecart
Copy link
Member Author

@eriknordmark updated the PR

Copy link
Contributor

@eriknordmark eriknordmark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

it's doesn't give any effect

Signed-off-by: Pavel Abramov <uncle.decart@gmail.com>
@uncleDecart
Copy link
Member Author

I've added commit to remove duplicate activation to this PR

@uncleDecart uncleDecart changed the title msrv: ensure subscription persistence consistency pubsub: ensure subscription persistence consistency & remove double activation Sep 25, 2025
@eriknordmark eriknordmark merged commit 670c70f into lf-edge:master Sep 26, 2025
44 of 45 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants