Skip to content

Synchronization problem when quickly restarting KER in distributed-mode #639

@bnouwt

Description

@bnouwt

This problem can be reproduced with the following setup and scenario:

Start the following applications:

  • Knowledge Directory (KD)
  • KER-A
  • KER-B

Restart the KER-B quickly (before the KD noticing its absence) and immediately (before it discovers new peers) add a SC. This should result in the messsage:
• Notifying 0 peer(s) of new or removed Smart Connectors, there are now 1 smart connectors

KER-B will never notice the new SC that was created in VUKER and ignores any messages received from this SC.

The work around is to shutdown KER-B and keep it off for 5 minutes, so that KER-A notices its absence. Then start KER-B again and it will function correctly.

Two parts of fixing this problem:

  1. Make sure graceful shutdown is implemented better in the KER. For example, when the RemoteKerConnectionManager stops, send a shutdown message by calling the RemoteKerConnection.stop() method. Also make sure a JVM shutdown hook is implemented to stop the Message Dispatcher and gracefully SmartConnectorImpl.stop() all smart connectors in the KER.
  2. When a KER discovers a new peer (KER) it sends its runtimedetails to this KER, just to be sure that it gets an update when it is restarted. This might be a duplicate action in normal circumstances, but when it quickly restart the other KER just did not notice yet and will not take action to update its information about the KER.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions