-
Notifications
You must be signed in to change notification settings - Fork 6.6k
Description
Hi everyone,
I have seen and known a lot, even tons of new users, have the experience of deleting the whole storage(Elastic Search), which make the metadata(including service, instance, address and endpoint inventories) not available for streaming analysis and query.
Lots of issues came from there. Also, there is a proposal to provide reset from agent settings, #1929 , which hasn't been merged.
I have kept thinking the solution in several months, discussing with several contributors, also involved in #1929 too. Now I have a new proposal
Core requirement
The key of this feature is making the whole env back to work, even the data have been deleted somehow. So back online immediately is not the high-priority requirements, convenience is.
Solution
Bring a new UI page to show
- List of all service and service instance. Each instance shows the agent register time and last ping time.
- A RESET button with a time in minutes, default in 2 mins. Which give the user a way to reset all agent core.
Tech related.
- We already have the
Commandsresponse in trace, JVM, ServiceInstancePing#doPing services. So the commands are open to use. - UI page is not very hard. Need to add new GraphQL query to support this. @TinyAllen Any interest to help us on this? FYI @hanahmily
- Backend, provide a timer to this specific command, Right now, we don't consider there is a common way to do all commands. Is that all right? @peng-yongsheng . When timer find this command, asked all receivers to keep in
response Commands onlymode, which meansdon't do dispatchandclear the buffer - Agent, support this command, clear buffer, queue, serviceId and service InstanceId. Also keep running in this sleep mode until time is up.
@apache/skywalking-committers @JaredTan95 I want to provide this in beta release. Look forward you feedback.