-
Notifications
You must be signed in to change notification settings - Fork 535
7052 Fix infinite IN PROGRESS and DELETE IN PROGRESS labels #7053
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,23 @@ | ||
| package edu.harvard.iq.dataverse.harvest.client; | ||
|
|
||
| import javax.annotation.PostConstruct; | ||
| import javax.ejb.EJB; | ||
| import javax.ejb.Singleton; | ||
| import javax.ejb.Startup; | ||
|
|
||
| @Singleton | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe I'm just being paranoid but we've had trouble with Singleton beans recently. See IQSS/dataverse.harvard.edu#73
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I read the issue you linked very closely since the last thing I want to do is introduce new errors. In the issue there's an authentication bean that's a singleton and when the bean gets held up by a process, no other clients can use it to authenticate because there's only one instance. This is bad. However, in my use case the singleton is only used during the startup of Dataverse to reset any hanging harvesting clients. After that there's no use for it, so a singleton bean seems the right approach here. Moreover, in Dataverse there's many singleton beans with an additional
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not worried about it being a singleton - seeing how it has nothing in it but the It's easy to address for the most part: in a multi-server situation only one node is designated to do harvesting, and other timer-run jobs. This "master node" is the one that has the JVM option HOWEVER, there is an exception to the rule - when a harvesting run is started not by the timer, but manually, through the harvesting client page and/or API, the job can end up running on either of the multiple nodes. Admittedly, this will not affect many other dataverse installations. (I'm actually not aware of another production installation at this point that is using a multiple server setup). But this would affect us, so I am a bit hesitant about doing this.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Another solution would be to change how the manually-run harvests are handled. I.e., instead of actually running the harvest it could create a one-time timer entry for 5 sec. from now. So it will then be picked up by the timer service on the master node - and we are guaranteed to have all the harvesting runs staying on one server.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not 100% sure if being able to have these harvest-related flags cleared automatically is worth adding any complicated code to address the multiple server scenario...
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @landreev I think it's best to take the simplest approach and only build in support for single server scenario's. My plan is to add a sane default of "true" for a configurable DB key "ClearingHarvestJobsOnStartupEnabled" using the SettingsServiceBean, do you have any objections? |
||
| @Startup | ||
| public class ResetHarvestInProgressBean { | ||
|
|
||
| @EJB | ||
| HarvestingClientServiceBean harvestingClientService; | ||
|
|
||
| @PostConstruct | ||
| public void init() { | ||
| for (HarvestingClient client : harvestingClientService.getAllHarvestingClients()) { | ||
| harvestingClientService.resetHarvestInProgress(client.getId()); | ||
| harvestingClientService.resetDeleteInProgress(client.getId()); | ||
| } | ||
| } | ||
|
|
||
| } | ||
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a big deal, but what is the value of adding this status condition for a failure to delete a client? It has some value for a dataverse admin, to know that the last attempt to harvest from a certain server resulted in a failure. Is it really useful to know that an attempt to delete a client failed? - should the startup check simply remove the "delete in progress" flag quietly instead?
After all, seeing how the client is still there makes it somewhat clear that the attempt to get rid of it didn't work out, so they should try again?
I may be missing some situation where it could actually be useful - so I'm open to hearing it.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@landreev I guess handling it quietly would be ok. So instead of DELETE FAILED or FAILED we just show a success label (those are the only options)?