-
Notifications
You must be signed in to change notification settings - Fork 1.3k
CLOUDSTACK-9570: Bug in listSnapshots for snapshots with deleted data stores #1735
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
LGTM. I'll fire some regression tests. |
|
@rhtyd a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress. |
|
Packaging result: ✔centos6 ✔centos7 ✔debian. JID-131 |
|
@blueorangutan test |
|
LGTM on code review. |
dd7d85a to
b63805c
Compare
|
@rhtyd @jburwell The tests didn't kick in last time. Can we fire matrix again? |
|
if the data store is deleted, the snapshot should be removed as well, right? |
|
@blueorangutan package |
|
@rhtyd a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress. |
|
|
||
| long storagePoolId = snapshotStore.getDataStoreId(); | ||
| DataStore dataStore = dataStoreMgr.getDataStore(storagePoolId, DataStoreRole.Primary); | ||
| if (! snapshotStore.getState().equals(ObjectInDataStoreStateMachine.State.Destroyed)){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nvazquez this can cause NPE, instead use ObjectInDataStoreStateMachine.State.Destroyed.equals(...) or an explicit != null.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thanks @rhtyd
|
Packaging result: ✔centos6 ✔centos7 ✔debian. JID-149 |
|
@ustcweizhou Not necessary. The issue affects removal of primary storage DS where there are snapshot copies. After that the snapshot could remain on secondary storage. |
b63805c to
18d2a05
Compare
|
@nvazquez could you please add a Marvin test to verify this fix? |
|
@jburwell sure, will work on it |
| Boolean supportsStorageSystemSnapshots = new Boolean(value); | ||
| if (mapCapabilities != null) { | ||
| String value = mapCapabilities.get(DataStoreCapabilities.STORAGE_SYSTEM_SNAPSHOT.toString()); | ||
| Boolean supportsStorageSystemSnapshots = new Boolean(value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new Boolean skips the constant pool -- putting unnecessary pressure on the heap and creating a potential memory leak. Please use Boolean.valueOf to part the value to avoid this issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thanks!
18d2a05 to
0c2ad75
Compare
|
@blueorangutan package |
|
@rhtyd a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress. |
|
Packaging result: ✔centos6 ✔centos7 ✔debian. JID-184 |
42be7f6 to
32c3e89
Compare
|
@jburwell I added new Marvin test in |
|
@blueorangutan package |
|
LGTM. |
|
@nvazquez could you please check into the Travis failures? |
|
@blueorangutan package |
|
@rhtyd a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress. |
|
Packaging result: ✔centos6 ✔centos7 ✔debian. JID-245 |
|
@jburwell the Travis failure is ignorable caused by a oobm corner case (exception is ignorable). I'll work on a fix soon. |
|
@rhtyd a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests |
|
@jburwell @rhtyd sorry I've been off yesterday, do I push force to kick tests again? |
|
Trillian test result (tid-426)
|
32c3e89 to
8369007
Compare
8369007 to
ed87d66
Compare
| _accountMgr.buildACLSearchParameters(caller, id, cmd.getAccountName(), cmd.getProjectId(), permittedAccounts, domainIdRecursiveListProject, cmd.listAll(), false); | ||
| Long domainId = domainIdRecursiveListProject.first(); | ||
| Boolean isRecursive = domainIdRecursiveListProject.second(); | ||
| ListProjectResourcesCriteria listProjectResourcesCriteria = domainIdRecursiveListProject.third(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nvazquez What is changed here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @anshul1886, at the beggining of each line there's one space added, to be consistent with indentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These type of changes should not be done on unrelated file as it makes following changes difficult.
| snapshotResponse.setObjectName("snapshot"); | ||
| snapshotResponses.add(snapshotResponse); | ||
| if (snapshotResponse != null) { | ||
| snapshotResponse.setObjectName("snapshot"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is any issue observed here for which null check is added?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check is added as we introduced a case in createSnapshotResponse where null is returned, and in that case we shouldn't add the response to the response list
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the use case for null case when that can happen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a few comments.
If we implement those comments, then null should not be a problem here (in other words, the code in this class could remain unchanged).
|
|
||
| long storagePoolId = snapshotStore.getDataStoreId(); | ||
| DataStore dataStore = dataStoreMgr.getDataStore(storagePoolId, DataStoreRole.Primary); | ||
| if (snapshotStore.getState() != null && ! snapshotStore.getState().equals(ObjectInDataStoreStateMachine.State.Destroyed)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As pointed out by @mike-tutkowski store cannot be destroyed if there are snapshots. Also that enum is meant for snapshot (and entities like that) and not for snapshot store.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For managed storage, that is true. For unmanaged storage, one should be able to delete the primary storage from which a snapshot was created and the snapshot should be able to remain (since the snapshot, in this case, is on secondary storage).
|
@mike-tutkowski @anshul1886 @rhtyd We had enough LGTM and successful test results for this PR. Are there any concern before merging it? |
|
@serg38 sure |
|
@rhtyd a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress. |
|
Packaging result: ✔centos6 ✔centos7 ✔debian. JID-453 |
|
@serg38 @anshul1886 @rhtyd Maybe someone could take a look at my comments before we check this in and see if they think any of those should be implemented here. Thanks |
|
@mike-tutkowski sorry I don't have the bandwidth for a review, though I can facilitate with running tests. Maybe @anshul1886 and/or @serg38 can comment. Thanks. @blueorangutan test |
|
@rhtyd a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests |
|
@mike-tutkowski The only comments in this PR are about deletion of PS with a snapshot copy. I am 100% agree with your statement. This PR excludes destroyed snapshot copies including ones on the PS from evaluation during the listSnaphsot call thus avoiding NPE if original PS is already deleted. |
|
@serg38 I had some open questions on code which are not yet answered specifically related to use case of null case and comparison of non comparable properties. Also, for unmanaged storage there is nothing called snapshot copy on primary yet once the snapshot is created. This PR is meant to fix that case i.e. unmanaged primary storage case only. So I am not understanding what are you referring to here. I had replied to your comment on dev mail but didn't get any response. |
| } | ||
|
|
||
| long storagePoolId = snapshotStore.getDataStoreId(); | ||
| DataStore dataStore = dataStoreMgr.getDataStore(storagePoolId, DataStoreRole.Primary); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another possibility here is that we could simply still try to retrieve "dataStore" and then perform this check:
DataStore dataStore = dataStoreMgr.getDataStore(storagePoolId, DataStoreRole.Primary);
if (dataStore == null) {
return DataStoreRole.Image;
}
If "dataStore" equals null, then it was removed, which should only be something that happened when unmanaged storage is being used (thus when the the snapshot resides on secondary storage).
| snapshotInfo = (SnapshotInfo)snapshot; | ||
| } else { | ||
| DataStoreRole dataStoreRole = getDataStoreRole(snapshot, _snapshotStoreDao, _dataStoreMgr); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say the default should be DataStoreRole.Image for getDataStoreRole and so getDataStoreRole should probably never return null.
| @Override | ||
| public PrimaryDataStore getPrimaryDataStore(long dataStoreId) { | ||
| StoragePoolVO dataStoreVO = dataStoreDao.findById(dataStoreId); | ||
| StoragePoolVO dataStoreVO = dataStoreDao.findByIdIncludingRemoved(dataStoreId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is getPrimaryDataStore(long) called from many places? If so, it might be a bit risky to change this from findById to findByIdIncludingRemoved unless we are pretty sure all of the calling code is OK with that change.
| snapshotResponse.setObjectName("snapshot"); | ||
| snapshotResponses.add(snapshotResponse); | ||
| if (snapshotResponse != null) { | ||
| snapshotResponse.setObjectName("snapshot"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a few comments.
If we implement those comments, then null should not be a problem here (in other words, the code in this class could remain unchanged).
|
Trillian test result (tid-774)
|
|
@mike-tutkowski @anshul1886 thanks for your comments! I'll work on them! @rhtyd this PR was targeted for 4.9, should I retarget it to master branch? |
|
Closing this PR as #1847 includes this changes in a simpler way |
Actual behaviour
If there is snapshot on a data store that is removed,
listSnapshotsstill tries to enumerate it and gives error (in this example data store 2 has been removed):client/api?command=listSnapshots&isrecursive=true&listall=trueReproduce error
This steps can be followed to reproduce issue:
client/api?command=listSnapshots&isrecursive=true&listall=true