Skip to content

Delete datastore files iteratively for all datastore types#7441

Merged
AngieCris merged 12 commits intovmware:masterfrom
AngieCris:vic/7359
Mar 14, 2018
Merged

Delete datastore files iteratively for all datastore types#7441
AngieCris merged 12 commits intovmware:masterfrom
AngieCris:vic/7359

Conversation

@AngieCris
Copy link
Contributor

@AngieCris AngieCris commented Mar 2, 2018

[specific ci=Group6-VIC-Machine]

Fixes #7359

Revert of commit 1fd1e7f PR #6951

For non-vSAN type datastores, sometimes it complains when we try to delete non-empty datastore folders with child image files, causing trailing VCH datastore folders after vic-machine delete finishes.
So the new approach is, we attempt to delete a datastore folder first. If it fails, we delete children files one by one, and then finally empty the folder.

@AngieCris AngieCris self-assigned this Mar 2, 2018
@AngieCris AngieCris requested review from emlin and hickeng March 2, 2018 22:47
@emlin
Copy link
Contributor

emlin commented Mar 5, 2018

#2652 is the PR to delete all kind of datastore files recursively, for vc6.5 requires that.

@emlin
Copy link
Contributor

emlin commented Mar 5, 2018

lgtm

Approved with PullApprove

@AngieCris AngieCris requested a review from anchal-agrawal March 5, 2018 17:33
}
d.op.Debugf("Folder %q is not found", dsPath)
return nil
// Get sorted result to make sure children files listed ahead of folder. Then we can empty folder before delete it
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/children/child

}
d.op.Debugf("Folder %q is not found", dsPath)
return nil
// Get sorted result to make sure children files listed ahead of folder. Then we can empty folder before delete it
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/delete/deleting

d.op.Debugf("Folder %q is not found", dsPath)
return nil
// Get sorted result to make sure children files listed ahead of folder. Then we can empty folder before delete it
// This function specifically designed for vSan, as vSan sometimes will throw error to delete folder is the folder is not empty
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove this comment if it's not relevant anymore?

Copy link
Contributor

@anchal-agrawal anchal-agrawal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Do we need to update any tests in 6-03-Delete to account for the changes here?

Copy link
Contributor

@hickeng hickeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems I'm reluctant to continue with the epically long delete times for large numbers of images...

@@ -158,26 +158,23 @@ func (d *Dispatcher) isVSAN(ds *object.Datastore) bool {
func (d *Dispatcher) deleteFilesIteratively(m *object.DatastoreFileManager, ds *object.Datastore, dsPath string) error {
defer trace.End(trace.Begin(dsPath, d.op))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry for the whiplash Angie - given this isn't a direct revert if addressing anchal's comments, could we add a recursive delete attempt as the first thing, then fail over into the current path.
So adding the following at the top:

// don't check errors given detail process of any remaining files will occur after
d.deleteVMFSFiles(m, ds, dsPath)

Copy link
Contributor Author

@AngieCris AngieCris Mar 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's alright. unnecessary recursive delete doesn't sound very neat tbh....
it seems like ESX and older versions of VC don't have the issue but vSAN and new VC version do have that, in most cases it has to go thru iterative delete

if we add d.deleteVMFSFiles(m, ds, dsPath) at the top, then it would probably print failure log messages complaining not being able to delete non-empty datastore folders. Since we ignore the error coming out of this, and recursively clean stuff up later, if the folder did get cleaned up in the end, looking at the log it would still feel like that there's a failure deleting the folder.
sorry about the messy description, but just wondering will it confuse users?

Copy link
Contributor Author

@AngieCris AngieCris Mar 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The log looks like this from a test run I did:

INFO[0003] Removing image stores 
DEBU[0004] Failed to delete "[datastore1] vch-test-4/VIC": Cannot delete file [datastore1] vch-test-4/VIC 
DEBU[0013] Folder "[datastore1] vch-test-4/kvStores" is not found 
DEBU[0013] Image store parent directory not empty, leaving in place.
......
DEBU[0016] Folder "[datastore1] vch-test-4" is not found 
INFO[0016] Removing Resource Pool "vch-test-4" 

The later Folder "[datastore1] vch-test-4" is not found is from vm.Unregsiter, it shows that the folder is already been deleted
but still we see Failed to delete "[datastore1] vch-test-4/VIC": Cannot delete file [datastore1] vch-test-4/VIC on the top, even tho it's actually deleted successfully right after

(also we're sort of not supposed to see Folder "[datastore1] vch-test-4" is not found from unregister at all, but it shows up because the line we added returns NotFound error)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm okay with them showing up as debug messages - we could add an additional debug message saying "Attempting recursive delete" then if that's not clean a message saying "Cleaning up recusive deletion failures via iterative delete" or similar.

We may want to return a different error type rather than re-use NotFound if it's causing misleading output.

Copy link
Contributor Author

@AngieCris AngieCris Mar 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Here's the log output with the most recent change:

DEBU[0007] No disks found attached to VM                
INFO[0007] Removing image stores                        
DEBU[0009] Failed to delete "[datastore1] vch-2/VIC": Cannot delete file [datastore1] vch-2/VIC 
DEBU[0009] Attempt to delete top level folder "[datastore1] vch-2/VIC" failed. Remove the children files instead. 
DEBU[0027] Image store parent directory not empty, leaving in place. 
INFO[0027] Removing volume stores 

Does this look okay? I added a log saying Attempt to delete top level folder failed. Remove the children files instead

@AngieCris
Copy link
Contributor Author

AngieCris commented Mar 5, 2018

@anchal-agrawal
6-03-Delete test suite already includes a test case called Attach Disks and Delete VCH that pulls an image, attaches the disk files, then delete the VCH, and finally check if the image folder is deleted. This covers the case where it could potentially fail.

The reason why all group 6 tests didn't fail on CI before was that old CI ran on ESX, and deleting a non-empty datastore folder with child image files works on ESX. But the operation won't go thru on newer version of VC.

@AngieCris AngieCris requested a review from zjs March 5, 2018 21:48
@codecov-io
Copy link

codecov-io commented Mar 5, 2018

Codecov Report

Merging #7441 into master will decrease coverage by 0.07%.
The diff coverage is 42.85%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master   #7441      +/-   ##
=========================================
- Coverage   31.58%   31.5%   -0.08%     
=========================================
  Files         278     278              
  Lines       42169   42172       +3     
=========================================
- Hits        13318   13286      -32     
- Misses      27685   27719      +34     
- Partials     1166    1167       +1
Impacted Files Coverage Δ
lib/install/management/store_files.go 46.24% <42.85%> (+0.23%) ⬆️
lib/portlayer/attach/communication/lazy.go 68.18% <0%> (-22.73%) ⬇️
lib/portlayer/attach/communication/connector.go 54.2% <0%> (-7.48%) ⬇️
lib/dns/cache.go 89.85% <0%> (-4.35%) ⬇️
lib/portlayer/attach/communication/interactor.go 24.08% <0%> (-2.92%) ⬇️
pkg/logmgr/logmgr.go 67.36% <0%> (+1.38%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 401438a...0b2cedc. Read the comment docs.

@AngieCris AngieCris changed the title Delete datastore files recursively for all datastore types Delete datastore files iteratively for all datastore types Mar 8, 2018
@AngieCris AngieCris merged commit 42375aa into vmware:master Mar 14, 2018
@AngieCris AngieCris deleted the vic/7359 branch March 14, 2018 19:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants