Skip to content

Publish update checks#100

Merged
dmullen17 merged 5 commits intoNCEAS:masterfrom
dmullen17:publish_update_checks
Jul 12, 2018
Merged

Publish update checks#100
dmullen17 merged 5 commits intoNCEAS:masterfrom
dmullen17:publish_update_checks

Conversation

@dmullen17
Copy link
Member

Added publish_update argument checks + unit tests to verify that the resource_map_pid and metadata_pid are current. Resolve #99

@jeanetteclark
Copy link
Collaborator

this looks good to me @dmullen17. @amoeba do you want to have a look as well?

R/editing.R Outdated
}

# Check that resource_map_pid, and metadata_pid are the current versions
meta_obsoletedBy <- dataone::getSystemMetadata(mn, metadata_pid)@obsoletedBy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other checks in this function are guarded by the value fo the check_first arg. Is this one and, if not, could it be?

These two calls alone could slow down an publish_update by 2-10 seconds depending on server load.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check_first just checks if the objects exist, not if they're already obsoleted. It certainly could be put into the check_first block though.

I would argue against that (at least for the resource map check) because you can set to check_first = FALSE and using an obsoleted pid by mistake is one of our most common errors. It's harmless to use an incorrect metadata pid, but using an incorrect resource map pid will actually obsolete the metadata first before it fails, leaving the package as stranded metadata. This isn't a problem for the experienced user to fix, but new users typically make matters worse before asking for help, and we spend lots of time figuring out what happened.

In short: moving the metadata pid check into the check_first block is ok, but an incorrect resource map pid causes problems that I'd rather avoid. What do you think?

MRE:

## Incorrect metadata pid 
pkg1 <- create_dummy_package(adc_test)
pkg2 <- publish_update(adc_test, pkg1$metadata, pkg1$resource_map, pkg1$data)

# update package with incorrect metadata pid: 
pkg3 <- publish_update(adc_test, pkg1$metadata, pkg2$resource_map, pkg2$data) # this call errors but is harmless
# We can still publish successfully 
pkg3 <- publish_update(adc_test, pkg2$metadata, pkg2$resource_map, pkg2$data)


## Incorrect resource map 
pkg1 <- create_dummy_package(adc_test)
pkg2 <- publish_update(adc_test, pkg1$metadata, pkg1$resource_map, pkg1$data)

# update package with incorrect resource map pid:  
pkg3 <- publish_update(adc_test, pkg2$metadata, pkg1$resource_map, pkg2$data) 
# this call errors but obsoletes the metadata beforehand 
browseURL(paste0("https://test.arcticdata.io/#view/", pkg2$metadata))
# we also can't publish correctly now: 
pkg3 <- publish_update(adc_test, pkg2$metadata, pkg2$resource_map, pkg2$data)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent points! I think I could go either way on this. Always doing these two checks each time the function is run is fine w/ me if you prefer that.

R/editing.R Outdated
}
rm_obsoletedBy <- dataone::getSystemMetadata(mn, resource_map_pid)@obsoletedBy
if (!is.na(rm_obsoletedBy)) {
stop("resource_map_pid is already obsoleted by: ", rm_obsoletedBy)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can these error messages by a touch more helpful? Perhaps:

stop("The value passed in for the argument 'resource_map_pid' of '", resource_map_pid, "' is already obsoleted by a newer version with PID '", rm_obsoletedBy, "'. All PID arguments to publish_update should be the latest versions of each Object series.")

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

love this error message!

expect_equal(child$resource_map, parent$child_packages)
})

test_that("publish_update errors if the non-current resource map or metadata pid is provided", {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice and tidy test!

@amoeba
Copy link
Contributor

amoeba commented Jul 11, 2018

Looks good! I didn't look at the full source code, just the patch: Are these checks guarded behind the value of the check_first arg so a user can opt not to run them? Otherwise looks good, sans my minor comments.

@dmullen17 dmullen17 merged commit ed26220 into NCEAS:master Jul 12, 2018
laijasmine pushed a commit that referenced this pull request Oct 2, 2020
Put the metadata check in the `check_first` block and left the resource map check as non-optional.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants