only abbreviate siteid if numeric and over a billion #3428
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Working on support for arbitrary string identifiers by changing places that abbreviate siteIDs.
Previously these places assumed the ID was coercible to numeric with billions place = server number, then dropped zeroes from the middle:
33->"0-33",1000000005->"1-5","3000001875"->"3-1875","foo"->"NA-NA"from some fns and error from others.Now they check whether the coercion succeeds and gives a value greater than 1e+09, and treat the siteID as a string otherwise.
Note that there are a few places with patterns like "siteid %/% 1e9" that I didn't change here:
shiny/and 3 ininst/folders, which look like they're not used often enough to bother updating right now./modules/data.remote/R/remote_process.R, which is heavily DB-dependent in other ways and it's probably reasonable for now to keep assuming all the IDs it handles come from BETYsite.infoif not present #3324Motivation and Context
As we move away from requiring BETY connections, siteIDs will keep being useful as unique identifiers but need not be constrained to be numeric, and probably will be smaller than 1e9 / the billions place won't have any special significance if they're larger than that. For the initial CCMMF workflows, I've been using site names as IDs and finding they mostly Just Work. Of the changes here, I only needed the ones in
pool_ic_list2netcdftoday, but decided to tackle the others I saw that used the same assumption.This does add a bit of complexity because the ID might be passed as actual numeric or as character containing digits (as read from XML).
One obvious alternate design would be to stop abbreviating at all (or move it to a step further upstream) and have all these functions use the ID exactly as passed, coercing to character if it isn't already. I considered this but thought that for backward compatibility it was worth keeping the existing behavior when running with BETY ids.
Note also that in #3324 we discussed what to do if passed a lat-lon with no siteID, and one design we considered was "generate a siteID by pasting lat and lon together". If we proceed with that design, we may want to consider potential confusion between "1-35" meaning siteID 1000000035 vs meaning a site at 1 degree north and 35 degrees west.
Review Time Estimate
Types of changes
Checklist: