Skip to content

3583 dv cannot parse dois that don't have a separator in authority/prefix#4651

Merged
kcondon merged 49 commits intoIQSS:developfrom
QualitativeDataRepository:3583-DV-cannot-parse-DOIs-that-don't-have-a-separator-in-authority/prefix
May 31, 2018
Merged

3583 dv cannot parse dois that don't have a separator in authority/prefix#4651
kcondon merged 49 commits intoIQSS:developfrom
QualitativeDataRepository:3583-DV-cannot-parse-DOIs-that-don't-have-a-separator-in-authority/prefix

Conversation

@qqmyers
Copy link
Member

@qqmyers qqmyers commented May 4, 2018

New Contributors

Welcome! New contributors should at least glance at CONTRIBUTING.md, especially the section on pull requests where we encourage you to reach out to other developers before you start coding. Also, please note that we measure code coverage and prefer you write unit tests. Pull requests can still be reviewed without tests or completion of the checklist outlined below. Thanks!

Related Issues

Pull Request Checklist

@qqmyers
Copy link
Member Author

qqmyers commented May 4, 2018

The required database update is the two lines below - not sure if the version update sql files go in git somewhere or not...

UPDATE dataset SET identifier=substring(authority, strpos(authority,doiseparator)+1) || doiseparator || identifier WHERE strpos(authority,doiseparator)>0;
UPDATE dataset SET authority=substring(authority from 0 for strpos(authority,doiseparator)) WHERE strpos(authority,doiseparator)>0;

@coveralls
Copy link

coveralls commented May 4, 2018

Coverage Status

Coverage increased (+0.0008%) to 13.258% when pulling f646a82 on QualitativeDataRepository:3583-DV-cannot-parse-DOIs-that-don't-have-a-separator-in-authority/prefix into e600c29 on IQSS:develop.

qqmyers and others added 23 commits May 14, 2018 21:43
always '/' 'for DOIs and Handles, and any shoulder:identifier separator
is now handled as part of the separator.

Also fixed the bug I introduced case logic for Handles, which are case
sensitive.
first indexOf '/' only, so the restriction is now obsolete.
…-DOIs-that-don't-have-a-separator-in-authority/prefix
3583-DV-cannot-parse-DOIs-that-don't-have-a-separator-in-authority/prefix

Conflicts:
	src/main/java/edu/harvard/iq/dataverse/Dataset.java
	src/main/java/edu/harvard/iq/dataverse/DatasetServiceBean.java
	src/main/java/edu/harvard/iq/dataverse/engine/command/impl/CreateDatasetCommand.java
Both Handles and DOIs require '/' as the authority/identifier separator.
Since any separator / sub-identifier separator is now just part of the
shoulder, there is no need for a configurable value for that. It is also
problematic to make the authority/identifier separator configurable at
this point since there are many places in the code where '/' is
hardcoded (reasonable given that Handle and DOI both require it), so
supporting a new Id type that doesn't use this character requires more
work than creating a new serviceBean and setting the key.

Nominally, all identifiers will have '/' as the separator in the
database and the separator column in the dataset table (and now for
datafiles too) could be removed, but that would complicate id matching
relative to current code which just concatenates
<authority><separator><identifier> and would add work to restore it if a
different identifier with new separator is ever created. So - leaving
this, and code to get/set it as is...
files

Given the code for generating datafile ids, it looks like the new
ability to support a shoulder should only apply to datasets as a start.

This commit splits the existing IdGenerationStyle key into two one for
datasets and one for datafiles. Datafiles will continue to only support
random or sequential identifiers (with an additional choice as to
whether to those are independent or dependent (appended to the dataset
identifier) while datasets can also use the shoulder with random or
sequential options.

It seems like allowing files to use a shoulder when they are independent
would also make sense, but I don't think using that with dependent would
make as much sense. If that's a reasonable choice, we might want to
rethink having the dependent/independent choice as a separate key
(datafiles would ultimately have 6 choices - random or sequential with
dependent/independent or independent with shoulder and random or
sequential).
These updates fix an issue with checking the db for entries that match
the new one so that tests include the shoulder.

They also use a shoulder, if defined, for datafile, only when the ids
for datafiles are being generated as 'INDEPENDENT'. When DEPENDENT, the
shoulder is already used once in the part of the id coming from the
dataset, so it is not used again.
Having a shoulder set will cause it to be used.
Datafile identifiers now prepend either the dataset identifier
(dependent mode) or the shoulder (if defined, independent mode only) and
the appropriate prepend string is sent into the appropriate generateId
method so it can be prepended before the isIdentifierUniqueInDatabase
call.
3583-DV-cannot-parse-DOIs-that-don't-have-a-separator-in-authority/prefix

Conflicts:
	src/main/java/edu/harvard/iq/dataverse/DataFileServiceBean.java
	src/main/java/edu/harvard/iq/dataverse/DatasetServiceBean.java
qqmyers and others added 11 commits May 22, 2018 18:08
…n-authority/prefix' of https://github.com/QualitativeDataRepository/dataverse.git into 3583-DV-cannot-parse-DOIs-that-don't-have-a-separator-in-authority/prefix
doiShoulder is a misnomer, it is called in the generation of new
identifiers, regardless of the protocol. However, it is no longer
required to be '/' as the misnamed doiSeparator was.

I also updated the doc to note that the shoulder is applied to
identifiers, including INDEPENDENT file ids.
since it applies to generated Handles as well.

--Add new setting into content for shoulder
INSERT INTO setting(name, content)
VALUES (':Shoulder', (SELECT substring(content, strpos(content,'/')+1) || '/' from setting where name = ':Authority'));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have an :Authority with no shoulder and this insert sets the :Shoulder to '/' instead of just leaving it unset. If we (QDR) are the only ones (since we've already deployed the early custom shoulder code that started this PR), then this would be OK.


--Add new setting into content for shoulder
INSERT INTO setting(name, content)
VALUES (':Shoulder', (SELECT substring(content, strpos(content,'/')+1) || '/' from setting where name = ':Authority'));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sekmiller - I'm confused as to why you want to use doiseparator since any value other than '/' would have broken an install at least for >~4.6. Is it for earlier versions? If so, I think you'll need to restore the lines to set the doiseparator in the dvobject table that I removed this morning in my commit. (Sorry for causing confusion...)

@pdurbin
Copy link
Member

pdurbin commented May 31, 2018

After this pull request was merged, the API tests on the phoenix server started failing: #4725

@matthew-a-dunlap matthew-a-dunlap changed the title 3583 dv cannot parse do is that don't have a separator in authority/prefix 3583 dv cannot parse dois that don't have a separator in authority/prefix Jun 20, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants