10116 incomplete matadata label setting#10172
Conversation
src/main/java/edu/harvard/iq/dataverse/settings/JvmSettings.java
Outdated
Show resolved
Hide resolved
…hub.com/ErykKul/dataverse into 10116_incomplete_matadata_label_setting
…dity-label-when-published
qqmyers
left a comment
There was a problem hiding this comment.
Looks good. I think the build is failing because something isn't up to date w.r.t. the war file name after the version change. Once that's fixed, the build should be rerun.
|
Hello and thank you for this fix. Looking at the developments of this ticket, I see that it is not possible to have an overview of the published datasets concerned for an administrator (this is however specified in the doc), perhaps this is normal? Indeed, "my Data" will not display the dataset if the administrator is not the depositor. dataverse/src/main/java/edu/harvard/iq/dataverse/search/IndexServiceBean.java Lines 787 to 798 in 7a3ee97 Perhaps we need to modify the condition on the draft versions to obtain a consistent search? Thanks a lot |
|
@stevenferey Good catch on the indexing part! Thanks! I think I over-fixed it while making sure that published datasets never show as incomplete to regular users. I will fix it and retest it to see if it works as intended. As for the filters part and datasets you see as administrator, it might be because of the roles that are assigned to the administrator account? You could have accounts with different roles, even one specific to detecting the incomplete datasets. I think it might be the The filter does work too, but indeed I see only draft datasets with incomplete metadata. I overlooked that it does not show any published datasets with incomplete metadata as we do not have any. I will create some on my test installation and fix the problem. |
|
Thank you for the feedback and future adjustments, Indeed, an administrator with "contributor" rights on a dataverse displays the draft datasets with incomplete metadata in the "my data" page. |
|
@ErykKul - I assigned you since it looks like you were going to make additional changes. If that's not true or that's a separate PR, let me know so we can move this to Ready For QA. Otherwise, just assign me when you've made the changes and I'll re-review. |
|
Yes, I need to make some changes first. Easy to do, but then I need to test it thoughtfully. I rill do it after my vacation after 15th. This PR is not urgent, once ready, I will let you know. |
…nd incmoplete) are checked
…ata, you are allowed to see incomplete metadata label on published datasets when the flag is enabled
|
@qqmyers |
| for (DatasetField dsf : retList) { | ||
| if (dsfType.equals(dsf.getDatasetFieldType())) { | ||
| add = false; | ||
| if (removeEmptyValues) { |
There was a problem hiding this comment.
I'm not sure I understand this part. Can you explain more about why this change is needed? With removeEmptyValues = true, which is called only from the DatasetPage.isValid method (and not the FilePage.isValid method in line 325?), this removes empty values for any field that is in the version and is in the metadata blocks. Could it just be for any field in the version, e.g. called around line 1597? If so, since this whole method just adds empty values, do you just not need line 2301 in DatasetPage (and 325 in FilePage)?
There was a problem hiding this comment.
I find this part strange and confusing too. Maybe I am missing something. The reasons why I did it this way are the following:
- when I simply do a
isValid()check on a version with incomplete metadata that does not have some mandatory fields, it says that it is valid. Only when I first call "initDatasetFields", it works and detects when mandatory fields are missing (it keeps the fields that are already there, but adds empty values for the fields that are not there yet, and now the validator can detect that these are empty). It still says valid for valid metadata. - however, I managed to break that logic in my tests by manipulating metadata blocks and making fields mandatory, etc. As it turns out, it worked for new versions made with the new citation block, but old version, already published, still came out as 'valid' from the check (even after making a new draft version from an old published version with now incomplete metadata, it still would be marked as 'valid'). This was inconsistent, since you can make some changes to the citation block that do get detected in an already published datasets... After some digging, it turned out that some fields were not empty and had the special "N/A" value in them. I am not sure if empty strings were fine or not for the isValid check, I did not dig deep enough for that, I think that only "N/A" was causing the problems.
- I have tried using the "is empty" checks implemented in the dataset field class, none of them worked for this special case, datasets were still "valid", even when incomplete (both tests did not detect empty values well enough to fix the problem).
- inspired by what the 'is empty' checks are doing, I implemented this new method that removes primitive values if they are either empty strings or contain the special "N/A" value. With this new method I had consistent results: published datasets with incomplete metadata showed as invalid, and with complete metadata as valid. The same for all draft versions. I only tested with the dataset page, so I missed the file page thing.
- since what I did looks strange and drastic to me, I did not make it a "new standard" and decided to use it only for the test on completeness of metadata. I am still not sure how exactly the UI validation logic works on metadata, and how it is different from "isValid" check done in the dataset version, it seems that they use the same bean validators, but results are very different. It looks to me that it might be worth it to invest more time in this and do some refactoring there.
src/main/webapp/file.xhtml
Outdated
| <h:outputText value="#{bundle['dataset.versionUI.deaccessioned']}" styleClass="label label-danger" rendered="#{FilePage.fileMetadata.datasetVersion.deaccessioned}"/> | ||
| <h:outputText value="#{FilePage.fileMetadata.datasetVersion.externalStatusLabel}" styleClass="label label-info" rendered="#{FilePage.fileMetadata.datasetVersion.externalStatusLabel!=null and FilePage.canPublishDataset()}"/> | ||
| <h:outputText value="#{bundle['incomplete']}" styleClass="label label-danger" rendered="#{FilePage.fileMetadata.datasetVersion.draft and !FilePage.fileMetadata.datasetVersion.valid}"/> | ||
| <h:outputText value="#{bundle['incomplete']}" styleClass="label label-danger" rendered="#{FilePage.fileMetadata.datasetVersion.draft and !FilePage.valid}"/> |
There was a problem hiding this comment.
This just seems to bypass lines 324-325 in FilePage.isValid() (and skips the new caching there). Is this why you didn't need to call initDatasetFields(true) there?
Also - FilePage.isValid is called in the FilePage.displayPublishMessage() call - should that have the same logic as the call here?
There was a problem hiding this comment.
I focused on the dataset page and missed that one. I have changed the logic to 'show incomplete label when not FilePage.valid', and added the 'true' in the call to initDatasetFields.
src/main/java/edu/harvard/iq/dataverse/mydata/DataRetrieverAPI.java
Outdated
Show resolved
Hide resolved
|
@qqmyers |
|
It turns out that I was the only one using the "isValid" method in DatasetVersion, so I changed it to keep the logic centralized. I added comments to make it more clear what is happening there. |
|
I retested it: collection now do show up in my data, incomplete and complete (draft and published) datasets have correct labels when "dataverse.ui.show-validity-label-when-published" is enabled, and published datasets do not have incomplete labels when they are disabled for published datasets. |
|
You need to reindex for the changes to take effect. |
qqmyers
left a comment
There was a problem hiding this comment.
Looks good. For QA, it sounds like two ways to set up - allow incomplete datasets and upload one via the api, or publish a dataset and then make a field required in a metadata block.
|
@ErykKul This looks good. The only concern I have is that the show-validity-label defaults to false. I think if I have edit rights, I would want to know if I have a published dataset that needs attention - especially since it should be a rare occurrence. I would argue for a default of 'true' then if it gets to be too much you could shut it off with false. |
|
@sekmiller Sounds good, I changed the default to |


What this PR does / why we need it:
Fixed the bug where incomplete metadata label was shown on a published dataset and visible for everybody.
The label is now only shown for draft dataset or when the new
dataverse.api.show-label-for-incomplete-when-publishedfeature is enabled, but only for the published datasets that the users can edit (e.g., when you are logged in, and you are a contributor for a given published dataset with incomplete metadata).Which issue(s) this PR closes:
Closes #10116
Is there a release notes update needed for this change?:
Yes