Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
4764b8e
read metadatalanauge if sent, set if exists and is allowed
qqmyers Feb 15, 2022
212c824
add metadata language as schema.org/inLanguage for import/export
qqmyers Feb 15, 2022
d3d1aab
typo (Thanks pdurbin) and de-merge issues brealing this PR out
qqmyers Feb 18, 2022
b7d0c52
Merge remote-tracking branch 'IQSS/develop' into IQSS/8337-handle_met…
qqmyers Feb 18, 2022
36e077f
Add debug info to test
qqmyers Feb 21, 2022
99174b9
check undefined code, not null
qqmyers Feb 21, 2022
2163721
explicitly print
qqmyers Feb 21, 2022
74c5a03
Merge remote-tracking branch 'IQSS/develop' into IQSS/8337-handle_met…
qqmyers Mar 7, 2022
5c4b85a
check versus undefined code, not null
qqmyers Mar 8, 2022
fdbf71c
Merge remote-tracking branch 'IQSS/develop' into IQSS/8337-handle_met…
qqmyers Mar 8, 2022
5835d8f
captialization typo
qqmyers Mar 11, 2022
a62af79
Merge remote-tracking branch 'IQSS/develop' into IQSS/8337-handle_met…
qqmyers Mar 21, 2022
692fa2b
add example files for creating with a metadatalanguage
qqmyers Mar 24, 2022
5512f20
missed update in checking null vs undefined
qqmyers Mar 24, 2022
852e2aa
add metadataLanguage field/set to fr
qqmyers Mar 24, 2022
ebd62dc
metadataLang goes at dataset level
qqmyers Mar 28, 2022
0afbe20
refactor API checks - require consistent value
qqmyers Mar 28, 2022
d690073
note about metadataLang in API calls
qqmyers Mar 28, 2022
b3fb7fc
fix logic for owner has undefined option
qqmyers Mar 28, 2022
a9e8dea
inverted logic for the metadatalang setting not set case
qqmyers Mar 29, 2022
5497434
Merge remote-tracking branch 'IQSS/develop' into IQSS/8337-handle_met…
qqmyers Apr 6, 2022
7f8e683
refactor, fix, add tests
qqmyers Apr 7, 2022
8fc524f
add mlang /check to import ddi
qqmyers Apr 8, 2022
2bb334f
use namespace in parsing xml:lang
qqmyers Apr 8, 2022
948660a
add logging, switch order (as a test)
qqmyers Apr 8, 2022
091cc19
try namespace url
qqmyers Apr 8, 2022
7c2fa01
Merge remote-tracking branch 'IQSS/develop' into IQSS/8337-handle_met…
qqmyers Apr 11, 2022
67de235
change logging to fine
qqmyers Apr 11, 2022
bea2c57
fix typo in string
qqmyers Apr 11, 2022
60c9f13
add logging to debug test fails
qqmyers Apr 11, 2022
dee802a
move debug logging to fine
qqmyers Apr 11, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions doc/sphinx-guides/source/_static/api/dataset-create_en.jsonld
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
{
"http://purl.org/dc/terms/title": "Darwin's Finches",
"http://purl.org/dc/terms/subject": "Medicine, Health and Life Sciences",
"http://schema.org/inLanguage":"en",
"http://purl.org/dc/terms/creator": {
"https://dataverse.org/schema/citation/author#Name": "Finch, Fiona",
"https://dataverse.org/schema/citation/author#Affiliation": "Birds Inc."
},
"https://dataverse.org/schema/citation/Contact": {
"https://dataverse.org/schema/citation/datasetContact#E-mail": "finch@mailinator.com",
"https://dataverse.org/schema/citation/datasetContact#Name": "Finch, Fiona"
},
"https://dataverse.org/schema/citation/Description": {
"https://dataverse.org/schema/citation/dsDescription#Text": "Darwin's finches (also known as the Galápagos finches) are a group of about fifteen species of passerine birds."
}
}
2 changes: 1 addition & 1 deletion doc/sphinx-guides/source/api/native-api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -459,7 +459,7 @@ To create a dataset, you must supply a JSON file that contains at least the foll
- Description
- Subject

As a starting point, you can download :download:`dataset-finch1.json <../../../../scripts/search/tests/data/dataset-finch1.json>` and modify it to meet your needs. (In addition to this minimal example, you can download :download:`dataset-create-new-all-default-fields.json <../../../../scripts/api/data/dataset-create-new-all-default-fields.json>` which populates all of the metadata fields that ship with a Dataverse installation.)
As a starting point, you can download :download:`dataset-finch1.json <../../../../scripts/search/tests/data/dataset-finch1.json>` and modify it to meet your needs. (:download:`dataset-create-new-all-default-fields.json <../../../../scripts/api/data/dataset-finch1_fr.json>` is a variant of this file that includes setting the metadata language (see :ref:`:MetadataLanguages`) to French (fr). In addition to this minimal example, you can download :download:`dataset-create-new-all-default-fields.json <../../../../scripts/api/data/dataset-create-new-all-default-fields.json>` which populates all of the metadata fields that ship with a Dataverse installation.)

The curl command below assumes you have kept the name "dataset-finch1.json" and that this file is in your current working directory.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -99,5 +99,5 @@ With curl, this is done by adding the following header:

curl -H X-Dataverse-key:$API_TOKEN -H 'Content-Type: application/ld+json' -X POST $SERVER_URL/api/dataverses/$DATAVERSE_ID/datasets --upload-file dataset-create.jsonld

An example jsonld file is available at :download:`dataset-create.jsonld <../_static/api/dataset-create.jsonld>`
An example jsonld file is available at :download:`dataset-create.jsonld <../_static/api/dataset-create.jsonld>` (:download:`dataset-create_en.jsonld <../_static/api/dataset-create.jsonld>` is a version that sets the metadata language (see :ref:`:MetadataLanguages`) to English (en).)

3 changes: 2 additions & 1 deletion doc/sphinx-guides/source/installation/config.rst
Original file line number Diff line number Diff line change
Expand Up @@ -730,7 +730,8 @@ Allowing the Language Used for Dataset Metadata to be Specified
Since dataset metadata can only be entered in one language, and administrators may wish to limit which languages metadata can be entered in, Dataverse also offers a separate setting defining allowed metadata languages.
The presence of the :ref:`:MetadataLanguages` database setting identifies the available options (which can be different from those in the :Languages setting above, with fewer or more options).

Dataverse collection admins can select from these options to indicate which language should be used for new Datasets created with that specific collection. If they do not, users will be asked when creating a dataset to select the language they want to use when entering metadata.
Dataverse collection admins can select from these options to indicate which language should be used for new Datasets created with that specific collection. If they do not, users will be asked when creating a dataset to select the language they want to use when entering metadata.
Similarly, when this setting is defined, Datasets created/imported/migrated are required to specify a metadataLanguage compatible with the collection's requirement.

When creating or editing a dataset, users will be asked to enter the metadata in that language. The metadata language selected will also be shown when dataset metadata is viewed and will be included in metadata exports (as appropriate for each format) for published datasets:

Expand Down
78 changes: 78 additions & 0 deletions scripts/api/data/dataset-finch1_fr.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
{
"metadataLanguage": "fr",
"datasetVersion": {
"metadataBlocks": {
"citation": {
"fields": [
{
"value": "Darwin's Finches",
"typeClass": "primitive",
"multiple": false,
"typeName": "title"
},
{
"value": [
{
"authorName": {
"value": "Finch, Fiona",
"typeClass": "primitive",
"multiple": false,
"typeName": "authorName"
},
"authorAffiliation": {
"value": "Birds Inc.",
"typeClass": "primitive",
"multiple": false,
"typeName": "authorAffiliation"
}
}
],
"typeClass": "compound",
"multiple": true,
"typeName": "author"
},
{
"value": [
{ "datasetContactEmail" : {
"typeClass": "primitive",
"multiple": false,
"typeName": "datasetContactEmail",
"value" : "finch@mailinator.com"
},
"datasetContactName" : {
"typeClass": "primitive",
"multiple": false,
"typeName": "datasetContactName",
"value": "Finch, Fiona"
}
}],
"typeClass": "compound",
"multiple": true,
"typeName": "datasetContact"
},
{
"value": [ {
"dsDescriptionValue":{
"value": "Darwin's finches (also known as the Galápagos finches) are a group of about fifteen species of passerine birds.",
"multiple":false,
"typeClass": "primitive",
"typeName": "dsDescriptionValue"
}}],
"typeClass": "compound",
"multiple": true,
"typeName": "dsDescription"
},
{
"value": [
"Medicine, Health and Life Sciences"
],
"typeClass": "controlledVocabulary",
"multiple": true,
"typeName": "subject"
}
],
"displayName": "Citation Metadata"
}
}
}
}
19 changes: 18 additions & 1 deletion src/main/java/edu/harvard/iq/dataverse/api/Dataverses.java
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
import edu.harvard.iq.dataverse.api.datadeposit.SwordServiceBean;
import edu.harvard.iq.dataverse.authorization.DataverseRole;
import edu.harvard.iq.dataverse.DvObject;
import edu.harvard.iq.dataverse.DvObjectContainer;
import edu.harvard.iq.dataverse.GlobalId;
import edu.harvard.iq.dataverse.GuestbookResponseServiceBean;
import edu.harvard.iq.dataverse.GuestbookServiceBean;
Expand All @@ -29,6 +30,7 @@
import edu.harvard.iq.dataverse.authorization.groups.impl.explicit.ExplicitGroupServiceBean;
import edu.harvard.iq.dataverse.authorization.users.AuthenticatedUser;
import edu.harvard.iq.dataverse.authorization.users.User;
import edu.harvard.iq.dataverse.dataverse.DataverseUtil;
import edu.harvard.iq.dataverse.engine.command.DataverseRequest;
import edu.harvard.iq.dataverse.engine.command.impl.AddRoleAssigneesToExplicitGroupCommand;
import edu.harvard.iq.dataverse.engine.command.impl.AssignRoleCommand;
Expand Down Expand Up @@ -67,6 +69,7 @@
import static edu.harvard.iq.dataverse.util.StringUtil.nonEmpty;

import edu.harvard.iq.dataverse.util.json.JSONLDUtil;
import edu.harvard.iq.dataverse.util.json.JsonLDTerm;
import edu.harvard.iq.dataverse.util.json.JsonParseException;
import static edu.harvard.iq.dataverse.util.json.JsonPrinter.brief;
import java.io.StringReader;
Expand Down Expand Up @@ -111,6 +114,7 @@
import java.text.SimpleDateFormat;
import java.util.Arrays;
import java.util.Date;
import java.util.HashMap;
import java.util.Map;
import java.util.Optional;
import javax.servlet.http.HttpServletResponse;
Expand Down Expand Up @@ -223,6 +227,7 @@ public Response addDataverse(String body, @PathParam("identifier") String parent
@Consumes("application/json")
public Response createDataset(String jsonBody, @PathParam("identifier") String parentIdtf) {
try {
logger.fine("Json is: " + jsonBody);
User u = findUserOrDie();
Dataverse owner = findDataverseOrDie(parentIdtf);
Dataset ds = parseDataset(jsonBody);
Expand All @@ -236,6 +241,9 @@ public Response createDataset(String jsonBody, @PathParam("identifier") String p
return badRequest(BundleUtil.getStringFromBundle("dataverses.api.create.dataset.error.superuserFiles"));
}

//Throw BadRequestException if metadataLanguage isn't compatible with setting
DataverseUtil.checkMetadataLangauge(ds, owner, settingsService.getBaseMetadataLanguageMap(null, true));

// clean possible version metadata
DatasetVersion version = ds.getVersions().get(0);
version.setMinorVersionNumber(null);
Expand Down Expand Up @@ -304,6 +312,9 @@ public Response createDatasetFromJsonLd(String jsonLDBody, @PathParam("identifie
ds.setIdentifier(null);
ds.setProtocol(null);
ds.setGlobalIdCreateTime(null);

//Throw BadRequestException if metadataLanguage isn't compatible with setting
DataverseUtil.checkMetadataLangauge(ds, owner, settingsService.getBaseMetadataLanguageMap(null, true));

Dataset managedDs = execCommand(new CreateNewDatasetCommand(ds, createDataverseRequest(u)));
return created("/datasets/" + managedDs.getId(),
Expand Down Expand Up @@ -333,6 +344,9 @@ public Response importDataset(String jsonBody, @PathParam("identifier") String p
return badRequest("Supplied json must contain a single dataset version.");
}

//Throw BadRequestException if metadataLanguage isn't compatible with setting
DataverseUtil.checkMetadataLangauge(ds, owner, settingsService.getBaseMetadataLanguageMap(null, true));

DatasetVersion version = ds.getVersions().get(0);
if (version.getVersionState() == null) {
version.setVersionState(DatasetVersion.VersionState.DRAFT);
Expand Down Expand Up @@ -400,6 +414,7 @@ public Response importDatasetDdi(String xml, @PathParam("identifier") String par
Dataset ds = null;
try {
ds = jsonParser().parseDataset(importService.ddiToJson(xml));
DataverseUtil.checkMetadataLangauge(ds, owner, settingsService.getBaseMetadataLanguageMap(null, true));
} catch (JsonParseException jpe) {
return badRequest("Error parsing data as Json: "+jpe.getMessage());
} catch (ImportException e) {
Expand Down Expand Up @@ -486,8 +501,10 @@ public Response recreateDataset(String jsonLDBody, @PathParam("identifier") Stri
if(!datasetSvc.isIdentifierLocallyUnique(ds)) {
throw new BadRequestException("Cannot recreate a dataset whose PID is already in use");
}


//Throw BadRequestException if metadataLanguage isn't compatible with setting
DataverseUtil.checkMetadataLangauge(ds, owner, settingsService.getBaseMetadataLanguageMap(null, true));


if (ds.getVersions().isEmpty()) {
return badRequest("Supplied json must contain a single dataset version.");
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,10 @@ public String toString() {
return "DatasetDTO{" + "id=" + id + ", identifier=" + identifier + ", protocol=" + protocol + ", authority=" + authority + ", globalIdCreateTime=" + globalIdCreateTime + ", datasetVersion=" + datasetVersion + ", dataFiles=" + dataFiles + '}';
}

public void setMetadataLanguage(String metadataLanguage) {
this.metadataLanguage = metadataLanguage;
}

public String getMetadataLanguage() {
return metadataLanguage;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -182,6 +182,11 @@ private void processDDI(ImportType importType, XMLStreamReader xmlr, DatasetDTO
throw new XMLStreamException("It doesn't start with the XML element <codeBook>");
}

//Include metadataLanguage from an xml:lang attribute if present (null==undefined)
String metadataLanguage= xmlr.getAttributeValue("http://www.w3.org/XML/1998/namespace", "lang");
logger.fine("Found metadatalanguage in ddi xml: " + metadataLanguage);
datasetDTO.setMetadataLanguage(metadataLanguage);

// Some DDIs provide an ID in the <codeBook> section.
// We are going to treat it as just another otherId.
// (we've seen instances where this ID was the only ID found in
Expand Down
79 changes: 57 additions & 22 deletions src/main/java/edu/harvard/iq/dataverse/dataverse/DataverseUtil.java
Original file line number Diff line number Diff line change
@@ -1,73 +1,108 @@
package edu.harvard.iq.dataverse.dataverse;

import edu.harvard.iq.dataverse.Dataset;
import edu.harvard.iq.dataverse.Dataverse;
import edu.harvard.iq.dataverse.DvObjectContainer;
import edu.harvard.iq.dataverse.authorization.groups.impl.ipaddress.ip.IpAddress;
import edu.harvard.iq.dataverse.authorization.users.User;
import edu.harvard.iq.dataverse.engine.command.DataverseRequest;
import edu.harvard.iq.dataverse.util.BundleUtil;
import edu.harvard.iq.dataverse.util.json.JsonLDTerm;

import static edu.harvard.iq.dataverse.util.json.JsonPrinter.json;
import java.io.File;
import java.io.IOException;
import java.math.BigDecimal;

import java.util.Map;
import java.util.logging.Logger;

import javax.ws.rs.BadRequestException;

import opennlp.tools.util.StringUtil;
import org.apache.commons.io.FileUtils;

public class DataverseUtil {

private static final Logger logger = Logger.getLogger(DataverseUtil.class.getCanonicalName());

public static String getSuggestedDataverseNameOnCreate(User user) {
if (user == null) {
return null;
}
// getDisplayInfo() is never null.
return user.getDisplayInfo().getTitle() + " " + BundleUtil.getStringFromBundle("dataverse");
}

public static boolean validateDataverseMetadataExternally(Dataverse dv, String executable, DataverseRequest request) {
String jsonMetadata;

String sourceAddressLabel = "0.0.0.0";


public static boolean validateDataverseMetadataExternally(Dataverse dv, String executable,
DataverseRequest request) {
String jsonMetadata;

String sourceAddressLabel = "0.0.0.0";

if (request != null) {
IpAddress sourceAddress = request.getSourceAddress();
if (sourceAddress != null) {
sourceAddressLabel = sourceAddress.toString();
}
}

try {
jsonMetadata = json(dv).add("sourceAddress", sourceAddressLabel).build().toString();
} catch (Exception ex) {
logger.warning("Failed to export dataverse metadata as json; "+ex.getMessage() == null ? "" : ex.getMessage());
return false;
logger.warning(
"Failed to export dataverse metadata as json; " + ex.getMessage() == null ? "" : ex.getMessage());
return false;
}

if (StringUtil.isEmpty(jsonMetadata)) {
logger.warning("Failed to export dataverse metadata as json.");
return false;
return false;
}
// save the metadata in a temp file:

// save the metadata in a temp file:

try {
File tempFile = File.createTempFile("dataverseMetadataCheck", ".tmp");
FileUtils.writeStringToFile(tempFile, jsonMetadata);
// run the external executable:

// run the external executable:
String[] params = { executable, tempFile.getAbsolutePath() };
Process p = Runtime.getRuntime().exec(params);
p.waitFor();

return p.exitValue() == 0;

} catch (IOException | InterruptedException ex) {
logger.warning("Failed run the external executable.");
return false;
return false;
}

}

public static void checkMetadataLangauge(Dataset ds, Dataverse owner, Map<String, String> mLangMap) {
// Verify metadatalanguage is allowed
logger.fine("Dataset mdl: " + ds.getMetadataLanguage());
logger.fine("Owner mdl: " + owner.getMetadataLanguage());
logger.fine("Map langs: " + mLangMap.toString());

// :MetadataLanguage setting is not set
// Must send UNDEFINED or match parent
if (mLangMap.isEmpty()) {
if (!(ds.getMetadataLanguage().equals(DvObjectContainer.UNDEFINED_METADATA_LANGUAGE_CODE)
|| ds.getMetadataLanguage().equals(owner.getMetadataLanguage()))) {
throw new BadRequestException("This repository is not configured to support metadataLanguage.");
}
} else {
// When :MetadataLanguage is set, the specificed language must either match the
// parent collection choice, or, if that is undefined, be one of the choices
// allowed by the setting
if (!((ds.getMetadataLanguage().equals(owner.getMetadataLanguage())
&& !owner.getMetadataLanguage().equals(DvObjectContainer.UNDEFINED_METADATA_LANGUAGE_CODE))
|| (owner.getMetadataLanguage().equals(DvObjectContainer.UNDEFINED_METADATA_LANGUAGE_CODE)
&& (mLangMap.containsKey(ds.getMetadataLanguage()))))) {
throw new BadRequestException("Specified metadatalanguage ( metadataLanguage, "
+ JsonLDTerm.schemaOrg("inLanguage").getUrl() + ") not allowed in this collection.");
}
}

}

}
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
import edu.harvard.iq.dataverse.DatasetFieldServiceBean;
import edu.harvard.iq.dataverse.DatasetFieldType;
import edu.harvard.iq.dataverse.DatasetVersion;
import edu.harvard.iq.dataverse.DvObjectContainer;
import edu.harvard.iq.dataverse.FileMetadata;
import edu.harvard.iq.dataverse.TermsOfUseAndAccess;
import edu.harvard.iq.dataverse.branding.BrandingUtil;
Expand Down Expand Up @@ -214,7 +215,11 @@ public JsonObjectBuilder getOREMapBuilder(boolean aggregationOnly) throws Except

aggBuilder.add(JsonLDTerm.schemaOrg("includedInDataCatalog").getLabel(),
BrandingUtil.getRootDataverseCollectionName());

String mdl = dataset.getMetadataLanguage();
if(!mdl.equals(DvObjectContainer.UNDEFINED_METADATA_LANGUAGE_CODE)) {
aggBuilder.add(JsonLDTerm.schemaOrg("inLanguage").getLabel(), mdl);
}

// The aggregation aggregates aggregatedresources (Datafiles) which each have
// their own entry and metadata
JsonArrayBuilder aggResArrayBuilder = Json.createArrayBuilder();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,9 @@ public static Dataset updateDatasetMDFromJsonLD(Dataset ds, String jsonLDBody,
+ "'. Make sure it is in valid form - see Dataverse Native API documentation.");
}
}

//Store the metadatalanguage if sent - the caller needs to check whether it is allowed (as with any GlobalID)
ds.setMetadataLanguage(jsonld.getString(JsonLDTerm.schemaOrg("inLanguage").getUrl(),null));

dsv = updateDatasetVersionMDFromJsonLD(dsv, jsonld, metadataBlockSvc, datasetFieldSvc, append, migrating, licenseSvc);
dsv.setDataset(ds);
Expand Down
Loading