Skip to content

Language missing in the OAI-DDI #7388

@bappun

Description

@bappun

Most of our datasets are described as "french" datasets in the metadata. For example: https://data.sciencespo.fr/dataset.xhtml?persistentId=doi:10.21410/7E4/00LYOG (detailed metadata are embedded in the dataset here).

We imported a DDI file where the French language is set at a file level also at a study level:

<?xml version="1.0" encoding="UTF-8"?>
<codeBook xml:lang="fr" xsi:schemaLocation="ddi:codebook:2_5 https://www.ddialliance.org/Specification/DDI-Codebook/2.5/XMLSchema/codebook.xsd" version="2.5" ID="fr.cdsp.ddi.OV70"
	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
	xmlns="ddi:codebook:2_5">
<stdyDscr>
	<citation>
		<titlStmt>
			<titl xml:lang="fr">L'ouvrier français en 1970</titl>
			<parTitl xml:lang="en">The French Working Class in 1970</parTitl>
			<IDNo agency="CDSP">fr.cdsp.ddi.OV70</IDNo>
			<IDNo agency="DataCite">doi:10.21410/7E4/00LYOG</IDNo>
		</titlStmt>

But this information is lost at study level, either when harvesting in oai-ddi or when downloading metadata as DDI, for example:

<OAI-PMH
	xmlns="http://www.openarchives.org/OAI/2.0/"
	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd">
	<responseDate>2020-11-04T13:29:10Z</responseDate>
	<request verb="ListRecords" metadataPrefix="oai_ddi" set="CDSP">https://data.sciencespo.fr/oai</request>
	<ListRecords>
		<record>
			<header>
				<identifier>doi:10.21410/7E4/00LYOG</identifier>
				<datestamp>2020-08-22T00:00:07Z</datestamp>
				<setSpec>CDSP</setSpec>
				<setSpec>ALL_SCPO</setSpec>
			</header>
			<metadata>
				<codeBook
					xmlns="ddi:codebook:2_5"
					xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="ddi:codebook:2_5 https://ddialliance.org/Specification/DDI-Codebook/2.5/XMLSchema/codebook.xsd" version="2.5">
					<docDscr>
						<citation>
							<titlStmt>
								<titl>L'ouvrier français en 1970</titl>
								<IDNo agency="DOI">doi:10.21410/7E4/00LYOG</IDNo>
							</titlStmt>
							<distStmt>
								<distrbtr source="archive">data.sciencespo</distrbtr>
								<distDate>2020-05-05</distDate>
							</distStmt>
							<verStmt source="DVN">
								<version date="2020-05-13" type="RELEASED">2</version>
							</verStmt>
							<biblCit>Adam, Gérard; Bon, Frédéric; Capdevielle, Jacques; Mouriaux, René; Lavau, Georges, 2020, "L'ouvrier français en 1970", https://doi.org/10.21410/7E4/00LYOG, data.sciencespo, V2</biblCit>
						</citation>
					</docDscr>

However it is kept in oai-dc harvesting: <dcterms:language>French</dcterms:language>

Maybe this is two different issues: one about the import that does not save the xml:lang attribute, and another about the dataset language that is not added to the OAI-DDI.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions