diff --git a/conf/solr/4.6.0/schema.xml b/conf/solr/4.6.0/schema.xml index b4fee380844..2577c3b27f4 100644 --- a/conf/solr/4.6.0/schema.xml +++ b/conf/solr/4.6.0/schema.xml @@ -298,22 +298,22 @@ - + - - - + + + - - - + + + @@ -336,7 +336,7 @@ - + @@ -350,7 +350,7 @@ - + @@ -370,9 +370,9 @@ - - - + + + diff --git a/doc/Sphinx/source/User/account.rst b/doc/Sphinx/source/User/account.rst index 257f3212f01..e0d1222e9b7 100644 --- a/doc/Sphinx/source/User/account.rst +++ b/doc/Sphinx/source/User/account.rst @@ -15,8 +15,8 @@ Create User Account Edit Your Account ================== -#. To edit your account, click on your account name in the header on the right hand side. -#. On the right of your account information, click on the "Edit Account" button and from there you can select to edit either your Account Information or your Account Password. +#. To edit your account, click on your account name in the header on the right hand side and click on either Notifications or Data Related to Me. +#. On the top right of your account page, click on the "Edit Account" button and from there you can select to edit either your Account Information or your Account Password. #. Select "Save Changes" when you are done. Notifications: Setup & Maintainance @@ -25,7 +25,8 @@ Notifications appear in the notifications tab on your account page and are also You will receive a notification when: -- You've published a dataverse or dataset -- When another user has published a dataverse or dataset within your dataverse (only if your dataverse allows this)- A user has requested access to restricted dataverses, datasets, or files within your dataverse +- You've created your account +- You've created a dataverse or added a dataset +- More notifications to come! Dataverse will email your unread notifications once a day. Notifications will only be emailed one time even if you haven't read the notification on the Dataverse site. diff --git a/doc/Sphinx/source/User/dataset-management.rst b/doc/Sphinx/source/User/dataset-management.rst index 93737478ab6..bec1b74e65a 100644 --- a/doc/Sphinx/source/User/dataset-management.rst +++ b/doc/Sphinx/source/User/dataset-management.rst @@ -1,43 +1,47 @@ Dataset & File Management +++++++++++++++++++++++++++++ -A Dataset in Dataverse is a container for your data, documentation, code, and the metadata describing this Dataset. +A dataset in Dataverse is a container for your data, documentation, code, and the metadata describing this Dataset. |image1| -Add Dataset +New Dataset ==================== -#. Navigate to the dataverse in which you want to add a Dataset (or in the "root" dataverse). -#. Click on the "Add Data" button and select "Add Dataset" in the dropdown menu. +#. Navigate to the dataverse in which you want to add a dataset (or in the "root" dataverse). +#. Click on the "Add Data" button and select "New Dataset" in the dropdown menu. #. To quickly get started, enter at minimum all the required fields with an asterisk (e.g., the Dataset Title, Author, Description, etc) #. Scroll down to "Files" tab and click on "Select Files to Add" to add all the relevant files to your Dataset. You can also upload your files directly from your Dropbox. **Tip:** You can drag and drop your files from your desktop, directly into the upload widget. Your file will appear below the "Select Files to Add" button where you can add a description of the file. -#. Click the "Create Dataset" button when you are done. Your unpublished Dataset is now created. +#. Click the "Add Dataset" button when you are done. Your unpublished dataset is now created. -Note: You can add additional metadata once you have completed the initial Dataset creation by going to Edit Dataset. +Note: You can add additional metadata once you have completed the initial dataset creation by going to Edit Dataset. Edit Dataset ================== -Go to your Dataset page and click on the "Edit Dataset" button. There you will have two options where you can either edit: +Go to your dataset page and click on the "Edit Dataset" button. There you will have two options where you can either edit: -- Files (Upload or Edit Data): to add or edit files in this Dataset. +- Files (Upload or Edit Data): to add or edit files in this dataset. - Metadata: to add/edit metadata including additional metadata than was not previously available during Dataset Creation. Publish Dataset ==================== -When you publish a Dataset, you make it available to the public. Users can -browse it or search for it. Once your Dataset is ready to go public, go to your Dataset page, -click on the "Unpublished" button on the right hand side of the page which should indicate: -"This Dataset is unpublished. To publish it click 'Publish Dataset' link." +When you publish a dataset, you make it available to the public so that other users can +browse or search for it. Once your dataset is ready to go public, go to your dataset page and click on the +"Publish" button on the right hand side of the page. A pop-up will appear to confirm that you are ready to actually +Publish since once a dataset is made public it can no longer be unpublished. + +Whenever you edit your dataset, you are able to publish a new version of the dataset. The publish dataset button will reappear whenever you edit the metadata of the dataset or add a file. + +Note: Prior to publishing your dataset the Data Citation will indicate that this is a draft but the "DRAFT VERSION" text +will be removed as soon as you Publish. -Important Note: Once a Dataset is published it **cannot be unpublished**; it can be archived instead. .. |image1| image:: ./img/DatasetDiagram.png diff --git a/doc/Sphinx/source/User/dataverse-management.rst b/doc/Sphinx/source/User/dataverse-management.rst index d98872e71a5..c920d5f54ab 100644 --- a/doc/Sphinx/source/User/dataverse-management.rst +++ b/doc/Sphinx/source/User/dataverse-management.rst @@ -44,12 +44,9 @@ where you will be presented with the following editing options. Publish Your Dataverse ================================================================= -Once your dataverse is ready to go public, go to your dataverse page, click on the "Unpublished" button on the right -hand side of the page which should indicate: -"This dataverse is Unpublished. To publish it click 'Publish dataverse' link." Once you click "Publish dataverse" it -will be made public. - -**Important Note**: Once a dataverse is made public it can no longer be unpublished. +Once your dataverse is ready to go public, go to your dataverse page, click on the "Publish" button on the right +hand side of the page. A pop-up will appear to confirm that you are ready to actually Publish since once a dataverse +is made public it can no longer be unpublished. .. |image1| image:: ./img/Dataverse-Diagram.png diff --git a/doc/Sphinx/source/User/find-use-data.rst b/doc/Sphinx/source/User/find-use-data.rst index 50885016302..096dc5d9fb2 100644 --- a/doc/Sphinx/source/User/find-use-data.rst +++ b/doc/Sphinx/source/User/find-use-data.rst @@ -40,6 +40,12 @@ To perform an advanced search, click the Advanced Search link next to the search **Advanced Search fields** +*Dataverses:* + +- Name - The project, department, university, or professor this Dataverse will contain data for. +- Affiliation - The organization with which this Dataverse is affiliated. +- Description - A summary describing the purpose, nature, or scope of this Dataverse. + *Citation Metadata:* - Title - Full title by which the dataset is known. @@ -84,6 +90,11 @@ To perform an advanced search, click the Advanced Search link next to the search - Organism - The taxonomic name of the organism used in a study or from which the starting biological material derives. - Cell Type - The name of the cell line from which the source or sample derives. +*Files:* + +- Name - Full name by which the file is known. +- Description - A summary describing the file, variables, or type. +- File Type - The extension for the file, for example, JPEG, PNG, dta, etc Browsing Dataverse -------------------- diff --git a/scripts/api/data/metadatablocks/astrophysics.tsv b/scripts/api/data/metadatablocks/astrophysics.tsv index 0e64f4fc8cc..79fde5ab4b6 100644 --- a/scripts/api/data/metadatablocks/astrophysics.tsv +++ b/scripts/api/data/metadatablocks/astrophysics.tsv @@ -7,16 +7,16 @@ astroObject Object Astronomical Objects represented in the data (Given as SIMBAD recognizable names preferred). text 3 TRUE FALSE TRUE TRUE FALSE FALSE astrophysics resolution.Spatial Spatial Resolution The spatial (angular) resolution that is typical of the observations, in decimal degrees. float 4 TRUE FALSE TRUE TRUE FALSE FALSE astrophysics resolution.Spectral Spectral Resolution The spectral resolution that is typical of the observations, given as the ratio λ/Δλ. float 5 TRUE FALSE TRUE TRUE FALSE FALSE astrophysics - resolution.Temporal Time Resolution The temporal resolution that is typical of the observations, given in seconds. float 6 FALSE FALSE FALSE FALSE FALSE FALSE astrophysics + resolution.Temporal Time Resolution The temporal resolution that is typical of the observations, given in seconds. float 6 FALSE FALSE TRUE FALSE FALSE FALSE astrophysics coverage.Spectral.Bandpass Bandpass Conventional bandpass name text 7 TRUE TRUE TRUE TRUE FALSE FALSE astrophysics coverage.Spectral.CentralWavelength Central Wavelength (m) The central wavelength of the spectral bandpass, in meters. float 8 TRUE FALSE TRUE TRUE FALSE FALSE astrophysics coverage.Spectral.Wavelength Wavelength Range The minimum and maximum wavelength of the spectral bandpass. float 9 FALSE FALSE TRUE FALSE FALSE FALSE astrophysics coverage.Spectral.MinimumWavelength Minimum (m) The minimum wavelength of the spectral bandpass, in meters. float 10 TRUE FALSE FALSE TRUE FALSE FALSE coverage.Spectral.Wavelength astrophysics coverage.Spectral.MaximumWavelength Maximum (m) The maximum wavelength of the spectral bandpass, in meters. float 11 TRUE FALSE FALSE TRUE FALSE FALSE coverage.Spectral.Wavelength astrophysics - coverage.Temporal Dataset Date Range Time period covered by the data. date 12 TRUE FALSE TRUE FALSE FALSE FALSE astrophysics - coverage.Temporal.StartTime Start Dataset Start Date YYYY-MM-DD date 13 FALSE FALSE FALSE TRUE FALSE FALSE coverage.Temporal astrophysics - coverage.Temporal.StopTime End Dataset End Date YYYY-MM-DD date 14 FALSE FALSE FALSE TRUE FALSE FALSE coverage.Temporal astrophysics - coverage.Spatial Sky Coverage The sky coverage of the data object. text 15 FALSE FALSE FALSE FALSE FALSE FALSE astrophysics + coverage.Temporal Dataset Date Range Time period covered by the data. date 12 FALSE FALSE TRUE FALSE FALSE FALSE astrophysics + coverage.Temporal.StartTime Start Dataset Start Date YYYY-MM-DD date 13 TRUE FALSE FALSE TRUE FALSE FALSE coverage.Temporal astrophysics + coverage.Temporal.StopTime End Dataset End Date YYYY-MM-DD date 14 TRUE FALSE FALSE TRUE FALSE FALSE coverage.Temporal astrophysics + coverage.Spatial Sky Coverage The sky coverage of the data object. text 15 FALSE FALSE TRUE FALSE FALSE FALSE astrophysics coverage.Depth Depth Coverage The (typical) depth coverage, or sensitivity, of the data object in Jy. float 16 FALSE FALSE FALSE FALSE FALSE FALSE astrophysics coverage.ObjectDensity Object Density The (typical) density of objects, catalog entries, telescope pointings, etc., on the sky, in number per square degree. float 17 FALSE FALSE FALSE FALSE FALSE FALSE astrophysics coverage.ObjectCount Object Count The total number of objects, catalog entries, etc., in the data object. int 18 FALSE FALSE FALSE FALSE FALSE FALSE astrophysics @@ -51,4 +51,4 @@ astroType Object 20 astroType Value 21 astroType ValuePair 22 - astroType Survey 23 + astroType Survey 23 \ No newline at end of file diff --git a/scripts/api/data/metadatablocks/citation.tsv b/scripts/api/data/metadatablocks/citation.tsv index 652f8c155cc..d4cbbc6dc81 100644 --- a/scripts/api/data/metadatablocks/citation.tsv +++ b/scripts/api/data/metadatablocks/citation.tsv @@ -5,7 +5,7 @@ author Author The person(s), corporate body(ies), or agency(ies) responsible for creating the work. text 1 FALSE FALSE TRUE FALSE TRUE FALSE citation authorName Name The author's Family Name, Given Name or the name of the organization responsible for this dataset. FamilyName, GivenName or Organization text 2 TRUE FALSE FALSE TRUE TRUE TRUE author citation authorAffiliation Affiliation The organization with which the author is affiliated. text 3 TRUE TRUE FALSE TRUE TRUE FALSE author citation - distributorContact Contact E-mail The e-mail address(es) of the contact(s) for the Dataset. This will not be displayed. text 4 FALSE FALSE TRUE FALSE TRUE TRUE citation + distributorContact Contact E-mail The e-mail address(es) of the contact(s) for the Dataset. This will not be displayed. email 4 FALSE FALSE TRUE FALSE TRUE TRUE citation dsDescription Description A summary describing the purpose, nature, and scope of the Dataset. textbox 5 TRUE FALSE FALSE FALSE TRUE TRUE citation keyword Keyword Key terms that describe important aspects of the Dataset. text 6 TRUE FALSE TRUE TRUE TRUE FALSE citation subject Subject Domain-specific Subjects that are topically relevant to the Dataset. text 7 TRUE TRUE TRUE TRUE TRUE TRUE citation @@ -82,4 +82,4 @@ contributorType Sponsor 14 contributorType Supervisor 15 contributorType Work Package Leader 16 - contributorType Other 17 + contributorType Other 17 \ No newline at end of file diff --git a/scripts/api/setup-dvs.sh b/scripts/api/setup-dvs.sh index 6afa21d80a5..4294813eccc 100755 --- a/scripts/api/setup-dvs.sh +++ b/scripts/api/setup-dvs.sh @@ -30,4 +30,6 @@ curl -H "Content-type:application/json" -X POST -d @data/role-guest.json "http:/ curl -H "Content-type:application/json" -X POST -d"{\"userName\":\"__GUEST__\",\"roleAlias\":\"guest-role\"}" http://localhost:8080/api/dvs/root/assignments/?key=pete echo Set the metadata block for Root -curl -X POST -H "Content-type:application/json" -d "[\"citation\"]" http://localhost:8080/api/dvs/:root/metadatablocks/?key=pete \ No newline at end of file +curl -X POST -H "Content-type:application/json" -d "[\"citation\"]" http://localhost:8080/api/dvs/:root/metadatablocks/?key=pete + + diff --git a/scripts/database/homebrew/rebuild-and-test b/scripts/database/homebrew/rebuild-and-test index 6eb8f4ba013..57d97b07469 100755 --- a/scripts/database/homebrew/rebuild-and-test +++ b/scripts/database/homebrew/rebuild-and-test @@ -12,3 +12,4 @@ sleep 15 scripts/database/homebrew/run-post-create-post-deploy scripts/search/tests/permissions scripts/search/tests/delete-dataverse +scripts/search/tests/query-unparseable diff --git a/scripts/search/tests/dataset-versioning02 b/scripts/search/tests/dataset-versioning02 index 4a3e856f932..5a5e826bf44 100755 --- a/scripts/search/tests/dataset-versioning02 +++ b/scripts/search/tests/dataset-versioning02 @@ -7,6 +7,21 @@ # # To do this, you must also release the Trees and Root dataverses # raw version +# +# the expected output in server.log is something like this: +# +#deleteDraftDatasetVersionResult: Attempted to delete draft dataset_17_draft from Solr index. updateReponse was: {responseHeader={status=0,QTime=1}}, deleteDraftFilesResults: , indexReleasedVersionResult:indexed dataset 17 as dataset_17 +#indexFilesResults for dataset_17, filesIndexed: [datafile_18], +#rationale: +#version found with database id 1 +#- title: Rings of Trees and Other Observations +#- semanticVersion-STATE: 1.0-RELEASED +#- isWorkingCopy: false +#- isReleased: true +#- files: 1 [18:trees.png] +#The latest version is not a working copy (latestVersionState: RELEASED) and will be indexed as dataset_17 (visible by anonymous) and we will be deleting dataset_17_draft and its files (if any, num:1): [datafile_18_draft] +#The released version is 1.0 (releasedVersionState: RELEASED) and will be (again) indexed as dataset_17 (visible by anonymous)]] +# #diff <(curl -s 'http://localhost:8080/api/datasets/17/versions/1?key=pete') scripts/search/tests/expected/dataset-versioning02raw # anon can now see the dataset and file and parents diff -u <(curl -s 'http://localhost:8080/api/search?q=trees&showrelevance=true') scripts/search/tests/expected/dataset-versioning02anon diff --git a/scripts/search/tests/dataset-versioning03 b/scripts/search/tests/dataset-versioning03 index 7b341ab4d38..177593f17f8 100755 --- a/scripts/search/tests/dataset-versioning03 +++ b/scripts/search/tests/dataset-versioning03 @@ -6,6 +6,25 @@ # to # Title: Rings of Conifers and Other Observations # +# expected log information for indexing while in this state: +#indexFilesResults for dataset_17_draft, filesIndexed: [datafile_18_draft], indexReleasedVersionResult:indexed dataset 17 as dataset_17 +#indexFilesResults for dataset_17, filesIndexed: [datafile_18], +#rationale: +#version found with database id 2 +#- title: Rings of Conifers and Other Observations +#- semanticVersion-STATE: null.null-DRAFT +#- isWorkingCopy: true +#- isReleased: false +#- files: 1 [18:trees.png] +#version found with database id 1 +#- title: Rings of Trees and Other Observations +#- semanticVersion-STATE: 1.0-RELEASED +#- isWorkingCopy: false +#- isReleased: true +#- files: 1 [18:trees.png] +#The latest version is a working copy (latestVersionState: DRAFT) and will be indexed as dataset_17_draft (only visible by creator) +#The released version is 1.0 (releasedVersionState: RELEASED) and will be indexed as dataset_17 (visible by anonymous) +# # anon should be able to see the published 1.0 version but not the new draft (no change from dataset-versioning02anon) diff -u <(curl -s 'http://localhost:8080/api/search?q=trees&showrelevance=true') scripts/search/tests/expected/dataset-versioning02anon # pete should be able to see both published and unpublished by not specifying either diff --git a/scripts/search/tests/dataset-versioning04 b/scripts/search/tests/dataset-versioning04 index 5ae34c7248c..bc5ea3ff8aa 100755 --- a/scripts/search/tests/dataset-versioning04 +++ b/scripts/search/tests/dataset-versioning04 @@ -17,6 +17,27 @@ # The new file should be named trees2.png and have a description of # "Another tree image." # +# expected info in logs for indexing: +# +#indexDraftResult:indexed dataset 17 as dataset_17_draft +#indexFilesResults for dataset_17_draft, filesIndexed: [datafile_18_draft, datafile_19_draft], indexReleasedVersionResult:indexed dataset 17 as dataset_17 +#indexFilesResults for dataset_17, filesIndexed: [datafile_18], +#rationale: +#version found with database id 2 +#- title: Rings of Conifers and Other Observations +#- semanticVersion-STATE: null.null-DRAFT +#- isWorkingCopy: true +#- isReleased: false +#- files: 2 [18:trees.png, 19:trees2.png] +#version found with database id 1 +#- title: Rings of Trees and Other Observations +#- semanticVersion-STATE: 1.0-RELEASED +#- isWorkingCopy: false +#- isReleased: true +#- files: 1 [18:trees.png] +#The latest version is a working copy (latestVersionState: DRAFT) and will be indexed as dataset_17_draft (only visible by creator) +#The released version is 1.0 (releasedVersionState: RELEASED) and will be indexed as dataset_17 (visible by anonymous) +# # anon should be able to see the published 1.0 version but not the new draft and not the new file # (no change from dataset-versioning02anon) diff -u <(curl -s 'http://localhost:8080/api/search?q=trees&showrelevance=true') scripts/search/tests/expected/dataset-versioning02anon diff --git a/scripts/search/tests/dataset-versioning05 b/scripts/search/tests/dataset-versioning05 index ff8bc9a52d5..a00a71339f4 100755 --- a/scripts/search/tests/dataset-versioning05 +++ b/scripts/search/tests/dataset-versioning05 @@ -15,10 +15,11 @@ # to # "The first picture of trees I uploaded." # +# (logging for indexing is unchanged from output in dataset-versioning04) +# # anon should be able to see the published 1.0 version but not the new draft and not the new file # and not the change in description # (no change from dataset-versioning02anon) diff -u <(curl -s 'http://localhost:8080/api/search?q=trees&showrelevance=true') scripts/search/tests/expected/dataset-versioning02anon -# What about pete? should he see multiple cards for the two versions -# (with different descriptions) of 18:trees.png? Right now there is only one -# card per file and for published files it always shows the published information. +# pete should be able to see published and draft versions of the dataset and files +diff -u <(curl -s 'http://localhost:8080/api/search?q=trees&showrelevance=true&key=pete') scripts/search/tests/expected/dataset-versioning05pete diff --git a/scripts/search/tests/dataset-versioning06 b/scripts/search/tests/dataset-versioning06 new file mode 100755 index 00000000000..1c1f70c3720 --- /dev/null +++ b/scripts/search/tests/dataset-versioning06 @@ -0,0 +1,35 @@ +#!/bin/bash +# We assume you've done everything in scripts/search/tests/dataset-versioning05 +# +# Now we publish the version that is post 1.0 but unpublished and has +# two files (the description was changed on trees.png) +# Title: Rings of Conifers and Other Observations +# files: 18:trees.png, 19:trees2.png +# +# the expected index logging should be something like this: +# +#deleteDraftDatasetVersionResult: Attempted to delete draft dataset_17_draft from Solr index. updateRep +#onse was: {responseHeader={status=0,QTime=0}}, deleteDraftFilesResults: Attempted to delete draft datafile_19_draft from Solr index. updateReponse +# was: {responseHeader={status=0,QTime=0}}Attempted to delete draft datafile_18_draft from Solr index. updateReponse was: {responseHeader={status=0 +# ,QTime=0}}, indexReleasedVersionResult:indexed dataset 17 as dataset_17 +# indexFilesResults for dataset_17, filesIndexed: [datafile_19, datafile_18], +# rationale: +# version found with database id 2 +# - title: Rings of Conifers and Other Observations +# - semanticVersion-STATE: 2.0-RELEASED +# - isWorkingCopy: false +# - isReleased: true +# - files: 2 [19:trees2.png, 18:trees.png] +# version found with database id 1 +# - title: Rings of Trees and Other Observations +# - semanticVersion-STATE: 1.0-RELEASED +# - isWorkingCopy: false +# - isReleased: true +# - files: 1 [18:trees.png] +# The latest version is not a working copy (latestVersionState: RELEASED) and will be indexed as dataset_17 (visible by anonymous) and we will be de +# leting dataset_17_draft and its files (if any, num:2): [datafile_19_draft, datafile_18_draft] +# +# anon should be able to see the published 2.0 version +diff -u <(curl -s 'http://localhost:8080/api/search?q=trees&showrelevance=true') scripts/search/tests/expected/dataset-versioning06anon +# pete should see only the published 2.0 version too +diff -u <(curl -s 'http://localhost:8080/api/search?q=trees&showrelevance=true&key=pete') scripts/search/tests/expected/dataset-versioning06pete diff --git a/scripts/search/tests/expected/dataset-versioning03pete-both b/scripts/search/tests/expected/dataset-versioning03pete-both index 1927c45f7c9..4dff25154e2 100644 --- a/scripts/search/tests/expected/dataset-versioning03pete-both +++ b/scripts/search/tests/expected/dataset-versioning03pete-both @@ -3,8 +3,8 @@ "q":"trees", "fq_provided":"[]", "fq_actual":"[({!join from=groups_s to=perms_ss}id:group_public OR {!join from=groups_s to=perms_ss}id:group_user1)]", - "total_count":6, + "total_count":7, "start":0, - "count_in_response":6, - "items":"[datafile_18:trees.png:18, dataset_17:Rings of Trees and Other Observations:17, dataset_17_draft:Rings of Conifers and Other Observations:17, dataverse_10:Birds:10, dataverse_11:Trees:11, dataverse_16:Chestnut Trees:16]" + "count_in_response":7, + "items":"[datafile_18:trees.png:18, datafile_18_draft:trees.png:18, dataset_17:Rings of Trees and Other Observations:17, dataset_17_draft:Rings of Conifers and Other Observations:17, dataverse_10:Birds:10, dataverse_11:Trees:11, dataverse_16:Chestnut Trees:16]" } \ No newline at end of file diff --git a/scripts/search/tests/expected/dataset-versioning04pete b/scripts/search/tests/expected/dataset-versioning04pete index 0b718de71a4..b7f358d471a 100644 --- a/scripts/search/tests/expected/dataset-versioning04pete +++ b/scripts/search/tests/expected/dataset-versioning04pete @@ -3,8 +3,8 @@ "q":"trees", "fq_provided":"[]", "fq_actual":"[({!join from=groups_s to=perms_ss}id:group_public OR {!join from=groups_s to=perms_ss}id:group_user1)]", - "total_count":7, + "total_count":8, "start":0, - "count_in_response":7, - "items":"[datafile_18:trees.png:18, datafile_19:trees2.png:19, dataset_17:Rings of Trees and Other Observations:17, dataset_17_draft:Rings of Conifers and Other Observations:17, dataverse_10:Birds:10, dataverse_11:Trees:11, dataverse_16:Chestnut Trees:16]" + "count_in_response":8, + "items":"[datafile_18:trees.png:18, datafile_18_draft:trees.png:18, datafile_19_draft:trees2.png:19, dataset_17:Rings of Trees and Other Observations:17, dataset_17_draft:Rings of Conifers and Other Observations:17, dataverse_10:Birds:10, dataverse_11:Trees:11, dataverse_16:Chestnut Trees:16]" } \ No newline at end of file diff --git a/scripts/search/tests/expected/dataset-versioning05pete b/scripts/search/tests/expected/dataset-versioning05pete new file mode 100644 index 00000000000..1e6362ea3b7 --- /dev/null +++ b/scripts/search/tests/expected/dataset-versioning05pete @@ -0,0 +1,175 @@ + +{ + "q":"trees", + "fq_provided":"[]", + "fq_actual":"[({!join from=groups_s to=perms_ss}id:group_public OR {!join from=groups_s to=perms_ss}id:group_user1)]", + "total_count":8, + "start":0, + "count_in_response":8, + "items":"[datafile_18:trees.png:18, datafile_18_draft:trees.png:18, datafile_19_draft:trees2.png:19, dataset_17:Rings of Trees and Other Observations:17, dataset_17_draft:Rings of Conifers and Other Observations:17, dataverse_10:Birds:10, dataverse_11:Trees:11, dataverse_16:Chestnut Trees:16]", + "relevance":[ + { + "id":"datafile_18", + "matched_fields":"[description, filename_without_extension_en]", + "detailsArray":[ + { + "description":[ + "Trees are lovely." + ] + }, + { + "filename_without_extension_en":[ + "trees" + ] + } + ] + }, + { + "id":"datafile_18_draft", + "matched_fields":"[description, filename_without_extension_en]", + "detailsArray":[ + { + "description":[ + "The first picture of trees I uploaded." + ] + }, + { + "filename_without_extension_en":[ + "trees" + ] + } + ] + }, + { + "id":"datafile_19_draft", + "matched_fields":"[description]", + "detailsArray":[ + { + "description":[ + "Another tree image." + ] + } + ] + }, + { + "id":"dataset_17", + "matched_fields":"[dsDescription, title, notesText, authorAffiliation, authorName, keyword, contributorName]", + "detailsArray":[ + { + "dsDescription":[ + "Trees have rings. Trees can be tall." + ] + }, + { + "title":[ + "Rings of Trees and Other Observations" + ] + }, + { + "notesText":[ + "Many notes have been taken about trees over the years." + ] + }, + { + "authorAffiliation":[ + "Trees Inc." + ] + }, + { + "authorName":[ + "Tree, Tony" + ] + }, + { + "keyword":[ + "trees" + ] + }, + { + "contributorName":[ + "Edward Trees Jr." + ] + } + ] + }, + { + "id":"dataset_17_draft", + "matched_fields":"[dsDescription, notesText, authorAffiliation, authorName, keyword, contributorName]", + "detailsArray":[ + { + "dsDescription":[ + "Trees have rings. Trees can be tall." + ] + }, + { + "notesText":[ + "Many notes have been taken about trees over the years." + ] + }, + { + "authorAffiliation":[ + "Trees Inc." + ] + }, + { + "authorName":[ + "Tree, Tony" + ] + }, + { + "keyword":[ + "trees" + ] + }, + { + "contributorName":[ + "Edward Trees Jr." + ] + } + ] + }, + { + "id":"dataverse_10", + "matched_fields":"[description]", + "detailsArray":[ + { + "description":[ + "A bird dataverse with some trees" + ] + } + ] + }, + { + "id":"dataverse_11", + "matched_fields":"[description, name]", + "detailsArray":[ + { + "description":[ + "A tree dataverse with some birds" + ] + }, + { + "name":[ + "Trees" + ] + } + ] + }, + { + "id":"dataverse_16", + "matched_fields":"[description, name]", + "detailsArray":[ + { + "description":[ + "A dataverse with chestnut trees and an oriole" + ] + }, + { + "name":[ + "Chestnut Trees" + ] + } + ] + } + ] +} \ No newline at end of file diff --git a/scripts/search/tests/expected/dataset-versioning06anon b/scripts/search/tests/expected/dataset-versioning06anon new file mode 100644 index 00000000000..9182140e372 --- /dev/null +++ b/scripts/search/tests/expected/dataset-versioning06anon @@ -0,0 +1,91 @@ + +{ + "q":"trees", + "fq_provided":"[]", + "fq_actual":"[{!join from=groups_s to=perms_ss}id:group_public]", + "total_count":4, + "start":0, + "count_in_response":4, + "items":"[datafile_18:trees.png:18, datafile_19:trees2.png:19, dataset_17:Rings of Conifers and Other Observations:17, dataverse_11:Trees:11]", + "relevance":[ + { + "id":"datafile_18", + "matched_fields":"[description, filename_without_extension_en]", + "detailsArray":[ + { + "description":[ + "The first picture of trees I uploaded." + ] + }, + { + "filename_without_extension_en":[ + "trees" + ] + } + ] + }, + { + "id":"datafile_19", + "matched_fields":"[description]", + "detailsArray":[ + { + "description":[ + "Another tree image." + ] + } + ] + }, + { + "id":"dataset_17", + "matched_fields":"[dsDescription, notesText, authorAffiliation, authorName, keyword, contributorName]", + "detailsArray":[ + { + "dsDescription":[ + "Trees have rings. Trees can be tall." + ] + }, + { + "notesText":[ + "Many notes have been taken about trees over the years." + ] + }, + { + "authorAffiliation":[ + "Trees Inc." + ] + }, + { + "authorName":[ + "Tree, Tony" + ] + }, + { + "keyword":[ + "trees" + ] + }, + { + "contributorName":[ + "Edward Trees Jr." + ] + } + ] + }, + { + "id":"dataverse_11", + "matched_fields":"[description, name]", + "detailsArray":[ + { + "description":[ + "A tree dataverse with some birds" + ] + }, + { + "name":[ + "Trees" + ] + } + ] + } + ] +} \ No newline at end of file diff --git a/scripts/search/tests/expected/dataset-versioning06pete b/scripts/search/tests/expected/dataset-versioning06pete new file mode 100644 index 00000000000..2a69d6461e8 --- /dev/null +++ b/scripts/search/tests/expected/dataset-versioning06pete @@ -0,0 +1,118 @@ + +{ + "q":"trees", + "fq_provided":"[]", + "fq_actual":"[({!join from=groups_s to=perms_ss}id:group_public OR {!join from=groups_s to=perms_ss}id:group_user1)]", + "total_count":6, + "start":0, + "count_in_response":6, + "items":"[datafile_18:trees.png:18, datafile_19:trees2.png:19, dataset_17:Rings of Conifers and Other Observations:17, dataverse_10:Birds:10, dataverse_11:Trees:11, dataverse_16:Chestnut Trees:16]", + "relevance":[ + { + "id":"datafile_18", + "matched_fields":"[description, filename_without_extension_en]", + "detailsArray":[ + { + "description":[ + "The first picture of trees I uploaded." + ] + }, + { + "filename_without_extension_en":[ + "trees" + ] + } + ] + }, + { + "id":"datafile_19", + "matched_fields":"[description]", + "detailsArray":[ + { + "description":[ + "Another tree image." + ] + } + ] + }, + { + "id":"dataset_17", + "matched_fields":"[dsDescription, notesText, authorAffiliation, authorName, keyword, contributorName]", + "detailsArray":[ + { + "dsDescription":[ + "Trees have rings. Trees can be tall." + ] + }, + { + "notesText":[ + "Many notes have been taken about trees over the years." + ] + }, + { + "authorAffiliation":[ + "Trees Inc." + ] + }, + { + "authorName":[ + "Tree, Tony" + ] + }, + { + "keyword":[ + "trees" + ] + }, + { + "contributorName":[ + "Edward Trees Jr." + ] + } + ] + }, + { + "id":"dataverse_10", + "matched_fields":"[description]", + "detailsArray":[ + { + "description":[ + "A bird dataverse with some trees" + ] + } + ] + }, + { + "id":"dataverse_11", + "matched_fields":"[description, name]", + "detailsArray":[ + { + "description":[ + "A tree dataverse with some birds" + ] + }, + { + "name":[ + "Trees" + ] + } + ] + }, + { + "id":"dataverse_16", + "matched_fields":"[description, name]", + "detailsArray":[ + { + "description":[ + "A dataverse with chestnut trees and an oriole" + ] + }, + { + "name":[ + "Chestnut Trees" + ] + } + ] + } + ] +} \ No newline at end of file diff --git a/scripts/search/tests/expected/highlighting-nick-trees b/scripts/search/tests/expected/highlighting-nick-trees index 87091ece143..13a9c0beea9 100644 --- a/scripts/search/tests/expected/highlighting-nick-trees +++ b/scripts/search/tests/expected/highlighting-nick-trees @@ -6,10 +6,10 @@ "total_count":5, "start":0, "count_in_response":5, - "items":"[datafile_18:trees.png:18, dataset_17_draft:Rings of Trees and Other Observations:17, dataverse_10:Birds:10, dataverse_11:Trees:11, dataverse_16:Chestnut Trees:16]", + "items":"[datafile_18_draft:trees.png:18, dataset_17_draft:Rings of Trees and Other Observations:17, dataverse_10:Birds:10, dataverse_11:Trees:11, dataverse_16:Chestnut Trees:16]", "relevance":[ { - "id":"datafile_18", + "id":"datafile_18_draft", "matched_fields":"[description, filename_without_extension_en]", "detailsArray":[ { diff --git a/scripts/search/tests/expected/highlighting-pete-trees b/scripts/search/tests/expected/highlighting-pete-trees index 9564910dd0a..56a05be8be4 100644 --- a/scripts/search/tests/expected/highlighting-pete-trees +++ b/scripts/search/tests/expected/highlighting-pete-trees @@ -6,10 +6,10 @@ "total_count":5, "start":0, "count_in_response":5, - "items":"[datafile_18:trees.png:18, dataset_17_draft:Rings of Trees and Other Observations:17, dataverse_10:Birds:10, dataverse_11:Trees:11, dataverse_16:Chestnut Trees:16]", + "items":"[datafile_18_draft:trees.png:18, dataset_17_draft:Rings of Trees and Other Observations:17, dataverse_10:Birds:10, dataverse_11:Trees:11, dataverse_16:Chestnut Trees:16]", "relevance":[ { - "id":"datafile_18", + "id":"datafile_18_draft", "matched_fields":"[description, filename_without_extension_en]", "detailsArray":[ { diff --git a/scripts/search/tests/expected/query-unparseable b/scripts/search/tests/expected/query-unparseable new file mode 100644 index 00000000000..9a87780a96c --- /dev/null +++ b/scripts/search/tests/expected/query-unparseable @@ -0,0 +1,11 @@ + +{ + "q":":", + "fq_provided":"[]", + "fq_actual":"[]", + "total_count":0, + "start":0, + "count_in_response":0, + "items":"[]", + "error":"Trouble parsing query? org.apache.solr.search.SyntaxError: Cannot parse ':': Encountered \" \":\" \": \"\" at line 1, column 0.\nWas expecting one of:\n ...\n \"+\" ...\n \"-\" ...\n ...\n \"(\" ...\n \"*\" ...\n ...\n ...\n ...\n ...\n ...\n \"[\" ...\n \"{\" ...\n ...\n ...\n ...\n \"*\" ...\n " +} \ No newline at end of file diff --git a/scripts/search/tests/expected/solr-down b/scripts/search/tests/expected/solr-down new file mode 100644 index 00000000000..abe16b49a9c --- /dev/null +++ b/scripts/search/tests/expected/solr-down @@ -0,0 +1,11 @@ + +{ + "q":"*", + "fq_provided":"[]", + "fq_actual":"[]", + "total_count":0, + "start":0, + "count_in_response":0, + "items":"[]", + "error":"Internal error: Server refused connection at: http://localhost:8983/solr" +} \ No newline at end of file diff --git a/scripts/search/tests/highlighting b/scripts/search/tests/highlighting index bf01ee1e6c1..1d76815f4b7 100755 --- a/scripts/search/tests/highlighting +++ b/scripts/search/tests/highlighting @@ -31,7 +31,5 @@ # We assume you add a file called "trees.png" to this dataset # with a description of "Trees are lovely." # -# Then run an "index all" until this is fixed: https://redmine.hmdc.harvard.edu/issues/3809 -# curl http://localhost:8080/api/index diff <(curl -s 'http://localhost:8080/api/search?q=trees&showrelevance=true&key=nick') scripts/search/tests/expected/highlighting-nick-trees diff <(curl -s 'http://localhost:8080/api/search?q=trees&showrelevance=true&key=pete') scripts/search/tests/expected/highlighting-pete-trees diff --git a/scripts/search/tests/query-unparseable b/scripts/search/tests/query-unparseable new file mode 100755 index 00000000000..1a4f2e708d7 --- /dev/null +++ b/scripts/search/tests/query-unparseable @@ -0,0 +1,2 @@ +#!/bin/bash +diff <(curl -s 'http://localhost:8080/api/search?q=:') scripts/search/tests/expected/query-unparseable diff --git a/scripts/search/tests/solr-down b/scripts/search/tests/solr-down new file mode 100755 index 00000000000..d0a145a4dac --- /dev/null +++ b/scripts/search/tests/solr-down @@ -0,0 +1,2 @@ +#!/bin/bash +diff <(curl -s 'http://localhost:8080/api/search?q=*') scripts/search/tests/expected/solr-down diff --git a/src/main/java/edu/harvard/iq/dataverse/AdvancedSearchPage.java b/src/main/java/edu/harvard/iq/dataverse/AdvancedSearchPage.java index e897611d3cc..1c0a8296949 100644 --- a/src/main/java/edu/harvard/iq/dataverse/AdvancedSearchPage.java +++ b/src/main/java/edu/harvard/iq/dataverse/AdvancedSearchPage.java @@ -11,6 +11,7 @@ import javax.ejb.EJB; import javax.faces.view.ViewScoped; import javax.inject.Named; +import org.apache.commons.lang.StringUtils; @ViewScoped @Named("AdvancedSearchPage") @@ -26,8 +27,8 @@ public class AdvancedSearchPage { private Dataverse dataverse; private List metadataBlocks; - private Map> metadataFieldMap = new HashMap(); - private List metadataFieldList; + private Map> metadataFieldMap = new HashMap(); + private List metadataFieldList; private String dvFieldName; private String dvFieldDescription; private String dvFieldAffiliation; @@ -45,7 +46,7 @@ public void init() { this.metadataFieldList = datasetFieldService.findAllAdvancedSearchFieldTypes(); for (MetadataBlock mdb : metadataBlocks) { - + List dsfTypes = new ArrayList(); for (DatasetFieldType dsfType : metadataFieldList) { if (dsfType.getMetadataBlock().getId().equals(mdb.getId())) { @@ -53,114 +54,96 @@ public void init() { } } metadataFieldMap.put(mdb.getId(), dsfTypes); - } - + } + } public String find() throws IOException { - /* - logger.info("clicked find. author: " + author + ". title: " + title); - List queryStrings = new ArrayList(); - if (title != null && !title.isEmpty()) { - queryStrings.add(SearchFields.TITLE + ":" + title); - } - - if (author != null && !author.isEmpty()) { - queryStrings.add(SearchFields.AUTHOR_STRING + ":" + author); - } - query = new String(); - for (String string : queryStrings) { - query += string + " "; - } - logger.info("query: " + query); */ - StringBuilder queryBuilder = new StringBuilder(); - - String delimiter = "[\"]+"; + List queryStrings = new ArrayList(); + queryStrings.add(constructDataverseQuery()); + queryStrings.add(constructDatasetQuery()); + queryStrings.add(constructFileQuery()); + + return "/dataverse.xhtml?q=" + constructQuery(queryStrings, false, false) + "faces-redirect=true"; + } + + private String constructDatasetQuery() { + List queryStrings = new ArrayList(); for (DatasetFieldType dsfType : metadataFieldList) { - List queryStrings = new ArrayList(); if (dsfType.getSearchValue() != null && !dsfType.getSearchValue().equals("")) { - String myString = dsfType.getSearchValue(); - if (myString.contains("\"")) { - String [] tempString = dsfType.getSearchValue().split(delimiter); - for (int i = 1; i < tempString.length; i++) { - if (!tempString[i].equals(" ") && !tempString[i].isEmpty()) { - queryStrings.add(dsfType.getSolrField().getNameSearchable() + ":" + "\"" + tempString[i].trim() + "\""); - } - } - } else { - StringTokenizer st = new StringTokenizer(dsfType.getSearchValue()); - while (st.hasMoreElements()) { - queryStrings.add(dsfType.getSolrField().getNameSearchable() + ":" + st.nextElement()); - } - } - } else if (dsfType.getListValues() != null && !dsfType.getListValues().isEmpty()){ + queryStrings.add(constructQuery(dsfType.getSolrField().getNameSearchable(), dsfType.getSearchValue())); + } else if (dsfType.getListValues() != null && !dsfType.getListValues().isEmpty()) { + List listQueryStrings = new ArrayList(); for (String value : dsfType.getListValues()) { - queryStrings.add(dsfType.getSolrField().getNameSearchable() + ":" + "\"" + value + "\""); + listQueryStrings.add(dsfType.getSolrField().getNameSearchable() + ":" + "\"" + value + "\""); } + queryStrings.add(constructQuery(listQueryStrings, false)); } + } + return constructQuery(queryStrings, true); - if (queryStrings.size() > 0 && queryBuilder.length() > 0 ) { - queryBuilder.append(" AND "); - } - - if (queryStrings.size() > 1) { - queryBuilder.append("("); - } - - for (int i = 0; i < queryStrings.size(); i++) { - if ( i > 0 ) { - queryBuilder.append(" "); - } - queryBuilder.append(queryStrings.get(i)); - } - - if (queryStrings.size() > 1) { - queryBuilder.append(")"); - } - - /** - * @todo: What people really want (we think) is fancy combination - * searches with users typing a little under Dataverses, a little - * under Datasets, and a little under Files and logic would exist - * here to construct and OR (or AND?) query. For now, we reset the - * whole query every time we pass through the if's below. - * - * see also https://redmine.hmdc.harvard.edu/issues/3745 - */ - if (!dvFieldName.isEmpty()) { - queryBuilder = constructQuery(SearchFields.DATAVERSE_NAME, dvFieldName); - } + } - if (!dvFieldAffiliation.isEmpty()) { - queryBuilder = constructQuery(SearchFields.DATAVERSE_AFFILIATION, dvFieldAffiliation); - } + private String constructDataverseQuery() { + List queryStrings = new ArrayList(); + if (!dvFieldName.isEmpty()) { + queryStrings.add(constructQuery(SearchFields.DATAVERSE_NAME, dvFieldName)); + } - if (!dvFieldDescription.isEmpty()) { - queryBuilder = constructQuery(SearchFields.DATAVERSE_DESCRIPTION, dvFieldDescription); - } + if (!dvFieldAffiliation.isEmpty()) { + queryStrings.add(constructQuery(SearchFields.DATAVERSE_AFFILIATION, dvFieldAffiliation)); + } - if (!fileFieldName.isEmpty()) { - queryBuilder = constructQuery(SearchFields.FILE_NAME, fileFieldName); - } + if (!dvFieldDescription.isEmpty()) { + queryStrings.add(constructQuery(SearchFields.DATAVERSE_DESCRIPTION, dvFieldDescription)); + } - if (!fileFieldDescription.isEmpty()) { - queryBuilder = constructQuery(SearchFields.FILE_DESCRIPTION, fileFieldDescription); - } + return constructQuery(queryStrings, true); + } - if (!fileFieldFiletype.isEmpty()) { - queryBuilder = constructQuery(SearchFields.FILE_TYPE_SEARCHABLE, fileFieldFiletype); - } + private String constructFileQuery() { + List queryStrings = new ArrayList(); + if (!fileFieldName.isEmpty()) { + queryStrings.add(constructQuery(SearchFields.FILE_NAME, fileFieldName)); + } + if (!fileFieldDescription.isEmpty()) { + queryStrings.add(constructQuery(SearchFields.FILE_DESCRIPTION, fileFieldDescription)); } - return "/dataverse.xhtml?q=" + queryBuilder.toString().trim() + "faces-redirect=true"; + if (!fileFieldFiletype.isEmpty()) { + queryStrings.add(constructQuery(SearchFields.FILE_TYPE_SEARCHABLE, fileFieldFiletype)); + } + + return constructQuery(queryStrings, true); + } + + private String constructQuery(List queryStrings, boolean isAnd) { + return constructQuery(queryStrings, isAnd, true); } + private String constructQuery(List queryStrings, boolean isAnd, boolean surroundWithParens) { + StringBuilder queryBuilder = new StringBuilder(); + + int count = 0; + for (String string : queryStrings) { + if (!StringUtils.isBlank(string)) { + if (++count > 1) { + queryBuilder.append(isAnd ? " AND " : " OR "); + } + queryBuilder.append(string); + } + } - /** - * @todo have the code that operates on dataset fields call into this? - */ - private StringBuilder constructQuery(String solrField, String userSuppliedQuery) { + if (surroundWithParens && count > 1) { + queryBuilder.insert(0, "("); + queryBuilder.append(")"); + } + + return queryBuilder.toString().trim(); + } + + private String constructQuery(String solrField, String userSuppliedQuery) { StringBuilder queryBuilder = new StringBuilder(); String delimiter = "[\"]+"; @@ -182,23 +165,23 @@ private StringBuilder constructQuery(String solrField, String userSuppliedQuery) } } } - + if (queryStrings.size() > 1) { queryBuilder.append("("); } - + for (int i = 0; i < queryStrings.size(); i++) { if (i > 0) { queryBuilder.append(" "); } queryBuilder.append(queryStrings.get(i)); } - + if (queryStrings.size() > 1) { queryBuilder.append(")"); } - return queryBuilder; + return queryBuilder.toString().trim(); } public Dataverse getDataverse() { diff --git a/src/main/java/edu/harvard/iq/dataverse/DataFile.java b/src/main/java/edu/harvard/iq/dataverse/DataFile.java index 209fe3f0000..de12660f3a4 100644 --- a/src/main/java/edu/harvard/iq/dataverse/DataFile.java +++ b/src/main/java/edu/harvard/iq/dataverse/DataFile.java @@ -45,20 +45,15 @@ public class DataFile extends DvObject { @OneToMany(mappedBy="dataFile", cascade={CascadeType.REMOVE, CascadeType.MERGE, CascadeType.PERSIST}) private List fileMetadatas; - @OneToMany(mappedBy="dataFile", cascade={CascadeType.REMOVE, CascadeType.MERGE, CascadeType.PERSIST}) - private List fileMetadataFieldValues; - private char ingestStatus = INGEST_STATUS_NONE; public DataFile() { this.fileMetadatas = new ArrayList<>(); - fileMetadataFieldValues = new ArrayList<>(); } public DataFile(String contentType) { this.contentType = contentType; this.fileMetadatas = new ArrayList<>(); - fileMetadataFieldValues = new ArrayList<>(); } // The dvObject field "name" should not be used in @@ -69,7 +64,6 @@ public DataFile(String name, String contentType) { this.name = name; this.contentType = contentType; this.fileMetadatas = new ArrayList<>(); - fileMetadataFieldValues = new ArrayList<>(); } public List getDataTables() { @@ -200,14 +194,6 @@ public void setDescription(String description) { } } - public List getFileMetadataFieldValues() { - return fileMetadataFieldValues; - } - - public void setFileMetadataFieldValues(List fileMetadataFieldValues) { - this.fileMetadataFieldValues = fileMetadataFieldValues; - } - public FileMetadata getFileMetadata() { return getLatestFileMetadata(); } diff --git a/src/main/java/edu/harvard/iq/dataverse/Dataset.java b/src/main/java/edu/harvard/iq/dataverse/Dataset.java index d6b09208acc..ee47d9d9c13 100644 --- a/src/main/java/edu/harvard/iq/dataverse/Dataset.java +++ b/src/main/java/edu/harvard/iq/dataverse/Dataset.java @@ -101,7 +101,7 @@ public void setFiles(List files) { private List versions = new ArrayList(); public DatasetVersion getLatestVersion() { - return versions.get(0); + return getVersions().get(0); } public List getVersions() { diff --git a/src/main/java/edu/harvard/iq/dataverse/DatasetAuthor.java b/src/main/java/edu/harvard/iq/dataverse/DatasetAuthor.java index 483e9ff5701..59e4549511f 100644 --- a/src/main/java/edu/harvard/iq/dataverse/DatasetAuthor.java +++ b/src/main/java/edu/harvard/iq/dataverse/DatasetAuthor.java @@ -23,7 +23,6 @@ public int compare(DatasetAuthor o1, DatasetAuthor o2) { } }; - private DatasetVersion datasetVersion; public DatasetVersion getDatasetVersion() { return datasetVersion; diff --git a/src/main/java/edu/harvard/iq/dataverse/DatasetField.java b/src/main/java/edu/harvard/iq/dataverse/DatasetField.java index 8b2397ccd8e..cf73feef459 100644 --- a/src/main/java/edu/harvard/iq/dataverse/DatasetField.java +++ b/src/main/java/edu/harvard/iq/dataverse/DatasetField.java @@ -237,7 +237,7 @@ public List getValues() { public boolean isEmpty() { if (datasetFieldType.isPrimitive()) { // primitive for (String value : getValues()) { - if (value != null && value.trim() != "") { + if (value != null && value.trim().isEmpty() ) { return false; } } diff --git a/src/main/java/edu/harvard/iq/dataverse/DatasetFieldServiceBean.java b/src/main/java/edu/harvard/iq/dataverse/DatasetFieldServiceBean.java index 195ca5c2e5b..efabd59725a 100644 --- a/src/main/java/edu/harvard/iq/dataverse/DatasetFieldServiceBean.java +++ b/src/main/java/edu/harvard/iq/dataverse/DatasetFieldServiceBean.java @@ -10,6 +10,7 @@ import javax.ejb.Stateless; import javax.inject.Named; import javax.persistence.EntityManager; +import javax.persistence.NoResultException; import javax.persistence.PersistenceContext; import javax.persistence.Query; @@ -24,7 +25,7 @@ public class DatasetFieldServiceBean { @PersistenceContext(unitName = "VDCNet-ejbPU") private EntityManager em; - private static final String NAME_QUERY = "SELECT dsfType from DatasetFieldType dsfType where dsfType.name= :fieldName "; + private static final String NAME_QUERY = "SELECT dsfType from DatasetFieldType dsfType where dsfType.name= :fieldName"; private static final String FILEMETA_NAME_QUERY = "SELECT fmf from FileMetadataField fmf where fmf.name= :fieldName "; private static final String FILEMETA_NAME_FORMAT_QUERY = "SELECT fmf from FileMetadataField fmf where fmf.name= :fieldName and fmf.fileFormatName= :fileFormatName "; @@ -53,62 +54,26 @@ public DatasetFieldType findByName(String name) { return dsfType; } - public ControlledVocabularyValue findControlledVocabularyValue(Object pk) { - return (ControlledVocabularyValue) em.find(ControlledVocabularyValue.class, pk); - } - - public List findAvailableFileMetadataFields() { - List fileMetadataFields = null; - fileMetadataFields = (List ) em.createQuery("SELECT fmf from FileMetadataField fmf ORDER by fmf.id").getResultList(); - - return fileMetadataFields; - } - - public List findFileMetadataFieldByName (String name) { - List fmfs = null; - try { - fmfs = (List) em.createQuery(FILEMETA_NAME_QUERY).setParameter("fieldName",name).getResultList(); - } catch (Exception ex) { - // getResultList() can throw an IllegalStateException. - // - we just return null. - return null; - } - // If there are no results, getResultList returns an empty list. - return fmfs; - } - - public FileMetadataField findFileMetadataFieldByNameAndFormat (String fieldName, String formatName) { - FileMetadataField fmf = null; + /** + * Gets the dataset field type, or returns {@code null}. Does not throw exceptions. + * @param name the name do the field type + * @return the field type, or {@code null} + * @see #findByName(java.lang.String) + */ + public DatasetFieldType findByNameOpt( String name ) { try { - Query query = em.createQuery(FILEMETA_NAME_FORMAT_QUERY); - query.setParameter("fieldName", fieldName); - query.setParameter("fileFormatName", formatName); - fmf = (FileMetadataField) query.getSingleResult(); - } catch (Exception ex) { - // getSingleResult() can throw several different exceptions: - // NoResultException, NonUniqueResultException, IllegalStateException... - // - we just return null. - return null; + return em.createNamedQuery("DatasetFieldType.findByName", DatasetFieldType.class) + .setParameter("name", name) + .getSingleResult(); + } catch ( NoResultException nre ) { + return null; } - return fmf; } - public FileMetadataField createFileMetadataField (String fieldName, String formatName) { - FileMetadataField fmf = new FileMetadataField(); - fmf.setName(fieldName); - fmf.setFileFormatName(formatName); - //em.persist(fmf); - //em.flush(); - - return fmf; - } + public ControlledVocabularyValue findControlledVocabularyValue(Object pk) { + return (ControlledVocabularyValue) em.find(ControlledVocabularyValue.class, pk); + } - public void saveFileMetadataField (FileMetadataField fmf) { - em.persist(fmf); - em.flush(); - - } - public DatasetFieldType save(DatasetFieldType dsfType) { return em.merge(dsfType); } diff --git a/src/main/java/edu/harvard/iq/dataverse/DatasetFieldType.java b/src/main/java/edu/harvard/iq/dataverse/DatasetFieldType.java index 5ac89a93b52..0e926d2048b 100644 --- a/src/main/java/edu/harvard/iq/dataverse/DatasetFieldType.java +++ b/src/main/java/edu/harvard/iq/dataverse/DatasetFieldType.java @@ -1,8 +1,3 @@ -/* - * To change this license header, choose License Headers in Project Properties. - * To change this template file, choose Tools | Templates - * and open the template in the editor. - */ package edu.harvard.iq.dataverse; import java.util.Collection; @@ -19,6 +14,10 @@ * * @author Stephen Kraffmiller */ +@NamedQueries({ + @NamedQuery( name="DatasetFieldType.findByName", + query="SELECT dsfType FROM DatasetFieldType dsfType WHERE dsfType.name=:name") +}) @Entity public class DatasetFieldType implements Serializable, Comparable { @@ -343,8 +342,15 @@ public String getDisplayName() { public SolrField getSolrField() { SolrField.SolrType solrType = SolrField.SolrType.TEXT_EN; if (fieldType != null) { - solrType = fieldType.equals("date") ? SolrField.SolrType.INTEGER : SolrField.SolrType.TEXT_EN; - + + /** + * @todo made more decisions based on fieldType: index as dates, + * integers, and floats so we can do range queries etc. + */ + if (fieldType.equals("date")) { + solrType = SolrField.SolrType.DATE; + } + Boolean parentAllowsMultiplesBoolean = false; if (isHasParent()) { if (getParentDatasetFieldType() != null) { @@ -355,8 +361,8 @@ public SolrField getSolrField() { boolean makeSolrFieldMultivalued; // http://stackoverflow.com/questions/5800762/what-is-the-use-of-multivalued-field-type-in-solr - if (solrType == SolrField.SolrType.TEXT_EN) { - makeSolrFieldMultivalued = (allowMultiples || parentAllowsMultiplesBoolean); + if (allowMultiples || parentAllowsMultiplesBoolean) { + makeSolrFieldMultivalued = true; } else { makeSolrFieldMultivalued = false; } diff --git a/src/main/java/edu/harvard/iq/dataverse/DatasetFieldValueValidator.java b/src/main/java/edu/harvard/iq/dataverse/DatasetFieldValueValidator.java index 73dccc3751e..0ca72bff60e 100644 --- a/src/main/java/edu/harvard/iq/dataverse/DatasetFieldValueValidator.java +++ b/src/main/java/edu/harvard/iq/dataverse/DatasetFieldValueValidator.java @@ -54,6 +54,19 @@ public boolean isValid(DatasetFieldValue value, ConstraintValidatorContext conte if (!valid ) { valid = isValidDate(value.getValue(), "yyyy"); } + if (!valid) { + // TODO: + // This is a temporary fix for the early beta! + // (to accommodate dates with time stamps from Astronomy files) + // As a real fix, we need to introduce a different type - + // "datetime" for ex. and use it for timestamps; + // We do NOT want users to be able to enter a full time stamp + // as the release date... + // -- L.A. 4.0 beta + + valid = (isValidDate(value.getValue(), "yyyy-MM-dd'T'HH:mm:ss") || isValidDate(value.getValue(), "yyyy-MM-dd HH:mm:ss")); + + } if (!valid ) { context.buildConstraintViolationWithTemplate(" " + dsfType.getDisplayName() + " is not a valid date.").addConstraintViolation(); return false; @@ -134,5 +147,6 @@ private boolean isValidDate(String dateString, String pattern) { } return valid; } + } diff --git a/src/main/java/edu/harvard/iq/dataverse/DatasetPage.java b/src/main/java/edu/harvard/iq/dataverse/DatasetPage.java index 6e59183392d..85143efc1de 100644 --- a/src/main/java/edu/harvard/iq/dataverse/DatasetPage.java +++ b/src/main/java/edu/harvard/iq/dataverse/DatasetPage.java @@ -91,6 +91,8 @@ public enum DisplayMode { EjbDataverseEngine commandEngine; @Inject DataverseSession session; + @EJB + UserNotificationServiceBean userNotificationService; private Dataset dataset = new Dataset(); private EditMode editMode; @@ -235,6 +237,7 @@ public void init() { // create mode for a new child dataset editMode = EditMode.CREATE; editVersion = dataset.getLatestVersion(); + dataset.setOwner(dataverseService.find(ownerId)); datasetVersionUI = new DatasetVersionUI(editVersion); //On create set pre-populated fields @@ -262,6 +265,7 @@ public void init() { } } FacesContext.getCurrentInstance().addMessage(null, new FacesMessage(FacesMessage.SEVERITY_INFO, "Add New Dataset", " - Enter metadata to create the dataset's citation. You can add more metadata about this dataset after it's created.")); + displayVersion = editVersion; } else { throw new RuntimeException("On Dataset page without id or ownerid."); // improve error handling } @@ -467,6 +471,9 @@ public String save() { cmd = new UpdateDatasetCommand(dataset, session.getUser()); } dataset = commandEngine.submit(cmd); + if (editMode == EditMode.CREATE) { + userNotificationService.sendNotification(session.getUser(), dataset.getCreateDate(), UserNotification.Type.CREATEDS, dataset.getLatestVersion().getId()); + } } catch (EJBException ex) { StringBuilder error = new StringBuilder(); error.append(ex + " "); diff --git a/src/main/java/edu/harvard/iq/dataverse/DatasetServiceBean.java b/src/main/java/edu/harvard/iq/dataverse/DatasetServiceBean.java index 423e8b8a116..ac843229ec3 100644 --- a/src/main/java/edu/harvard/iq/dataverse/DatasetServiceBean.java +++ b/src/main/java/edu/harvard/iq/dataverse/DatasetServiceBean.java @@ -5,8 +5,6 @@ */ package edu.harvard.iq.dataverse; -import java.util.Collection; -import java.util.Iterator; import java.util.List; import java.util.logging.Logger; import javax.ejb.EJB; @@ -15,7 +13,6 @@ import javax.persistence.EntityManager; import javax.persistence.PersistenceContext; import javax.persistence.Query; -import org.apache.commons.lang.StringUtils; /** * @@ -31,9 +28,6 @@ public class DatasetServiceBean { @PersistenceContext(unitName = "VDCNet-ejbPU") private EntityManager em; - - - public Dataset find(Object pk) { return (Dataset) em.find(Dataset.class, pk); diff --git a/src/main/java/edu/harvard/iq/dataverse/DatasetVersion.java b/src/main/java/edu/harvard/iq/dataverse/DatasetVersion.java index d983f558630..a5c91f8195d 100644 --- a/src/main/java/edu/harvard/iq/dataverse/DatasetVersion.java +++ b/src/main/java/edu/harvard/iq/dataverse/DatasetVersion.java @@ -180,6 +180,10 @@ public void setLastUpdateTime(Date lastUpdateTime) { } this.lastUpdateTime = lastUpdateTime; } + + public String getVersionDate(){ + return new SimpleDateFormat("MMMM d, yyyy").format(releaseTime); + } public Date getReleaseTime() { return releaseTime; diff --git a/src/main/java/edu/harvard/iq/dataverse/DataverseUserPage.java b/src/main/java/edu/harvard/iq/dataverse/DataverseUserPage.java index de4d262cd3a..7b262127f31 100644 --- a/src/main/java/edu/harvard/iq/dataverse/DataverseUserPage.java +++ b/src/main/java/edu/harvard/iq/dataverse/DataverseUserPage.java @@ -5,7 +5,9 @@ */ package edu.harvard.iq.dataverse; +import java.sql.Timestamp; import java.util.ArrayList; +import java.util.Date; import java.util.List; import javax.ejb.EJB; import javax.faces.application.FacesMessage; @@ -247,6 +249,7 @@ public String save() { } } dataverseUser = dataverseUserService.save(dataverseUser); + userNotificationService.sendNotification(dataverseUser, new Timestamp(new Date().getTime()), UserNotification.Type.CREATEACC, null); if (editMode == EditMode.CREATE) { session.setUser(dataverseUser); diff --git a/src/main/java/edu/harvard/iq/dataverse/FileMetadataField.java b/src/main/java/edu/harvard/iq/dataverse/FileMetadataField.java deleted file mode 100644 index c5e982b389c..00000000000 --- a/src/main/java/edu/harvard/iq/dataverse/FileMetadataField.java +++ /dev/null @@ -1,215 +0,0 @@ -/* - Copyright (C) 2005-2012, by the President and Fellows of Harvard College. - - Licensed under the Apache License, Version 2.0 (the "License"); - you may not use this file except in compliance with the License. - You may obtain a copy of the License at - - http://www.apache.org/licenses/LICENSE-2.0 - - Unless required by applicable law or agreed to in writing, software - distributed under the License is distributed on an "AS IS" BASIS, - WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - See the License for the specific language governing permissions and - limitations under the License. - - Dataverse Network - A web application to share, preserve and analyze research data. - Developed at the Institute for Quantitative Social Science, Harvard University. - Version 3.0. -*/ -/* - * FileMetadataField.java - * - * Taken virtually unchanged from DVN 3.*; - * Originally created in Feb. 2013 - * - */ -package edu.harvard.iq.dataverse; - -/** - * - * @author Leonid Andreev - */ - -import java.io.Serializable; -import java.util.ArrayList; -import java.util.List; -import javax.persistence.*; - -/** - * - * @author Leonid Andreev - */ - -// TODO: update the comment below, to reflect the object name changes in -// 4.0: - -// This is the studyfile-level equivalent of the StudyField table; this will -// store metadata fields associated with study files. -// For consistency with the StudyField and StudyFieldValue, I could have called -// it "StudyFileField"; but decided to go wtih "FileMetadataField" and -// "FileMetadataFieldValue", to have the names that are more descriptive. - - -@Entity -public class FileMetadataField implements Serializable { - /** - * Properties: - * ========== - */ - - @Id - @GeneratedValue(strategy = GenerationType.IDENTITY) - private Long id; - - @Column(name="name", columnDefinition="TEXT") - private String name; // This is the internal, DDI-like name, no spaces, etc. - @Column(name="title", columnDefinition="TEXT") - private String title; // A longer, human-friendlier name - punctuation allowed - @Column(name="description", columnDefinition="TEXT") - private String description; // A user-friendly Description; will be used for - // mouse-overs, etc. - - // TODO: - // decide if we even need this "custom field" flag; since all the file-level - // fields are going to be custom. - // On the other hand, we may want to add a set of standard file-level fields - // - something like "author", "date" and "keyword" maybe? - General enough - // attributes that can be associated with any document. - - private boolean customField; - private boolean basicSearchField; - private boolean advancedSearchField; - private boolean searchResultField; - private boolean prefixSearchable; - - private String fileFormatName; - - private int displayOrder; - - /** - * Constructors: - * ============ - */ - - /** Creates a new instance of FileMetadataField */ - public FileMetadataField() { - } - - - /** - * Getters and Setters: - * =================== - */ - - public Long getId() { - return this.id; - } - - public void setId(Long id) { - this.id = id; - } - - public int getDisplayOrder() { - return this.displayOrder; - } - - public void setDisplayOrder(int displayOrder) { - this.displayOrder = displayOrder; - } - - public String getName() { - return name; - } - - public void setName(String name) { - this.name = name; - } - - public String getTitle() { - return title; - } - - public void setTitle(String title) { - this.title = title; - } - - public String getDescription() { - return description; - } - - public void setDescription(String description) { - this.description = description; - } - - public boolean isCustomField() { - return customField; - } - - public void setCustomField(boolean customField) { - this.customField = customField; - } - - public String getFileFormatName() { - return fileFormatName; - } - - public void setFileFormatName(String fileFormatName) { - this.fileFormatName = fileFormatName; - } - - public boolean isBasicSearchField() { - return this.basicSearchField; - } - - public void setBasicSearchField(boolean basicSearchField) { - this.basicSearchField = basicSearchField; - } - - public boolean isAdvancedSearchField() { - return this.advancedSearchField; - } - - public void setAdvancedSearchField(boolean advancedSearchField) { - this.advancedSearchField = advancedSearchField; - } - - public boolean isSearchResultField() { - return this.searchResultField; - } - - public void setSearchResultField(boolean searchResultField) { - this.searchResultField = searchResultField; - } - - - public boolean isPrefixSearchable() { - return this.prefixSearchable; - } - - public void setPrefixSearchable(boolean prefixSearchale) { - this.prefixSearchable = prefixSearchable; - } - - /** - * Helper methods: - * ============== - * - */ - public int hashCode() { - int hash = 0; - hash += (this.id != null ? this.id.hashCode() : 0); - return hash; - } - - public boolean equals(Object object) { - // TODO: Warning - this method won't work in the case the id fields are not set - if (!(object instanceof FileMetadataField)) { - return false; - } - FileMetadataField other = (FileMetadataField)object; - if (this.id != other.id && (this.id == null || !this.id.equals(other.id))) return false; - return true; - } - -} diff --git a/src/main/java/edu/harvard/iq/dataverse/FileMetadataFieldValue.java b/src/main/java/edu/harvard/iq/dataverse/FileMetadataFieldValue.java deleted file mode 100644 index 98d4bc02ab1..00000000000 --- a/src/main/java/edu/harvard/iq/dataverse/FileMetadataFieldValue.java +++ /dev/null @@ -1,164 +0,0 @@ -/* - Copyright (C) 2005-2012, by the President and Fellows of Harvard College. - - Licensed under the Apache License, Version 2.0 (the "License"); - you may not use this file except in compliance with the License. - You may obtain a copy of the License at - - http://www.apache.org/licenses/LICENSE-2.0 - - Unless required by applicable law or agreed to in writing, software - distributed under the License is distributed on an "AS IS" BASIS, - WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - See the License for the specific language governing permissions and - limitations under the License. - - Dataverse Network - A web application to share, preserve and analyze research data. - Developed at the Institute for Quantitative Social Science, Harvard University. - Version 3.0. -*/ -/* - * FileMetadataField.java - * - * Taken virtually unchanged from DVN 3.*; - * Originally created in Feb. 2013 - * - */ -package edu.harvard.iq.dataverse; - -/** - * - * @author Leonid Andreev - */ - -import java.io.Serializable; -import javax.persistence.Column; -import javax.persistence.Entity; -import javax.persistence.GeneratedValue; -import javax.persistence.GenerationType; -import javax.persistence.Id; -import javax.persistence.JoinColumn; -import javax.persistence.ManyToOne; - - -@Entity -public class FileMetadataFieldValue implements Serializable { - - public FileMetadataFieldValue () { - } - - public FileMetadataFieldValue(FileMetadataField fmf, DataFile sf, String val) { - setFileMetadataField(fmf); - setStudyFile(sf); - setStrValue(val); - } - - - @Id - @GeneratedValue(strategy = GenerationType.IDENTITY) - private Long id; - - public Long getId() { - return id; - } - - public void setId(Long id) { - this.id = id; - } - - /** - * fileMetadataField, corresponding FileMetadataField - */ - @ManyToOne - @JoinColumn(nullable=false) - private FileMetadataField fileMetadataField; - - /** - * dataFile, corresponding StudyFile - */ - @ManyToOne - @JoinColumn(nullable=false) - private DataFile dataFile; - - - @Column(columnDefinition="TEXT") - private String strValue; - - private int displayOrder; - - - /** - * Getter and Setter methods: - */ - - - public FileMetadataField getFileMetadataField() { - return fileMetadataField; - } - - public void setFileMetadataField(FileMetadataField fileMetadataField) { - this.fileMetadataField=fileMetadataField; - } - - public DataFile getStudyFile() { - return dataFile; - } - - public void setStudyFile(DataFile dataFile) { - this.dataFile=dataFile; - } - - - - public String getStrValue() { - return strValue; - } - - public void setStrValue(String strValue) { - this.strValue = strValue; - - } - - public int getDisplayOrder() { return this.displayOrder;} - public void setDisplayOrder(int displayOrder) {this.displayOrder = displayOrder;} - - /** - * Class-specific method overrides: - */ - - @Override - public int hashCode() { - int hash = 0; - hash += (id != null ? id.hashCode() : 0); - return hash; - } - - @Override - public boolean equals(Object object) { - // TODO: Warning - this method won't work in the case the id fields are not set - if (!(object instanceof FileMetadataFieldValue)) { - return false; - } - FileMetadataFieldValue other = (FileMetadataFieldValue) object; - if ((this.id == null && other.id != null) || (this.id != null && !this.id.equals(other.id))) { - return false; - } - return true; - } - - @Override - public String toString() { - return "edu.harvard.iq.dataverse.FileMetadataFieldValue[ id=" + id + " ]"; - } - - /** - * Helper methods: - * - * @return - */ - public boolean isEmpty() { - return ((strValue==null || strValue.trim().equals(""))); - } - - -} diff --git a/src/main/java/edu/harvard/iq/dataverse/IndexServiceBean.java b/src/main/java/edu/harvard/iq/dataverse/IndexServiceBean.java index 98ce11c8ae1..778bae848df 100644 --- a/src/main/java/edu/harvard/iq/dataverse/IndexServiceBean.java +++ b/src/main/java/edu/harvard/iq/dataverse/IndexServiceBean.java @@ -21,10 +21,14 @@ import javax.ejb.EJB; import javax.ejb.Stateless; import javax.inject.Named; +import org.apache.solr.client.solrj.SolrQuery; import org.apache.solr.client.solrj.SolrServer; import org.apache.solr.client.solrj.SolrServerException; import org.apache.solr.client.solrj.impl.HttpSolrServer; +import org.apache.solr.client.solrj.response.QueryResponse; import org.apache.solr.client.solrj.response.UpdateResponse; +import org.apache.solr.common.SolrDocument; +import org.apache.solr.common.SolrDocumentList; import org.apache.solr.common.SolrInputDocument; @Stateless @@ -236,7 +240,7 @@ public String indexDataset(Dataset dataset) { */ String solrIdPublishedStudy = "dataset_" + dataset.getId(); StringBuilder sb = new StringBuilder(); - sb.append("rationale:\n"); + sb.append("\nrationale:\n"); List versions = dataset.getVersions(); for (DatasetVersion datasetVersion : versions) { Long versionDatabaseId = datasetVersion.getId(); @@ -299,7 +303,12 @@ public String indexDataset(Dataset dataset) { return "indexDraftResult:" + indexDraftResult + ", " + sb.toString(); } } else { - sb.append("The latest version is not a working copy (latestVersionState: " + latestVersionState + ") and will be indexed as " + solrIdPublishedStudy + " (visible by anonymous) and we will be deleting " + solrIdDraftStudy + "\n"); + List solrDocIdsForDraftFilesToDelete = findSolrDocIdsForDraftFilesToDelete(dataset); + sb.append("The latest version is not a working copy (latestVersionState: " + latestVersionState + + ") and will be indexed as " + solrIdPublishedStudy + + " (visible by anonymous) and we will be deleting " + + solrIdDraftStudy + " and its files (if any, num:" + solrDocIdsForDraftFilesToDelete.size() + + "): " + solrDocIdsForDraftFilesToDelete + "\n"); if (releasedVersion != null) { IndexableDataset indexableReleasedVersion = new IndexableDataset(releasedVersion); String releasedVersionState = releasedVersion.getVersionState().name(); @@ -314,9 +323,17 @@ public String indexDataset(Dataset dataset) { * and will be (again) indexed as dataset_34 (visible by anonymous) */ logger.info(sb.toString()); - String deleteDraftVersionResult = removeDatasetDraftFromIndex(solrIdDraftStudy); + String deleteDraftDatasetVersionResult = removeDraftFromIndex(solrIdDraftStudy); + StringBuilder deleteDraftFilesResults = new StringBuilder(); + /** + * @todo remove draft files from Solr index + */ + for (String doomed : solrDocIdsForDraftFilesToDelete) { + String result = removeDraftFromIndex(doomed); + deleteDraftFilesResults.append(result); + } String indexReleasedVersionResult = addOrUpdateDataset(indexableReleasedVersion); - return "deleteDraftVersionResult: " + deleteDraftVersionResult + ", indexReleasedVersionResult:" + indexReleasedVersionResult + ", " + sb.toString(); + return "deleteDraftDatasetVersionResult: " + deleteDraftDatasetVersionResult + ", deleteDraftFilesResults: " + deleteDraftFilesResults.toString() + ", indexReleasedVersionResult:" + indexReleasedVersionResult + ", " + sb.toString(); } else { sb.append("We don't ever expect to ever get here. Why is there no released version if the latest version is not a working copy? The latestVersionState is " + latestVersionState + " and we don't know what to do with it. Nothing will be added or deleted from the index."); logger.info(sb.toString()); @@ -339,8 +356,8 @@ private String addOrUpdateDataset(IndexableDataset indexableDataset) { } List dataversePaths = getDataversePathsFromSegments(dataverseSegments); SolrInputDocument solrInputDocument = new SolrInputDocument(); - String solrDocId = indexableDataset.getSolrDocId(); - solrInputDocument.addField(SearchFields.ID, solrDocId); + String datasetSolrDocId = indexableDataset.getSolrDocId(); + solrInputDocument.addField(SearchFields.ID, datasetSolrDocId); solrInputDocument.addField(SearchFields.ENTITY_ID, dataset.getId()); solrInputDocument.addField(SearchFields.TYPE, "datasets"); @@ -416,7 +433,8 @@ private String addOrUpdateDataset(IndexableDataset indexableDataset) { if (dsf.getValues() != null && !dsf.getValues().isEmpty() && dsf.getValues().get(0) != null && solrFieldSearchable != null) { logger.info("indexing " + dsf.getDatasetFieldType().getName() + ":" + dsf.getValues() + " into " + solrFieldSearchable + " and maybe " + solrFieldFacetable); - if (dsfType.getSolrField().getSolrType().equals(SolrField.SolrType.INTEGER)) { +// if (dsfType.getSolrField().getSolrType().equals(SolrField.SolrType.INTEGER)) { + if (dsfType.getSolrField().getSolrType().equals(SolrField.SolrType.DATE)) { String dateAsString = dsf.getValues().get(0); logger.info("date as string: " + dateAsString); if (dateAsString != null && !dateAsString.isEmpty()) { @@ -431,9 +449,11 @@ private String addOrUpdateDataset(IndexableDataset indexableDataset) { SimpleDateFormat yearOnly = new SimpleDateFormat("yyyy"); String datasetFieldFlaggedAsDate = yearOnly.format(dateAsDate); logger.info("YYYY only: " + datasetFieldFlaggedAsDate); - solrInputDocument.addField(solrFieldSearchable, Integer.parseInt(datasetFieldFlaggedAsDate)); +// solrInputDocument.addField(solrFieldSearchable, Integer.parseInt(datasetFieldFlaggedAsDate)); + solrInputDocument.addField(solrFieldSearchable, datasetFieldFlaggedAsDate); if (dsfType.getSolrField().isFacetable()) { - solrInputDocument.addField(solrFieldFacetable, Integer.parseInt(datasetFieldFlaggedAsDate)); +// solrInputDocument.addField(solrFieldFacetable, Integer.parseInt(datasetFieldFlaggedAsDate)); + solrInputDocument.addField(solrFieldFacetable, datasetFieldFlaggedAsDate); } } catch (Exception ex) { logger.info("unable to convert " + dateAsString + " into YYYY format and couldn't index it (" + dsfType.getName() + ")"); @@ -564,15 +584,12 @@ private String addOrUpdateDataset(IndexableDataset indexableDataset) { docs.add(solrInputDocument); + List filesIndexed = new ArrayList<>(); if (datasetVersion != null) { List fileMetadatas = datasetVersion.getFileMetadatas(); for (FileMetadata fileMetadata : fileMetadatas) { SolrInputDocument datafileSolrInputDocument = new SolrInputDocument(); Long fileEntityId = fileMetadata.getDataFile().getId(); - /** - * @todo: should this sometimes end with "_draft" like datasets do? - */ - datafileSolrInputDocument.addField(SearchFields.ID, "datafile_" + fileEntityId); datafileSolrInputDocument.addField(SearchFields.ENTITY_ID, fileEntityId); datafileSolrInputDocument.addField(SearchFields.TYPE, "files"); @@ -603,6 +620,8 @@ private String addOrUpdateDataset(IndexableDataset indexableDataset) { datafileSolrInputDocument.addField(SearchFields.NAME_SORT, filenameCompleteFinal); datafileSolrInputDocument.addField(SearchFields.FILE_NAME, filenameCompleteFinal); + datafileSolrInputDocument.addField(SearchFields.DATASET_VERSION_ID, datasetVersion.getId()); + /** * for rules on sorting files see * https://docs.google.com/a/harvard.edu/document/d/1DWsEqT8KfheKZmMB3n_VhJpl9nIxiUjai_AIQPAjiyA/edit?usp=sharing @@ -647,13 +666,19 @@ private String addOrUpdateDataset(IndexableDataset indexableDataset) { if (majorVersionReleaseDate == null) { datafileSolrInputDocument.addField(SearchFields.PUBLICATION_STATUS, UNPUBLISHED_STRING); } + + String fileSolrDocId = "datafile_" + fileEntityId; if (indexableDataset.getDatasetState().equals(indexableDataset.getDatasetState().PUBLISHED)) { + fileSolrDocId = "datafile_" + fileEntityId; datafileSolrInputDocument.addField(SearchFields.PUBLICATION_STATUS, PUBLISHED_STRING); datafileSolrInputDocument.addField(SearchFields.PERMS, publicGroupString); addDatasetReleaseDateToSolrDoc(datafileSolrInputDocument, dataset); } else if (indexableDataset.getDatasetState().equals(indexableDataset.getDatasetState().WORKING_COPY)) { + fileSolrDocId = "datafile_" + fileEntityId + indexableDataset.getDatasetState().getSuffix(); datafileSolrInputDocument.addField(SearchFields.PUBLICATION_STATUS, DRAFT_STRING); } + datafileSolrInputDocument.addField(SearchFields.ID, fileSolrDocId); + filesIndexed.add(fileSolrDocId); if (creator != null) { datafileSolrInputDocument.addField(SearchFields.PERMS, groupPerUserPrefix + creator.getId()); @@ -740,25 +765,6 @@ private String addOrUpdateDataset(IndexableDataset indexableDataset) { } } - // And if the file has indexable file-level metadata associated - // with it, we'll index that too: - - List fileMetadataFieldValues = fileMetadata.getDataFile().getFileMetadataFieldValues(); - if (fileMetadataFieldValues != null && fileMetadataFieldValues.size() > 0) { - for (int j = 0; j < fileMetadataFieldValues.size(); j++) { - - String fieldValue = fileMetadataFieldValues.get(j).getStrValue(); - - FileMetadataField fmf = fileMetadataFieldValues.get(j).getFileMetadataField(); - String fileMetadataFieldName = fmf.getName(); - String fileMetadataFieldFormatName = fmf.getFileFormatName(); - String fieldName = fileMetadataFieldFormatName + "-" + fileMetadataFieldName + "_s"; - - datafileSolrInputDocument.addField(fieldName, fieldValue); - - } - } - docs.add(datafileSolrInputDocument); } } @@ -779,7 +785,8 @@ private String addOrUpdateDataset(IndexableDataset indexableDataset) { return ex.toString(); } - return "indexed dataset " + dataset.getId() + " as " + solrDocId; // + ":" + dataset.getTitle(); +// return "indexed dataset " + dataset.getId() + " as " + solrDocId + "\nindexFilesResults for " + solrDocId + ":" + fileInfo.toString(); + return "indexed dataset " + dataset.getId() + " as " + datasetSolrDocId + "\nindexFilesResults for " + datasetSolrDocId + ", filesIndexed: " + filesIndexed; } public String indexGroup(Map.Entry group) { @@ -948,12 +955,12 @@ public String delete(Dataverse doomed) { return response; } - public String removeDatasetDraftFromIndex(String doomed) { + public String removeDraftFromIndex(String doomed) { /** * @todo allow for configuration of hostname and port */ SolrServer server = new HttpSolrServer("http://localhost:8983/solr/"); - logger.info("deleting Solr document for dataset draft: " + doomed); + logger.info("deleting Solr document draft: " + doomed); UpdateResponse updateResponse; try { updateResponse = server.deleteById(doomed); @@ -965,7 +972,7 @@ public String removeDatasetDraftFromIndex(String doomed) { } catch (SolrServerException | IOException ex) { return ex.toString(); } - String response = "Successfully deleted dataset draft " + doomed + " from Solr index. updateReponse was: " + updateResponse.toString(); + String response = "Attempted to delete draft " + doomed + " from Solr index. updateReponse was: " + updateResponse.toString(); logger.info(response); return response; } @@ -980,4 +987,31 @@ public String convertToFriendlyDate(Date dateAsDate) { return friendlyDate; } + private List findSolrDocIdsForDraftFilesToDelete(Dataset datasetWithDraftFilesToDelete) { + Long datasetId = datasetWithDraftFilesToDelete.getId(); + SolrServer solrServer = new HttpSolrServer("http://localhost:8983/solr"); + SolrQuery solrQuery = new SolrQuery(); + solrQuery.setQuery("parentid:" + datasetId); + /** + * @todo rather than hard coding "_draft" here, tie to + * IndexableDataset(new DatasetVersion()).getDatasetState().getSuffix() + */ +// String draftSuffix = new IndexableDataset(new DatasetVersion()).getDatasetState().WORKING_COPY.name(); + solrQuery.addFilterQuery(SearchFields.ID + ":" + "*_draft"); + List solrIdsOfFilesToDelete = new ArrayList<>(); + try { + QueryResponse queryResponse = solrServer.query(solrQuery); + SolrDocumentList results = queryResponse.getResults(); + for (SolrDocument solrDocument : results) { + String id = (String) solrDocument.getFieldValue(SearchFields.ID); + if (id != null) { + solrIdsOfFilesToDelete.add(id); + } + } + } catch (SolrServerException ex) { + logger.info("error in findSolrDocIdsForDraftFilesToDelete method: " + ex.toString()); + } + return solrIdsOfFilesToDelete; + } + } diff --git a/src/main/java/edu/harvard/iq/dataverse/SearchIncludeFragment.java b/src/main/java/edu/harvard/iq/dataverse/SearchIncludeFragment.java index a8e83ba89dc..50f22293030 100644 --- a/src/main/java/edu/harvard/iq/dataverse/SearchIncludeFragment.java +++ b/src/main/java/edu/harvard/iq/dataverse/SearchIncludeFragment.java @@ -95,6 +95,7 @@ public class SearchIncludeFragment { // private boolean showUnpublished; List filterQueriesDebug = new ArrayList<>(); // private Map friendlyName = new HashMap<>(); + private String errorFromSolr; /** * @todo: @@ -288,6 +289,7 @@ public void search(boolean onlyDataRelatedToMe) { this.datasetfieldFriendlyNamesBySolrField = solrQueryResponse.getDatasetfieldFriendlyNamesBySolrField(); this.staticSolrFieldFriendlyNamesBySolrField = solrQueryResponse.getStaticSolrFieldFriendlyNamesBySolrField(); this.filterQueriesDebug = solrQueryResponse.getFilterQueriesActual(); + this.errorFromSolr = solrQueryResponse.getError(); paginationGuiStart = paginationStart + 1; paginationGuiEnd = Math.min(page * paginationGuiRows,searchResultsCount); List searchResults = solrQueryResponse.getSolrSearchResults(); @@ -904,4 +906,13 @@ public String getNewSelectedTypes(String typeClicked) { return combine(arr, ":"); } + + public String getErrorFromSolr() { + return errorFromSolr; + } + + public void setErrorFromSolr(String errorFromSolr) { + this.errorFromSolr = errorFromSolr; + } + } diff --git a/src/main/java/edu/harvard/iq/dataverse/SearchServiceBean.java b/src/main/java/edu/harvard/iq/dataverse/SearchServiceBean.java index 74c3e699b35..ee62bfd34f8 100644 --- a/src/main/java/edu/harvard/iq/dataverse/SearchServiceBean.java +++ b/src/main/java/edu/harvard/iq/dataverse/SearchServiceBean.java @@ -25,6 +25,7 @@ import org.apache.solr.client.solrj.SolrServer; import org.apache.solr.client.solrj.SolrServerException; import org.apache.solr.client.solrj.impl.HttpSolrServer; +import org.apache.solr.client.solrj.impl.HttpSolrServer.RemoteSolrException; import org.apache.solr.client.solrj.response.FacetField; import org.apache.solr.client.solrj.response.QueryResponse; import org.apache.solr.client.solrj.response.RangeFacet; @@ -238,8 +239,31 @@ public SolrQueryResponse search(DataverseUser dataverseUser, Dataverse dataverse QueryResponse queryResponse; try { queryResponse = solrServer.query(solrQuery); - } catch (SolrServerException ex) { - throw new RuntimeException("Is the Solr server down?"); + } catch (SolrServerException | RemoteSolrException ex) { + String error = "Bloops!"; + if (ex instanceof SolrServerException) { + error = "Internal error: " + ex.getLocalizedMessage(); + } else if (ex instanceof RemoteSolrException) { + error = "Trouble parsing query? " + ex.getLocalizedMessage(); + } + + logger.info(error + " " + ex.getLocalizedMessage()); + SolrQueryResponse exceptionSolrQueryResponse = new SolrQueryResponse(); + exceptionSolrQueryResponse.setError(error); + + long zeroNumResultsFound = 0; + long zeroGetResultsStart = 0; + List emptySolrSearchResults = new ArrayList<>(); + List exceptionFacetCategoryList = new ArrayList<>(); + Map> emptySpellingSuggestion = new HashMap<>(); + exceptionSolrQueryResponse.setNumResultsFound(zeroNumResultsFound); + exceptionSolrQueryResponse.setResultsStart(zeroGetResultsStart); + exceptionSolrQueryResponse.setSolrSearchResults(emptySolrSearchResults); + exceptionSolrQueryResponse.setFacetCategoryList(exceptionFacetCategoryList); + exceptionSolrQueryResponse.setTypeFacetCategories(exceptionFacetCategoryList); + exceptionSolrQueryResponse.setSpellingSuggestionsByToken(emptySpellingSuggestion); + + return exceptionSolrQueryResponse; } SolrDocumentList docs = queryResponse.getResults(); Iterator iter = docs.iterator(); @@ -351,6 +375,7 @@ public SolrQueryResponse search(DataverseUser dataverseUser, Dataverse dataverse } else if (type.equals("files")) { solrSearchResult.setName(name); solrSearchResult.setFiletype(filetype); + solrSearchResult.setDatasetVersionId(datasetVersionId); } /** * @todo store PARENT_ID as a long instead and cast as such diff --git a/src/main/java/edu/harvard/iq/dataverse/SolrField.java b/src/main/java/edu/harvard/iq/dataverse/SolrField.java index 1940e8b37a9..2cce2ea9a93 100644 --- a/src/main/java/edu/harvard/iq/dataverse/SolrField.java +++ b/src/main/java/edu/harvard/iq/dataverse/SolrField.java @@ -51,7 +51,7 @@ public enum SolrType { * non-English languages? We changed it to text_en to improve English * language searching in https://redmine.hmdc.harvard.edu/issues/3859 */ - STRING("string"), TEXT_EN("text_en"), INTEGER("int"), LONG("long"); + STRING("string"), TEXT_EN("text_en"), INTEGER("int"), LONG("long"), DATE("text_en"); private String type; diff --git a/src/main/java/edu/harvard/iq/dataverse/SolrQueryResponse.java b/src/main/java/edu/harvard/iq/dataverse/SolrQueryResponse.java index f2a3c9eefdf..946d8e18cac 100644 --- a/src/main/java/edu/harvard/iq/dataverse/SolrQueryResponse.java +++ b/src/main/java/edu/harvard/iq/dataverse/SolrQueryResponse.java @@ -19,6 +19,7 @@ public class SolrQueryResponse { Map datasetfieldFriendlyNamesBySolrField = new HashMap<>(); private Map staticSolrFieldFriendlyNamesBySolrField; private List filterQueriesActual = new ArrayList(); + private String error; public List getSolrSearchResults() { return solrSearchResults; @@ -92,4 +93,12 @@ public void setFilterQueriesActual(List filterQueriesActual) { this.filterQueriesActual = filterQueriesActual; } + public String getError() { + return error; + } + + public void setError(String error) { + this.error = error; + } + } diff --git a/src/main/java/edu/harvard/iq/dataverse/SolrSearchResult.java b/src/main/java/edu/harvard/iq/dataverse/SolrSearchResult.java index 89f313e5d32..5531053ae2a 100644 --- a/src/main/java/edu/harvard/iq/dataverse/SolrSearchResult.java +++ b/src/main/java/edu/harvard/iq/dataverse/SolrSearchResult.java @@ -412,4 +412,12 @@ public void setDatasetVersionId(long datasetVersionId) { this.datasetVersionId = datasetVersionId; } + public String getDatasetUrl() { + return "/dataset.xhtml?id=" + entityId + "&versionId=" + datasetVersionId; + } + + public String getFileUrl() { + return "/dataset.xhtml?id=" + parent.get(SearchFields.ID) + "&versionId=" + datasetVersionId; + } + } diff --git a/src/main/java/edu/harvard/iq/dataverse/UserNotification.java b/src/main/java/edu/harvard/iq/dataverse/UserNotification.java index 8ecd9760b22..6c6717af6dc 100644 --- a/src/main/java/edu/harvard/iq/dataverse/UserNotification.java +++ b/src/main/java/edu/harvard/iq/dataverse/UserNotification.java @@ -22,7 +22,7 @@ @Entity public class UserNotification implements Serializable { public enum Type { - CREATEDV, CREATEDS + CREATEDV, CREATEDS, CREATEACC }; private static final long serialVersionUID = 1L; diff --git a/src/main/java/edu/harvard/iq/dataverse/api/Config.java b/src/main/java/edu/harvard/iq/dataverse/api/Config.java index bdbff72a4c7..b627f5489a8 100644 --- a/src/main/java/edu/harvard/iq/dataverse/api/Config.java +++ b/src/main/java/edu/harvard/iq/dataverse/api/Config.java @@ -58,8 +58,8 @@ public String getSolrSchema() { String nameFacetable = datasetField.getSolrField().getNameFacetable(); if (listOfStaticFields.contains(nameSearchable)) { - if (nameSearchable.equals(SearchFields.DESCRIPTION)) { - // Skip, known conflct. We are merging these fields together across types. + if (nameSearchable.equals(SearchFields.DATASET_DESCRIPTION)) { + // Skip, expected conflct. } else { return error("searchable dataset metadata field conflict detected with static field: " + nameSearchable); } diff --git a/src/main/java/edu/harvard/iq/dataverse/api/Dataverses.java b/src/main/java/edu/harvard/iq/dataverse/api/Dataverses.java index 8684997b5d3..86837993673 100644 --- a/src/main/java/edu/harvard/iq/dataverse/api/Dataverses.java +++ b/src/main/java/edu/harvard/iq/dataverse/api/Dataverses.java @@ -41,11 +41,13 @@ import javax.json.stream.JsonParsingException; import javax.validation.ConstraintViolation; import javax.validation.ConstraintViolationException; +import javax.ws.rs.Consumes; import javax.ws.rs.DELETE; import javax.ws.rs.POST; import javax.ws.rs.PathParam; import javax.ws.rs.Produces; import javax.ws.rs.QueryParam; +import javax.ws.rs.core.MediaType; import javax.ws.rs.core.Response; import javax.ws.rs.core.Response.Status; @@ -58,7 +60,6 @@ public class Dataverses extends AbstractApiBean { private static final Logger logger = Logger.getLogger(Dataverses.class.getName()); - @GET public String list() { JsonArrayBuilder bld = Json.createArrayBuilder(); @@ -115,9 +116,8 @@ public String addDataverse( Dataverse d, @PathParam("identifier") String parentI } @POST - @Path("{identifier}/datasets/") - @Produces("application/json") - public Response createDataset( @PathParam("identifier") String parentIdtf, String jsonBody, @QueryParam("key") String apiKey ) { + @Path("{identifier}/datasets") + public Response createDataset( String jsonBody, @PathParam("identifier") String parentIdtf, @QueryParam("key") String apiKey ) { DataverseUser u = userSvc.findByUserName(apiKey); if ( u == null ) return errorResponse( Response.Status.UNAUTHORIZED, "Invalid apikey '" + apiKey + "'"); @@ -130,6 +130,7 @@ public Response createDataset( @PathParam("identifier") String parentIdtf, Strin try ( StringReader rdr = new StringReader(jsonBody) ) { json = Json.createReader(rdr).readObject(); } catch ( JsonParsingException jpe ) { + logger.log(Level.SEVERE, "Json: " + jsonBody); return errorResponse( Status.BAD_REQUEST, "Error parsing Json: " + jpe.getMessage() ); } @@ -240,7 +241,7 @@ public String listMetadataBlocks( @PathParam("identifier") String dvIdtf, @Query @POST @Path("{identifier}/metadatablocks") - @Produces("application/json") + @Produces(MediaType.APPLICATION_JSON) public Response setMetadataBlocks( @PathParam("identifier")String dvIdtf, @QueryParam("key") String apiKey, String blockIds ) { DataverseUser u = userSvc.findByUserName(apiKey); if ( u == null ) return badApiKey(apiKey); @@ -449,9 +450,6 @@ public String deleteAssignment( @PathParam("id") long assignmentId, @PathParam(" } } - // CONTPOINT add a POST method for datasets here, and a POST method for dataset versions in the dataset part. - - @GET @Path(":gv") public String toGraphviz() { diff --git a/src/main/java/edu/harvard/iq/dataverse/api/Search.java b/src/main/java/edu/harvard/iq/dataverse/api/Search.java index bd4221859c1..8c41b0dd473 100644 --- a/src/main/java/edu/harvard/iq/dataverse/api/Search.java +++ b/src/main/java/edu/harvard/iq/dataverse/api/Search.java @@ -21,6 +21,8 @@ import javax.ws.rs.GET; import javax.ws.rs.Path; import javax.ws.rs.QueryParam; +import org.apache.solr.client.solrj.SolrServerException; +import org.apache.solr.client.solrj.impl.HttpSolrServer.RemoteSolrException; @Path("search") public class Search extends AbstractApiBean { @@ -134,6 +136,9 @@ public String search(@QueryParam("key") String apiKey, value.add("spelling_alternatives", spelling_alternatives); value.add("facets", facets); } + if (solrQueryResponse.getError() != null) { + value.add("error", solrQueryResponse.getError()); + } return Util.jsonObject2prettyString(value.build()); } else { /** diff --git a/src/main/java/edu/harvard/iq/dataverse/ingest/IngestServiceBean.java b/src/main/java/edu/harvard/iq/dataverse/ingest/IngestServiceBean.java index 3db2fad333d..8d13541fb8d 100644 --- a/src/main/java/edu/harvard/iq/dataverse/ingest/IngestServiceBean.java +++ b/src/main/java/edu/harvard/iq/dataverse/ingest/IngestServiceBean.java @@ -32,8 +32,6 @@ import edu.harvard.iq.dataverse.DatasetPage; import edu.harvard.iq.dataverse.DatasetVersion; import edu.harvard.iq.dataverse.FileMetadata; -import edu.harvard.iq.dataverse.FileMetadataField; -import edu.harvard.iq.dataverse.FileMetadataFieldValue; import edu.harvard.iq.dataverse.MetadataBlock; import edu.harvard.iq.dataverse.dataaccess.ImageThumbConverter; import edu.harvard.iq.dataverse.dataaccess.TabularSubsetGenerator; @@ -207,6 +205,14 @@ public boolean ingestAsTabular(DataFile dataFile) throws IOException { public boolean ingestAsTabular(String tempFileLocation, DataFile dataFile) throws IOException { boolean ingestSuccessful = false; + PushContext pushContext = PushContextFactory.getDefault().getPushContext(); + if (pushContext != null) { + Logger.getLogger(DatasetPage.class.getName()).log(Level.FINE, "Ingest: Obtained push context " + + pushContext.toString()); + } else { + Logger.getLogger(DatasetPage.class.getName()).log(Level.SEVERE, "Warning! Could not obtain push context."); + } + // Locate ingest plugin for the file format by looking // it up with the Ingest Service Provider Registry: //TabularDataFileReader ingestPlugin = IngestSP.getTabDataReaderByMIMEType(dFile.getContentType()); @@ -215,6 +221,11 @@ public boolean ingestAsTabular(String tempFileLocation, DataFile dataFile) throw TabularDataFileReader ingestPlugin = getTabDataReaderByMimeType(dataFile); if (ingestPlugin == null) { + dataFile.SetIngestProblem(); + dataFile = fileService.save(dataFile); + FacesMessage facesMessage = new FacesMessage("ingest failed"); + pushContext.push("/ingest"+dataFile.getOwner().getId(), facesMessage); + Logger.getLogger(DatasetPage.class.getName()).log(Level.INFO, "Ingest failure: Sent push notification to the page."); throw new IOException("Could not find ingest plugin for the file " + fileName); } @@ -223,6 +234,11 @@ public boolean ingestAsTabular(String tempFileLocation, DataFile dataFile) throw try { tempFileInputStream = new FileInputStream(new File(tempFileLocation)); } catch (FileNotFoundException notfoundEx) { + dataFile.SetIngestProblem(); + dataFile = fileService.save(dataFile); + FacesMessage facesMessage = new FacesMessage("ingest failed"); + pushContext.push("/ingest"+dataFile.getOwner().getId(), facesMessage); + Logger.getLogger(DatasetPage.class.getName()).log(Level.INFO, "Ingest failure: Sent push notification to the page."); throw new IOException("Could not open temp file "+tempFileLocation); } @@ -266,27 +282,21 @@ public boolean ingestAsTabular(String tempFileLocation, DataFile dataFile) throw try { produceSummaryStatistics(dataFile); } catch (IOException sumStatEx) { + dataFile.SetIngestProblem(); + dataFile = fileService.save(dataFile); + FacesMessage facesMessage = new FacesMessage("ingest failed"); + pushContext.push("/ingest"+dataFile.getOwner().getId(), facesMessage); + Logger.getLogger(DatasetPage.class.getName()).log(Level.INFO, "Ingest failure: Sent push notification to the page."); throw new IOException ("Ingest: failed to calculate summary statistics. "+sumStatEx.getMessage()); } - ingestSuccessful = true; - PushContext pushContext = PushContextFactory.getDefault().getPushContext(); - if (pushContext != null ) { - Logger.getLogger(DatasetPage.class.getName()).log(Level.FINE, "Ingest: Obtained push context " - + pushContext.toString()); - } else { - Logger.getLogger(DatasetPage.class.getName()).log(Level.SEVERE, "Warning! Could not obtain push context."); - } - - - - - FacesMessage facesMessage = new FacesMessage("ingest done"); - pushContext.push("/ingest"+dataFile.getOwner().getId(), facesMessage); - Logger.getLogger(DatasetPage.class.getName()).log(Level.INFO, "Ingest: Sent push notification to the page."); - + ingestSuccessful = true; } } + + FacesMessage facesMessage = new FacesMessage("ingest done"); + pushContext.push("/ingest"+dataFile.getOwner().getId(), facesMessage); + Logger.getLogger(DatasetPage.class.getName()).log(Level.INFO, "Ingest: Sent push notification to the page."); return ingestSuccessful; } @@ -396,10 +406,10 @@ public boolean extractIndexableMetadata(String tempFileLocation, DataFile dataFi logger.fine("Ingest Service: Processing extracted metadata;"); if (extractedMetadata.getMetadataBlockName() != null) { logger.fine("Ingest Service: This metadata belongs to the "+extractedMetadata.getMetadataBlockName()+" metadata block."); - ingestDatasetMetadata(extractedMetadata, editVersion); + processDatasetMetadata(extractedMetadata, editVersion); } - ingestFileLevelMetadata(extractedMetadata, dataFile, fileMetadata, extractorPlugin.getFormatName()); + processFileLevelMetadata(extractedMetadata, fileMetadata); } @@ -409,7 +419,7 @@ public boolean extractIndexableMetadata(String tempFileLocation, DataFile dataFi } - private void ingestDatasetMetadata(FileMetadataIngest fileMetadataIngest, DatasetVersion editVersion) throws IOException { + private void processDatasetMetadata(FileMetadataIngest fileMetadataIngest, DatasetVersion editVersion) throws IOException { for (MetadataBlock mdb : editVersion.getDataset().getOwner().getMetadataBlocks()) { @@ -511,15 +521,18 @@ private void ingestDatasetMetadata(FileMetadataIngest fileMetadataIngest, Datase } - private void ingestFileLevelMetadata(FileMetadataIngest fileLevelMetadata, DataFile dataFile, FileMetadata fileMetadata, String fileFormatName) { - // First, add the "metadata summary" generated by the file reader/ingester - // to the fileMetadata object, as the "description": + private void processFileLevelMetadata(FileMetadataIngest fileLevelMetadata, FileMetadata fileMetadata) { + // The only type of metadata that ingest plugins can extract from ingested + // files (as of 4.0 beta) that *stay* on the file-level is the automatically + // generated "metadata summary" note. We attach it to the "description" + // field of the fileMetadata object. -- L.A. + String metadataSummary = fileLevelMetadata.getMetadataSummary(); if (metadataSummary != null) { if (!metadataSummary.equals("")) { - // The AddFiles page allows a user to enter file description + // The file upload page allows a user to enter file description // on ingest. We don't want to overwrite whatever they may - // have entered. Rather, we'll append our metadata summary + // have entered. Rather, we'll append this generated metadata summary // to the existing value. String userEnteredFileDescription = fileMetadata.getDescription(); if (userEnteredFileDescription != null @@ -530,88 +543,30 @@ private void ingestFileLevelMetadata(FileMetadataIngest fileLevelMetadata, DataF fileMetadata.setDescription(metadataSummary); } } - - Map> fileMetadataMap = fileLevelMetadata.getMetadataMap(); - - // And now we can go through the remaining key/value pairs in the - // metadata maps and process the metadata elements that were found in the - // file: - for (String mKey : fileMetadataMap.keySet()) { - - Set mValues = fileMetadataMap.get(mKey); - - Logger.getLogger(DatasetPage.class.getName()).log(Level.FINE, "Looking up file meta field " + mKey + ", file format " + fileFormatName); - FileMetadataField fileMetaField = fieldService.findFileMetadataFieldByNameAndFormat(mKey, fileFormatName); - - if (fileMetaField == null) { - //fileMetaField = studyFieldService.createFileMetadataField(mKey, fileFormatName); - fileMetaField = new FileMetadataField(); - - if (fileMetaField == null) { - Logger.getLogger(DatasetPage.class.getName()).log(Level.WARNING, "Failed to create a new File Metadata Field; skipping."); - continue; - } - - fileMetaField.setName(mKey); - fileMetaField.setFileFormatName(fileFormatName); - // TODO: provide meaningful descriptions and labels: - fileMetaField.setDescription(mKey); - fileMetaField.setTitle(mKey); - - try { - fieldService.saveFileMetadataField(fileMetaField); - } catch (Exception ex) { - Logger.getLogger(DatasetPage.class.getName()).log(Level.WARNING, "Failed to save new file metadata field (" + mKey + "); skipping values."); - continue; - } - - Logger.getLogger(DatasetPage.class.getName()).log(Level.FINE, "Created file meta field " + mKey); - } - - String fieldValueText = null; - - if (mValues != null) { - for (String mValue : mValues) { - if (mValue != null) { - if (fieldValueText == null) { - fieldValueText = mValue; - } else { - fieldValueText = fieldValueText.concat(" ".concat(mValue)); - } - } - } - } - - FileMetadataFieldValue fileMetaFieldValue = null; - - if (!"".equals(fieldValueText)) { - Logger.getLogger(DatasetPage.class.getName()).log(Level.FINE, "Attempting to create a file meta value for study file " + dataFile.getId() + ", value " + fieldValueText); - if (dataFile != null) { - fileMetaFieldValue - = new FileMetadataFieldValue(fileMetaField, dataFile, fieldValueText); - } - } - if (fileMetaFieldValue == null) { - Logger.getLogger(DatasetPage.class.getName()).log(Level.WARNING, "Failed to create a new File Metadata Field value; skipping"); - continue; - } else { - if (dataFile.getFileMetadataFieldValues() == null) { - dataFile.setFileMetadataFieldValues(new ArrayList()); - } - dataFile.getFileMetadataFieldValues().add(fileMetaFieldValue); - } - } } public void performPostProcessingTasks(DataFile dataFile) { /* - * At this point (4.0 alpha 1) the only ingest "post-processing task" performed + * At this point (4.0 beta) the only ingest "post-processing task" performed * is pre-generation of image thumbnails in a couple of popular sizes. * -- L.A. */ - if (dataFile != null && dataFile.isImage()) { - ImageThumbConverter.generateImageThumb(dataFile.getFileSystemLocation().toString(), ImageThumbConverter.DEFAULT_THUMBNAIL_SIZE); - ImageThumbConverter.generateImageThumb(dataFile.getFileSystemLocation().toString(), ImageThumbConverter.DEFAULT_PREVIEW_SIZE); + if (dataFile != null) { + // These separate methods for generating thumbnails, for PDF files and + // and for regular images, will eventually go away. We'll have a unified + // system of generating "previews" for datafiles of all kinds; the + // differentiation between different types of content and different + // methods for generating these previews will be hidden inside that + // subsystem (could be as simple as a type-specific icon, or even a + // special "content unknown" icon, for some types of files). + // -- L.A. 4.0 beta + if ("application/pdf".equalsIgnoreCase(dataFile.getContentType())) { + ImageThumbConverter.generatePDFThumb(dataFile.getFileSystemLocation().toString(), ImageThumbConverter.DEFAULT_THUMBNAIL_SIZE); + ImageThumbConverter.generatePDFThumb(dataFile.getFileSystemLocation().toString(), ImageThumbConverter.DEFAULT_PREVIEW_SIZE); + } else if (dataFile.isImage()) { + ImageThumbConverter.generateImageThumb(dataFile.getFileSystemLocation().toString(), ImageThumbConverter.DEFAULT_THUMBNAIL_SIZE); + ImageThumbConverter.generateImageThumb(dataFile.getFileSystemLocation().toString(), ImageThumbConverter.DEFAULT_PREVIEW_SIZE); + } } } diff --git a/src/main/java/edu/harvard/iq/dataverse/ingest/metadataextraction/impl/plugins/fits/FITSFileMetadataExtractor.java b/src/main/java/edu/harvard/iq/dataverse/ingest/metadataextraction/impl/plugins/fits/FITSFileMetadataExtractor.java index f821685b405..87cd0d509d6 100644 --- a/src/main/java/edu/harvard/iq/dataverse/ingest/metadataextraction/impl/plugins/fits/FITSFileMetadataExtractor.java +++ b/src/main/java/edu/harvard/iq/dataverse/ingest/metadataextraction/impl/plugins/fits/FITSFileMetadataExtractor.java @@ -15,12 +15,16 @@ import java.io.InputStreamReader; import java.io.IOException; import java.io.File; +import java.text.ParseException; +import java.text.ParsePosition; +import java.text.SimpleDateFormat; import java.util.Map; import java.util.HashMap; import java.util.Set; import java.util.HashSet; import java.util.List; import java.util.ArrayList; +import java.util.Date; import java.util.Properties; import java.util.logging.Logger; import nom.tam.fits.BasicHDU; @@ -69,65 +73,116 @@ public class FITSFileMetadataExtractor extends FileMetadataExtractor { // This map defines the names of the keys under which they will be indexed // and made searchable in the application - private static final String CONFIG_TOKEN_META_KEY = "RECOGNIZED_META_KEY"; private static final String CONFIG_TOKEN_COLUMN_KEY = "RECOGNIZED_COLUMN_KEY"; private static final String ASTROPHYSICS_BLOCK_NAME = "astrophysics"; + private static final int FIELD_TYPE_TEXT = 0; + private static final int FIELD_TYPE_DATE = 1; + private static final int FIELD_TYPE_FLOAT = 2; + + private static final String ATTRIBUTE_TYPE = "astroType"; + private static final String ATTRIBUTE_FACILITY = "astroFacility"; + private static final String ATTRIBUTE_INSTRUMENT = "astroInstrument"; + private static final String ATTRIBUTE_START_TIME = "coverage.Temporal.StartTime"; + private static final String ATTRIBUTE_STOP_TIME = "coverage.Temporal.StopTime"; + + static { dbgLog.fine("FITS plugin: loading the default configuration values;"); - defaultRecognizedFitsMetadataKeys.put("DATE", 0); - //defaultRecognizedFitsMetadataKeys.put("DATE-OBS", 0); - defaultRecognizedFitsMetadataKeys.put("ORIGIN", 0); - defaultRecognizedFitsMetadataKeys.put("AUTHOR", 0); - defaultRecognizedFitsMetadataKeys.put("REFERENC", 0); - defaultRecognizedFitsMetadataKeys.put("COMMENT", 0); - defaultRecognizedFitsMetadataKeys.put("HISTORY", 0); - defaultRecognizedFitsMetadataKeys.put("OBSERVER", 0); + + // The following fields have been dropped from the configuration + // map, not because we are not interested in them anymore - but + // because they are now *mandatory*, i.e. non-configurable. + // We will be checking for the "telescope" and "instrument" + // fields on all files and HDUs: + // -- 4.0 beta + //defaultRecognizedFitsMetadataKeys.put("TELESCOP", 0); //defaultRecognizedFitsMetadataKeys.put("INSTRUME", 0); - defaultRecognizedFitsMetadataKeys.put("EQUINOX", 0); - defaultRecognizedFitsMetadataKeys.put("EXTNAME", 0); - defaultRecognizedFitsColumnKeys.put("TTYPE", 1); + //defaultRecognizedFitsMetadataKeys.put("NAXIS", 0); + //defaultRecognizedFitsMetadataKeys.put("DATE-OBS", FIELD_TYPE_DATE); + // both coverage.Temporal.StartTime and .EndTime are derived from + // the DATE-OBS values; extra rules apply (coded further down) + //defaultIndexableFitsMetaKeys.put("DATE-OBS", "coverage.Temporal.StartTime"); + //defaultIndexableFitsMetaKeys.put("DATE-OBS", "coverage.Temporal.StopTime"); + + + //defaultIndexableFitsMetaKeys.put("NAXIS", "naxis"); + + + // Optional, configurable fields: + + defaultRecognizedFitsMetadataKeys.put("FILTER", FIELD_TYPE_TEXT); + defaultRecognizedFitsMetadataKeys.put("OBJECT", FIELD_TYPE_TEXT); + defaultRecognizedFitsMetadataKeys.put("CD1_1", FIELD_TYPE_FLOAT); + defaultRecognizedFitsMetadataKeys.put("CDELT", FIELD_TYPE_FLOAT); + defaultRecognizedFitsMetadataKeys.put("EXPTIME", FIELD_TYPE_DATE); + defaultRecognizedFitsMetadataKeys.put("CRVAL1", FIELD_TYPE_TEXT); + defaultRecognizedFitsMetadataKeys.put("CRVAL2", FIELD_TYPE_TEXT); + + + // And the mapping to the corresponding values in the + // metadata block: + // (per 4.0 beta implementation, the names below must match + // the names of the fields in the corresponding metadata block!) + + defaultIndexableFitsMetaKeys.put("TELESCOP", ATTRIBUTE_FACILITY); + defaultIndexableFitsMetaKeys.put("INSTRUME", ATTRIBUTE_INSTRUMENT); + defaultIndexableFitsMetaKeys.put("FILTER", "coverage.Spectral.Bandpass"); + defaultIndexableFitsMetaKeys.put("OBJECT", "astroObject"); + defaultIndexableFitsMetaKeys.put("CD1_1", "resolution.Spatial"); + defaultIndexableFitsMetaKeys.put("CDELT", "resolution.Spatial"); + defaultIndexableFitsMetaKeys.put("EXPTIME", "resolution.Temporal"); + defaultIndexableFitsMetaKeys.put("CDELT", "resolution.Spatial"); + defaultIndexableFitsMetaKeys.put("CRVAL1", "coverage.Spatial"); + defaultIndexableFitsMetaKeys.put("CRVAL2", "coverage.Spatial"); + + + + // The following fields have been dropped from the configuration + // in 4.0 beta because we are not interested in them + // any longer: + + //defaultRecognizedFitsMetadataKeys.put("EQUINOX", 0); + //defaultIndexableFitsMetaKeys.put("EQUINOX", "Equinox"); + + //defaultRecognizedFitsMetadataKeys.put("DATE", 0); + //defaultRecognizedFitsMetadataKeys.put("ORIGIN", 0); + //defaultRecognizedFitsMetadataKeys.put("AUTHOR", 0); + //defaultRecognizedFitsMetadataKeys.put("REFERENC", 0); + //defaultRecognizedFitsMetadataKeys.put("COMMENT", 0); + //defaultRecognizedFitsMetadataKeys.put("HISTORY", 0); + //defaultRecognizedFitsMetadataKeys.put("OBSERVER", 0); + //defaultRecognizedFitsMetadataKeys.put("EXTNAME", 0); + //defaultRecognizedFitsColumnKeys.put("TTYPE", 1); //defaultRecognizedFitsColumnKeys.put("TCOMM", 0); //defaultRecognizedFitsColumnKeys.put("TUCD", 0); - defaultRecognizedFitsMetadataKeys.put("FILTER", 0); - defaultRecognizedFitsMetadataKeys.put("OBJECT", 0); - defaultRecognizedFitsMetadataKeys.put("NAXIS", 0); - defaultRecognizedFitsMetadataKeys.put("CD1_1", 0); - defaultRecognizedFitsMetadataKeys.put("CDELT", 0); - defaultRecognizedFitsMetadataKeys.put("CUNIT", 0); + //defaultRecognizedFitsMetadataKeys.put("CUNIT", 0); - defaultIndexableFitsMetaKeys.put("DATE", "Date"); - //defaultIndexableFitsMetaKeys.put("DATE-OBS", "coverage.Temporal.StartTime"); - defaultIndexableFitsMetaKeys.put("ORIGIN", "Origin"); - defaultIndexableFitsMetaKeys.put("AUTHOR", "Author"); - defaultIndexableFitsMetaKeys.put("REFERENC", "Reference"); - defaultIndexableFitsMetaKeys.put("COMMENT", "Comment"); - defaultIndexableFitsMetaKeys.put("HISTORY", "History"); + //defaultIndexableFitsMetaKeys.put("DATE", "Date"); + //defaultIndexableFitsMetaKeys.put("ORIGIN", "Origin"); + //defaultIndexableFitsMetaKeys.put("AUTHOR", "Author"); + //defaultIndexableFitsMetaKeys.put("REFERENC", "Reference"); + //defaultIndexableFitsMetaKeys.put("COMMENT", "Comment"); + //defaultIndexableFitsMetaKeys.put("HISTORY", "History"); //defaultIndexableFitsMetaKeys.put("OBSERVER", "Observer"); - //defaultIndexableFitsMetaKeys.put("TELESCOP", "Telescope"); - defaultIndexableFitsMetaKeys.put("INSTRUME", "Instrument"); - defaultIndexableFitsMetaKeys.put("EQUINOX", "Equinox"); - defaultIndexableFitsMetaKeys.put("EXTNAME", "Extension-Name"); - defaultIndexableFitsMetaKeys.put("TTYPE", "Column-Label"); + //defaultIndexableFitsMetaKeys.put("EXTNAME", "Extension-Name"); + //defaultIndexableFitsMetaKeys.put("TTYPE", "Column-Label"); //defaultIndexableFitsMetaKeys.put("TCOMM", "Column-Comment"); //defaultIndexableFitsMetaKeys.put("TUCD", "Column-UCD"); - defaultIndexableFitsMetaKeys.put("FILTER", "coverage.Spectral.Bandpass"); - defaultIndexableFitsMetaKeys.put("OBJECT", "object"); - defaultIndexableFitsMetaKeys.put("NAXIS", "naxis"); - defaultIndexableFitsMetaKeys.put("CD1_1", "cd1_1"); - defaultIndexableFitsMetaKeys.put("CUNIT", "cunit"); + //defaultIndexableFitsMetaKeys.put("CUNIT", "cunit"); } - private static final String METADATA_SUMMARY = "FILE_METADATA_SUMMARY_INFO"; - private static final String OPTION_PREFIX_SEARCHABLE = "PREFIXSEARCH"; + //private static final String METADATA_SUMMARY = "FILE_METADATA_SUMMARY_INFO"; + //private static final String OPTION_PREFIX_SEARCHABLE = "PREFIXSEARCH"; + private static final String HDU_TYPE_IMAGE = "Image"; private static final String HDU_TYPE_IMAGE_CUBE = "Cube"; @@ -141,11 +196,13 @@ public class FITSFileMetadataExtractor extends FileMetadataExtractor { private static final String FILE_TYPE_TABLE = "Table"; private static final String FILE_TYPE_SPECTRUM = "Spectrum"; + // Recognized date formats, for extracting temporal values: - private static final String ATTRIBUTE_FACILITY = "facility"; - private static final String ATTRIBUTE_INSTRUMENT = "instrument"; - private static final String ATTRIBUTE_START_TIME = "coverage.Temporal.StartTime"; - private static final String ATTRIBUTE_STOP_TIME = "coverage.Temporal.StopTime"; + private static SimpleDateFormat[] DATE_FORMATS = new SimpleDateFormat[] { + new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss"), + new SimpleDateFormat("yyyy-MM-dd") + //new SimpleDateFormat("yyyy") + }; /** * Constructs a FITSFileMetadataExtractor instance with a @@ -164,7 +221,8 @@ public FITSFileMetadataExtractor() { public FileMetadataIngest ingest (BufferedInputStream stream) throws IOException{ dbgLog.fine("Attempting to read FITS file;"); - Map> fitsMetaMap = new HashMap>(); + Map> fitsMetaMap = new HashMap<>(); + Map> tempMetaMap = new HashMap<>(); FileMetadataIngest ingest = new FileMetadataIngest(); ingest.setMetadataBlockName(ASTROPHYSICS_BLOCK_NAME); @@ -180,6 +238,7 @@ public FileMetadataIngest ingest (BufferedInputStream stream) throws IOException throw new IOException ("Failed to open FITS stream; null Fits object"); } + readConfig(); BasicHDU hdu = null; @@ -197,7 +256,7 @@ public FileMetadataIngest ingest (BufferedInputStream stream) throws IOException List hduNames = new ArrayList(); try { - fitsMetaMap.put("type", new HashSet()); + fitsMetaMap.put(ATTRIBUTE_TYPE, new HashSet()); while ((hdu = fitsFile.readHDU()) != null) { dbgLog.fine("reading HDU number " + i); @@ -212,15 +271,20 @@ public FileMetadataIngest ingest (BufferedInputStream stream) throws IOException nAxis = hduHeader.getIntValue("NAXIS"); dbgLog.fine("NAXIS (directly from header): "+nAxis); - if (nAxis > 1) { - nImageHDUs++; - if (nAxis > 2) { - hduTypes.add(HDU_TYPE_IMAGE_CUBE); + if (nAxis > 0) { + metadataKeys.add("NAXIS"); + + if (nAxis > 1) { + + nImageHDUs++; + if (nAxis > 2) { + hduTypes.add(HDU_TYPE_IMAGE_CUBE); - } else { - // Check for type Spectrum: + } else { + // Check for type Spectrum: - hduTypes.add(HDU_TYPE_IMAGE); + hduTypes.add(HDU_TYPE_IMAGE); + } } } else { hduTypes.add(HDU_TYPE_UNKNOWN); @@ -257,20 +321,23 @@ public FileMetadataIngest ingest (BufferedInputStream stream) throws IOException if (hduInstrument != null) { fitsMetaMap.put(ATTRIBUTE_INSTRUMENT, new HashSet()); fitsMetaMap.get(ATTRIBUTE_INSTRUMENT).add(hduInstrument); - metadataKeys.add("TELESCOP"); + metadataKeys.add("INSTRUME"); } } - - if (fitsMetaMap.get(ATTRIBUTE_START_TIME) == null) { - String obsDate = hduHeader.getStringValue("DATE-OBS"); - if (obsDate != null) { - fitsMetaMap.put(ATTRIBUTE_START_TIME, new HashSet()); - fitsMetaMap.get(ATTRIBUTE_START_TIME).add(obsDate); - fitsMetaMap.put(ATTRIBUTE_STOP_TIME, new HashSet()); - fitsMetaMap.get(ATTRIBUTE_STOP_TIME).add(obsDate); - metadataKeys.add("DATE-OBS"); + + //if (fitsMetaMap.get(ATTRIBUTE_START_TIME) == null) { + String obsDate = hduHeader.getStringValue("DATE-OBS"); + if (obsDate != null) { + // The value of DATE-OBS will be used later, to determine + // coverage.Temporal.StarTime and .StppTime; for now + // we are storing the values in a temporary map. + if (tempMetaMap.get("DATE-OBS") == null) { + tempMetaMap.put("DATE-OBS", new HashSet()); } + tempMetaMap.get("DATE-OBS").add(obsDate); + metadataKeys.add("DATE-OBS"); } + //} /* TODO: @@ -278,6 +345,7 @@ public FileMetadataIngest ingest (BufferedInputStream stream) throws IOException */ for (int j = 0; j < hdu.getAxes().length; j++) { int nAxisN = hdu.getAxes()[j]; + metadataKeys.add("NAXIS"+j); dbgLog.fine("NAXIS"+j+" value: "+nAxisN); } @@ -305,12 +373,12 @@ public FileMetadataIngest ingest (BufferedInputStream stream) throws IOException dbgLog.fine("recognized key: " + headerKey); recognized = true; metadataKeys.add(headerKey); - } else if (isRecognizedColumnKey(headerKey)) { + } /*else if (isRecognizedColumnKey(headerKey)) { dbgLog.fine("recognized column key: " + headerKey); recognized = true; //columnKeys.add(getTrimmedColumnKey(headerKey)); columnKeys.add(headerKey); - } + }*/ } if (recognized) { @@ -377,25 +445,116 @@ public FileMetadataIngest ingest (BufferedInputStream stream) throws IOException String imageFileType = determineImageFileType (nImageHDUs, hduTypes); if (imageFileType != null) { - fitsMetaMap.get("type").add(imageFileType); + fitsMetaMap.get(ATTRIBUTE_TYPE).add(imageFileType); } - if (fitsMetaMap.get("type").isEmpty()) { + if (fitsMetaMap.get(ATTRIBUTE_TYPE).isEmpty()) { String tableFileType = determineTableFileType (nTableHDUs, hduTypes); if (tableFileType != null) { - fitsMetaMap.get("type").add(tableFileType); + fitsMetaMap.get(ATTRIBUTE_TYPE).add(tableFileType); } } - if (n == 1 && fitsMetaMap.get("type").isEmpty()) { + if (n == 1 && fitsMetaMap.get(ATTRIBUTE_TYPE).isEmpty()) { // If there's only 1 (primary) HDU in the file, we'll make sure // the file type is set to (at least) "image" - even if we skipped // that HDU because it looked empty: - fitsMetaMap.get("type").add(FILE_TYPE_IMAGE); + fitsMetaMap.get(ATTRIBUTE_TYPE).add(FILE_TYPE_IMAGE); } + // Final post-processing. + // Some values are derived from the collected fields + // (for example, the coverage.temporal.StopTime is the min. + // of all the collected OBS-DATE values). + // Specific rules are applied below: + + // start time and and stop time: + + int numObsDates = tempMetaMap.get("DATE-OBS") == null ? 0 : tempMetaMap.get("DATE-OBS").size(); + if (numObsDates > 0) { - String metadataSummary = createMetadataSummary (n, nTableHDUs, nImageHDUs, nUndefHDUs, metadataKeys, columnKeys, hduNames, fitsMetaMap.get("Column-Label")); + String[] obsDateValues = new String[numObsDates]; + obsDateValues = tempMetaMap.get("DATE-OBS").toArray(new String[0]); + + Date minDate = null; + Date maxDate = null; + + String startObsTime = ""; + String stopObsTime = ""; + + for (int k = 0; k < obsDateValues.length; k++) { + Date obsDate = null; + String obsDateString = obsDateValues[k]; + + for (SimpleDateFormat format : DATE_FORMATS) { + // Strict parsing - it will throw an + // exception if it doesn't parse! + format.setLenient(false); + // replace all slashes with dashes: + obsDateString = obsDateString.replace('/', '-'); + // parse date string without truncating: + try { + obsDate = format.parse(obsDateString); + dbgLog.fine("Valid date: " + obsDateString + ", format: " + format.toPattern()); + break; + } catch (ParseException ex) { + obsDate = null; + } + // Alternative method: + // We'll truncate the string to the point where the parser + // stopped; e.g., if our format was yyyy-mm-dd and the + // string was "2014-05-07T14:52:01" we'll truncate the + // string to "2014-05-07". + /* + ParsePosition pos = new ParsePosition(0); + obsDate = format.parse(obsDateString, pos); + if (obsDate == null) { + continue; + } + if (pos.getIndex() != obsDateString.length()) { + obsDateString = obsDateString.substring(0, pos.getIndex()); + } + dbgLog.fine("Valid date: " + obsDateString + ", format: " + format.toPattern()); + break; + */ + } + + if (obsDate != null) { + + if (minDate == null) { + minDate = obsDate; + startObsTime = obsDateString; + } else if (obsDate.before(minDate)) { + minDate = obsDate; + startObsTime = obsDateString; + } + + if (maxDate == null) { + maxDate = obsDate; + stopObsTime = obsDateString; + } else if (obsDate.after(maxDate)) { + maxDate = obsDate; + stopObsTime = obsDateString; + } + } + } + + if (!startObsTime.equals("")) { + fitsMetaMap.put(ATTRIBUTE_START_TIME, new HashSet()); + fitsMetaMap.get(ATTRIBUTE_START_TIME).add(startObsTime); + } + + if (!stopObsTime.equals("")) { + fitsMetaMap.put(ATTRIBUTE_STOP_TIME, new HashSet()); + fitsMetaMap.get(ATTRIBUTE_STOP_TIME).add(stopObsTime); + } + } + + // TODO: + // Numeric fields should also be validated! + // -- L.A. 4.0 beta + + String metadataSummary = createMetadataSummary (n, nTableHDUs, nImageHDUs, nUndefHDUs, metadataKeys); //, columnKeys, hduNames, fitsMetaMap.get("Column-Label")); ingest.setMetadataMap(fitsMetaMap); ingest.setMetadataSummary(metadataSummary); @@ -418,7 +577,7 @@ private void readConfig () { int nConfiguredKeys = 0; if (domainRoot != null && !(domainRoot.equals(""))) { - String configFileName = domainRoot + "/config/fits.conf"; + String configFileName = domainRoot + "/config/fits.conf_DONOTREAD"; File configFile = new File (configFileName); BufferedReader configFileReader = null; @@ -466,11 +625,13 @@ private void readConfig () { // Extra field options: // (the only option currently supported is prefix-steam searching // on the field) + /* if (configTokens.length > 3 && configTokens[3] != null) { if (configTokens[3].equalsIgnoreCase(OPTION_PREFIX_SEARCHABLE)) { recognizedFitsMetadataKeys.put(configTokens[1], 1); } } + */ nConfiguredKeys++; } else { dbgLog.warning("FITS plugin: empty (or malformed) meta key entry in the config file."); @@ -489,11 +650,12 @@ private void readConfig () { indexableFitsMetaKeys.put(configTokens[1], configTokens[1]); } // Extra field options: + /* if (configTokens.length > 3 && configTokens[3] != null) { if (configTokens[3].equalsIgnoreCase(OPTION_PREFIX_SEARCHABLE)) { recognizedFitsColumnKeys.put(configTokens[1], 1); } - } + } */ nConfiguredKeys++; } else { dbgLog.warning("FITS plugin: empty (or malformed) column key entry in the config file."); @@ -611,7 +773,7 @@ private String getTrimmedColumnKey (String key) { return null; } - private String createMetadataSummary (int nHDU, int nTableHDUs, int nImageHDUs, int nUndefHDUs, Set metadataKeys, Set columnKeys, List hduNames, Set columnNames) { + private String createMetadataSummary (int nHDU, int nTableHDUs, int nImageHDUs, int nUndefHDUs, Set metadataKeys) { //, Set columnKeys, List hduNames, Set columnNames) { String summary = ""; if (nHDU > 1) { @@ -620,7 +782,7 @@ private String createMetadataSummary (int nHDU, int nTableHDUs, int nImageHDUs, summary = summary.concat("The primary HDU; "); if (nTableHDUs > 0) { summary = summary.concat(nTableHDUs + " Table HDU(s) "); - summary = summary.concat("(column names: "+StringUtils.join(columnNames, ", ")+"); "); + //summary = summary.concat("(column names: "+StringUtils.join(columnNames, ", ")+"); "); } if (nImageHDUs > 0) { summary = summary.concat(nImageHDUs + " Image HDU(s); "); diff --git a/src/main/java/edu/harvard/iq/dataverse/ingest/tabulardata/impl/plugins/csv/CSVFileReader.java b/src/main/java/edu/harvard/iq/dataverse/ingest/tabulardata/impl/plugins/csv/CSVFileReader.java index 0399baddd4d..d03fe9bc298 100644 --- a/src/main/java/edu/harvard/iq/dataverse/ingest/tabulardata/impl/plugins/csv/CSVFileReader.java +++ b/src/main/java/edu/harvard/iq/dataverse/ingest/tabulardata/impl/plugins/csv/CSVFileReader.java @@ -443,8 +443,19 @@ public int readFile(BufferedReader csvReader, DataTable dataTable, PrintWriter f } else if (valueTokens[i].equalsIgnoreCase("null")) { // By request from Gus - "NULL" is recognized as a // numeric zero: - caseRow[i] = "0"; + if (isIntegerVariable[i]) { + caseRow[i] = "0"; + } else { + caseRow[i] = "0.0"; + } } else { + /* No re-formatting is done on any other numeric values. + * We'll save them as they were, for archival purposes. + * The alternative solution - formatting in sci. notation + * is commented-out below. + */ + caseRow[i] = valueTokens[i]; + /* if (isIntegerVariable[i]) { try { Integer testIntegerValue = new Integer(valueTokens[i]); @@ -466,13 +477,11 @@ public int readFile(BufferedReader csvReader, DataTable dataTable, PrintWriter f // in a IEEE 754-like "scientific notation" - for ex., // 753.24 will be encoded as 7.5324e2 BigDecimal testBigDecimal = new BigDecimal(valueTokens[i], doubleMathContext); - /* // an experiment - what's gonna happen if we just // use the string representation of the bigdecimal object // above? - caseRow[i] = testBigDecimal.toString(); - */ - + //caseRow[i] = testBigDecimal.toString(); += caseRow[i] = String.format(FORMAT_IEEE754, testBigDecimal); // Strip meaningless zeros and extra + signs: @@ -486,6 +495,7 @@ public int readFile(BufferedReader csvReader, DataTable dataTable, PrintWriter f throw new IOException("Failed to parse a value recognized as numeric in the first pass! (?)"); } } + */ } } else if (isTimeVariable[i] || isDateVariable[i]) { // Time and Dates are stored NOT quoted (don't ask). diff --git a/src/main/java/edu/harvard/iq/dataverse/util/json/JsonParser.java b/src/main/java/edu/harvard/iq/dataverse/util/json/JsonParser.java index d686417ac3e..b43aec1f42f 100644 --- a/src/main/java/edu/harvard/iq/dataverse/util/json/JsonParser.java +++ b/src/main/java/edu/harvard/iq/dataverse/util/json/JsonParser.java @@ -98,7 +98,6 @@ public DatasetVersion parseDatasetVersion( JsonObject obj ) throws JsonParseExce dsv.setDatasetDistributors(distros); } - return dsv; } catch (ParseException ex) { @@ -125,42 +124,40 @@ public List parseMetadataBlocks( JsonObject json ) throws JsonPars public DatasetField parseField( JsonObject json ) throws JsonParseException { if ( json == null ) return null; - try { - DatasetField ret = new DatasetField(); - DatasetFieldType type = datasetFieldSvc.findByName(json.getString("typeName","")); - - if ( type == null ) { - throw new NoResultException("Can't find type '" + json.getString("typeName","") +"'"); - } - ret.setDatasetFieldType(type); + + DatasetField ret = new DatasetField(); + DatasetFieldType type = datasetFieldSvc.findByNameOpt(json.getString("typeName","")); - if ( type.isCompound() ) { - List vals = parseCompoundValue(type, json); - for ( DatasetFieldCompoundValue dsfcv : vals ) { - dsfcv.setParentDatasetField(ret); - } - ret.setDatasetFieldCompoundValues(vals); + if ( type == null ) { + throw new JsonParseException("Can't find type '" + json.getString("typeName","") +"'"); + } + + ret.setDatasetFieldType(type); - } else if ( type.isControlledVocabulary() ) { - List vals = parseControlledVocabularyValue(type, json); - for ( ControlledVocabularyValue cvv : vals ) { - cvv.setDatasetFieldType(type); - } - ret.setControlledVocabularyValues(vals); + if ( type.isCompound() ) { + List vals = parseCompoundValue(type, json); + for ( DatasetFieldCompoundValue dsfcv : vals ) { + dsfcv.setParentDatasetField(ret); + } + ret.setDatasetFieldCompoundValues(vals); - } else { - // primitive - List values = parsePrimitiveValue( json ); - for ( DatasetFieldValue val : values ) { - val.setDatasetField(ret); - } - ret.setDatasetFieldValues(values); + } else if ( type.isControlledVocabulary() ) { + List vals = parseControlledVocabularyValue(type, json); + for ( ControlledVocabularyValue cvv : vals ) { + cvv.setDatasetFieldType(type); } + ret.setControlledVocabularyValues(vals); - return ret; - } catch ( NoResultException nre ) { - throw new JsonParseException("Can't find field type named '" + json.getString("typeName","") + "'"); + } else { + // primitive + List values = parsePrimitiveValue( json ); + for ( DatasetFieldValue val : values ) { + val.setDatasetField(ret); + } + ret.setDatasetFieldValues(values); } + + return ret; } public List parseCompoundValue( DatasetFieldType compoundType, JsonObject json ) throws JsonParseException { diff --git a/src/main/java/edu/harvard/iq/dataverse/util/json/JsonPrinter.java b/src/main/java/edu/harvard/iq/dataverse/util/json/JsonPrinter.java index 0ddb264ab73..e9309ee481b 100644 --- a/src/main/java/edu/harvard/iq/dataverse/util/json/JsonPrinter.java +++ b/src/main/java/edu/harvard/iq/dataverse/util/json/JsonPrinter.java @@ -33,6 +33,8 @@ import java.util.Deque; import java.util.LinkedList; import java.util.Map; +import javax.json.JsonArray; +import javax.json.JsonObject; /** * Convert objects to Json. @@ -225,12 +227,12 @@ public static String typeClassString( DatasetFieldType typ ) { public static JsonObjectBuilder json( DatasetField dfv ) { JsonObjectBuilder bld = jsonObjectBuilder(); - bld.add( "id", dfv.getId() ); if ( dfv.isEmpty() ) { bld.addNull("value"); } else { - // TODO traverse the fields - bld.add( "value", dfv.getDisplayValue() ); + JsonArrayBuilder fieldArray = Json.createArrayBuilder(); + DatasetFieldWalker.walk(dfv, new DatasetFieldsToJson(fieldArray)); + bld.add( "value", fieldArray.build().getJsonObject(0) ); } return bld; @@ -319,66 +321,63 @@ public static String format( Date d ) { return (d==null) ? null : dateFormat.format(d); } - private static class DatasetFieldsToJson implements DatasetFieldWalker.Listener { Deque objectStack = new LinkedList<>(); Deque valueArrStack = new LinkedList<>(); - Deque fieldAggregator = new LinkedList<>(); - - public DatasetFieldsToJson(JsonArrayBuilder fieldsArray) { - fieldAggregator.push(fieldsArray); + + DatasetFieldsToJson( JsonArrayBuilder result ) { + valueArrStack.push(result); } - + @Override public void startField(DatasetField f) { objectStack.push( jsonObjectBuilder() ); + // Invariant: all values are multiple. Diffrentiation between multiple and single is done at endField. + valueArrStack.push(Json.createArrayBuilder()); + DatasetFieldType typ = f.getDatasetFieldType(); objectStack.peek().add("typeName", typ.getName() ); objectStack.peek().add("multiple", typ.isAllowMultiples()); objectStack.peek().add("typeClass", typeClassString(typ) ); - if ( typ.isAllowMultiples() ) { - valueArrStack.push(Json.createArrayBuilder()); - } } @Override public void endField(DatasetField f) { - if ( f.getDatasetFieldType().isAllowMultiples() ) { - objectStack.peek().add("value", valueArrStack.pop()); + JsonObjectBuilder jsonField = objectStack.pop(); + JsonArray jsonValues = valueArrStack.pop().build(); + if ( ! jsonValues.isEmpty() ) { + jsonField.add("value", + f.getDatasetFieldType().isAllowMultiples() ? jsonValues + : jsonValues.get(0) ); + valueArrStack.peek().add(jsonField); } - fieldAggregator.peek().add(objectStack.pop()); } @Override public void primitiveValue(DatasetFieldValue dsfv) { - if ( dsfv.getDatasetField().getDatasetFieldType().isAllowMultiples() ) { - valueArrStack.peek().add( dsfv.getValue() ); - } else { - objectStack.peek().add("value", dsfv.getValue()); - } + valueArrStack.peek().add( dsfv.getValue() ); } @Override public void controledVocabularyValue(ControlledVocabularyValue cvv) { - if ( cvv.getDatasetFieldType().isAllowMultiples() ) { - valueArrStack.peek().add( cvv.getStrValue() ); - } else { - objectStack.peek().add("value", cvv.getStrValue()); - } + valueArrStack.peek().add( cvv.getStrValue() ); } @Override public void startCompoundValue(DatasetFieldCompoundValue dsfcv) { - fieldAggregator.push( Json.createArrayBuilder() ); + valueArrStack.push( Json.createArrayBuilder() ); } @Override public void endCompoundValue(DatasetFieldCompoundValue dsfcv) { - if ( dsfcv.getParentDatasetField().getDatasetFieldType().isAllowMultiples() ) { - valueArrStack.peek().add( fieldAggregator.pop() ); - } else { - objectStack.peek().add("value", fieldAggregator.pop() ); + JsonArray jsonValues = valueArrStack.pop().build(); + if ( ! jsonValues.isEmpty() ) { + JsonObjectBuilder jsonField = jsonObjectBuilder(); + for ( JsonObject jobj : jsonValues.getValuesAs(JsonObject.class) ) { + jsonField.add( jobj.getString("typeName"), jobj ); + } + valueArrStack.peek().add( jsonField ); } } } diff --git a/src/main/webapp/dataset.xhtml b/src/main/webapp/dataset.xhtml index 6f0addf7671..22f95d9ff64 100644 --- a/src/main/webapp/dataset.xhtml +++ b/src/main/webapp/dataset.xhtml @@ -11,9 +11,10 @@ - + + @@ -88,76 +89,20 @@
- -
- @@ -178,7 +123,7 @@ - + - Keyword(s) + Keyword(s) #{DatasetPage.datasetVersionUI.keyword.displayValue} - Subject(s) + Subject(s) #{DatasetPage.datasetVersionUI.subject.displayValue} @@ -246,7 +191,7 @@ - Notes + Notes
#{DatasetPage.datasetVersionUI.notes.value}
@@ -266,9 +211,15 @@ Host Dataverse
+ + + + + +
@@ -326,7 +277,7 @@

--> - + @@ -453,21 +404,30 @@ - - - + + + + + Title + - - - + + + - - - + + + Version + + + - - + + + Version Date + + @@ -477,7 +437,7 @@ - + diff --git a/src/main/webapp/dataverse.xhtml b/src/main/webapp/dataverse.xhtml index d475b6bccd9..59b10328cc7 100644 --- a/src/main/webapp/dataverse.xhtml +++ b/src/main/webapp/dataverse.xhtml @@ -64,18 +64,19 @@ - + - + - - - - +
+ + + +
@@ -85,8 +86,8 @@ - +
@@ -97,28 +98,22 @@
-
- -
-
+ +
-
- -
-
+ +
-
- -
-
+ +
@@ -126,9 +121,7 @@
-
- -
+
-
-
- - -
-
- -
- -
+
+ + + +
+ + + + + + +
+
-
- +
+
+ +
+ +
+ +
+
+
- - + + + diff --git a/src/main/webapp/dataverse_header.xhtml b/src/main/webapp/dataverse_header.xhtml index 1262c913c3c..17fa93a050c 100644 --- a/src/main/webapp/dataverse_header.xhtml +++ b/src/main/webapp/dataverse_header.xhtml @@ -19,7 +19,9 @@ - Beta + + Beta +
diff --git a/src/main/webapp/dataverse_template.xhtml b/src/main/webapp/dataverse_template.xhtml index 124b72640da..570fedf8065 100644 --- a/src/main/webapp/dataverse_template.xhtml +++ b/src/main/webapp/dataverse_template.xhtml @@ -33,7 +33,7 @@ Default Body
@@ -47,25 +47,9 @@ + -