Skip to content

Error in Redetect File Type API #40

@helkv

Description

@helkv

Executing the Redetect File Type API-Call (https://guides.dataverse.org/en/latest/api/native-api.html#redetect-file-type) results in multiple Exceptions, if the original filetype is text/plain and S3 is used as File-Storage.
The result is an incorrect filetype detection and the file being removed from the index.

Equivalent Issue from IQSS/Dataverse: IQSS#7527 & IQSS#7631

Server: All instances
Date of Test: 06.07.2022
Browser: -
Version: i.a. Dataverse v. 5.10.1 / v. 5.11
User: -

Preconditions:

  • S3 as File-Storage
  • Plain Text as original File Type of the File

Actions:

  1. Call the Redetect File Type API: {{base_url}}/api/files/64/redetect?dryRun=false
  2. Result:
    • Incorrect result of the API-Call
    • The file is missing in the Solr index

Root of the Error:

  1. The File Type is checked on a temporary File (.tmp) when using S3
  2. Method FileUtil.determineFileTypeByExtension() returns null for .tmp-File with text/plain as File Type
  3. JPA/EJB Exceptions because Type of a File must not be null
  4. Indexing of the File fails because of the preceding errors (NPE while indexing the Dataset)

Server.log with the related Errors:
Server_Log_with_Redetect_File_Type_Errors.log

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions