diff --git a/hapi-3.3.1/HAPI-data-access-spec-3.3.1.md b/hapi-3.3.1/HAPI-data-access-spec-3.3.1.md new file mode 100644 index 0000000..750e5ef --- /dev/null +++ b/hapi-3.3.1/HAPI-data-access-spec-3.3.1.md @@ -0,0 +1,2223 @@ + +[1 Significant Changes to Specification](#1-significant-changes-to-specification)
+   [1.1 v2 to v3 API Changes](#11-v2-to-v3-api-changes)
+   [1.2 v2 to v3 Schema Changes](#12-v2-to-v3-schema-changes)
+   [1.3 3.X changes](#13-3x-changes)
+[2 Introduction](#2-introduction)
+   [2.1 Overview](#21-overview)
+   [2.2 Facilitating Adoption](#22-facilitating-adoption)
+   [2.3 Limitations](#23-limitations)
+[3 Endpoints](#3-endpoints)
+   [3.1 Overview](#31-overview)
+   [3.2 hapi](#32-hapi)
+   [3.3 about](#33-about)
+   [3.4 capabilities](#34-capabilities)
+   [3.5 catalog](#35-catalog)
+   [3.6 info](#36-info)
+      [3.6.1 Request Parameters](#361-request-parameters)
+      [3.6.2 info Response Object](#362-info-response-object)
+      [3.6.3 unitsSchema Details](#363-unitsschema-details)
+      [3.6.4 coordinateSystemSchema Details](#364-coordinatesystemschema-details)
+      [3.6.5 additionalMetadata Object](#365-additionalmetadata-object)
+      [3.6.6 stringType Object](#366-stringtype-object)
+      [3.6.7 parameter Object](#367-parameter-object)
+      [3.6.8 size Details](#368-size-details)
+      [3.6.9 fill Details](#369-fill-details)
+      [3.6.10 units and label Array](#3610-units-and-label-array)
+      [3.6.11 vectorComponents Object](#3611-vectorcomponents-object)
+      [3.6.12 location and geoLocation Details](#3612-location-and-geolocation-details)
+      [3.6.13 bins Object](#3613-bins-object)
+      [3.6.14 JSON References](#3614-json-references)
+   [3.7 data](#37-data)
+      [3.7.1 Request Parameters](#371-request-parameters)
+      [3.7.2 Response](#372-response)
+      [3.7.3 Examples](#373-examples)
+         [3.7.3.1 Data with Header](#3731-data-with-header)
+         [3.7.3.2 Data Only](#3732-data-only)
+      [3.7.4 Response formats](#374-response-formats)
+         [3.7.4.1 CSV](#3741-csv)
+         [3.7.4.2 Binary](#3742-binary)
+         [3.7.4.3 JSON](#3743-json)
+      [3.7.5 Errors While Streaming](#375-errors-while-streaming)
+      [3.7.6 Representation of Time](#376-representation-of-time)
+         [3.7.6.1 Incoming time values](#3761-incoming-time-values)
+         [3.7.6.2 Outgoing time values](#3762-outgoing-time-values)
+         [3.7.6.3 Time Range With No Data](#3763-time-range-with-no-data)
+         [3.7.6.4 Time Range With All Fill Values](#3764-time-range-with-all-fill-values)
+[4 Status Codes](#4-status-codes)
+   [4.1 status Object](#41-status-object)
+   [4.2 status Error Codes](#42-status-error-codes)
+   [4.3 Client Error Handling](#43-client-error-handling)
+[5 Implementation Details](#5-implementation-details)
+   [5.1 Cross-Origin Resource Sharing](#51-cross-origin-resource-sharing)
+   [5.2 Security Notes](#52-security-notes)
+   [5.3 HEAD Requests and Efficiency](#53-head-requests-and-efficiency)
+[6 References](#6-references)
+[7 Contact](#7-contact)
+[8 Appendix](#8-appendix)
+   [8.1 Sample Landing Page](#81-sample-landing-page)
+   [8.2 Allowed Characters in id, dataset, and parameters](#82-allowed-characters-in-id-dataset-and-parameters)
+   [8.3 JSON Object of Status Codes](#83-json-object-of-status-codes)
+   [8.4 Examples](#84-examples)
+   [8.5 Robot clients](#85-robot-clients)
+   [8.6 FAIR](#86-fair)
+      [8.6.1 Findable](#861-findable)
+      [8.6.2 Accessible](#862-accessible)
+      [8.6.3 Interoperable](#863-interoperable)
+      [8.6.4 Reusable](#864-reusable) + + +Version 3.3.1 \| Heliophysics Data and Model Consortium (HDMC) \| + +[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.17516129.svg)](https://doi.org/10.5281/zenodo.17516129) + +# 1 Significant Changes to Specification + +Note that version 3.0 introduced some breaking changes. They are listed here and are documented as issues in the [3.0 Milestone list](https://github.com/hapi-server/data-specification/milestone/4?closed=1). + +## 1.1 v2 to v3 API Changes + +Non-backward compatible changes introduced to the request interface in HAPI 3.0: +1. The URL parameter `id` was replaced with `dataset`. +2. `time.min` and `time.max` were replaced with `start` and `stop`, respectively. +3. Addition of a new endpoint, "[about](#33-about)", for server description metadata. + +These changes were discussed in issue [#77](https://github.com/hapi-server/data-specification/issues/77). HAPI 3 servers must accept both the old and these new parameter names, but the HAPI 2 specification requires an error response if the new URL parameter names are used. In a future version, the deprecated older names will no longer be valid. + +## 1.2 v2 to v3 Schema Changes + +This list captures new optional elements that can be included in the `info` response as of 3.0: +1. Ability to specify time-varying bins ([#83](https://github.com/hapi-server/data-specification/issues/83)) +1. Ability to use JSON references in `info` response ([#82](https://github.com/hapi-server/data-specification/issues/82)) +1. Ability to indicate a units schema (if one is being used for `units` strings) ([#81](https://github.com/hapi-server/data-specification/issues/81)) + +## 1.3 3.X changes + +3.0, 3.1, 3.2, and 3.3 changes are documented in the [changelog](https://github.com/hapi-server/data-specification/blob/master/hapi-3.3.0/changelog.md). We follow [Semantic Versioning](https://semver.org/) conventions, so all 3.X versions are backward compatible with version 3.0. + +# 2 Introduction + +## 2.1 Overview + +This document describes the Heliophysics Application Programmer’s Interface (HAPI) specification, which is an API, metadata, and data streaming format specification for time-series data. The intent of this specification is to enhance interoperability among time series data providers. The objective of this specification is to capture features available from many existing data providers and to codify implementation details so that providers can use a common API. This will make it possible to obtain time series science data content seamlessly from many sources and use a variety of programming languages. + +This document is intended to be used by two groups of people: first by data providers who want to make time-series data available through a HAPI server, and second by data users who want to understand how data is made available from a HAPI server, or perhaps to write client software to obtain data from an existing HAPI server. + +HAPI constitutes a minimum but complete set of capabilities needed for a server to allow access to the time series data values within one or more data collections. Because of its focus on data access, the HAPI metadata standard is not intended for complex search and discovery. However, the metadata schema allows ways to indicate where further descriptive details for any dataset could be found and the metadata contains enough information to enable its use by complex search and discovery software. + +The HAPI API is based on REpresentational State Transfer (RESTful) principles, which emphasize that URLs are stable endpoints through which clients can request data. Because it is based on well-established HTTP request and response rules, a wide range of HTTP clients can be used to interact with HAPI servers. + +Key definitions for terms used in this document include + +* **parameter** – a measured science quantity or a related ancillary quantity at one instant in time; may be scalar or a multi-dimensional array as a function; must have units and must have a fill value that indicates no measurement was available or absent information +* **record** – all the parameters and an associated time value +* **dataset** – a collection of records with the same parameters; a HAPI service presents a dataset as a seamless +collection of time- records such that it or a subset of it can be retrieved without knowledge of the actual storage details +* **catalog** - a collection of datasets +* **request parameter** – keywords that appear after the `?` in a URL + + For example, in the request (see [change notes](#1-significant-changes-to-specification)) + + `http://server/hapi/data?dataset=alpha&start=2016-07-13&stop=2016-07-14` + + the three request parameters are `dataset` (corresponding to the identifier of the dataset), `start`, and `stop`. They have values of `alpha`, `2016-07-13`, and `2016-07-14`, respectively. This document will always use the full phrase "request parameter" to refer to these URL elements to distinguish them from the parameters in a dataset. + + In the above URL, the segment represented as `server` captures the hostname for the HAPI server as well as any prefix path elements before the required `hapi` element. For example, in `http://example.com/public/data/hapi`, the `server` element is `example.com/public/data`. + +## 2.2 Facilitating Adoption + +The text from this section was taken from [Weigel et al., 2021](https://dx.doi.org/10.1029/2021JA029534). + +The HAPI specification is designed to provide a standardized way to provide streaming services that many time-series data providers already offer. The use of the common aspects of existing features of HP data providers means that data providers can, with minimal effort, modify existing streaming servers to make them HAPI compliant. + +In addition, several tools have been developed to facilitate adoption. + +1. Although the HAPI streaming formats are simple enough that only 10-20 lines of code are needed to read a basic HAPI response, there are error conditions and other optimizations to consider (such as caching and parallelization of requests); a robust and comprehensive client implementation is beyond what should be expected of science users or even software developers who want to create tools that use HAPI data. Mature HAPI server client libraries are available from [open-source HAPI projects](https://github.com/hapi-server?q=client-*) for Java, IDL, MATLAB, and Python, so most users can begin with these libraries. These clients read a HAPI response into a data structure appropriate for the given language (e.g., in Python, the response is read into a NumPy N-D array with timestamps converted to Python `datetime` objects). Work is underway to incorporate HAPI readers into existing popular analysis frameworks, such as [SPEDAS in IDL and Python](http://spedas.org), [Autoplot in Java](http://autoplot.org), [SunPy in Python](https://sunpy.org), and [SpacePy in Python](http://spedas.org). + +2. To assist server developers with adoption, [a cross-platform reference server](https://github.com/hapi-server/server-nodejs) has been developed. With this server, a data provider only needs to provide HAPI JSON metadata and a command-line program that emits HAPI CSV given inputs of a dataset identifier and start/stop times. The server handles HAPI error responses, request validation, and logging. + +3. A [validator/verifier](https://hapi-server.org/verify) has been developed that runs many tests on a server. The server executes a comprehensive suite of tests for compliance with the specification. By either downloading and running the test software locally, or by entering the URL of a HAPI server into a website, data providers can test most aspects of their HAPI implementation, with detailed results indicating warnings or errors. This has proven to be a very effective way to assist developers, and it helps ensure that new HAPI servers are fully functional. The tests include checks of JSON responses against the HAPI metadata schema, verifying that the server handles different allowed start/stop time representations (e.g, year, day-of-year and year, month day with varying amounts of additional time precision), error responses and message, timeouts, and testing that output is independent of the output format. Also, the validator/verifier makes recommendations. For example, if a server does not include cross-origin request sharing headers or respond with compressed data when gzip is specified in the `Accept-Encoding` request header, it emits a message noting their advantages and a link to information on how to enable or implement these features. + +## 2.3 Limitations + +Because HAPI requires a single time column to be the first column, each record (one row of data) must be associated with one time value (the first value in the row). This has implications for serving files with multiple time arrays. Suppose a file contains 1-second data, 3-second data, and 5-second data, all from the same measurement but averaged differently. A HAPI server could expose this data, but not as a single dataset. To a HAPI server, each time resolution could be presented as a separate dataset, each with its unique time array. + +Note that there are only a few supported data types: `isotime`, `string`, `integer`, and `double`. This is intended to keep the client code simple in terms of dealing with the data stream. However, the spec may be expanded to include other types, such as 4-byte floating-point values (which would be called float), or 2-byte integers (which would be called short). + +# 3 Endpoints + +## 3.1 Overview + +The HAPI specification has five required endpoints that give clients a precise way to first determine the data holdings of the server and then to request data. The functionality of the required endpoints is as follows: + +1. `/hapi/capabilities` lists the output formats the server can stream (`csv`, `binary`, or `json`, [described below](#374-response-formats)). + +2. `/hapi/about` lists the server id and title, contact information, and a brief description of the datasets served (this endpoint is new in the version 3 HAPI specification). + +3. `/hapi/catalog` lists the catalog of available datasets; each dataset is associated with a unique id and may optionally have a title. + +4. `/hapi/info` lists information about a dataset with a given id; a primary component of the description is the list of parameters in the dataset. + +5. `/hapi/data` streams data for a dataset of a given id and over a given time range; a subset of parameters in a dataset may be requested (default is all parameters). + +There is also an optional landing page endpoint `/hapi` that returns human-readable HTML. Although there is recommended content for this landing page, it is not essential to the functioning of the server. + +Servers may have non-standard endpoints (an endpoint not given above) under `/hapi/` if they are prefixed by `x_`, e.g., `/hapi/x_custom_endpoint`. This will communicate to the user that this is not a standard endpoint and will prevent a name conflict if the HAPI standard adds `custom_endpoint` to the specification. + +The five required endpoints are RESTful-style in that the resulting HTTP response is the complete response for each endpoint. In particular, the `/data` endpoint does not give URLs for file or links to where the data can be downloaded; instead, it streams the data in the HTTP response body. The full specification for each endpoint is described below. + +All endpoints must have a `/hapi` path element in the URL, and only the `/info` and `/data` endpoints take query parameters: + +``` +http://server/hapi (Optional HTML landing page) +http://server/hapi/capabilities +http://server/hapi/about +http://server/hapi/catalog +http://server/hapi/info?dataset=...[&...] +http://server/hapi/data?dataset=...&... +``` + +We recommend that any requests with a trailing slash are handled by sending a `301` redirect, e.g., + +``` +http://server/hapi/ => 301 redirect to http://server/hapi +http://server/hapi/info/?dataset=... => 301 redirect to http://server/hapi/info?dataset=... +``` + +Requests to a HAPI server must not change the server state. Therefore, all HAPI endpoints must respond only to HTTP HEAD and GET requests. + +The request parameters and their allowed values must be strictly enforced by the server. **HAPI servers must not add request parameters beyond those in the specification.** If a request URL contains any unrecognized or misspelled request parameters, a HAPI server must respond with an error status (see [HAPI Status Codes](#4-status-codes) for more details). The principle being followed is that the server must not silently ignore unrecognized request parameters because this would falsely indicate to clients that the request parameter was understood and was taken into account when creating the output. That is, if a server is given a request parameter that is not part of the HAPI specification, such as `averagingInterval=5s`, the server must report an error for two reasons: 1. additional request parameters are not allowed, and 2. the server will not do any averaging. + +The outputs from a HAPI server to the `about`, `catalog`, `capabilities`, and `info` endpoints are JSON objects, the formats of which are described below in the sections detailing each endpoint. The `data` endpoint must be able to deliver Comma Separated Value (CSV) data following the RFC 4180 standard [[1](#6-references)], but may optionally deliver data content in binary format or JSON format. The response stream formats are described in the [Data Stream Content](#374-response-formats) section. + +The following is the detailed specification for the five main HAPI endpoints as well as the optional landing page endpoint. + +## 3.2 `hapi` + +This root endpoint is optional and should provide a human-readable landing page for the server. Unlike the other endpoints, there is no strict definition for the output, but if present, it should include a brief description of the data and other endpoints and links to documentation on how to use the server. An example landing page that can be easily customized for a new server is in [the Appendix](#81-sample-landing-page). + +There are many options for landing page content, such as an HTML view of the catalog or links to commonly requested data. + +**Sample Invocation** + +``` +http://server/hapi +``` + +**Request Parameters** + +None + +**Response** + +The response is in HTML format with a mime type of `text/html`. There is no specification for the content but should provide an overview that is useful for science users. + +**Example** + +Retrieve the landing page for this server. + +``` +http://server/hapi +``` + +**Example Response** + +See [the Appendix](#81-sample-landing-page). + +## 3.3 `about` + +**Sample Invocation** + +``` +http://server/hapi/about +``` + +**Request Parameters** + +None + +**Response** + +The server's response to this endpoint must be in JSON format [[3](#6-references)] as defined by RFC-7159, and the response must indicate a mime type of `application/json`. Server attributes are described using keyword-value pairs, with the required and optional keywords described in the following table. + +**About Object** + +| Name | Type | Description | +|---------------------|---------------|-----------------------| +| `HAPI` | string | **Required** The version number of the HAPI specification this description complies with. | +| `status` | Status object | **Required** Server response status for this request. | +| `id` | string | **Required** A unique ID for the server. Ideally, this ID has the organization name in it, e.g., NASA/SPDF/SSCWeb, NASA/SPDF/CDAWeb, INTERMAGNET, UIowa/RPWG, LASP/TSI, etc. A list of known ids is available in the [servers repository](https://github.com/hapi-server/servers). | +| `title` | string | **Required** A short human-readable name for the server that defines an acronym in the `id` or provided additional context about data served (e.g., LASP Total Solar Irradiance (TSI) data). The suggested maximum length is 40 characters. We recommend against including "HAPI" and/or the HAPI version of the server in the title. | +| `contact` | string | **Required** Contact information or email address for server issues. HAPI clients should show this contact information when it is certain that an error is due to a problem with the server (as opposed to the client). Ideally, a HAPI client will recommend that the user check their connection and try again at least once before contacting the server contact. | +| `description` | string | **Optional** A brief description of the type of data the server provides. | +| `contactID` | string | **Optional** The identifier in the discovery system for information about the contact. For example, a SPASE ID of a person identified in the `contact` string. | +| `resourceID` | string | **Optional** An identifier associated with all datasets. +| `citation` | string | **Optional** Deprecated; use `serverCitation`. | +| `serverCitation` | string | **Optional** How to cite the HAPI server. An actionable DOI is preferred (e.g., https://doi.org/...). This `serverCitation` differs from the `dataCitation` in an `/info` response. Here the `serverCitation` is for the entity that maintains the data server. | +| `note` | string or array of strings | **Optional** General notes about the server that are not appropriate to include in `description`. For example, a change log that lists added datasets or parameters in datasets. If an array of strings is used to describe datestamped notes, we recommend prefixing the note with a [HAPI restricted ISO 8601 format](#376-representation-of-time), e.g., `["2024-01-01T00: Note on this date/time", "2024-04-02T00: Note on this date/time"]`. | +| `warning` | string or array of strings | **Optional** Temporary warnings about the data server, such as scheduled downtime and known general problems. Dataset-specific warnings should be placed in the `warning` element of the `/info` response. | +| `dataTest` | `DataTest` | **Optional** Information that a client can use to check that a server is operational. Data response should contain more than zero records. See below for the definition of this object. | + +**`DataTest` Object** + +| Name | Type | Description | +|-----------------|-----------|--------------| +| `query` | `Query` | **Required** See the `Query` object definition below. | +| `name` | string | **Optional** Name of the test. | + +A server test query then has the following definition: + +**`Query` Object** + +| Name | Type | Description | +|--------------|--------|--------------| +| `dataset` | string | **Required** String identifier for the dataset to be used in the test. | +| `start` | string | **Required** String value of test data start time in [restricted ISO 8601](#376-representation-of-time) format. | +| `stop` | string | **Required** String value of test data stop time in [restricted ISO 8601](#376-representation-of-time) format. | +| `parameters` | string | **Optional** A comma-separated list of parameters to include in the response ([allowed characters](#82-allowed-characters-in-id-dataset-and-parameters)). If not provided, all parameters will be included. | + + +**Example** + +``` +http://server/hapi/about +``` + +**Example Response:** + +```javascript +{ + "HAPI": "3.3", + "status": {"code": 1200, "message": "OK"}, + "id": "TestData3.3", + "title": "HAPI 3.3 Test Data and Metadata", + "contact": "examplel@example.org", + "dataTest": { + "name": "Ping Test", + "query": { + "dataset": "dataset1", + "start": "2023-01-01T12:00:00Z", + "stop": "2023-01-01T14:00:01Z", + "parameters": "parameter1,parameter2" + } + } +} +``` + + +## 3.4 `capabilities` + +This endpoint describes relevant implementation capabilities for this server. Currently, the only possible variability from server to server is the list of output formats that are supported. + +A server must support the `csv` output format, but `binary` output format and `json` output may optionally be supported. Servers may support custom output formats, which would be advertised here. All custom formats listed by a server must begin with the string `x_` to indicate that they are custom formats and avoid naming conflicts with possible future additions to the specification. + +**Sample Invocation** + +``` +http://server/hapi/capabilities +``` + +**Request Parameters** + +None + +**Response** + +The server's response to this endpoint must be in JSON format [[3](#6-references)] as defined by RFC 7159, and the response must indicate a mime type of `application/json`. Server capabilities are described using keyword-value pairs, with `outputFormats` being the only keyword currently in use. + +**Capabilities Object** + +| Name | Type | Description | +|-----------------|---------------|--------------| +| `HAPI` | string | **Required** The version number of the HAPI specification this description complies with. | +| `status` | Status object | **Required** Server response status for this request. | +| `outputFormats` | string array | **Required** The list of output formats the server can emit. All HAPI servers must support at least `csv` output format, with `binary` and `json` output formats being optional. Any custom formats not in this list must begin with a prefix of `x_` as shown in the example below. | +| `catalogDepthOptions` | string array | **Optional** A list of options for `/catalog?depth=` requests. It can include one or more of `dataset` and `all`. See [the catalog endpoint description](#35-catalog). | + +**Example** + +Retrieve a listing of the capabilities of this server. + +``` +http://server/hapi/capabilities +``` + +**Example Response:** + +```javascript +{ + "HAPI": "3.3", + "status": {"code": 1200, "message": "OK"}, + "outputFormats": ["csv", "binary", "json", "x_hdf"] +} +``` +Note the non-standard format of HDF data, and that it has the required `x_` prefix. + +If a server only reports an output format of `csv`, then requesting `binary` data should cause the server to respond with an error status of `1409 "Bad request - unsupported output format"` with a corresponding HTTP response code of 400. [See +below](#4-status-codes) for more about error responses. + +## 3.5 `catalog` + +This endpoint provides a list of datasets available from the server. + +**Sample Invocation** + +``` +http://server/hapi/catalog +``` + +**Request Parameters** + +| Name | Description | +|------------|-------------------------------------------------------------------| +| `depth` | **Optional** Possible values are `dataset` (the default) and `all`. Servers may choose to implement the `all` option, which allows all of the metadata from a server to be obtained in a single request. If this request parameter is supported, the `/capabilites` end point must return `catalogDepthOptions=["dataset", "all"]`. | +| `resolve_references` | **Optional** If `false,` allow the server to send responses with references that need resolution by the client. Default is `true`. See [3.6.13 JSON References](#3614-json-references).| +**Response** + +The response is in JSON format [[3](#6-references)] as defined by RFC-7159 and has a MIME type of `application/json`. The catalog +is a simple listing of identifiers for the datasets available from the server. Additional metadata about each dataset is available through the `info` endpoint (described below). The catalog takes no query parameters and always lists the full catalog. + +**Catalog Object** + +| Name | Type | Description | +|-----------|------------------|----------------------------------------------------------------------------------------------------| +| `HAPI` | string | **Required** The version number of the HAPI specification this description complies with. | +| `status` | object | **Required** Server response status for this request. (see [HAPI Status Codes](#4-status-codes)) | +| `catalog` | array of Dataset | **Required** A list of datasets available from this server. | + +**Dataset Object** + +| Name | Type | Description | +|---------|--------|-------------| +| `id` | string | **Required** The computer-friendly identifier ([allowed characters](#82-allowed-characters-in-id-dataset-and-parameters)) that the host system uses to locate the dataset. Each identifier must be unique within the HAPI server where it is provided. | +| `title` | string | **Optional** A short label for the dataset usable in selecting a dataset from a list. If none is given, it defaults to the id. The suggested maximum length is 40 characters. | +| `info` | object | **Optional** but required when `depth=all` is use in request. This object should be identical in content to what is returned by a `/info?dataset=id` request. The `HAPI` and `status` nodes should be omitted inside the `info` objects. (The `status` does not belong there, and the `HAPI` version should be the same for all `info` elements, so it is redundant information.) | + +The identifiers must be unique within a single HAPI server. Also, dataset identifiers in the catalog should be stable over time. Including rapidly changing version numbers or other revolving elements (dates, processing ids, etc.) in the datasets identifiers should be avoided. The intent of the HAPI specification is to allow data to be referenced using RESTful URLs with a reasonable lifetime. + +Identifiers must be limited to the set of characters, including upper and lower case letters, numbers, and the following characters: comma, colon, slash, minus, and plus. See [89](https://github.com/hapi-server/data-specification/issues/89) for a discussion of this. + +**Example** + +Retrieve a listing of datasets available from a server. + +``` +http://server/hapi/catalog +``` + +**Example Response:** + +```javascript +{ + "HAPI" : "3.3", + "status": {"code": 1200, "message": "OK"}, + "catalog": + [ + {"id": "ACE_MAG", title:"ACE Magnetometer data"}, + {"id": "data/IBEX/ENA/AVG5MIN"}, + {"id": "data/CRUISE/PLS"}, + {"id": "any_identifier_here"} + ] +} +``` + +**Example** + +Retrieve a listing of all the datasets and embed the `info` metadata for each in the dataset object. + +First, verify that this is supported: + +``` +http://server/hapi/capabilities +``` + +```javascript +{ + "HAPI": "3.3", + "status": {"code": 1200, "message": "OK"}, + "outputFormats": ["csv", "binary", "json"], + "catalogDepthOptions": ["dataset", "all"] +} +``` + +The `/capabilities` response indicates that the `/catalog` endpoint allows the `depth=all` option. Then, a subsequent request for the entire catalog will succeed: + +``` +http://server/hapi/catalog?depth=all +``` + +**Example Response:** + +(where `...` is a placeholder for additional metadata that has been omitted for clarity) + +```javascript +{ + "HAPI" : "3.3", + "status": {"code": 1200, "message": "OK"}, + "catalog": + [ + { + "id": "MAG1", title:"Magnetometer data", + "info": { + "startDate": "2010-01-01T00:00:00Z", + "stopDate": "2020-01-02T00:00:00Z", + "parameters": [{"name": "Time", ...}] + ... + } + }, + { + "id": "MAG2", title:"Magnetometer data", + "info": { + "startDate": "2000-01-02T00:00:00Z", + "stopDate": "2010-01-01T00:00:00Z", + "parameters": [{"name": "Time", ...}] + ... + } + }, + { + ... + } + ] +} +``` + +## 3.6 `info` + +This endpoint provides a data header for a given dataset. The header is expressed in JSON format [[3](#6-references)] as defined by RFC-7159 and has a MIME type of `application/json`. The specification for the header is that it provides a minimal amount of metadata that allows for the automated reading by a client of the data content streamed via the `data` endpoint. The header must include a list of the parameters in the dataset and the date range covered by the dataset. There are also optional metadata elements for capturing other high-level information, such as a brief description of the dataset, the nominal cadence of the data, and ways to learn more about a dataset. The table below lists all required and optional dataset attributes in the header. + +Servers may include additional custom (server-specific) keywords or keyword/value pairs in the header provided that the keywords begin with the prefix `x_`. While a HAPI server must check all request parameters (servers must return an error code given any unrecognized request parameter as described earlier), the JSON content output by a HAPI server may contain additional, user-defined metadata elements. All non-standard metadata keywords must begin with the prefix `x_` to indicate to HAPI clients that these are extensions. Custom clients could use the additional keywords, but standard clients would ignore the extensions. By using the standard prefix, the custom keywords will not conflict with any future keywords added to the HAPI standard. Servers using these extensions may wish to include additional, domain-specific characters after the `x_` to avoid possible collisions with extensions from other servers. + +Each parameter listed in the header must itself be described by specific metadata elements, and a separate table below describes the required and optional parameter attributes. + +By default, the parameter list in the `info` response will include *all* parameters in the dataset. However, a client may request a header for just a subset of the parameters. The subset of interest is specified as a comma-separated list via the request parameter called `parameters`. (Note that the client would have to obtain the parameter names from a prior request.) There must not be any duplicates in the subset list, and the subset list must be arranged according to the ordering in the original, full list of parameters. The reduced header is useful because it is also possible to request a subset of parameters when asking for data (see the `data` endpoint), and a reduced header can be requested that would then match the subset of parameters in the data. This correspondence of reduced header and reduced data ensures that a data request for a subset of parameters can be properly interpreted even if additional subset requests are made with no header. (Although a way to write a client as safe as possible would be always to request the full header and rely on its parameter ordering to determine the data column ordering.) + +Note that the `data` endpoint may optionally prepend the `info` header to the data stream when `include=header` is included in the request URL. In cases where the `data` endpoint response includes a header followed by `csv` or `binary` data, the header must always end with a newline. This enables the end of the JSON header to be more easily detected. + +**Sample Invocation** + +``` +http://server/hapi/info?dataset=ACE_MAG +``` + +### 3.6.1 Request Parameters + +Items with a * superscript in the following table have been modified from version 2 to 3; see [change notes](#1-significant-changes-to-specification). + +| Name | Description | +|------------|-------------------------------------------------------------------| +| `dataset`[ * ](#1-significant-changes-to-specification) | **Required** The identifier for the dataset ([allowed characters](#82-allowed-characters-in-id-dataset-and-parameters)) | +| `parameters` | **Optional** A comma-separated list of parameters to include in the response ([allowed characters](#82-allowed-characters-in-id-dataset-and-parameters)). Default is all; `...¶meters=&...` in URL should be interpreted as meaning all parameters. See "subsetting parameters" in this subsection for additional examples. | +| `resolve_references` | **Optional** If `false`, allow the server to send responses with references that need resolution by the client. Default is `true`. See [JSON References](#3614-json-references).| + +**Response** + +The response is in JSON format [[3](#6-references)] and provides metadata about one dataset. + +**Subsetting Parameters** + +Clients may request an `info` response that includes only a subset of the parameters or a data stream for a subset of parameters (via the `data` endpoint, described next). The logic on the server is the same for `info` and `data` requests in terms of what dataset parameters are included in the response. The primary time parameter (always required to be the first parameter in the list) is always included, even if not requested. These examples clarify the way a server must respond to various types of dataset parameter subsetting requests: + +- **request:** do not ask for any specific parameters (i.e., there is no request parameter called `parameters`); + **example:** `http://server/hapi/data?dataset=MY_MAG_DATA&start=1999Z&stop=2000Z` + **response:** all columns + +- **request:** ask for just the primary time parameter; + **example:** `http://server/hapi/data?dataset=MY_MAG_DATA¶meters=Epoch&start=1999Z&stop=2000Z` + **response:** just the primary time column + +- **request:** ask for a single parameter other than the primary time column (like `parameters=Bx`); + **example:** `http://server/hapi/data?dataset=MY_MAG_DATA¶meters=Bx&start=1999Z&stop=2000Z` + **response:** primary time column and the one requested data column + +- **request:** ask for two or more parameters other than the primary time column; + **example:** `http://server/hapi/data?dataset=MY_MAG_DATA¶meters=Bx,By&start=1999Z&stop=2000Z` + **response:** primary time column followed by the requested parameters in the order they occurred in the original, non-subsetted dataset header (note that the parameter ordering in the request must match the original ordering anyway - see just below) + +Note that the order in which parameters are listed in the request must not differ from the order that they appear in the response. For a data set with parameters `Time,param1,param2,param3` this subset request + +`?dataset=ID¶meters=Time,param1,param3` + +is acceptable, because `param1` is before `param3` in the `parameters` array (as determined by the `/info` response). However, asking for a subset of parameters in a different order, as in + +`?dataset=ID¶meters=Time,param3,param1` + +is not allowed, and servers must respond with an error status. See [HAPI Status Codes](#4-status-codes) for more about error conditions and codes. + +### 3.6.2 `info` Response Object + +| Dataset Attribute | Type | Description | +|---------------------|--------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| `HAPI` | string | **Required** The version number of the HAPI specification with which this description complies. | +| `status` | object | **Required** Server response status for this request; see [HAPI Status Codes](#4-status-codes). | +| `format` | string | **Required** (when the header is prefixed to data stream) Format of the data as `csv` or `binary` or `json`. | +| `parameters` | array of Parameter | **Required** Description of the parameters in the data. | +| `startDate` | string | **Required** [Restricted ISO 8601](#376-representation-of-time) date/time of first record of data in the entire dataset. | +| `stopDate` | string | **Required** [Restricted ISO 8601](#376-representation-of-time) date/time of the last record of data in the entire dataset. For actively growing datasets, the end date can be approximate, but it is the server's job to report an accurate end date. | +| `timeStampLocation` | string | **Optional** Indicates the positioning of the timestamp within the measurement window. Must be one of `begin`, `center`, `end` or `other`. **If this attribute is absent, clients are to assume a default value of `center`,** which is meant to indicate the exact middle of the measurement window. A value of `other` indicates that the location of the time stamp in the measurement window is not known or these options cannot be used for an accurate description. See also [HAPI convention notes](https://github.com/hapi-server/data-specification/wiki/implementation-notes). (Note: version 2.0 indicated that these labels were in all upper case. Starting with version 2.1, servers should use all lower case. Clients, however, should be able to handle both all upper case and all lower case versions of these labels.) | +| `cadence` | string | **Optional** Time difference between records as an ISO 8601 duration. This is meant as a guide to the nominal cadence of the data and not a precise statement about the time between measurements. See also [HAPI convention notes](https://github.com/hapi-server/data-specification/wiki/implementation-notes). | +| `sampleStartDate` | string | **Optional** [Restricted ISO 8601](#376-representation-of-time) date/time of the start of a sample time period for a dataset, where the time period must contain a manageable, representative example of valid, non-fill data. **Required** if `sampleStopDate` given. | +| `sampleStopDate` | string | **Optional** [Restricted ISO 8601](#376-representation-of-time) date/time of the end of a sample time period for a dataset, where the time period must contain a manageable, representative example of valid, non-fill data. **Required** if `sampleStartDate` given. | +| `maxRequestDuration` | string | **Optional** An [ISO 8601 duration](https://en.wikipedia.org/wiki/ISO_8601#Durations) indicating the maximum duration for a request. This duration should be interpreted by clients as a limit above which a request for all parameters will very likely be rejected with a HAPI 1408 error; requests for fewer parameters and a longer duration may or may not be rejected. | +| `description` | string | **Optional** A detailed description of the dataset content, caveats, relationships to other data, references and links -- the information users need to know for a basic interpretation of the data. Suggested length is a few lines of text, e.g., 1000 characters; extended details can be referenced with links. | +| `unitsSchema` | string | **Optional** The name of the units convention that describes how to parse all `units` strings in this dataset. Currently, the only allowed values are: `udunits2`, `astropy3`, and `cdf-cluster`. See [`unitsSchema` Details](#363-unitsschema-details) for additional information about these conventions. The list of allowed unit specifications is expected to grow to include other well-documented unit standards. | +| `coordinateSystemSchema` | string | **Optional** The name of the schema or convention that contains a list of coordinate system names and definitions. If this keyword is provided, any `coordinateSystemName` keyword given in a [parameter](#367-parameter-object) definition should follow this schema. See [`coordinateSystemSchema` Details](#364-coordinatesystemschema-details) for additional information. | +| `location` | object | **Optional** A way to specify where a dataset's measurements were made. See [location and geoLocation Details](#3612-location-and-geolocation-details) for an explanation of the two ways to indicate location: one for a fixed location and one for when the measurement location changes with time. +| `geoLocation` | array of double | **Optional** A simplified, compact way of indicating a fixed location for a dataset in Earth-based coordinates. The array has 2 or 3 elements: [longitude in degrees, latitude in degrees, (optional) altitude in meters]. For information and examples, see [location and geoLocation Details](#3612-location-and-geolocation-details). +| `resourceURL` | string | **Optional** URL linking to more detailed information about this dataset. | +| `resourceID` | string | **Optional** An identifier by which this data is known in another setting (e.g., DOI) +| `creationDate` | string | **Optional** [Restricted ISO 8601](#376-representation-of-time) date/time of the dataset creation. | +| `citation` | string | **Optional** Deprecated; use `datasetCitation`. | +| `datasetCitation` | string | **Optional** How to cite the data set. An actionable DOI is preferred (e.g., https://doi.org/...). Note that there is a `serverCitation` in an `/about` response for citing the data server, but `datasetCitation` is for the dataset citation, which is typically different. | +| `licenseURL` | string or array of strings | **Optional** A URL or array of URLs to a license landing page. If the license is in the [spdx.org](https://spdx.org/) list, provide a link to it. A list of licenses implies that users may choose any one from the list.| +| `provenance` | string | **Optional** A description of the provenance of this dataset.| +| `modificationDate` | string | **Optional** [Restricted ISO 8601](#376-representation-of-time) date/time of the last modification of metadata and/or data has changed. (Use this to signal that any content of the dataset in the full time range of data has changed; more granularity may be possible in the future, see (issue 218)[https://github.com/hapi-server/data-specification/issues/218].) | +| `contact` | string | **Optional** Relevant contact person name (and possibly contact information) for science questions about the dataset. | +| `contactID` | string | **Optional** The identifier in the discovery system for information about the contact. For example, the SPASE ID or ORCID of the person. | +| `additionalMetadata` | object | **Optional** A way to include a block of other (non-HAPI) metadata. See below for a description of the object, which can directly contain the metadata or point to it via a URL. | +| `definitions` | object | **Optional** An object containing definitions that are referenced using a [JSON reference](#3614-json-references) | +| `note` | string or array of strings | **Optional** General notes about the dataset that are inappropriate to include in `description`. For example, a change log that lists added parameters. If an array of strings is used to describe datestamped notes, we recommend prefixing the note with a [HAPI restricted ISO 8601 format](#376-representation-of-time), e.g., `["2024-01-01T00: Note on this date time", "2024-04-02T00: Note on this date time"]`.| +| `warning` | string or array of strings | **Optional** Temporary warnings about the dataset, such as "dataset stopDate is typically updated continuously, but ..." | + + +### 3.6.3 `unitsSchema` Details + +One optional attribute is `unitsSchema`. This allows a server to specify, for each dataset, what convention is followed for the `units` strings in the parameters of the dataset. Currently, the only allowed values for `unitsSchema` are: `udunits2`, `astropy3`, and `cdf-cluster`. These represent the currently known set of unit conventions with software available for parsing and interpreting unit strings. Only major version numbers (if available) are indicated in the convention name. It is expected that this list will grow over time as needed. The current locations of the official definitions and software tools for interpreting the various unit conventions are in the following table: + +| Convention Name | Current URL | Description (context help if link is broken) | +|-----------------|--------------------------------|----------------------------------------------| +| `udunits2` | https://www.unidata.ucar.edu/software/udunits | Unidata from UCAR; a C library for units of physical quantities | +| `astropy3` | https://docs.astropy.org/en/stable/units/ | package inside `astropy` that handles defining, converting between, and performing arithmetic with physical quantities, such as meters, seconds, Hz, etc | +| `cdf-cluster` | https://caa.esac.esa.int/documents/DS-QMW-TN-0010.pdf which is referenced on this page: https://www.cosmos.esa.int/web/csa/documentation | conventions created and used by ESA's Cluster mission | +| `vounits1.1` | https://www.ivoa.net/documents/VOUnits/ | IVOA Recommendation for Units in the VO | + + + +### 3.6.4 `coordinateSystemSchema` Details + +The `parameter` object (described below) allows for a `coordinateSystemName` to be associated with any parameter. +To allow a precise and computer-readable meaning to these coordinate system names, a dataset +can have a `coordinateSystemSchema`. This schema should contain a computer-readable and publically available +list of coordinate system names and definitions. Few such schemas exist. One suggested schema designation +for Heliophysics is provided in the following table. +If more community-approved schemas emerge for coordinate systems, then they could be included in the table, and if these schemas become +widely adopted, it might make sense in the future to restrict `coordinateSystemSchema` to a set of enumerated values, but for now +it is unconstrained. + +| Schema Name | URL | Description | +|----------------|--------------------------------------|------------------------------------------------------------------------| +| `spase2.4.1` | [https://spase-group.org/data/schema/spase-2.4.1.xsd](https://spase-group.org/data/schema/spase-2.4.1.xsd) | Schema containing the names and definitions of coordinate systems commonly used in Heliophysics. See also [an ΗΤΜL version of relevant section of schema.](https://spase-group.org/data/model/spase-2.4.1/spase-2_4_1_xsd.html#CoordinateSystemName) | + +This schema captures a Heliophysics-specific list of coordinate systems and is part of the larger SPASE metadata model (also Heliophysics-specific). It serves as an example of how a publically accessible list of coordinate frames can be referenced by the `coordinateSystemSchema` keyword. Different communities can reference their schemas. + +Within the `parameter` object is the `coordinateSystemName` keyword, which contains the name of the coordinate +system (must be from the schema). There is also the `vectorComponents` keyword to capture the details about which +coordinate components are present in the parameter. See the description +below ([the `vectorComponents` section](#3611-vectorcomponents-object)) for details on how to describe +parameters that contain directional (i.e., vector) quantities. + +### 3.6.5 `additionalMetadata` Object + +HAPI allows for bulk inclusion of additional metadata that may exist for a dataset. Additional metadata keywords can be inserted by prefixing them with `x_` (which indicates that the element is not part of the HAPI standard), but this means any original metadata would have to modify its keywords. + +The `additionalMetadata` object is a list of objects represented by the table below and allows for one or more sets of additional metadata, which can be isolated from each other and HAPI keywords. + +```json +{ + "additionalMetadata" : [ md1, md2, md3 ] +} +``` + +There can be one or more metadata objects (md1, md2, md3, etc above) in the list, and the keywords for these objects are as follows: + +| Keyword | Type | Description | +|---------------------|-------------|--------------------------------------------| +| `name` | string | **Optional** the name of the additional metadata | +| `content` | string or JSON object | **Required** (if no `contentURL`) either a string with the metadata content (for XML), or a JSON object representing the object tree for the additional metadata | +| `contentURL` | string | **Required** (if no `content`) URL pointing to additional metadata | +| `schemaURL` | string | **Optional** points to computer-readable schema for the additional metadata| +| `aboutURL` | string | **Optional** points to human-readable explanation for the metadata | + +The `name` is appropriate if the additional metadata follows a known standard that people know about. One of `content` or `contentURL` must be present. The `content` can be a string version of the actual metadata or a JSON object tree. If there is a schema reference embedded in the metadata (easy to do with XML and JSON), clients can figure that out, but if no internal schema is in the metadata, then the `schemaURL` can point to an external schema. The `aboutURL` is for humans to learn about the given type of additional metadata. + +For the `name`, please use these if appropriate `CF`, `FITS`, `ISTP`, `SPASE`. Other fields beyond Heliophysics will likely have their metadata names, which could be listed here if requested. + +**Example** + +```json +{ + "additionalMetadata" : [ + { "name": "SPASE", + "contentURL": "https://hpde.io/NASA/DisplayData/ACE/MAG/27-Day.xml", + "aboutURL": "http://spase-group.org" + }, + { "name": "CF", + "content": { "keyword" : "value1", "keyword2": "value2" }, + "aboutURL": "https://cfconventions.org/" + }, + { "name": "ISTP", + "content": { "json object (not shown) representing the tree of ISTP keyword-value pairs" }, + "aboutURL": "https://spdf.gsfc.nasa.gov/istp_guide/variables.html" + }, + { "name": "FITS", + "content": { "fits_header_keyword1" : "value1", "fits_header_keyword2": "value2" }, + "aboutURL": "http://fits.gsfc.nasa.gov" + } + ] +} +``` + +Note that no single dataset would likely have this variety of additional metadata -- these are provided for illustration. + +### 3.6.6 `stringType` Object + +`stringType` is an optional element within each `parameter` object, and it allows servers to indicate +that a string parameter has a special interpretation. + +Currently, the only special `stringType` allowed is a URI, and it can be used to indicate that a string +parameter contains a time series of links to resources (pointed to by the URIs). + +The main use of HAPI is serving numeric data, but the ability to also serve URIs that point to data +opens up two use cases for HAPI servers. + +1. Serving image URIs. In this case, the images should be in a widely recognized format that could be easily interpreted by libraries available to many clients, such as JPG, PNG, etc. + +2. Serving data file URIs to provide a list of files used to construct a HAPI numeric data response. For example, if a server has a dataset named `dataset1`, the files used to construct a request for `dataset1` could be provided in a dataset named `dataset1Files`. In this case, a user can request `dataset1` over a time range and determine what files `dataset1` came from using a request for `dataset1Files` over the same time range. + + It is emphasized that a HAPI server that provides only datasets with data file URIs that contain time series data that could be served as HAPI numeric data is not recommended. HAPI clients should only need to read a HAPI stream and not have to read and parse data in arbitrary file formats. + +A recommended practice in both cases is also to include columns that provide metadata values. + +The `stringType` attribute can either have a simple value that is just the string `uri`, +or it can be an object that is a dictionary with `uri` as +the key and a value that is another object with three optional elements: `mediaType`, `scheme`, and `base`. +Thus a `stringType` will have one of the following forms: + +```javascript +"stringType": "uri" +``` +or +```javascript +"stringType": { + "uri": { + "mediaType": "image/png", + "scheme": "https", + "base": "https://cdaweb.gsfc.nasa.gov/pub/pre_generated_plots/de/pwi/" + } +} +``` + +The `uri` object attributes are: + +| stringType Attribute | Type | Description | +|----------------------|---------|-----------------------------------------------------------------| +| `mediaType` | string | **Optional** indicates content type behind the URI (also referred to as MIME type) | +| `scheme` | string | **Optional** the access protocol for the URI | +| `base` | string | **Optional** allows each URI string value to be relative to a base URI | + + +The media type indicates the type of data each URI points to. HAPI places no constraints on the values +for `mediaType`, but servers should only use standard values, +such as `image/fits`, ' image/png` or `application/x-cdf`. There are standard lists of media types available, +and we do not repeat them in the HAPI specification. + +The `scheme` describes the access protocol. Again, there are no restrictions, but there is an expectation that it should +be a well-known protocol, such as `http` or `https` or `ftp` or `doi` or `s3` (used for Amazon object stores). + +The `base` allows the individual string values for the parameter to be relative to a base URI, typically a web-accessible location ending in a slash. + +URIs should follow the syntax outlines in [RFC 3986](https://www.rfc-editor.org/rfc/rfc3986). The basic pattern is: +``` +URI = scheme ":" ["//" authority] path ["?" query] ["#" fragment] +``` + +URI strings should not be encoded because this is what most clients expect, and clients typically do their own encoding of a URI before +issuing a request to retrieve the content. + +The units for a string parameter that is a URI should be `null`. The units value here should not be used +to try and describe the contents behind the URIs. URI content is likely too variable to be uniformly +handled by this simple units indicator. + +Example: + +```javascript +"parameters": + [ { "name": "Time", + "type": "isotime", + "units": "UTC", + "fill": null, + "length": 24 + }, + { "name": "solar_images", + "description": "full-disk images of the Sun by SDO/AIA at wavelength of 304 angstroms", + "type": "string", + "length": 64, + "stringType": {"uri": {"mediaType": "image/fits", "scheme": "https"}}, + "units": null, + "fill": null + }, + { "name": "cadence", + "description": "time between images", + "type": "double", + "units": "sec", + "fill": "-999" + }, + { "name": "wavelength", + "description": "which wavelength filter was active for this image", + "type": "double", + "units": "angstrom", + "fill": "NaN" + }, + { "name": "contains_active_region", + "description": "boolean indicator of active region presence: 0=no, 1=yes", + "type": "integer", + "units": null, + "fill": "NaN" + } + ] +``` + +This example shows what the `parameters` portion of a HAPI `info` response would look like for a set of solar images. +The parameter name `solar_images` is given a `stringType` of `uri` and has a `mediaType` and `scheme` specified. +No `base` is given, so the URIs must be fully qualified. There are also other parameters (`cadence`, +`wavelength`, and `contains_active_region`) that could be used on the client side for filtering the images +based on the values of those parameters. + +The approach shown here emphasizes a useful way for HAPI to provide image lists. HAPI queries +can only constrain a set of images by time, but if the response contains metadata values in other columns, clients can restrict the image list further by filtering on values in the metadata columns. + +### 3.6.7 parameter Object + +The focus of the header is to list the parameters in a dataset. The first parameter in the list must be a time value. This time column serves as the independent variable for the dataset. The time column parameter may have any name, but its type must be `isotime`, and there must not be any fill values in the data stream for this column. Note that the HAPI specification does not clarify if the time values given are the start, middle, or end of the measurement intervals. There can be other parameters of type `isotime` in the parameter list. The table below describes the `parameter` object attributes. + +| Parameter Attribute | Type | Description | +|-----------------------|----------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| `name` | string | **Required** A short name for this parameter. Each parameter must have a unique name. It is recommended that all parameter names start with a letter or underscore, followed by letters, underscores, or numbers. This allows the parameter names to become variable names in computer languages. Parameter names in a dataset must be unique, and names are not allowed to differ only by having a different case. Note that because parameter names can appear in URLs that can serve as permanent links to data, changing them will have negative implications, such as breaking links to data. Therefore, parameter names should be stable over time. | +| `type` | string | **Required** One of `string`, `double`, `integer`, or `isotime`. Binary content for `double` is always 8 bytes in IEEE 754 format, `integer` is 4 bytes signed little-endian. There is no default length for `string` or `isotime` types. String parameters may include UTF-8 encoded Unicode characters. Strings may be flagged as representing URIs - see the optional `stringType` attribute in this table. | +| `length` | integer | **Required** For type `string` and `isotime`; **not allowed for others**. The maximum number of bytes that the string may contain. If the response format is binary and a string has fewer than this maximum number of bytes, the string must be padded with ASCII null bytes. If the string parameter contains only ASCII characters, `length` means the maximum number of ASCII characters. If a string parameter contains UTF-8 encoded Unicode characters, `length` means the maximum number of bytes required to represent all of the characters. For example, if a string parameter can be `A` or `α` `length: 2` is required because `α` in Unicode requires two bytes when encoded as UTF-8. HAPI clients that read CSV output from a HAPI server will generally not need to use the `length` parameter. However, for HAPI binary, the `length` parameter is needed for parsing the stream. See also [the description of HAPI binary](#3742-binary). | +| `size` | array of integers | **Required** For array parameters; **not allowed for others**. Must be a 1-D array whose values are the number of array elements in each dimension of this parameter. For example, `"size"=[7]` indicates that the value in each record is a 1-D array of length 7. For the `csv` and `binary` output, there must be 7 columns for this parameter -- one column for each array element, effectively unwinding this array. The `json` output for this data parameter must contain an actual JSON array (whose elements would be enclosed by `[ ]`). For arrays 2-D and higher, such as `"size"=[2,3]`, the later indices are the fastest moving, so that the CSV and binary columns for such a 2 by 3 would be `[0,0]`, `[0,1]`, `[0,2]` and then `[1,0]`, `[1,1]`, `[1,2]`.Note that `"size": [1]` is allowed but discouraged because clients may interpret it as either an array of length 1 or as a scalar. Similarly, an array size of 1 in any dimension is discouraged because of ambiguity in how clients would treat this structure. Array sizes of arbitrary dimensionality are allowed, but from a practical view, clients typically support up to 3D or 4D arrays. See [the `size` Details section](#368-size-details) for more about array sizes. | +| `units` | null OR string OR array of string | **Required** The units for the data values represented by this parameter. For dimensionless quantities, the value can be the literal string `"dimensionless"` or the special JSON value `null`. Note that an empty string `""` is not allowed. For `isotime` parameters, the units must be `UTC`. If a parameter is a scalar, the units must be a single string. For an array parameter, a `units` value that is a single string means that the same units apply to all elements in the array. If the elements in the array parameter have different units, then `units` can be an array of strings to provide specific units strings for each element in the array. Individual values for elements in the array can also be `"dimensionless"` or `null` (but not an empty string) to indicate no units for that element. The shape of such a `units` array must match the shape given by the `size` of the parameter, and the ordering of multi-dimensional arrays of unit strings is as discussed in the `size` attribute definition above. See the following example responses to an `info` query for examples of a single string and string array units and additional details in [`units` and `label` Arrays](#3610-units-and-label-array). | +| `fill` | string | **Required** A fill value indicates no valid data. If a parameter has no fill present for any records in the dataset, this can be indicated by using a JSON null for this attribute as in `"fill": null` [See the `fill` Details section](#369-fill-details) for more about fill values, **including the issues related to specifying numeric fill values as strings**. Since the primary time column cannot have fill values, it must specify `"fill": null` in the header. | +| `description` | string | **Optional** A brief, one-sentence description of the parameter. | +| `label` | string OR array of string | **Optional** A word or very short phrase that could serve as a label for this parameter (as on a plot axis or in a selection list of parameters). It is intended to be less cryptic than the parameter name. If the parameter is a scalar, this label must be a single string. If the parameter is an array, a single string label or an array of string labels are allowed. `null` and the empty string `""` are not allowed. A single label string will be applied to all elements in the array, whereas an array of label strings specifies a different label string for each element in the array parameter. The shape of the array of label strings must match the `size` attribute, and the ordering of multi-dimensional arrays of label strings is as discussed in the `size` attribute definition above. See also the following example responses to an `info` query for examples of a single string and string array labels and additional details in [Units and Label Arrays](#3610-units-and-label-array).| +| `stringType` | string or object | **Optional** A string parameter can have a specialized type. Currently, the only supported specialized type is a URI. See [The `stringType` Object](#366-stringtype-object) for more details on syntax and allowed values for `stringType`. | +| `coordinateSystemName`| string | **Optional** Some parameters contain directional or position information, such as look direction, spacecraft location, or a measured vector quantity. This keyword specifies the name of the coordinate system for these quantities. If `coordinateSystemName` is given, `vectorComponents` should be given (but this is not required). If a [`coordinateSystemSchema`](#364-coordinatesystemschema-details) was given for this dataset, then the `coordinateSystemName` must come from the schema. [See the `vectorComponents` Object](#3611-vectorcomponents-object) for more about coordinate systems. | +| `vectorComponents` | string or array of strings| **Optional** The name or list of names of the vector-related elements. For a scalar `parameter`, only a single string indicating its component type is allowed. For an array `parameter`, an array of strings corresponding to the component names is expected. If `vectorComponents` is given, `length(size)=1` and `size[0]=length(vectorComponents)` are required. If a parameter has a `coordinateSystemName` and three elements, and they represent Cartesian (x,y,z) values, then as a convenience, you may omit the `vectorComponents`. For any other situation (not three elements, not Cartesian, not ordered as x,y,z), then you must specify the `vectorComponents` explicitly. [See the `vectorComponents` Object](#3611-vectorcomponents-object) for additional details. | +| `bins` | array of Bins object | **Optional** For array parameters, each object in the `bins` array corresponds to one of the dimensions of the array and describes values associated with each element in the corresponding dimension of the array. The [bins object table](3611-bins-object) below describes all required and optional attributes within each `bins` object. For example, if the parameter represents a 1-D frequency spectrum, the `bins` array will have one object describing the frequency values for each frequency bin; within that object, the `centers` attribute points to an array of values to use for the central frequency of each channel, and the `ranges` attribute specifies a range associated with each channel. | + +**Example** + +These examples show an `info` response for a hypothetical magnetic field dataset. + +``` +http://server/hapi/info?dataset=ACE_MAG +``` + +**Example Response:** + +```javascript +{ + "HAPI": "3.3", + "status": { "code": 1200, "message": "OK"}, + "startDate": "1998-001Z", + "stopDate" : "2017-100Z", + "coordinateSystemSchema" : { "schemaName": "SPASE", "schemaURI": "TBD"}, + "parameters": [ + { "name": "Time", + "type": "isotime", + "units": "UTC", + "fill": null, + "length": 24 }, + { "name": "radial_position", + "type": "double", + "units": "km", + "fill": null, + "description": "radial position of the spacecraft", + "label": "R Position"}, + { "name": "quality_flag", + "type": "integer", + "units": "none", + "fill": null, + "description ": "0=OK and 1=bad"}, + { "name": "mag_GSE", + "type": "double", + "units": "nT", + "fill": "-1e31", + "size" : [3], + "coordinateSystemName": "GSE", + "description": "hourly average Cartesian magnetic field in nT in GSE", + "label": "B field in GSE"} + ] +} +``` + +### 3.6.8 size Details + +The `size` attribute is required for array parameters and not allowed for others. The length of the `size` array indicates the number of dimensions, and each element in the `size` array indicates the number of elements in that dimension. For example, the size attribute for a 1-D array would be a 1-D JSON array of length one, with the one element in the JSON array indicating the number of elements in the data array. For a spectrum, this number of elements is the number of wavelengths or energies in the spectrum. Thus `"size": [9]` refers to a data parameter that is a 1-D array of length 9, and in the `csv` and `binary` output formats, there will be 9 columns for this data parameter. In the `json` output for this data parameter, each record will contain a JSON array of 9 elements (enclosed in brackets `[ ]`). + +For arrays of size 2-D or higher, the column orderings need to be specified for the `csv` and `binary` output formats. For both of these formats, the array needs to be "unrolled" into individual columns. The mapping of 2-D array element to an unrolled column index is done so that the later array elements change the fastest. For example, given a 2-D array of `"size": [2, 5]`, the 5-item index changes the most quickly. Items in each record will be like this: `[0,0] [0,1] [0,2] [0,3] [0,4] [1,0] [1,1] [1,2] [1,3] [1,4]`. The ordering is similar for higher dimensions. + +No unrolling is needed for JSON arrays because JSON syntax can represent arrays of any dimension. The following example shows one record of data with a time parameter and a single data parameter `"size":[2,5]` (of type double): + +``` +["2017-11-13T12:34:56.789Z", [ [0.0, 1.1, 2.2, 3.3, 4.4], [5.0, 6.0, 7.0, 8.0, 9.0] ] ] +``` + +**Time-Varying `size`** + +If the size of a dimension in a multi-dimensional parameter changes over time, the only way to represent this in HAPI is to define the parameter as having the largest potential `size`, and then use a `fill` value for any data elements which are no longer actually being provided. + +If this size-changing parameter has bins, then the number of bins would also presumably change over time. Servers can indicate the absence of one or more bins by using the time-varying bin mechanism described above and then providing all fill values for the `ranges` and `centers` of the records where those bins are absent. + +The following example shows a conceptual data block (not in HAPI format) where there is an array parameter whose values are in columns `d0` through `d3`. The corresponding bin centers are in the columns `c0` through `c3`. The data block shows what happens in the data if the size of the parameter changes from 4 to 3 after the third time step. The data values change to fill (`-1.0e31` in this case), and the values for the centers also change to fill to indicate that the corresponding array elements are no longer valid elements in the array. + +``` +time data0 data1 data2 data3 center0 center1 center1 center3 +2019-01-01T14:10:30.5 1.2 3.4 5.4 8.9 10.0 20.0 30.0 40.0 +2019-01-01T14:10:31.5 1.1 3.6 5.8 8.4 10.0 20.0 30.0 40.0 +2019-01-01T14:10:32.5 1.4 3.8 5.9 8.3 10.0 20.0 30.0 40.0 +2019-01-01T14:10:33.5 1.3 3.1 5.3 -1.0e31 15.0 25.0 35.0 -1.0e31 +2019-01-01T14:10:34.5 1.2 3.0 5.4 -1.0e31 15.0 25.0 35.0 -1.0e31 +2019-01-01T14:10:35.5 1.2 3.0 5.4 -1.0e31 15.0 25.0 35.0 -1.0e31 +``` + +Note that the fill value in the bin centers column indicates that this `data3` array element is gone more permanently than just finding a fill value in `data3`. Finding some fill values in an array parameter would not necessarily indicate that the column was permanently gone, while the bin center being fill indicates that the array size has effectively changed. If a bin center is fill, the corresponding data column should also be fill, even though this is duplicate information (since having a fill `center3` in a record already indicates a non-usable `data3` in that record.) + +Recall that the static `centers` and `ranges` objects in the JSON `info` header cannot contain null or fill values. +### 3.6.9 fill Details + +Note that fill values for all types must be specified as a string (not just as ASCII within the JSON, but as a literal JSON string inside quotes). For `double` and `integer` types, the string should correspond to a numeric value. In other words, using a string like `invalid_int` would not be allowed for an integer fill value. Care should be taken to ensure that the string value given will have an exact numeric representation and special care should be taken for `double` values, which can suffer from round-off problems. For integers, string fill values must correspond to an integer value small enough to fit into a 4-byte signed integer. For `double` parameters, the fill string must parse to an exact IEEE 754 double representation. One suggestion is to use large negative integers, such as `-1.0E30`. The string `NaN` is allowed, in which the case `csv` output should contain the string `NaN` for fill values. For `binary` data output with double NaN values, the bit pattern for quiet NaN should be used, as opposed to the signaling NaN, which should not be used (see [[6](#6-references)]). For `string` and `isotime` parameters, the string `fill` value is used at face value, and it should have a length that fits in the length of the data parameter. + +### 3.6.10 units and label Array + +When a scalar `units` value is given for an array parameter, the scalar is assumed to apply to all elements in the array -- a kind of broadcast application of the single value to all values in the array. For multi-dimensional arrays, the broadcast applies to all elements in every dimension. A partial broadcast to only one dimension in the array is not allowed. Either a full set of unit strings are given to describe every element in the multi-dimensional array, or a single value is given to apply to all elements. This allows for the handling of special cases while keeping the specification simple. The same broadcast rules govern labels. + +The previous example included the optional `label` attribute for some parameters. Using a single string for the `units` and `label` of the array parameter `mag_GSE` indicates that all array elements have the same units and label. The next example shows an `info` response for a magnetic field dataset where the vector components are assigned distinct units and labels. + +**Example** + +```json +{ + "HAPI": "3.3", + "status": {"code": 1200, "message": "OK"}, + "startDate": "1998-001Z", + "stopDate" : "2017-100Z", + "parameters": [ + { "name": "Time", + "type": "isotime", + "units": "UTC", + "fill": null, + "length": 24 }, + { "name": "radial_position", + "type": "double", + "units": "km", + "fill": null, + "description": "radial position of the spacecraft", + "label": "R Position"}, + { "name": "quality_flag", + "type": "integer", + "units": "none", + "fill": null, + "description ": "0=OK and 1=bad " }, + { "name": "mag_GSE", + "description": "B field as magnitude and two angles theta (colatitude) and phi (longitude)", + "type": "double", + "size" : [3], + "coordinateSystemName": "GSE", + "vectorComponents": [ "r", "colatitude", "longitude" ], + "units": ["nT","degrees", "degrees"], + "label": ["B Magnitude", "theta", "phi"], + "fill": "-1e31" } + ] +} +``` + +Each element in the string array applies to the corresponding element in the `mag_GSE` data array. + +The following are examples for `units` and `label` values for a parameter of `size=[2,3]`. The parameter is vector velocity from two separate instruments. In a JSON response, the parameter at a given time would have the form + + ["2017-11-13T12:34:56.789Z", [ [1, 2, 3] [4, 5, 6] ] ] + +The `[1,2,3]` are measurements from the first instrument and the `[4, 5, 6]` are measurements from the second instrument. + +**Allowed** (scalar for `units` and for `label` applies to all elements in the array) + +```Javascript +"size": [2,3] +"units": "m/s", +"label": "velocity" +``` + +**Allowed** (scalar units applied to all six elements in the array; unique label for each element) + +```Javascript +"size": [2,3], +"units": "m/s", +"label": [["V1x","V1y","V1z"],["V2x","V2y","V2z"]] +``` + +**Allowed** (array of length 1 is treated like scalar; not preferred but allowed) + +```Javascript +"size": [2,3] +"units": ["m/s"], +"label": [["V1x","V1y","V1z"],["V2x","V2y","V2z"]] +``` + +**Allowed** (all elements are properly given their own `units` string) + +```Javascript +"size": [2,3], +"units": [["m/s","m/s","km/s"],["m/s","m/s","km/s"]], +"label": [["V1x","V1y","V1z"],["V2x","V2y","V2z"]] +``` + +**Not Allowed** (array size does not match parameter size -- must specify all `units` elements if not just giving a scalar) + +```Javascript +"size": [2,3], +"units": ["m/s","m/s","km/s"], +"label": [["V1x","V1y","V1z"],["V2x","V2y","V2z"]] +``` + +**Not Allowed** (`units` array size does not match parameter size) + +```Javascript +"size": [2,3] +"units": ["m/s",["m/s","m/s","km/s"]], +"label": [["V1x","V1y","V1z"],["V2x","V2y","V2z"]] +``` + +### 3.6.11 `vectorComponents` Object + +For a `parameter` that describes a vector-related quantity (position of spacecraft relative to a body, location of ground station, direction of measured vector quantity, detector look direction), the `vectorComponents` keyword indicates the vector-related components present in the data. For a scalar `parameter` that is associated with a vector component, this keyword contains a single string describing the component. For non-scalar parameters, this keyword contains an array of strings naming the coordinate values present in the parameter. + +The names represent the mathematical components typically found in vector or positional quantities. + +Possible component names are constrained to be one of the following: + +| Component Name | Meaning | +|----------------|--------------------------------------------------------------------| +| `x` | Cartesian x component of vector | +| `y` | Cartesian y component of vector | +| `z` | Cartesian z component of vector | +| `r` | Magnitude of vector | +| `rho` | Magnitude of radial component (perpendicular distance from z-axis) in a Cylindrical coordinate representation | +| `latitude` | Angle relative to x-y plane from -90° to 90°, or -π to π (90° corresponds to +z axis) | +| `colatitude` | Angle relative to +z axis, from 0 to 180°, or 0 to π | +| `longitude` | Angle relative to +x axis of a projection of the vector into the x-y plane, from -180° to 180°, or -π to π (90° corresponds to +y axis; this is also known as "East longitude") | +| `longitude0` | Angle relative to +x axis of a projection of the vector into the x-y plane, from 0° to 360°, or 0 to 2π (270° corresponds to -y axis; this is also known as "East longitude") | +| `altitude` | Altitude above a reference sphere or surface as identified in `coordinateSystemName`. For example, Earth's locations are often described by longitude, latitude, and altitude.| +| `other` | Any parameter component that cannot be described by a name in this list | + +Note that `vectorComponents` should be interpreted as describing vector-related elements. In some instances, the `vectorComponents` will describe a proper vector (e.g., if `vectorComponents = ["x", "y", "z"]`). + +Many angular quantities in datasets have different names than the ones here +(azimuth instead of longitude, or elevation or inclination instead of latitude), +but most quantities directly map to these commonly used component names, which come from +spherical or cylindrical coordinate systems. A [future version](https://github.com/hapi-server/data-specification/issues/115) of HAPI may offer +a way to represent other angular components. + +The following parameter object of an `/info` response (but not a full `/info` response) demonstrates the use of directional quantities. + +```json +"parameters": [ + { "name": "spacecraftLocationXYZ", + "description": "S/C location in X,Y,Z; vector direction from center of Earth to spacecraft", + "size": [3], + "units": "km", + "coordinateSystemName": "GSM", + "vectorComponents": ["x", "y", "z"] }, + + { "name": "spacecraftLocationSpherical", + "description": "position in R, MLT and magnetic latitude", + "size": [3], + "units": ["Re", "hours", "degrees" ], + "coordinateSystemName": "GSM", + "vectorComponents": ["r", "longitude", "latitude"] }, + + { "name": "magnetic_field", + "description": "mag vector in Cartesian coords; components are x,y,z which is the assumed default for size [3]", + "size": [3], + "units": "nT", + "coordinateSystemName": "GSM" }, + + { "name": "magnetic_field_cylindrical", + "description": "mag vector in cylindrical coords", + "size": [3], + "units": ["nT","degrees", "nT"], + "coordinateSystemName": "GSM", + "vectorComponents": ["rho", "longitude", "z"] }, + + { "name": "Bx", + "description": "mag X component", + "units": "nT", + "coordinateSystemName": "GSE", + "vectorComponents": "x" }, + + { "name": "By", + "description": "mag Y component", + "units": "nT", + "coordinateSystemName": "GSE", + "vectorComponents": "y" }, + + { "name": "Bz", + "description": "mag Z component", + "units": "nT", + "coordinateSystemName": "GSE", + "vectorComponents": "z" }, + + ] +``` + +In this example, the spacecraft position is provided in two different ways, Cartesian and spherical. +The default value for `vectorComponents` is `["x", "y", "z"]`, and so it may be included if +desired, as is the case with the first position parameter, `spacecraftLocationXYZ`, or omitted +as is shown for the `magnetic_field` parameter. If the `units` provided is a scalar, it applies +to all components of the parameter (this is true regardless of whether the `parameter` is a vector). +The parameter `magnetic_field_cylindrical` is indeed in cylindrical coordinates, so it does list +specific vector components of `["rho", "longitude", "z"]`, and since the units of each component are +not the same, those are listed in an array as `["nT", "degrees", "nT"]`. + +Note that the `longitude` units of the `spacecraftLocationSpherical` parameter are `hours`. This +must be a floating point value to capture fractions of hours, not a string with hours, minutes, and seconds. +So, within the data values for this parameter component, 20.5 would be an interpretable value +for 20 hours 30 minutes, but 20:30:00 would not be ok. + +The last three parameters in this example (`Bx`, `By`, and `Bz`) illustrate how a vector quantity +can be spread across multiple scalar parameters. The `vectorComponents` attribute identifies which +component is in the parameter. All relevant components would need to be manually combined to create the +vector quantity. + +### 3.6.12 location and geoLocation Details + +Some datasets have parameters whose measurements are associated with a fixed location. Other measurements +are made from locations that change with time. +The `location` attribute can express the location of measurements either as a single, fixed location, +or as location values that change with time. An alternative attribute, `geoLocation`, can +be used instead as a compact way of describing an Earth-based position for a dataset's measurements. +Only `location` or `geoLocation` should be used, not both. + +For the case of a fixed location on Earth, the `geoLocation` attribute concisely +represents a point location with an array of longitude, latitude, and optionally altitude. + +```javascript + "geoLocation": [longitude, latitude] + -OR- + "geoLocation": [longitude, latitude, altitude] +``` + +Angles in `geoLocation` must be in `deg` and altitude in `m` and the coordinate system is WGS84 (see [Geodetic Coordinate Systems](https://en.wikipedia.org/wiki/World_Geodetic_System)). +This makes `geoLocation` consistent with the [GeoJSON](https://geojson.org) specification for a point location. + +As an example, the location of measurements made by a hypothetical magnetometer at the peak of Sugarloaf Mountain in Maryland, USA, could be represented as: + +```javascript +"geoLocation": [-77.395, 39.269, 391.0] +``` + +For a fixed location expressed in a different coordinate system, or with other vector components for the location, +use a `location` object with these four attributes: + +| location Attribute | Type | Description | +|------------------------|---------|-----------------------------------------------------------------| +| `point` | double or double array | **Required** values to specify the location | +| `vectorComponents` | string or string array | **Required** the kind of [`vectorComponents`](#3611-vectorcomponents-object) values in the `point` array. Number of values should match what is in `point`. | +| `units` | string or string array | **Required** units string or strings for the `vectorComponents` values. To see how `units` are applied to scalars or arrays, see [the section about units and arrays](#3610-units-and-label-array) | +| `coordinateSystemName` | string | **Required** the name of the coordinate system for the `vectorComponents` quantities | + +If a `unitsSchema` has been specified for this dataset, any string given for the `units` must be consistent with that schema. Similarly, if a `coordinateSystemSchema` has been specified for this dataset, any string given for the `coordinateSystemName` must be consistent with that schema. + +**Examples** + +A verbose version of `"geoLocation": [-77.395, 39.269, 391.0]`: + +```javascript +"location": { + "point": [-77.395, 39.269, 391.0], + "vectorComponents": ["longitude", "latitude", "altitude"], + "units": ["deg", "deg", "m"], + "coordinateSystemName": "wgs84" +} +``` + +Location in non-WGS84 coordinate system and with cartesian vector components: + +```javascript +"location": { + "point": [-4.1452e3, 1.2050e3, 0.10201e3], + "vectorComponents": ["x", "y", "z"], + "units": "km", + "coordinateSystemName": "GSE" +} +``` + +Note that in the second example, the units value of `km` [applies to all components elements](#3610-units-and-label-array). + +**Time-Varying Locations** + +If a dataset has parameters where the measurement location changes over time, the `location` object can indicate +the names of other parameters in the dataset that contain the corresponding time-varying locations. If the time-varying position is +present in more than one coordinate system, each can be referenced. Therefore, the `location` +attribute is an array (outer array) consisting of a list of inner arrays of string parameter names. +If a parameter containing location info has all the `vectorComponents` in it to fully represent the location, then the +inner array will have just one element: the name of the fully sufficient parameter. +If the position info for one coordinate system is spread over multiple parameters, then +each parameter name needs to be in the inner array. +``` +"location": { + "parameters": [ ["param_name_for_location_using_coord_sys_A"], ["param_name_for_location_using_coord_sys_B"] ] + # each parameter here must be a vector and have in its attributes a full set of `vectorComponents` to describe the vector +} +``` +Examples help illustrate: +``` +"location": { + "parameters": [ ["Location_GEO"], ["Location_GSE"] ] + } +``` +In this first example, two parameters provide position info, each in a different coordinate system. +Within the definition of each location parameter, there must be a `vectorComponent` description. +``` +location: { + "parameters": [ ["Location_GEO_X", "Location_GEO_Y", "Location_GEO_Z], + ["Location_J2000_X", "Location_J2000_Y", "Location_J2000_Z"] + ] +} +``` +In this second example, there are also two coordinate systems, but each one is expressed +across three parameters, one each for the x, y, and z values of the position. + +### 3.6.13 bins Object + +The bins attribute of a parameter is an array of JSON objects with the following attributes. + +| Bins Attribute | Type | Description | +|------------------|-------------------------------|-----------------------------------------------------------------| +| `name` | string | **Required** Name for the dimension (e.g. "Frequency"). | +| `centers` | array of n doubles | **Required if no `ranges`** The centers of each bin range. | +| `ranges` | array of n arrays of 2 doubles | **Required if no `centers`** The boundaries for each bin (array of `[min, max]`). | +| `units` | string | **Required** The units for the bin ranges and/or center values. | +| `label` | string | **Optional** A label appropriate for a plot (use if `name` is not appropriate) | +| `description` | string | **Optional** Brief comment explaining what the bins represent. | + +Notes: +* At least one of `ranges` and `centers` must be given. +* Some dimensions of a multi-dimensional parameter may not represent binned data. Each dimension must be described in the `bins` object, but any dimension not representing binned data should indicate this by using `"centers": null` and not including the `'ranges'` attribute. +* Bin centers and/or ranges may depend on time (but the number of centers and/or ranges may not change), see the description of time-varying bins later in this section. +* Bin centers and/or ranges may not depend on non--time dimensions. For example, suppose the parameter dimensions are time, energy and pitch angle. The pitch angle bin centers and/or ranges cannot depend on the energy bins. + + +**Example** +```javascript +"parameters": +[ + { + "name": "Time", + "type": "isotime", + "units": "UTC", + "fill": null, + "length": 24 + }, + { + "name": "Protons_10_to_20_keV_pitch_angle_spectrogram", + "type": "double", + "units": "1/(cm^2 s^2 ster keV)", + "fill": "-1.0e31", + "size": [6], + "bins": [{ + "name": "angle_bins", + "ranges": [ + [0.0, 30.0], + [30.0, 60.0], + [60.0, 90.0], + [90.0, 120.0], + [120.0, 150.0], + [150.0, 180.0] + ], + "units": "degrees", + "label": "Pitch Angle" + }] + } +] +``` + +The data given for `centers` and `ranges` must not contain any `null` or missing values. The number of valid numbers in the `centers` array and the number of valid min/max pairs in the `ranges` array must match the size of the parameter dimension being described. So this is not allowed: + +**Invalid** +```Javascript +centers = [2.0, 3.0, 4.0], +ranges = [[1.0, 3.0], null, [3.0, null]] +``` +**Invalid** +```Javascript +centers = [2.0, 3.0, null] +``` + +**Valid** +```Javascript +centers = [2.0, 3.0, 4.0], +ranges = [[1.0, 3.0], [2.0, 4.0], [3.0, 5.0]] +``` + +**Time-Varying Bins** + +In some datasets, the bin centers and/or ranges may vary with time. The static values in the `bins` object definition for `ranges` or `centers` are fixed arrays and therefore cannot represent bin boundaries that change over time. As of HAPI 3.0, the `ranges` and `centers` objects can be, instead of a numeric array, a string value that is the name of another parameter in the dataset. This allows the `ranges` and `centers` objects to point to a parameter that is the source of numbers for the bin `centers` or `ranges`. The size of the target parameter must match that of the bins being represented. And, of course, each record of data can contain a different value for the parameter, effectively allowing the bin `ranges` and `centers` to change potentially at every time step. + +Due to its complexity, plotting data with time-varying bins may not be supported by all clients. We recommend that if not supported, the client communicates this to the user. + +The following example shows a dataset of multi-dimensional values: proton intensities over multiple energies and at multiple pitch angles. The data parameter name is `proton_spectrum`, and it has bins for both an energy dimension (16 different energy bins) and a pitch angle dimension (3 different pitch angle bins). For the bins in both of these dimensions, a parameter name is given instead of numeric values for the bin locations. The parameter `energy_centers` contains an array of 16 values at each time step, and these are to be interpreted as the time-varying centers of the energies. Likewise, there is a `pitch_angle_centers` parameter, which serves as the source of numbers for the centers of the other bin dimension. There are also `ranges` parameters that are two-dimensional elements since each range consists of a high and low value. + +Note that the comments embedded in the JSON (with a prefix of `//`) are for human readers only since comments are not supported in JSON. + +```javascript +{ + "HAPI": "3.3", + "status": {"code": 1200, "message": "OK"}, + "startDate": "2016-01-01T00:00:00.000Z", + "stopDate": "2016-01-31T24:00:00.000Z", + "parameters": + [ { "name": "Time", + "type": "isotime", + "units": "UTC", + "fill": null, + "length": 24 + }, + { "name": "proton_spectrum", + "type": "double", + "size": [16,3], + "units": "particles/(sec ster cm^2 keV)", + "fill": "-1e31", + "bins": + [ + { "name": "energy", + "units": "keV", + "centers": "energy_centers", + "ranges": "energy_ranges" + }, + { "name": "pitch_angle", + "units": "degrees", + "centers": "pitch_angle_centers", + "ranges": "pitch_angle_ranges" + } + ] + }, + { + "name": "energy_centers", + "type": "double", + "size": [16], // 16 matches size[0] in #/proton_spectrum/size + "units": "keV", // Should match #/proton_spectrum/units + "fill": "-1e31" // Clients should interpret as meaning no measurement made in bin + }, + { "name": "energy_ranges", + "type": "double", + "size": [16,2], // 16 matches size[0] in #/proton_spectrum/size; size[1] must be 2 + "units": "keV", // Should match #/proton_spectrum/units + "fill": "-1e31" // Clients should interpret as meaning no measurement made in bin + }, + { "name": "pitch_angle_centers", + "type": "double", + "size": [3], // 3 matches size[1] in #/proton_spectrum/size + "units": "degrees", // Should match #/proton_spectrum/units + "fill": "-1e31" // Clients should interpret as meaning no measurement made in bin + }, + { "name": "pitch_angle_ranges", + "type": "double", + "size": [3,2], // 3 matches size[1] in #/proton_spectrum/size; size[1] must be 2 + "units": "degrees", // Should match #/proton_spectrum/units + "fill": "-1e31" // Clients should interpret as meaning no measurement made in bin + } + ] +} +``` + +### 3.6.14 JSON References + +If the same information appears more than once within the `info` response, it is better to represent this in a structured way rather than to copy and paste duplicate information. Consider a dataset with two parameters -- one for the measurement values and one for the uncertainties. If the two parameters both have `bins` associated with them, the bin definitions would likely be identical. Having each `bins` entity refer back to a pre-defined, single entity ensures that the bin values are identical, and it also more readily communicates the connection to users, who otherwise would have to do a value-by-value comparison to see if the bin values are the same. + +JSON has a built-in mechanism for handling references. HAPI utilizes a subset of these features, focusing on the simple aspects implemented in many existing JSON parsers. Also, using only simple features makes it easier for users to implement custom parsers. Note that familiarity with the full description of JSON references ([5](#6-references)), is helpful in understanding the use of references in HAPI described below. + +JSON objects placed within a `definitions` block can be pointed to using the `$ref` notation. We begin with an example. This `definitions` block contains this object called `angle_bins` that can be referred to using the JSON syntax `#/definitions/angle_bins` + +```json +{ + "definitions": { + "pitch_angle_bins": { + "name": "angle_bins", + "ranges": [ [0.0, 30.0], + [30.0, 60.0], + [60.0, 90.0], + [90.0, 120.0], + [120.0, 150.0], + [150.0, 180.0] + ], + "units": "degrees", + "label": "Pitch Angle" + } + } +} +``` + +Here is a parameter fragment showing the reference used in two places: + +```json +{ + "parameters": [ + { "name": "Time", + "type": "isotime", + "units": "UTC", + "fill": null, + "length": 24 + }, + { "name": "Protons_10_to_20_keV_pitch_angle_spectrogram", + "type": "double", + "units": "1/(cm^2 s^2 ster keV)", + "fill": "-1.0e38", + "size": [6], + "bins": [ + {"$ref" : "#/definitions/angle_bins"} + ] + }, + { "name": "Uncert_for_Protons_10_to_20_keV_pitch_angle_spectrogram", + "type": "double", + "units": "1/(cm^2 s^2 ster keV)", + "fill": "-1.0e38", + "size": [6], + "bins": [ + {"$ref" : "#/definitions/angle_bins"} + ] + + } + ] +} +``` + +The following rules govern the use of JSON references a HAPI info response. + +1. Anything referenced must appear in a top-level node named `definitions` (this is a JSON Schema convention [[5](#6-references)] but a HAPI requirement). +1. Objects in the `definitions` node may not contain references (JSON Schema [[5](#6-references)] allows this, HAPI does not) +1. Referencing by `id` is not allowed (JSON Schema [[5](#6-references)] allows this, HAPI does not) +1. `name` may not be a reference (names must be unique anyway - this would make HAPI `info` potentially very confusing). + +By default, a server resolves these references and excludes the definitions node. Stated more directly, a server should not return a `definitions` block unless the request URL includes + +``` +resolve_references=false +``` + +in which case the response metadata should contain references to items in the `definitions` node. Note these constraints on what can be in the `definitions`: + +1. Any element found in the `definitions` node must be used somewhere in the full set of metadata. Note that this full metadata can be obtained via an `info` request or by a `data` request to which the header is prepended (using `include=header`). +1. If an `info` request or a `data` request with `include=header` is for a *subset* of parameters (e.g., `/hapi/info?id=DATASETID¶meters=p1,p2)`, the `definitions` node may contain objects that are not referenced in the metadata for the requested subset of parameters; removal of unused definitions is optional in this case. + +Here, then, is a complete example of an info response with references unresolved, showing a `definitions` block and the use of references. The `units` string is commonly used, so it is captured as a reference, as is the full bins definition. Note that this example shows how just a part of the `bins` object could be represented -- the `centers` object in this case. The example is valid HAPI content, but normally, using both approaches in a single `info` response would not make sense. + +```json +{ + "HAPI": "3.3", + "status": { + "code": 1200, + "message": "OK" + }, + "startDate": "2016-01-01T00:00:00.000Z", + "stopDate": "2016-01-31T24:00:00.000Z", + "definitions": { + "spectrum_units": "particles/(sec ster cm^2 keV)", + "spectrum_units_explicit": ["particles/(sec ster cm^2 keV)","particles/(sec ster cm^2 keV)","particles/(sec ster cm^2 keV)","particles/(sec ster cm^2 keV)"], + "spectrum_bins_centers": [15, 25, 35, 45], + "spectrum_bins": { + "name": "energy", + "units": "keV", + "centers": [15, 25, 35, 45] + } + }, + "parameters": [{ + "name": "Time", + "type": "isotime", + "units": "UTC", + "fill": null, + "length": 24 + }, + { + "name": "proton_spectrum", + "type": "double", + "size": [4], + "units": {"$ref": "#/definitions/spectrum_units"}, + "fill": "-1e31", + "bins": [{ + "name": "energy", + "units": "keV", + "centers": {"$ref": "#/definitions/spectrum_bins_centers"} + }] + }, + { + "name": "proton_spectrum2", + "type": "double", + "size": [4], + "units": {"$ref": "#/definitions/spectrum_units"}, + "bins": [ + {"$ref": "#/definitions/spectrum_bins"} + ] + }, + { + "name": "proton_spectrum3", + "type": "double", + "size": [4], + "units": {"$ref": "#/definitions/spectrum_units_explicit"}, + "bins": [ + {"$ref": "#/definitions/spectrum_bins"} + ] + } + ] +} +``` + +## 3.7 `data` + +This endpoint provides access to a dataset and allows for selecting time ranges and parameters to return. Data is returned as a CSV [[2](#6-references)], binary, or JSON stream. The [Data Stream Content](#374-response-formats) section describes the stream structure and layout for each format. + +The resulting data stream can be considered a stream of records, where each record contains one value for each dataset parameter. Each data record must contain a data value or a fill value (of the same data type) for each parameter. + +### 3.7.1 Request Parameters + +Items with a * superscript in the following table have been modified from version 2 to 3; see [change notes](#1-significant-changes-to-specification). + +| Name | Description | +|------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| `dataset`[ * ](#1-significant-changes-to-specification) | **Required** The identifier for the dataset ([allowed characters](#82-allowed-characters-in-id-dataset-and-parameters)). | +| `start`[ * ](#1-significant-changes-to-specification) | **Required** The inclusive begin time for the data to include in the response. | +| `stop`[ * ](#1-significant-changes-to-specification) | **Required** The exclusive end time for the data to include in the response. | +| `parameters` | **Optional** A comma-separated list of parameters to include in the response ([allowed characters](#82-allowed-characters-in-id-dataset-and-parameters)). Default is all; `...¶meters=&...` in URL should be interpreted as meaning all parameters. | +| `include` | **Optional** Has one possible value of "header" to indicate that the info header should precede the data. The header lines will be prefixed with the "\#" character. | +| `format` | **Optional** The desired format for the data stream. Possible values are "csv", "binary", and "json". | + +### 3.7.2 Response + +The `data` endpoint response is in one of three formats: CSV format as defined by [[2](#6-references)] with a mime type of `text/csv`; binary format where floating points number are in IEEE 754 [[4](#6-references)] format and byte order is LSB and a mime type of `application/octet-stream`; JSON format with the structure as described below and a mime type of `application/json`. The default data format is CSV. See the [Data Stream Content](#374-response-formats) section for more details. + +If the header is requested, then for binary and CSV formats, each line of the header must begin with a hash (\#) character. For JSON output, no prefix character should be used because the data object will be another JSON element within the response. Other than the possible prefix character, the contents of the header should be the same as returned from the info endpoint. When a data stream has an attached header, the header must contain an additional "format" attribute to indicate if the content after the header is `csv`, `binary`, or `json`. Note that when a header is included in a CSV response, the data stream is not strictly in CSV format. + +The first parameter in the data must be a time column (type of `isotime`) and this must be the independent variable for the dataset. If a subset of parameters is requested, the time column is always provided, even if it is not requested. + +Note that the `start` request parameter represents an inclusive lower bound, and the `stop` request parameter is the exclusive upper bound. The server must return data records within these time constraints, i.e., no extra records outside the requested time range. This enables the concatenation of results from adjacent time ranges. + +There is a relationship between the `info` endpoint and the `data` endpoint because the header from the `info` endpoint describes the record structure of data emitted by the `data` endpoint. Thus, after a single call to the `info` endpoint, a client could make multiple calls to the `data` endpoint (for multiple time ranges, for example) with the expectation that each data response would contain records described by the single call to the `info` endpoint. The `data` endpoint can optionally prefix the data stream with header information, potentially obviating the need for the `info` endpoint. + +Both the `info` and `data` endpoints take an optional request parameter (recall the definition of request parameter in the introduction) called `parameters` that allows users to restrict the dataset parameters listed in the header and data stream, respectively. This enables clients (that already have a list of dataset parameters from a previous info or data request) to request a header for a subset of parameters that will match a data stream for the same subset of parameters. The parameters in the subset request must be ordered according to the original order of the parameters in the metadata, i.e., the subset can contain fewer parameters, but their order must be the same as in the `parameter` array in the `/info` response. + +Consider the following dataset header for a fictional dataset with the identifier `MY_MAG_DATA`. + +An `info` request for this dataset[*](#1-significant-changes-to-specification) + +``` +http://server/hapi/info?dataset=MY_MAG_DATA +``` + +results in a header listing of all the dataset parameters: + +```javascript +{ + "HAPI": "3.3", + "status": { "code": 1200, "message": "OK"}, + "startDate": "2005-01-21T12:05:00.000Z", + "stopDate" : "2010-10-18T00:00:00Z", + "parameters": [ + { "name": "Time", + "type": "isotime", + "units": "UTC", + "fill": null, + "length": 24 }, + { "name": "Bx", "type": "double", "units": "nT", "fill": "-1e31"}, + { "name": "By", "type": "double", "units": "nT", "fill": "-1e31"}, + { "name": "Bz", "type": "double", "units": "nT", "fill": "-1e31"}, + ] +} +``` + +An `info` request for a single parameter has the form + +``` +http://server/hapi/info?dataset=MY_MAG_DATA¶meters=Bx +``` + +and would result in the following header: + +```javascript +{ + "HAPI": "3.3", + "status": { "code": 1200, "message": "OK"}, + "startDate": "2005-01-21T12:05:00.000Z", + "stopDate" : "2010-10-18T00:00:00Z", + "parameters": [ + { "name": "Time", + "type": "isotime", + "units": "UTC", + "fill": null, + "length": 24 }, + { "name": "Bx", "type": "double", "units": "nT", "fill": "-1e31" }, + ] +} +``` + +Note that the time parameter is included even though it was not requested. + +In this request[*](#1-significant-changes-to-specification), + +``` +http://server/hapi/info?dataset=MY_MAG_DATA¶meters=By,Bx +``` + +the parameters are out of order. So the server should respond with an error code. See [HAPI Status Codes](#4-status-codes) for more about error conditions. + +### 3.7.3 Examples + +Two examples of data requests and responses are given – one with the header and one without. + +#### 3.7.3.1 Data with Header + +Note that in the following request, the header is to be included, so the same header from the `info` endpoint will be prepended to the data but with a ‘\#’ character as a prefix for every header line. + +``` +http://server/hapi/data?dataset=path/to/ACE_MAG&start=2016-01-01Z&stop=2016-02-01Z&include=header +``` + +Response + + +``` +#{ +# "HAPI": "3.3", +# "status": { "code": 1200, "message": "OK"}, +# "format": "csv", +# "startDate": "1998-001Z", +# "stopDate" : "2017-001Z", +# "parameters": [ +# { "name": "Time", +# "type": "isotime", +# "units": "UTC", +# "fill": null, +# "length": 24 +# }, +# { "name": "radial_position", +# "type": "double", +# "units": "km", +# "fill": null, +# "description": "radial position of the spacecraft" +# }, +# { "name": "quality flag", +# "type": "integer", +# "units ": null, +# "fill": null, +# "description ": "0=OK and 1=bad " +# }, +# { "name": "mag_GSE", +# "type": "double", +# "units": "nT", +# "fill": "-1e31", +# "size" : [3], +# "description": "hourly average Cartesian magnetic field in nT in GSE" +# } +# ] +#} +2016-01-01T00:00:00.000Z,6.848351,0,0.05,0.08,-50.98 +2016-01-01T01:00:00.000Z,6.890149,0,0.04,0.07,-45.26 + ... + ... +2016-01-01T02:00:00.000Z,8.142253,0,2.74,0.17,-28.62 +``` + +#### 3.7.3.2 Data Only + +The following example is the same, except it lacks the request to include the +header. + +``` +http://server/hapi/data?dataset=path/to/ACE_MAG&start=2016-01-01&stop=2016-02-01 +``` + +Response + +Consider a dataset that contains a time field, two scalar fields, and one array field of length 3. The response will have the form: + +``` +2016-01-01T00:00:00.000Z,6.848351,0,0.05,0.08,-50.98 +2016-01-01T01:00:00.000Z,6.890149,0,0.04,0.07,-45.26 +... +... +2016-01-01T02:00:00.000Z,8.142253,0,2.74,0.17,-28.62 +``` + +Note that there is no leading row with column names. The RFC 4180 CSV standard [[2](#6-references)] indicates that such a header row is optional. Leaving out this row avoids the complication of naming individual columns representing array elements within an array parameter. Recall that an array parameter has only a single name. HAPI specifies parameter names via the `info` endpoint, which also provides size details for each parameter (scalar or array, and array size if needed). The size of each parameter must be used to determine how many columns it will use in the CSV data. By not specifying a row of column names, HAPI avoids the need to have a naming convention for columns representing elements within an array parameter. + +### 3.7.4 Response formats + +The three possible output formats are `csv`, `binary`, and `json`. A HAPI server must support `csv`, while `binary` and `json` are optional. We emphasize that this is a simple and ephemeral streaming transport format and should not be considered or used in the same way as standard file formats such as FITS, HDF, CDF, and netCDF, which were not designed for streaming. + +#### 3.7.4.1 CSV + +The format of the CSV stream should follow the guidelines for CSV data as described by RFC 4180 [[2](#6-references)]. Each CSV record is one line of text, with commas between the values for each dataset parameter. Any value containing a comma must be surrounded with double quotes, and any double-quote within a value must be escaped by a preceding double quote. An array parameter (i.e., the value of a parameter within one record is an array) will have multiple columns resulting from placing each element in the array into its own column. For 1-D arrays, the ordering of the unwound columns is just the index ordering of the array elements. For 2-D arrays or higher, the right-most array index is the fastest moving index when mapping array elements to columns. + +It is up to the server to decide how much precision to include in the ASCII values when generating CSV output. + +Clients programs interpreting the HAPI CSV stream are encouraged to use existing CSV parsing libraries to be able to interpret the full range of possible CSV values, including quoted commas and escaped quotes. However, it is expected that a simple CSV parser would probably handle more than 90% of known cases. + +#### 3.7.4.2 Binary + +The binary data output is best described as a binary translation of the CSV stream, with full numerical precision and no commas or newlines. Recall that the dataset header provides type information for each dataset parameter, and this definitively indicates the number of bytes and the byte structure of each parameter and, thus, of each binary record in the stream. Array parameters are unwound in the same way for binary as for CSV data as described above. All numeric values are little-endian (LSB), integers are always signed and four-byte and floating-point values are always IEEE 754 double-precision values. + +Dataset parameters of type `string` and `isotime` (strings of ISO 8601 dates) have a maximum length specified in the info header. This length indicates how many bytes to read for each string value. If the string content is less than the length, the remaining bytes must be padded with ASCII null bytes. If a string uses all the bytes specified in the `length`, no null terminator or padding is needed. + +#### 3.7.4.3 JSON + +For the JSON output, an additional `data` element added to the header contains the array of data records. These records are similar to the CSV output, except that strings must be quoted and arrays must be delimited with array brackets in standard JSON fashion. An example helps to illustrate what the JSON format looks like. Consider a dataset with four parameters: time, a scalar value, a 1-D array value with an array length of 3, and a string value. The header with the data object might look like this: + +```javascript +{ "HAPI": "3.3", + "status": { "code": 1200, "message": "OK"}, + "startDate": "2005-01-21T12:05:00.000Z", + "stopDate" : "2010-10-18T00:00:00Z", + "parameters": [ + { "name": "Time", "type": "isotime", "units": "UTC", "fill": null, "length": 24 }, + { "name": "quality_flag", "type": "integer", "description": "0=ok; 1=bad", "fill": null }, + { "name": "mag_GSE", "type": "double", "units": "nT", "fill": "-1e31", "size" : [3], + "description": "hourly average Cartesian magnetic field in nT in GSE" }, + { "name": "region", "type": "string", "length": 20, "fill": "???", "units" : null} + ], +"format": "json", +"data" : [ +["2010-001T12:01:00Z",0,[0.44302,0.398,-8.49],"sheath"], +["2010-001T12:02:00Z",0,[0.44177,0.393,-9.45],"sheath"], +["2010-001T12:03:00Z",0,[0.44003,0.397,-9.38],"sheath"], +["2010-001T12:04:00Z",1,[0.43904,0.399,-9.16],"sheath"] +] + +} +``` + +The data element is a JSON array of records. Each record is itself an array of parameters. The time and string values are in quotes, and any data parameter in the record that is an array must be inside square brackets. This data element appears as the last JSON element in the header. + +The record-oriented arrangement of the JSON format is designed to allow a streaming client reader to begin reading (and processing) the JSON data stream before it is complete. Note also that servers can stream the data when records are available. In other words, the JSON format can be read and written without holding all the records in memory. This may require a custom JSON formatter, but this streaming capability is important for large responses. + +### 3.7.5 Errors While Streaming + +If the server encounters an error after the data has begun and can no longer continue, it must terminate the stream. As a result, clients must detect an abnormally terminated stream and treat this aborted condition the same as an internal server error. See [HAPI Status Codes](#4-status-codes) for more about error conditions. + +### 3.7.6 Representation of Time + +Time values are always strings, and the HAPI Time format is a subset of the ISO 8601 date and time format [[1](#6-references)]. + +The restriction on the ISO 8601 standard is that time must be represented as + +``` +yyyy-mm-ddThh:mm:ss.sssZ +``` + +or + +``` +yyyy-dddThh:mm:ss.sssZ +``` + +and the trailing `Z` is required. Strings with less precision are allowed as per ISO 8601, e.g., `1999-01Z` and `1999-001Z`. The [HAPI JSON schema](https://github.com/hapi-server/data-specification-schema/blob/main/HAPI-data-access-schema-3.3.json) lists a series of regular expressions that codifies the intention of the HAPI Time specification. The schema allows leap seconds and hour=24, but it should be expected that not all clients will be able to correctly interpret such time stamps. + +The name of the time parameter is not constrained. However, the time column name is strongly recommended to be "Time" or "Epoch" or some easily recognizable label. + +Note that the ISO 8601 time format allows arbitrary precision on the time values. HAPI servers and clients should be able to interpret times with at least nanosecond precision. + +#### 3.7.6.1 Incoming time values + +Servers must require incoming time values from clients (i.e., the `start` and `stop` values on a data request) to be valid ISO 8601 time values. The full ISO 8601 specification allows many esoteric options, but servers must only accept a subset of the full ISO 8601 specification, namely one of either year-month-day (`yyyy-mm-ddThh:mm:ss.sssZ`) or day-of-year (`yyyy-dddThh:mm:ss.sssZ`). Any date or time elements missing from the string are assumed to take on their smallest possible value. For example, the string `2017-01-15T23:00:00.000Z` could be given in truncated form as `2017-01-15T23Z`. Servers should be able to parse and properly interpret these truncated time strings. When clients provide a date at day resolution only, the `T` must not be included, so servers should be able to parse day-level time strings without the `T`, as in `2017-01-15Z`. + +Note that in the ISO 8601 specification, a trailing `Z` on the time string indicates that no time zone offset should be applied (so the time zone is GMT+0). If a server receives an input value without the trailing `Z`, it should still interpret the time zone as GMT+0 rather than a local time zone. This is true for time strings with all fields present and for truncated time strings with some fields missing. + +| Example time range request | comments | +|------------------------------|------------------------------------------------| +| `start=2017-01-15T00:00:00.000Z&stop=2017-01-16T00:00.000Z` | OK - fully specified time value with proper trailing Z | +| `start=2017-01-15Z&stop=2017-01-16Z` | OK - truncated time value that assumes 00:00.000 for the time | +| `start=2017-01-15&stop=2017-01-16` | OK - truncated with missing trailing Z, but GMT+0 should be assumed | + +There is no restriction on the earliest date or latest date a HAPI server can accept, but as a practical limit, clients are likely to be written to handle dates only from 1700 to 2100. + +#### 3.7.6.2 Outgoing time values + +Time values in the outgoing data stream must be ISO 8601 strings. A server may use one of either the `yyyy-mm-ddThh:mm:ss.sssZ` or the `yyyy-dddThh:mm:ss.sssZ` form, but must use one format and length within any given dataset. The time values must not have any local time zone offset, and they must indicate this by including the trailing `Z`. Time or date elements may be omitted from the end to indicate that the missing time or date elements should be given their lowest possible value. For date values at day resolution (i.e., no time values), the `T` must be omitted, but the `Z` is still required. Note that this implies that clients must be able to parse potentially truncated ISO strings of both Year-Month-Day and Year-Day-of-year styles. + +For `binary` and `csv` data, the length of the time string, truncated or not, is indicated with the `length` attribute for the time parameter, which refers to the number of printable characters in the string. Every time string must have the same length, so padding of time strings is unnecessary. + +The data returned from a request should strictly fall within the limits of `start` and `stop`, i.e., servers should not pad the data with extra records outside the requested time range. Furthermore, note that the `start` value is inclusive (data at or beyond this time can be included), while `stop` is exclusive (data at or beyond this time shall not be included in the response). + +The primary time column is not allowed to contain any fill values. Each record must be identified with a valid time value. If other columns contain parameters of type `isotime` (i.e., time columns that are not the primary time column), there may be fill values in these columns. Note that the `fill` definition is required for all types, including `isotime` parameters. The fill value for a (non-primary) `isotime` parameter does not have to be a valid time string - it can be any string, but it must be the same length string as the time variable. + +HAPI metadata (in the `info` header for a dataset) allows a server to specify where timestamps fall within the measurement window. The `timeStampLocation` attribute for a dataset is an enumeration with possible values of `begin`, `center`, `end`, or `other`. This attribute is optional, but the default value is `center`, which refers to the exact middle of the measurement window. If the location of the timestamp is not known or is more complex than any of the allowed options, the server can report `other` for the `timeStampLocation`. Clients are likely to use `center` for `other`, simply because there is not much else they can do. Note that the optional `cadence` attribute is not meant to be accurate enough to use as a way to compute an alternate time stamp location. In other words, given a `timeStampLocation` of `begin` and a `cadence` of 10 seconds, it may not always work to just add 5 seconds to get to the center of the measurement interval for this dataset. This is because the `cadence` provides a nominal duration, and the actual duration of each measurement may vary significantly throughout the dataset. Some datasets may have specific parameters devoted to accumulation time or other measurement window parameters, but HAPI metadata does not capture this level of measurement window details. Suggestions on handling the issues discussed in this paragraph are given on the [implementation notes](https://github.com/hapi-server/data-specification/wiki/Implementation-Notes#Durations) page. + +#### 3.7.6.3 Time Range With No Data + +If a request is made for a time range in which there are no data, the server must respond with an HTTP 200 status code. The HAPI [status-code](#4-status-codes) must be either `HAPI 1201` (the explicit no-data code) or `HAPI 1200` (OK). While the more specific `HAPI 1201` code is preferred, servers may have a difficult time recognizing the lack of data before issuing the header, in which case the issuing of `HAPI 1200` and the subsequent absence of any data records communicates to clients that everything worked but no data was present in the given interval. Any response that includes a header (JSON always does, and CSV and binary when requested) must have this same HAPI status in the header. For CSV or binary responses without a header, the message body should be empty to indicate no data records. + +This example clarifies the ideal case where a user has not requested the header. If servers have no data, the HTTP header + +``` +HTTP/1.1 200 OK +``` + +is acceptable, but a more specific header + +``` +HTTP/1.1 200 OK; HAPI 1201 OK - no data for time range +``` + +is preferred if the server can detect that there is no data in time and can modify the HTTP status message. This allows clients to verify that the empty body was intended without making a second request that includes the header containing a HAPI status object. + +#### 3.7.6.4 Time Range With All Fill Values + +If a request is made with a time range in which the response will contain all fill values, the server must respond with all fill values and not an empty body. + +# 4 Status Codes + +There are two ways that HAPI servers must report errors, and these must be consistent. Because every HAPI server response is an HTTP response, an appropriate HTTP status and message must be set for each response. The HTTP integer status codes to use are the standard ones (200 means OK, 404 means not found, etc), and these are listed below. + +The text message in the HTTP status should not just be the standard HTTP message but should include a HAPI-specific message, and this text should include the HAPI integer code along with the corresponding HAPI status message for that code. These HAPI codes and messages are also are described below. Note the careful use of "must" and "should" here. The use of the HTTP header message to include HAPI-specific details is optional, but the setting of the HTTP integer code status is required. + + +As an example, it is recommended that a status message such as + +``` +HTTP/1.1 404 Not Found +``` + +is modified to include the HAPI error code and error message (as described below) + +``` +HTTP/1.1 404 Not Found; HAPI 1402 Bad request - error in start time +``` + +Although the HTTP header mechanism is robust, it is more difficult for some clients to access -- a HAPI client using a high-level URL retrieving mechanism may not have easy access to HTTP header content. Therefore the HAPI response itself must also include a status indicator. This indicator appears as a `status` object in the HAPI header. The two status indicators (HAPI and HTTP) must be consistent, i.e., if one indicates success, so must the other. Note that some HAPI responses do not include a header, and in these cases, the HTTP header is the only place to obtain the status. + +## 4.1 `status` Object + +The HAPI `status` object is described as follows: + +| Name | Type | Description | +|-----------|---------|--------------------------------------------------------------------------------------------------------------------| +| `code` | integer | Specific value indicating the category of the outcome of the request - see [HAPI Status Codes](#4-status-codes). | +| `message` | string | Human readable description of the status - must conceptually match the intent of the integer code. | + +HAPI servers must categorize the response status using at least three status codes: `1200 - OK`, `1400 - Bad Request`, and `1500 - Internal Server Error`. These are intentionally analogous to the similar HTTP codes `200 - OK`, `400 - Bad Request`, and `500 - Internal Server Error`. + +| HTTP code | HAPI status `code` | HAPI status `message` | +|-------------|--------------------|--------------------------------| +| `200` | 1200 | OK | +| `400` | 1400 | Bad request - user input error | +| `500` | 1500 | Internal server error | + +The exact wording in the HAPI message does not need to match what is shown here. The conceptual message must be consistent with the status, but the wording can differ (or in another language, for example). If the server also includes the HAPI error message in the HTTP status message (recommended, not required), the HTTP status wording should be as similar as possible to the HAPI message wording. + +The `about`, `capabilities`, and `catalog` endpoints need to indicate `1200 - OK` or `1500 - Internal Server Error` since they do not take any request parameters. The `info` and `data` endpoints do take request parameters, so their status response must include `1400 - Bad Request` when appropriate. + +A response of `1400 - Bad Request` must also be given when the user requests an endpoint that does not exist. + +## 4.2 `status` Error Codes + +Servers may optionally provide the more specific codes in the table below. For convenience, a JSON object with these codes and messages is given in [the Appendix](#83-json-object-of-status-codes). It is recommended but not required that a server implement this more complete set of status responses. Servers may add their own codes, but must use numbers outside the `1200`s, `1400`s, and `1500`s to avoid collisions with possible future HAPI codes. + +| HTTP code | HAPI status `code` | HAPI status `message` | +|-------------|----------------------|------------------------------------------------| +| `200` | `1200` | OK | +| `200` | `1201` | OK - no data for time range | +| `400` | `1400` | Bad request - user input error | +| `400` | `1401` | Bad request - unknown API parameter name | +| `400` | `1402` | Bad request - syntax error in start time | +| `400` | `1403` | Bad request - syntax error in stop time | +| `400` | `1404` | Bad request - start equal to or after stop | +| `400` | `1405` | Bad request - `start` < `startDate` and/or `stop` > `stopDate` | +| `404` | `1406` | Bad request - unknown dataset id | +| `404` | `1407` | Bad request - unknown dataset parameter | +| `400` | `1408` | Bad request - too much time or data requested | +| `400` | `1409` | Bad request - unsupported output format | +| `400` | `1410` | Bad request - unsupported include value | +| `400` | `1411` | Bad request - out-of-order or duplicate parameters | +| `400` | `1412` | Bad request - unsupported resolve_references value | +| `400` | `1413` | Bad request - unsupported depth value | +| `500` | `1500` | Internal server error | +| `500` | `1501` | Internal server error - upstream request error | + +Note that there is an OK status to indicate that the request was fulfilled correctly but that no data was found. This can be useful feedback to clients and users, who may otherwise suspect server problems if no data is returned. + +Error `1405` implies that a HAPI server should not send data outside of the time range of availability indicated by `startDate` and `stopDate` from an `/info` response. The motivation for this is consistency between metadata and data responses and complexities that could result if servers did not enforce this error condition (see [a related issue discussion](https://github.com/hapi-server/data-specification/issues/97)). For a frequently updated dataset, we recommend that the server set the `stopDate` to a future time if there is a desire not to update the `stopDate` in the `/info` response at the same frequency. We also recommend that the allowed start and stop dates are included in the error message associated with a `1405` error. + +Note also the response `1408` indicating that the server will not fulfill the request since it is too large. This gives a HAPI server a way to inform clients about internal limits within the server. + +For errors that prevent any HAPI content from being returned (such as a `400 - not found` or `500 - internal server error`), the HAPI server should return a JSON object that is a HAPI header with just the status information. The JSON object should be returned even if the request was for non-JSON data. Returning server-specified content for an error response is also how HTTP servers handle error messages -- think about custom HTML content that accompanies the `404 - not found` response when asking a server for a data file that does not exist. + +In cases where the server cannot create a full response (such as an `info` +request or `data` request for an unknown dataset), the JSON header response must include the HAPI version and a HAPI status object indicating that an error has occurred. + +```javascript +{ + "HAPI": "3.3", + "status": { "code": 1406, "message": "Bad request - unknown dataset id"} +} +``` + +For the `data` endpoint, clients may request data with no JSON header, and in the case of an error, the HTTP status is the only place a client can determine the response status. + +## 4.3 Client Error Handling + +Because web servers are not required to limit HTTP return codes to those in the above table, HAPI clients should be able to handle the full range of HTTP responses. Also, the HAPI server code may not be the only software that interacts with a URL-based request from a HAPI server. There may be a load balancer or upstream request routing or caching mechanism in place. Therefore, it is a good client-side practice to be able to handle any HTTP errors. + +Also, recall that in a three-digit HTTP error code, the first digit is the main one client code should examine for determining the response status. Subsequent digits give a finer nuance to the error, but there may be variability between HTTP server implementations for the exact values of the seconds and third digits. HAPI servers can use more specific values for these second and third digits but must keep the first digit consistent with the table above. + +Consider the HTTP `204` error code, which represents "No data." A HAPI server can return this code when no data was present over the time range indicated, but (per HTTP rules) it must only do so in cases where the HTTP body contains no data. A HAPI header would count as HTTP data, so the HTTP 204 code can only be sent by a server when the clients request CSV or binary data with no header. Here is a sample HTTP response for this case: + +``` +HTTP/1.1 204 OK - no data for time range +``` +(Note: it may be difficult to change a server's response code message, so this is optional.) + +Regardless of whether the server uses a more specific HTTP code, the HAPI code embedded in the HTTP message must properly indicate the HAPI status. + +# 5 Implementation Details + +## 5.1 Cross-Origin Resource Sharing + +Because of the increasing importance of JavaScript clients that use AJAX requests, HAPI servers are strongly encouraged to implement Cross-Origin Resource Sharing [CORS](https://www.w3.org/TR/cors/). This will allow AJAX requests by browser clients from any domain. Enabling CORS is common for servers with only public data, and not implementing CORS limits the type of clients that can interface with a HAPI server. For testing purposes, the following headers have been sufficient for browser clients to HAPI servers: + +``` +Access-Control-Allow-Origin: * +Access-Control-Allow-Methods: GET +Access-Control-Allow-Headers: Content-Type +``` + +## 5.2 Security Notes + +When the server sees a request parameter that it does not recognize, it should throw an error. + +So given this query + +``` +http://server/hapi/data?dataset=DATA&start=T1&stop=T2&fields=mag_GSE&avg=5s +``` + +the server should throw an error with a status of `1400 - Bad Request` with an HTTP status of `400`. The server could optionally be more specific with `1401 - misspelled or invalid request parameter` with an HTTP code of `404 - Not Found`. + +Following general security best practices, HAPI servers should screen incoming request parameter name values. Unknown request parameters and values, including incorrectly formatted time values, should **not** be echoed in the error response. + +## 5.3 HEAD Requests and Efficiency + +Although HEAD requests are allowed (and required by the HTTP specification), the HAPI specification does not define +any additional or new aspects to the response of a HEAD request. Note that many server frameworks will respond to +a HEAD request by making a GET request and then omitting the body in the response since this is a simple way to +guarantee that the meta-information in the HEAD request is the same as that in the GET request (as is required by +the HTTP specification). As a result, HAPI server developers may want to modify this default behavior to prevent +unnecessary processing for HEAD requests. + +# 6 References + +1. ISO 8601:2019 Date Time Format Standard, https://www.iso.org/standard/70908.html +2. CSV format, https://tools.ietf.org/html/rfc4180 +3. JSON Format, https://tools.ietf.org/html/rfc7159; http://json-schema.org/ +4. IEEE Standard for Floating-Point Arithmetic, http://doi.org/10.1109/IEEESTD.2008.4610935 +5. Understanding JSON Schema - Structuring a Complex Schema, https://json-schema.org/understanding-json-schema/structuring.html + +# 7 Contact + +* Jon Vandegriff (jon.vandegriff\@jhuapl.edu) +* Robert Weigel (rweigel\@gmu.edu) +* Jeremy Faden (faden\@cottagesystems.com) +* Sandy Antunes (sandy.antunes\@jhuapl.edu) +* Doug Lindholm (doug.lindholm@lasp.colorado.edu) +* Robert Candey (Robert.M.Candey\@nasa.gov) +* Bernard Harris (bernard.t.harris\@nasa.gov) + +# 8 Appendix + +## 8.1 Sample Landing Page + +See https://github.com/hapi-server/server-ui + +## 8.2 Allowed Characters in `id`, `dataset`, and `parameters` + +HAPI allows the use of UTF-8 encoded Unicode characters for `id`, `dataset`, and `parameters`. (`id` is used in the [`/catalog`](#35-catalog) request and `dataset` and `parameter` are used in [`/info`](#36-info) and [`/data`](#37-data) requests.) + +**Not allowed** + +* Comma (ASCII decimal code 44) + +**Recommended** + +1. strings that match the regular expression `[_a-zA-Z][_a-zA-Z0-9]{0,30}` so that URL encoding is not required and names can be mapped directly to a variable name in most programming languages and a file name on modern operating systems; +2. strings with any of `a-z`, `A-Z`, `-`, `.`, `\_`, and `~` so that URL encoding is not required; and +3. short strings - the number of bytes required to write a comma-separated list of all parameters in a dataset and all other parts of a request URL (i.e., `http://.../hapi/data?dataset=...`) should be less than 2048 bytes, which is a limitation on a URL length for most web browsers. + +**Allowed** + +* Any non-control Unicode characters (but we recommend against using Unicode characters that look like a comma) + + +## 8.3 JSON Object of Status Codes + +```javascript +{ + "1200": {"status":{"code": 1200, "message": "OK"}}, + "1201": {"status":{"code": 1201, "message": "OK - no data for time range"}}, + "1400": {"status":{"code": 1400, "message": "Bad request - user input error"}}, + "1401": {"status":{"code": 1401, "message": "Bad request - unknown API parameter name"}}, + "1402": {"status":{"code": 1402, "message": "Bad request - syntax error in start time"}}, + "1403": {"status":{"code": 1403, "message": "Bad request - syntax error in stop time"}}, + "1404": {"status":{"code": 1404, "message": "Bad request - start equal to or after stop"}}, + "1405": {"status":{"code": 1405, "message": "Bad request - start < startDate and/or stop > stopDate"}}, + "1406": {"status":{"code": 1406, "message": "Bad request - unknown dataset id"}}, + "1407": {"status":{"code": 1407, "message": "Bad request - unknown dataset parameter"}}, + "1408": {"status":{"code": 1408, "message": "Bad request - too much time or data requested"}}, + "1409": {"status":{"code": 1409, "message": "Bad request - unsupported output format"}}, + "1410": {"status":{"code": 1410, "message": "Bad request - unsupported include value"}}, + "1411": {"status":{"code": 1411, "message": "Bad request - out-of-order or duplicate parameters"}}, + "1412": {"status":{"code": 1412, "message": "Bad request - unsupported resolve_references value"}}, + "1413": {"status":{"code": 1413, "message": "Bad request - unsupported depth value"}}, + "1500": {"status":{"code": 1500, "message": "Internal server error"}}, + "1501": {"status":{"code": 1501, "message": "Internal server error - upstream request error"}} +} +``` + + +## 8.4 Examples + +The following two examples illustrate two different ways to represent a magnetic field dataset. The first lists a time column and three scalar data columns, Bx, By, and Bz for the Cartesian components. + +```javascript +{ + "HAPI": "3.3", + "status": { "code": 1200, "message": "OK"}, + "startDate": "2016-01-01T00:00:00.000Z", + "stopDate": "2016-01-31T24:00:00.000Z", + "parameters": [ + {"name" : "timestamp", "type": "isotime", "units": "UTC", "fill": null, "length": 24}, + {"name" : "bx", "type": "double", "units": "nT", "fill": "-1e31"}, + {"name" : "by", "type": "double", "units": "nT", "fill": "-1e31"}, + {"name" : "bz", "type": "double", "units": "nT", "fill": "-1e31"} + ] +} +``` + +This example shows a header for the same conceptual data (time and three magnetic field components), but with the three components grouped into a one-dimensional array of size 3. + +```javascript +{ + "HAPI": "3.3", + "status": { "code": 1200, "message": "OK"}, + "startDate": "2016-01-01T00:00:00.000Z", + "stopDate": "2016-01-31T24:00:00.000Z", + "parameters": [ + { "name" : "timestamp", "type": "isotime", "units": "UTC", "fill": null, "length": 24 }, + { "name" : "b_field", "type": "double", "units": "nT", "fill": "-1e31", "size": [3] } + ] +} +``` + +These two different representations affect how a subset of parameters could be requested from a server. The first example, by listing Bx, By, and Bz as separate parameters, allows clients to request individual components: + +``` +http://server/hapi/data?dataset=MY_MAG_DATA&start=2001Z&stop=2010Z¶meters=Bx +``` + +This request would just return a time column (always included as the first column) and a Bx column. But in the second example, the components are all inside a single parameter named `b_field` and so a request for this parameter must always return all the components of the parameter. There is no way to request individual elements of an array parameter. + +The following example shows a proton energy spectrum and illustrates the use of the ‘bins’ element. Note also that the uncertainty of the values associated with the proton spectrum is a separate variable. There is currently no way in the HAPI spec to explicitly link a variable to its uncertainties. + +```javascript +{"HAPI": "3.3", + "status": { "code": 1200, "message": "OK"}, + "startDate": "2016-01-01T00:00:00.000Z", + "stopDate": "2016-01-31T24:00:00.000Z", + "parameters": [ + { "name": "Time", + "type": "isotime", + "units": "UTC", + "fill": null, + "length": 24 + }, + { "name": "qual_flag", + "type": "int", + "units": null, + "fill": null + }, + { "name": "maglat", + "type": "double", + "units": "degrees", + "fill": null, + "description": "magnetic latitude" + }, + { "name": "MLT", + "type": "string", + "length": 5, + "units": "hours:minutes", + "fill": "??:??", + "description": "magnetic local time in HH:MM" + }, + { "name": "proton_spectrum", + "type": "double", + "size": [3], + "units": "particles/(sec ster cm^2 keV)", + "fill": "-1e31", + "bins": [ { + "name": "energy", + "units": "keV", + "centers": [ 15, 25, 35 ], + } ], + { "name": "proton_spectrum_uncerts", + "type": "double", + "size": [3], + "units": "particles/(sec ster cm^2 keV)", + "fill": "-1e31", + "bins": [ { + "name": "energy", + "units": "keV", + "centers": [ 15, 25, 35 ], + } ] + } + + ] +} +``` + +This shows how "ranges" can specify the bins: + +```javascript +{ + "HAPI": "3.3", + "status": { "code": 1200, "message": "OK"}, + "startDate": "2016-01-01T00:00:00.000Z", + "stopDate": "2016-01-31T24:00:00.000Z", + "parameters": [ + { + "length": 24, + "name": "Time", + "type": "isotime", + "fill": null, + "units": "UTC" + }, + { + "bins": [{ + "ranges": [ + [ 0, 30 ], + [ 30, 60 ], + [ 60, 90 ], + [ 90, 120 ], + [ 120, 150 ], + [ 150, 180 ] + ], + "units": "degrees" + }], + "fill": "-1e31", + "name": "pitchAngleSpectrum", + "size": [6], + "type": "double", + "units": "particles/sec/cm^2/ster/keV" + } + ] +} +``` + +## 8.5 Robot clients + +Processes that introduce traffic to servers which does not represent immediate use for science purposes should be +identifiable by the servers. For example, an indexing service that queries +a server for metadata or a process that verifies that a server is responsive. Robot clients making such requests should set the `User-Agent` property. This property should identify both an `id` and a URL that describes the bot. For example, `hapibot-a` pings all HAPI servers each hour to see each +is responsive and sets the `User-Agent` agent to + +``` +"hapibot-a/1.0; https://github.com/hapi-server/data-specification/wiki/hapi-bots.md#hapibot-a". +``` + +Note that the use of the [wiki page](https://github.com/hapi-server/data-specification/wiki/hapi-bots.md) to describe bots is encouraged. + +## 8.6 FAIR + +For each of the elements of FAIR listed below (copied from https://www.go-fair.org/fair-principles/), we describe their relationship with the HAPI specification. + +Some aspects of HAPI, which is an API and metadata standard, directly address FAIR; however, some FAIR principles must be addressed by an external service or the data provider. As such, HAPI supports FAIR principles to the extent that the principles are within its scope. + +### 8.6.1 Findable + +_The first step in (re)using data is to find them. Metadata and data should be easy to find for both humans and computers. Machine-readable metadata are essential for automatic discovery of datasets and services, so this is an essential component of the FAIRification process._ + +1\. _(Meta)data are assigned a globally unique and persistent identifier_ + +The `resourceID` attribute can be used for a globally unique and persistent identifier. + +Alternatively, +* If each HAPI dataset does not have a globally unique and persistent identifier, but the server does, a data provider can use the `resourceID` in the `/about` response (but data providers are encouraged to have dataset-level `resourceID`s). +* If a dataset is associated with more than one identifier, a data provider can create a dataset of identifiers and serve this dataset as a time series. + +2\. _Data are described with rich metadata (defined by Reusable, item 1. below)_ + +Reusable, item 1: _(Meta)data are richly described with a plurality of accurate and relevant attributes_ + +The HAPI metadata specification has accurate and relevant attributes; the data provider needs to ensure the attribute values accurately describe the data and include information required for interpretation. + +3\. _Metadata clearly and explicitly include the identifier of the data they describe_ + +The HAPI metadata specification requires an identifier (`id`) for every dataset. The list of all `id`s is returned in a `catalog/` request. `id` is also used in the `dataset` request parameter in the URL for an `info/` or `data/` request. (The HAPI `/info` response does not contain the `id` because we have generally avoided the duplication of metadata in responses from different endpoints.) + +4\. _(Meta)data are registered or indexed in a searchable resource_ + +This is outside the scope of the HAPI specification. However, there is a way to explore all known HAPI servers at https://hapi-server.org/servers/. We also work with other projects that address registration, indexing, and searching. + +### 8.6.2 Accessible + +Once the user finds the required data, they need to know how they can be accessed, possibly including authentication and authorization. + +1\. _(Meta)data are retrievable by their identifier using a standardized communications protocol_ + +All HAPI endpoints use the HTTP protocol; HAPI metadata is JSON. The `info/` endpoint takes the dataset `id` and returns JSON metadata. The `data/` endpoint also takes the `id` and returns data in CSV, JSON, or binary. + +2\. _The protocol is open, free, and universally implementable_ + +HAPI delivers JSON metadata and well-structured data over HTTP(S). The schema for the data and metadata is free, open, and can be implemented in any programming language. + +3\. _The protocol allows for an authentication and authorization procedure, where necessary_ + +This is out of scope for HAPI, which was designed to access open data. The HAPI specification explicitly does not include an option for authentication. Access restrictions can still be implemented using authentication mechanisms outside or independent of HAPI. + +4\. _Metadata are accessible, even when the data are no longer available_ + +This is outside the scope of the HAPI specification. However, an [affiliated HAPI project](https://github.com/hapi-server/servers) caches metadata from all known HAPI servers nightly. + +### 8.6.3 Interoperable + +_The data usually needs to be integrated with other data. In addition, the data needs to interoperate with applications or workflows for analysis, storage, and processing._ + +1\. _(Meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation._ + +HAPI metadata are in JSON, and JSON schemas are available for validation. HAPI data is transmitted as JSON or Comma Separated Values (CSV), both widely used. (HAPI servers may use a simple binary format, which uses IEEE standards for binary numbers, and the layout mimics the CSV output.) + +2\. _(Meta)data use vocabularies that follow FAIR principles_ + +HAPI metadata does not use vocabularies directly, but links can be made to external metadata that uses vocabularies (see next item). + +3\. _(Meta)data include qualified references to other (meta)data_ + +Other metadata can be referenced using `additionalMetadata`. + +### 8.6.4 Reusable + +_The ultimate goal of FAIR is to optimize the reuse of data. To achieve this, metadata and data should be well-described so that they can be replicated and/or combined in different settings._ + +1\. _(Meta)data are richly described with a plurality of accurate and relevant attributes_ + +This is satisfied by the HAPI specification. + +2\. _(Meta)data are released with a clear and accessible data usage license_ + +This can be satisfied by using the `license` attribute. + +3\. _(Meta)data are associated with detailed provenance_ + +This can be satisfied using the `provenance` attribute. + +4\. (Meta)data meet domain-relevant community standards + +HAPI is built using the widely adopted RESTful approach to web-accessible resources. We use JSON in a way that is common in the community. The time standard is a subset of the ISO8601 standard for time strings. The design of the HAPI API for requesting and receiving data followed from an analysis of the API of many time series data providers, and HAPI is a standard that provides a common set of features. diff --git a/hapi-3.3.1/HAPI-data-access-spec-3.3.1.pdf b/hapi-3.3.1/HAPI-data-access-spec-3.3.1.pdf new file mode 100644 index 0000000..62acf6c Binary files /dev/null and b/hapi-3.3.1/HAPI-data-access-spec-3.3.1.pdf differ diff --git a/hapi-3.3.1/changelog.md b/hapi-3.3.1/changelog.md new file mode 100644 index 0000000..02b125c --- /dev/null +++ b/hapi-3.3.1/changelog.md @@ -0,0 +1,218 @@ +Changelog for the HAPI Specification +================== + +For each version, there may be up to three types of changes to the specification: +* API changes - new or different features to the request interface +* Response Format Changes - new or different items in the server responses +* Clarifications - better description or handling of corner cases, etc + +Some API and Response format changes may be non-backward compatible, but this is so far only true in Version 3.0. + +# Version 3.3.1 + +Fix [error in `location` example](https://github.com/hapi-server/data-specification/issues/276). + +# Version 3.3 + +Note 3.3 is fully backward compatible with 3.2. + +## API Changes + +* add new request parameter option `resolve_references=false` to `catalog` and `info` endpoints to tell the server not to perform JSON reference substitution ([#220](https://github.com/hapi-server/data-specification/pull/220)) + +## Response Format Changes + +* added optional `location` and `geoLocation` attributes to `/info` response for indicating where measurements were made ([#238](https://github.com/hapi-server/data-specification/pull/238)) +* added altitude quantity to list of valid vector component types since this is common for geo-location values ([#233](https://github.com/hapi-server/data-specification/pull/233)) +* now have distinct `serverCitation` in `about` endpoint and `datasetCitation` in `info` endpoint, and plain `citation` is deprecated ([#235](https://github.com/hapi-server/data-specification/pull/235)) +* added `resourceID` in `/about` and `provenance` and `licenseURL` in `/info` so that HAPI can describe FAIR data and added a section on how FAIR principles map to HAPI ([#224](https://github.com/hapi-server/data-specification/pull/224)). Also added appendix section describing relationship between HAPI and FAIR. +* added optional `warning` and `note` attributes to `about` and `info` endpoints ([#223](https://github.com/hapi-server/data-specification/pull/223)) +* added new units schema for VOUnits to enumerated list of allowed schemas ([#260](https://github.com/hapi-server/data-specification/pull/260)) + +## Clarifications + +* clarified how extra trailing slash should be handled ([#248]https://github.com/hapi-server/data-specification/issues/248) +* clarified how to name a custom (i.e., non-standard) endpoint ([#245](https://github.com/hapi-server/data-specification/pull/245)) +* clarified how scalar parameters can also contain vector components ([#244](https://github.com/hapi-server/data-specification/pull/244)) +* clarified requirements for which of bin centers and ranges need to be present ([#237](https://github.com/hapi-server/data-specification/pull/237)) +* clarified that HAPI is RESTful rather than strictly based on the original REST concept ([#236](https://github.com/hapi-server/data-specification/pull/236)) +* clarified that any non-standard data format in the `outputFormats` of the `capabilities` endpoint needs to begin with `x_` ([#222](https://github.com/hapi-server/data-specification/pull/222)) +* clarified the difference between `title` (a short label) in `/catalog` endpoint versus the `description` (few lines of details) in `/info` endpoint ([#221](https://github.com/hapi-server/data-specification/pull/221)) +* clarified expectations for `id` and `title` in `about` endpoint (acronyms ok in id, but expand in title; don't include word HAPI) ([#219](https://github.com/hapi-server/data-specification/pull/219)) +* fixed some typos and inconsistencies ([#241](https://github.com/hapi-server/data-specification/pull/241)) +* rearranged `info` section for clarity ([#247](https://github.com/hapi-server/data-specification/pull/247)) + +# Version 3.2 + +## API Changes + +Version 3.2 is backward compatible with 3.1. + +There is a new optional way to query the `catalog` endpoint, which now takes a request parameter +called `depth` to indicate how much detail the catalog response should include. The catalog +can now include all the elements in each dataset's `info` response. +The `capabilities` endpoint advertises if this functionality is supported. ([#164](https://github.com/hapi-server/data-specification/pull/164)) + +The spec document now suggests a way for HAPI clients to identify themselves as bots or +non-human users to a HAPI server. Doing this can help server administrators / developers in logging actual +science usage, as opposed to web scraping or mirroring activity. This is not a change in the spec. +([#174](https://github.com/hapi-server/data-specification/pull/174)) + +## Response Format Changes + +There is now an error code for when the `depth` request parameter to the `catalog` endpoint is invalid. +Only specific values are allowed for `depth.` ([#191](https://github.com/hapi-server/data-specification/pull/191)) + +The `capabilities` now includes a way to ping the server to test that it is functioning. The way to do +this is to make simple data request, and the optional new info the `capabilities` response allows a +server to identify exactly what data request to use for this kind pf ping. +([#172](https://github.com/hapi-server/data-specification/pull/172)) + +There have been many requests for HAPI to also serve images. To accomodate this, string parameters +now can be identified as being URIs which then point to image files. This also enables HAPI to +more uniformly offer lists of any kind of file. This file listing capability should be viewed as +a different kind of service from the usual numeric data serving capability offered by HAPI. +See 3.6.16 for more details and also ([#166](https://github.com/hapi-server/data-specification/pull/166)) + +## Clarifications + +The error message text was made more precise for HAPI error codes related to invalid start and stop times in a data request. +([#163](https://github.com/hapi-server/data-specification/pull/163)) + +When describing response formats offered by HAPI, it is now emphasized that the output formats offered by HAPI are +transport formats meant for streaming data and are not intended to be used as traditional file formats. +([#159](https://github.com/hapi-server/data-specification/pull/159)) + +Clarified that a HAPI request with an empty string after `parameters=` is the same as +not requesting any specific paramters, which defaults to requesting all parameters. +([#201](https://github.com/hapi-server/data-specification/pull/201)) + +# Version 3.1 + +Version 3.1 is backward compatible with 3.0. It adds support for three optional aspects in the `info` response: + +List of changes from 3.0.1 is [here](https://github.com/hapi-server/data-specification/compare/cfed14f74995b39598b43e1976be702f2c8350c4..964f44f8bbe07f5d3fd97fb8adb07ab71debb328) + +## Response Format Changes + +1. support for vector quantities: parameters that are vector quantities can optionally specify a coordinate system and can identify vector components as such; datasets can optionally specify a coordinate system schema ([#115](https://github.com/hapi-server/data-specification/issues/115)) +1. a dataset may optionally include other types of metadata inside a separate block ([#117](https://github.com/hapi-server/data-specification/issues/117)) +1. each dataset may optionally indicate a maximum time range to request data ([#136](https://github.com/hapi-server/data-specification/issues/136)) + + +# Version 3.0.1 + +## Clarifications + +Added statement that `dataset` and `parameters` may not contain Unicode but that this support will be added in 3.1. See [GitHub Issue #128](https://github.com/hapi-server/data-specification/issues/128). + + +# Version 3.0 + +## API Changes + +Non-backward compatible changes to the request interface in HAPI 3.0: + +* The URL parameter `id` was replaced with `dataset`. +* `time.min` and `time.max` were replaced with `start` and `stop`, respectively. +* Addition of a new endpoint, `about`, for server description metadata. + +These changes were discussed in issue #77. HAPI 3 servers must accept both the old and these new parameter names, but the HAPI 2 specification requires an error response if the new URL parameter names are used. In a future version, the deprecated older names will no longer be valid. + +## Response Format Changes +* Ability to specify time-varying bins (#83) +* Ability to use JSON references in info response (#82) +* Ability to indicate a units schema (if one is being used for units strings) (#81) + + +This URL generates a diff: + +https://github.com/hapi-server/data-specification/compare/4702968b13439af684d43416b442c534bf569f6c..4a02df680b76d757a7cbf4a06e55c53b9b91e310 + +## Clarifications: + +deprecated the use of all uppercase for the time stamp location options in favor of all lowercase, which is more consistent with the rest of the specification + +replaced "hapi-server.org" with just "server" in URLs + +clarified what the length attribute means for data parameters + +clarified how to specify array sizes when one array dimension size is 1 + +changed the definition of 'units' to allow for different units in the elements of an array + +changed the definition of 'label' to allow for different labels for each element in the array + +now allow multi-dimensional data to not have bins in every dimension; any dimenions with null for 'centers' and nothing present for 'ranges' will not be considered to have any binning in that dimension + +clarified that reordering parameters in a request has no effect on the order of the parameters in the data returned by the server (order of the fields returned is always the same regardless of the order in which they were requested) + +clarified server responses when time range has no data (HTTP status of 200 and HAPI status of 1201 and return no data content) or data is all fill (HTTP status of 200 and HAPI status of 1200 and return the fill values) + +more clarification and examples for error messages from HAPI servers + +fixed multiple typos and made wording improvements + + + +# Version 2.1.1 + +These are all small clarifications on relatively infrequent situations. + +Pull request for going from 2.1.0 to 2.1.1: +https://github.com/hapi-server/data-specification/pull/93 + +This URL shows differences between 2.1.0 and 2.1.1: +https://github.com/hapi-server/data-specification/compare/b85e1db..8969633 + +## Clarifications + +* updated version number to 2.1 when used in text and in example output +* clarified how to indicate a dimensionless quantity within an array of units for an array-valued parameter with non-uniform units; see Issue #85 +* clarified the use of scalar and array values for labels and units that describe an array-valued parameter; see Issue #91 +* clarified that `null` is not allowed as a value within a `centers` or `ranges` array in a `bins` description; see Issue #86 + +In a future release, there will be an occasion to use `null` values for some bin definitions, but only when the bin `centers` and `ranges` are able to be specified as time-varying elements within the data (as opposed to fixed quantities in the `info` metadata). This is expected to be included in verion 3.0. + +# Version 2.1.0 + +## Clarifications + +1. replaced "hapi-server.org" with just "server" in example URLs +2. clarified what the length attribute means for data parameters +3. clarified how to specify array sizes when one array dimension size is 1 +4. more clarification and examples for error messages from HAPI servers +5. fixed multiple typos and made wording improvements + +## Response Format Changes + +1. changed the definition of 'units' to allow for different units in the elements of an array +2. changed the definition of 'label' to allow for different labels for each element in the array +3. deprecated the use of all uppercase for the time stamp location options in favor of all lowercase, which is more consistent with the rest of the specification +4. now allow multi-dimensional data to not have bins in every dimension; any dimensions with null for 'centers' and nothing present for 'ranges' will not be considered to have any binning in that dimension + +## API Changes + +1. add HAPI 1411 error `out of order or duplicate parameters` +2. clarified server responses when the time range has no data (HTTP status of 200 and HAPI status of 1201 and return no data content) or data is all fill (HTTP status of 200 and HAPI status of 1200 and return the fill values) + + +# Version 2.0 + +Difference between 1.1 and 2.0: https://www.diffchecker.com/sBWweoDa + +Summary of changes: +1. time values output by the server now must end with "Z" (not backward compatible) [diff](https://github.com/hapi-server/data-specification/commit/9bcb2f43014a05380425c8e2be24b457da7c5542) +2. the "units" on the bins are required (not backward compatible) [diff](https://github.com/hapi-server/data-specification/commit/30d8c967e252069a256019b71537694cd7fe7f97) +3. incoming time values (in request URL) should have a trailing "Z" to indicate GMT+0, but are interpreted as such even if there is no trailing Z [diff](https://github.com/hapi-server/data-specification/commit/a7a528b455b57a02987fb50b5e8c4890b721e774) +4. the length value for strings is now consistently described; all unused characters in the string should be filled with null characters (so no null terminator is needed if the string is exactly the given length) +[diff 1](https://github.com/hapi-server/data-specification/commit/4d8395578c42a7a20fd8cd0895c6cedb5b28abc7) [diff 2](https://github.com/hapi-server/data-specification/commit/32f2b219fdd6e8f7c33546077f9dc95f908b0752) [diff 3](https://github.com/hapi-server/data-specification/commit/c0aa8b9ca2f3c0d573a40019bdaf5c6a1edb1008) [diff4](https://github.com/hapi-server/data-specification/commit/533120bcd3622b58ae0f776f2e19fd854b4d3b54) +5. in CSV output, the use of quotes for string values with commas is clarified [diff](https://github.com/hapi-server/data-specification/commit/a1c444298f4cb0c7da2dfeb25b40879a80d100b0) +6. the subset of ISO8601 that we use is better described [diff](https://github.com/hapi-server/data-specification/commit/a7a528b455b57a02987fb50b5e8c4890b721e774) +7. added `timeStampLocation` [diff](https://github.com/hapi-server/data-specification/commit/9ee24e0e3f9c09243b0762664e13adbf446024b8) [diff](https://github.com/hapi-server/data-specification/commit/2603550251fefee14d4f2192095981c35753a2f3) + +# Version 1.1 + +# Version 1.0 + +Initial version!