Sensor Data Structure (SDS): Specification

1. Introduction

The growing array of available "sensors" (cameras, wearable devices, mobile phones) that measure human behavior provide a wide variety of different types of data, in different formats. The management and control of video, audio, and related files and metadata can involve significant effort. Most studies have their own specific way of organizing the data, which makes it more difficult for others to understand, while curated and well structured datasets can improve the quality of research. The Sensor Data Standard (SDS) offers a simple and easy-to-adopt way of organizing these data - to enhance ease of collaboration across researchers, facilitate the development of new software tools, and support replicable science.

SDS is designed to closely follow the Biomedical Imaging Data Standard (BIDS), which addresses similar organizational problems with neuroimaging data. SDS is designed to be the extension of BIDS to sensor data to align sensor data and metadata organization with BIDS, enabling more efficient collaboration and use of existing and future BIDS-compliant tools, including the BIDS Validator, for sensor data.

2. Common Principles and Definitions

This specification expands the specification of the brain image data structure (BIDS) to sensor data, such as video, audio, and wearables. Many of the terms below are inherited directly from BIDS, but are listed again here to clarify their relation to the “sensor data” that is the focus of this extension.

To illustrate the use of the SDS terminology, we describe them in reference to an example dataset from a fictitious study. In this study, a research participant and an interviewer completed the Biographical Conversation Task (BCT) – a conversational interaction task in which the interviewer asks three questions about positive topics (birthdays, activities, etc.), and three questions about negative topics (fearful events, disappointment, etc.). During the interaction, video data are recorded using separate cameras pointed at the participant and the interviewer (who are seated facing one another). Heart rate data are also collected from the participant (but not the examiner) using a wireless ECG device. To make the example more interesting, we will pretend for this example that the ECG device accidentally stopped recording when the task was first administered; the whole task was therefore repeated immediately afterwards. This recording session yields ??x different files, which are described below in relation to each data field within the file name. The files (and the folders containing them) are named:

sub-mystudy001/ses-01/sds/sub-mystudy001_ses-01_task-bct_cnd-pos_dev-gopro_tgt-part_run-01.mp4 sub-mystudy001/ses-01/sds/sub-mystudy001_ses-01_task-bct_cnd-pos_dev-gopro_tgt-exam_run-01.mp4 sub-mystudy001/ses-01/sds/sub-mystudy001_ses-01_task-bct_cnd-pos_dev-bh_tgt-part_run-01.edf sub-mystudy001/ses-01/sds/sub-mystudy001_ses-01_task-bct_cnd-neg_dev-gopro_tgt-part_run-01.mp4 sub-mystudy001/ses-01/sds/sub-mystudy001_ses-01_task-bct_cnd-neg_dev-gopro_tgt-exam_run-01.mp4 sub-mystudy001/ses-01/sds/sub-mystudy001_ses-01_task-bct_cnd-neg_dev-bh_tgt-part_run-01.edf sub-mystudy001/ses-01/sds/sub-mystudy001_ses-01_task-bct_cnd-pos_dev-gopro_tgt-part_run-02.mp4 sub-mystudy001/ses-01/sds/sub-mystudy001_ses-01_task-bct_cnd-pos_dev-gopro_tgt-exam_run-02.mp4 sub-mystudy001/ses-01/sds/sub-mystudy001_ses-01_task-bct_cnd-pos_dev-bh_tgt-part_run-02.edf sub-mystudy001/ses-01/sds/sub-mystudy001_ses-01_task-bct_cnd-neg_dev-gopro_tgt-part_run-02.mp4 sub-mystudy001/ses-01/sds/sub-mystudy001_ses-01_task-bct_cnd-neg_dev-gopro_tgt-exam_run-02.mp4 sub-mystudy001/ses-01/sds/sub-mystudy001_ses-01_task-bct_cnd-neg_dev-bh_tgt-part_run-02.edf

The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].

Terminology that will be used include the following:

Dataset – A collection of video, audio, depth, and wearable (ECG, accelerometry) data from one or more subjects and sessions.
Modality – The category of data recorded in a specific data file. The standard denotes a series of modality types based on basic terms - "image", "video", "audio", "depth" (for depth cameras), "ecg", etc. The term "mix" is used for modalities that combine multiple modalities (or example, wearable devices that capture heart rate as well as accelerometry); however, the "video" modality is allow to have a audio channel (as is commonplace with most devices).
Data Type – A functional group of different types of data. This term is designed to be compatible with the BIDS standard, which has multiple data types; for simplicity, we classify all SDS-compliant data to fall under a single type, called sds. In raw and derivative folders (following the BIDS standard), the data type directory is nested inside subject and (optionally) session directories.
Subject - A person (or animal) participating in the study. Used interchangeably with the term Participant. In example dataset, we assign the subject ID mystudy001.
Session - A logical grouping of sensor and behavioral data consistent across subjects. Usually this corresponds to the data collected during a single laboratory visit (or remote data collection session), though the term is designed to be used flexibly. In our example above, we have only one session, referred to as 01.
Acquisition - A continuous period of time during which data are recorded. This is somewhat different from imaging, where acquisition typically refers to a continuous scan series where only one type of data, and one task are administered. During sensor data recording, a single acquisition will often include multiple distinct tasks. Perhaps the most common example of this consists of studies where a camera is left on during the administration of several tasks. As noted below, this is an Optional term; in the context of sensor data, it is typically most useful in instances where data from a single task are split across multiple files (acquisitions). In the example files described above, we chose not to specify the acquisition.
Task - A set of structured activities performed by the participant. Note that in the context of brain scanning, a task is almost always tied to one data acquisition, such that even if during one acquisition the subject performed multiple conceptually different behaviors (with different sets of instructions) they will be considered one (combined) "task". However, in the context of sensor data collection, multiple independent tasks may be collected in one acquisition. In our example files above, we refer to the Biographical Conversation Task with the abbreviation bct.
Condition - This field refers to a specific portion of a task. We include this to allow for additional specificity when referring to the contents of sds-compliant files (note that this field is not specified in BIDS, and is therefore new here). In our example above, we have two conditions - positive questions and negative questions, abbreviated as pos and neg, respectively.
Run - A field that is designed as a kind of counter to track multiple instances of a particular task or acquisition within the same subject (and session). This is primarily designed to reflect a task that is administered repeatedly (i.e., a repeated measures), and can also reflect a re-administration of a task (for example, due to some kind of technical or administration failure). In our example files above, we use run-01 for the first administration of the task (which was ended prematurely due to ECG problems), and run-02 for the second administration
Device - The type of device used to record the file. Unlike neuroimaging data, where device information is typically embedded in the file, information about the recording device may not always be stored with the data themselves - hence the need for a field to describe it. In the above example we use gopro to represent the video cameras, and bh (Short for the Zephyr Bioharness) as the ECG recording device.
Target - the person that the sensor data corresponds to. In digital phenotyping, this may be the same as the subject, or may be a different person, for example an examiner or confederate. In the example above, we have video videos with the tgt-part for the participant, and tgt-exam for the examiner.
Channel - This field is used when a single device provides more than one type of data, and the different data types are saved to separate files. An example of this may be a wearable device that provides both ECG and accelerometry data, or the XBox Kinect which records traditional video as well as 3D depth map data. Note that this is different from Modality above, ??this is like camera 1 and camera 2 for treecam. even if not different cameras, it's like depth and video. when we use just modality, it will not tell us that it's being recorded simultaneously using the same device... modality won't give us this detail. Like sensortreeversion1, camerachannel2
Notes - An unstructured open text field where users can place any information they choose. An example of this might be the name of the original data file from which a given file was derived. ??did we decide to nix this? LETS NIX THIS, SHOULDNT HAVE IT IN FILE NAME, INSTEAD HAVE IT IN JSON

3. Methodology

3.1 Brain Imaging Data Structure

The Brain Imaging Data Structure (BIDS) is the guiding framework for the Sensor Data Structure (SDS); in fact, SDS is designed to fit "within" the BIDS framework and directory structure. In the introduction to the BIDS standard, the authors explain that "so far there is no consensus how to organize and share data obtained in neuroimaging experiments". The same can be said about the growing number of sensors that can be used to quantify and understand behavior - i.e., what is often referred to as "digital phenotyping". Because of the tremendous resources that have been put into the development of the BIDS standard over the past decade, and the corresponding maturation of that standard, it represents an ideal framework for the development of a standard governing sensor data. Wherever possible, SDS adopts concepts, fields, and file structures from BIDS, with modifications as necessary to better fit the SDS standard to the types of data it encompasses.

3.2 Sensor Data Structure

SDS extends the BIDS standard by 1) integrating a broader array of data types (video, audio, wearables, etc.), 2) accommodating new metadata associated with these data types (for example, video recording parameters), and 3) integrating additional information regarding who is being recorded. Regarding the latter point: the MRI use cases for BIDS normally assumes that the main study participant is the one whom is providing data. However, in many digital phenotyping applications, data are also collected on additional individuals associated with the participant (such as conversational partners). Furthermore, digital phenotyping data sometimes involve the measurement of mutual information between individuals that cannot be attributed solely to any one individual (for example, data on synchronized gestures and language between two or more people). SDS is designed to accommodate these use cases.

3.2.1 Filesystem and File Name Structure

SDS follows the BIDS Filesystem structure and the BIDS File Name structure as closely as possible. Of particular note: consistent with BIDS file names, the file extension is used to denote the file modality where this is possible (i.e, modalities that are only saved with one file extension), but where not possible, the characters after the final underscore in the final name denote modality. For example, if saving an audio-only file using the Matroska file format (which allows for the storing of many modalities), one would end the file name with _aud.mkv.

?? need to fix figure so that you have raw/sub-????/ses-????/sds

3.2.2 File Tags

SDS uses two existing file tags exactly as they are previously defined: subject and session. SDS uses four existing file tags with minor extensions and additional clarification for sensor data: data type, task, acquisition, and run. SDS also introduces four new file tags specific to sensor data: device, target, channel, and notes.

File Tag	Abbreviation	Level	Description
Subject	sub	Required	Same as Bids
Session	ses	Recommended	Same as Bids
Acquisition	acq	Optional	Same as Bids
Task	task	Required	Same as Bids
Condition	cnd	Optional	Same as Bids
Run	run	Optional	Same as Bids
Device	dev	Required	Description of actual device used in data collection (e.g., "gopro")
Target	tgt	Optional	Person (or animal) from whom the data were collected (e.g., "participant", "confederate")
Channel	chn	Optional	Type of data from a device that saves multiple types to separate files for a single recording acquisition
Notes	note	Optional	Free text of any kind

3.2.3 File types

??To be added- describe the likely video, audio, and other formats, and likely file extensions (insert a chart). Describe multiple files from different angles of the same task/acq.

3.2.4 Device Types

3.2.5 Target: who is being recorded

?? describe the target(s), the need for this in sensor/behavioral. Make sure to check the beh portion of BIDS spec to align. Introduce sub (subject), exam (examiner), conf (confederate), clin (clinician), par (parent), sib (sibling), etc (make a chart). NOTE: we currently use ‘par’ for participant, this could get confusing!! We need to switch to ‘sub’, and then have ‘par’ available for parent-child interactions.

Include a graphic showing files and noting which are from what target in a given subject directory, to illustrate.

3.2.6 Sidecar files

??describe the json, yaml sidecars. Follow BIDS spec, one file per associated file, or file in parent folder

3. Software and Data

Software for assembling sensor data from source to raw, validating sensor directories meet SDS format, and curating datasets from SDS format, is open source and available at Github (insert link to the public repo that the NOSI software development team creates with the assembler, validator, and curator). Software written in [Python 3 or insert other language] and can be found at the public Github repository listed above. Figure X shows the flow in which these software tools are used (to be created and inserted).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly