Skip to content

The duration calculation for display in BaseRAW._repr_html_ doesn't account for the lenght of the last sample. #12954

@leorochael

Description

@leorochael

Description of the problem

The calculation of the duration display in BaseRAW._repr_html_ assumes that the length of the acquisition is the timestamp of the last sample which, for a high sampling rate, is close to the actual duration, but ignores the length of the last sample.

Depending on the acquisition length and the sampling frequency, this might result in a weird display.

Steps to reproduce

In a Jupyter Notebook, open an EDF containing data that is sampled at 500 hz and lasts exactly one hour and then display it. E.g:

import mne.io
raw_edf = mne.io.read_raw_edf("path-to-1h.edf", preload=False)
raw_edf

You'll get an output containing:

  • Acquisition
    • Duration: 00:59:60 (HH:MM:SS)
    • Sampling frequency: 500.00 Hz
    • Time points: 1,800,000

Instead of what would be expected:

  • Acquisition
    • Duration: 01:00:00 (HH:MM:SS)
    • Sampling frequency: 500.00 Hz
    • Time points: 1,800,000

Link to data

No response

Expected results

As mentioned, I expect to get:

  • Acquisition
    • Duration: 01:00:00 (HH:MM:SS)
    • Sampling frequency: 500.00 Hz
    • Time points: 1,800,000

Actual results

Instead I get:

  • Acquisition
    • Duration: 00:59:60 (HH:MM:SS)
    • Sampling frequency: 500.00 Hz
    • Time points: 1,800,000

Additional information

The code that calculates the duration is in BaseRaw._repr_html_:

        # https://stackoverflow.com/a/10981895
        duration = timedelta(seconds=self.times[-1])
        hours, remainder = divmod(duration.seconds, 3600)
        minutes, seconds = divmod(remainder, 60)
        seconds += duration.microseconds / 1e6
        seconds = np.ceil(seconds)  # always take full seconds

        duration = f"{int(hours):02d}:{int(minutes):02d}:{int(seconds):02d}"

To fix it, instead of calculating the duration based on the time of the last sample, use the total amount of samples and divide it by the sampling frequency. E.g:

        duration = timedelta(
            seconds=len(self.times) / self.info['sfreq'],
        )
        # https://stackoverflow.com/a/10981895
        hours, remainder = divmod(duration.seconds, 3600)
        minutes, seconds = divmod(remainder, 60)
        seconds += duration.microseconds / 1e6
        seconds = np.ceil(seconds)  # always take full seconds

        duration = f"{int(hours):02d}:{int(minutes):02d}:{int(seconds):02d}"

By the way, Pandas has a very nice representation for time deltas, if you want to remove the hours/minutes/seconds calculation (though that would mean pandas needs to be installed, even though it's an optional dependency for MNE):

>>> str( pd.Timedelta(3597, 's') )
'0 days 00:59:57'

Here is the mne.sys_info() (not that it makes a difference):

Platform Linux-6.1.85+-x86_64-with-glibc2.35
Python 3.10.12 (main, Sep 11 2024, 15:47:36) [GCC 11.4.0]
Executable /usr/bin/python3
CPU x86_64 (2 cores)
Memory 12.7 GB

Core
├☑ mne 1.8.0 (latest release)
├☑ numpy 1.26.4 (OpenBLAS 0.3.23.dev with 2 threads)
├☑ scipy 1.14.1
└☑ matplotlib 3.8.0 (backend=module://matplotlib_inline.backend_inline)

Numerical (optional)
├☑ sklearn 1.5.2
├☑ numba 0.60.0
├☑ nibabel 5.3.2
├☑ pandas 2.2.3
├☑ h5py 3.12.1
└☐ unavailable nilearn, dipy, openmeeg, cupy, h5io

Visualization (optional)
├☑ ipywidgets 8.1.5
└☐ unavailable pyvista, pyvistaqt, vtk, qtpy, ipympl, pyqtgraph, mne-qt-browser, trame_client, trame_server, trame_vtk, trame_vuetify

Ecosystem (optional)
└☐ unavailable mne-bids, mne-nirs, mne-features, mne-connectivity, mne-icalabel, mne-bids-pipeline, neo, eeglabio, edfio, mffpy, pybv

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions