Skip to content

Conversation

@alejoe91
Copy link
Member

This was previously discussed here: #2828

To sum up, loading and decompressing zarr timestamps for very long recordings can be quite time consuming, so we want to avoid doing that at init. When fetching the timestamps though, if they ar not a numpy array they are cast and cached as np.arryas, to avoid re-reading and re-decompressing at every call

@alejoe91 alejoe91 added the core Changes to core module label Aug 20, 2024
@alejoe91 alejoe91 requested a review from h-mayorquin August 20, 2024 13:25
Copy link
Collaborator

@h-mayorquin h-mayorquin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think it is better. Zarr is not the format were we want the data to be read when called.

Can you add the typing to get_times() -> np.ndaarray, this will make it clear that is returning and in memory object and also we should add a docstring describing this behavior.

return self.time_vector
else:
return np.array(self.time_vector)
if not isinstance(self.time_vector, np.ndarray):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can just always call np.asaray() which by default will just pass the data along if it is already and np.ndarray but will create a copy if it is hdf5, zarr or a memmap.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! Great suggestion!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify, you suggest doing this?

def get_times(self) -> np.ndarray:
        if self.time_vector is not None:
            self.time_vector = np.asarray(self.time_vector)
            return self.time_vector
        else:
            time_vector = np.arange(self.get_num_samples(), dtype="float64")
            time_vector /= self.sampling_frequency
            if self.t_start is not None:
                time_vector += self.t_start
            return time_vector

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then done in last commit :)

@alejoe91 alejoe91 merged commit a2f157c into SpikeInterface:main Aug 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Changes to core module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants