Skip to content

Conversation

@Logiquo
Copy link
Collaborator

@Logiquo Logiquo commented Dec 20, 2025

Contributor: Yongda Fan (yongdaf2@illinois.edu)

Contribution Type: Dataset

Description
Support multi-worker for the task transformation. The improvment is not substantial, the task transformation for MP task in MIMIC4 only reduce from 3.5 hours to 2 hours with 6 workers.

Files to Review
pyhealth/datasets/base_dataset.py

@Logiquo Logiquo requested a review from jhnwu3 December 20, 2025 09:53
@Logiquo Logiquo added enhancement New feature or request core Core functionality (Patient API, BaseDataset, event stream format, etc.) component: dataset Contribute a new dataset to PyHealth labels Dec 20, 2025
@Logiquo Logiquo marked this pull request as draft December 20, 2025 21:30
@Logiquo Logiquo marked this pull request as ready for review December 24, 2025 00:02
Copy link
Collaborator

@jhnwu3 jhnwu3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will approve when the dev job finishes running on the remote cluster.

Copy link
Collaborator

@jhnwu3 jhnwu3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm from the dev run. Will report back if anything crashes later.

@jhnwu3 jhnwu3 merged commit 864b228 into sunlabuiuc:master Dec 24, 2025
1 check passed
@Logiquo Logiquo deleted the multiprocess-task-transformation branch December 24, 2025 22:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component: dataset Contribute a new dataset to PyHealth core Core functionality (Patient API, BaseDataset, event stream format, etc.) enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants