Skip to content

Conversation

@jhnwu3
Copy link
Collaborator

@jhnwu3 jhnwu3 commented Jul 27, 2025

This pull request introduces a new test suite for the MIMIC3Dataset class, ensuring its functionality with demo data from PhysioNet. The tests focus on verifying dataset statistics, patient retrieval, and event extraction, while also handling dataset setup and cleanup efficiently.

New test suite for MIMIC3Dataset:

  • Setup and cleanup for demo dataset:

    • Added methods to download the MIMIC-III demo dataset using wget, load it into MIMIC3Dataset, and clean up temporary files after tests. These operations are encapsulated in setUpClass and tearDownClass for efficient resource management.
  • Testing dataset functionality:

    • Implemented tests for the .stats() method to ensure it executes without errors.
    • Added tests for get_patient and get_events methods to validate patient retrieval and event extraction, including checks for non-empty and correctly typed results.

@jhnwu3 jhnwu3 requested review from Copilot, plandes and zzachw July 27, 2025 03:08
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a comprehensive test suite for the MIMIC3Dataset class that validates core functionality using the PhysioNet demo dataset. The implementation uses class-level setup and teardown methods to efficiently manage the demo dataset download and cleanup process.

  • Downloads MIMIC-III demo dataset from PhysioNet using wget and handles download failures gracefully
  • Tests core dataset functionality including stats generation, patient retrieval, and event extraction
  • Implements efficient resource management with shared dataset across all test methods
Comments suppressed due to low confidence (1)

tests/core/test_mimic3.py:79

  • The test method name test_get_events doesn't accurately reflect what the test does. It tests both get_patient and get_events methods. Consider renaming to test_get_patient_and_events for clarity.
    def test_get_events(self):

def test_get_events(self):
"""Test get_patient and get_events methods with patient 10006."""
# Test get_patient method
patient = self.dataset.get_patient("10006")
Copy link

Copilot AI Jul 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The patient ID "10006" is a magic number. Consider defining it as a class constant (e.g., TEST_PATIENT_ID = "10006") to improve maintainability and make it clear why this specific patient ID is used.

Copilot uses AI. Check for mistakes.
@classmethod
def _load_dataset(cls):
"""Load the dataset once for all tests."""
tables = ["diagnoses_icd", "procedures_icd", "prescriptions", "noteevents"]
Copy link

Copilot AI Jul 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The tables list contains magic strings. Consider defining these as class constants to improve maintainability and make it easier to modify the test configuration.

Suggested change
tables = ["diagnoses_icd", "procedures_icd", "prescriptions", "noteevents"]
tables = [
cls.TABLE_DIAGNOSES_ICD,
cls.TABLE_PROCEDURES_ICD,
cls.TABLE_PRESCRIPTIONS,
cls.TABLE_NOTEEVENTS,
]

Copilot uses AI. Check for mistakes.
@classmethod
def _download_demo_dataset(cls):
"""Download MIMIC-III demo dataset using wget."""
download_url = "https://physionet.org/files/mimiciii-demo/1.4/"
Copy link

Copilot AI Jul 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The download URL is hardcoded. Consider defining it as a class constant to make it easier to update if the URL changes and to improve maintainability.

Suggested change
download_url = "https://physionet.org/files/mimiciii-demo/1.4/"
download_url = cls.DEMO_DATASET_URL

Copilot uses AI. Check for mistakes.
@plandes plandes merged commit ce02081 into master Jul 27, 2025
1 check passed
dalloliogm pushed a commit to dalloliogm/PyHealth that referenced this pull request Nov 26, 2025
* create mimic3 initialize test case with demo dataset

* remove extra nonsense from Claude

---------

Co-authored-by: John Wu <johnwu3@sunlab-work-01.cs.illinois.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants