-
Notifications
You must be signed in to change notification settings - Fork 555
MIMIC3 Initialize Test Case for Tests/Core #527
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces a comprehensive test suite for the MIMIC3Dataset class that validates core functionality using the PhysioNet demo dataset. The implementation uses class-level setup and teardown methods to efficiently manage the demo dataset download and cleanup process.
- Downloads MIMIC-III demo dataset from PhysioNet using wget and handles download failures gracefully
- Tests core dataset functionality including stats generation, patient retrieval, and event extraction
- Implements efficient resource management with shared dataset across all test methods
Comments suppressed due to low confidence (1)
tests/core/test_mimic3.py:79
- The test method name
test_get_eventsdoesn't accurately reflect what the test does. It tests bothget_patientandget_eventsmethods. Consider renaming totest_get_patient_and_eventsfor clarity.
def test_get_events(self):
| def test_get_events(self): | ||
| """Test get_patient and get_events methods with patient 10006.""" | ||
| # Test get_patient method | ||
| patient = self.dataset.get_patient("10006") |
Copilot
AI
Jul 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The patient ID "10006" is a magic number. Consider defining it as a class constant (e.g., TEST_PATIENT_ID = "10006") to improve maintainability and make it clear why this specific patient ID is used.
| @classmethod | ||
| def _load_dataset(cls): | ||
| """Load the dataset once for all tests.""" | ||
| tables = ["diagnoses_icd", "procedures_icd", "prescriptions", "noteevents"] |
Copilot
AI
Jul 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The tables list contains magic strings. Consider defining these as class constants to improve maintainability and make it easier to modify the test configuration.
| tables = ["diagnoses_icd", "procedures_icd", "prescriptions", "noteevents"] | |
| tables = [ | |
| cls.TABLE_DIAGNOSES_ICD, | |
| cls.TABLE_PROCEDURES_ICD, | |
| cls.TABLE_PRESCRIPTIONS, | |
| cls.TABLE_NOTEEVENTS, | |
| ] |
| @classmethod | ||
| def _download_demo_dataset(cls): | ||
| """Download MIMIC-III demo dataset using wget.""" | ||
| download_url = "https://physionet.org/files/mimiciii-demo/1.4/" |
Copilot
AI
Jul 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The download URL is hardcoded. Consider defining it as a class constant to make it easier to update if the URL changes and to improve maintainability.
| download_url = "https://physionet.org/files/mimiciii-demo/1.4/" | |
| download_url = cls.DEMO_DATASET_URL |
* create mimic3 initialize test case with demo dataset * remove extra nonsense from Claude --------- Co-authored-by: John Wu <johnwu3@sunlab-work-01.cs.illinois.edu>
This pull request introduces a new test suite for the
MIMIC3Datasetclass, ensuring its functionality with demo data from PhysioNet. The tests focus on verifying dataset statistics, patient retrieval, and event extraction, while also handling dataset setup and cleanup efficiently.New test suite for
MIMIC3Dataset:Setup and cleanup for demo dataset:
wget, load it intoMIMIC3Dataset, and clean up temporary files after tests. These operations are encapsulated insetUpClassandtearDownClassfor efficient resource management.Testing dataset functionality:
.stats()method to ensure it executes without errors.get_patientandget_eventsmethods to validate patient retrieval and event extraction, including checks for non-empty and correctly typed results.