-
Notifications
You must be signed in to change notification settings - Fork 555
[Bounty] ConCare PyHealth 2.0 ver #649
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bounty] ConCare PyHealth 2.0 ver #649
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will review more of the codebase later (as I find time tomorrow sorry)
For some reason this notebook doesn't show up in GitHub? "The Notebook Does Not Appear to Be Valid JSON"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, no pressure.
I believe I've fixed this error now in the latest commit.
Much appreciated for your guidance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jhnwu3, been updated to fix any issues. Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR successfully migrates the ConCare model from PyHealth 1.0 to PyHealth 2.0 API, implementing personalized clinical feature embedding through channel-wise GRUs and multi-head self-attention. The update simplifies the API by leveraging the new SampleDataset and EmbeddingModel interfaces while maintaining the core ConCare functionality including DeCov regularization and time-aware attention mechanisms.
Key Changes:
- Replaced PyHealth 1.0 API (
SampleEHRDataset, explicitfeature_keys/label_keyparameters) with PyHealth 2.0 API (SampleDataset, schema-based feature derivation) - Integrated
EmbeddingModelfor unified embedding handling across different input types - Added comprehensive Google-style docstrings, type hints, and examples
- Fixed bug in
SingleAttention.__init__(removed undefinedself.Wdinitialization) - Added extensive unit tests and MIMIC-IV example notebook
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
pyhealth/models/concare.py |
Main model implementation updated to PyHealth 2.0 API with improved documentation and type hints |
tests/core/test_concare.py |
Comprehensive unit tests covering initialization, forward/backward passes, embeddings, and edge cases |
examples/concare_mimic4_example.ipynb |
Example notebook demonstrating in-hospital mortality prediction on MIMIC-IV dataset |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
pyhealth/models/concare.py
Outdated
|
|
||
| # Get dynamic feature keys (excluding static key) | ||
| self.dynamic_feature_keys = [ | ||
| k for k in self.dataset.input_processors.keys() |
Copilot
AI
Dec 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The dynamic_feature_keys derivation is incorrect and inconsistent with PyHealth 2.0 model conventions. This code filters out the static_key from dataset.input_processors.keys(), but should instead use self.feature_keys (which is already set by BaseModel.init from dataset.input_schema). The current approach may fail if static_key is not in input_processors or may include unexpected keys.
Following the AdaCare model pattern (line 382 in adacare.py), this should be: self.dynamic_feature_keys = [k for k in self.feature_keys if k != self.static_key]
| k for k in self.dataset.input_processors.keys() | |
| k for k in self.feature_keys |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed - updated to use self.feature_keys
pyhealth/models/concare.py
Outdated
| embedded = self.embedding_model(kwargs) | ||
|
|
||
| # Get static features if available | ||
| static = None | ||
| if self.static_key is not None and self.static_key in kwargs: | ||
| static_data = kwargs[self.static_key] | ||
| if isinstance(static_data, torch.Tensor): | ||
| static = static_data.float().to(self.device) | ||
| else: | ||
| static = torch.tensor( | ||
| kwargs[self.static_key], dtype=torch.float, device=self.device | ||
| static_data, dtype=torch.float, device=self.device | ||
| ) | ||
| x, decov = self.concare[feature_key](x, static=static, mask=mask) | ||
| else: | ||
| x, decov = self.concare[feature_key](x, mask=mask) | ||
|
|
||
| for feature_key in self.dynamic_feature_keys: | ||
| x = embedded[feature_key] | ||
| mask = (x.sum(dim=-1) != 0).int() |
Copilot
AI
Dec 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The mask generation approach is not optimal and inconsistent with PyHealth 2.0 patterns. The current implementation creates masks using (x.sum(dim=-1) != 0).int(), which assumes zero-padded embeddings. However, following the AdaCare pattern (line 428 in adacare.py), the EmbeddingModel should be called with output_mask=True to get proper masks that are aware of padding tokens.
Change line 953 to: embedded, masks = self.embedding_model(kwargs, output_mask=True) and then use mask = masks[feature_key] instead of computing it manually. This ensures masks are correctly computed based on the actual padding indices used by each processor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed - now using output_mask=True
pyhealth/models/concare.py
Outdated
| "Only one label key is supported for ConCare" | ||
| ) | ||
| self.label_key = self.label_keys[0] | ||
| self.mode = self.dataset.output_schema[self.label_key] |
Copilot
AI
Dec 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Redundant mode assignment. Line 894 assigns self.mode from the dataset output schema, but BaseModel already sets self.mode during initialization (see base_model.py:36-42). This assignment is redundant and could potentially override the mode resolution logic in BaseModel. Consider removing this line since self.mode is already set by the parent class.
| self.mode = self.dataset.output_schema[self.label_key] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed - removed redundant self.mode
pyhealth/models/concare.py
Outdated
| def forward(self, x, sublayer): | ||
| "Apply residual connection to any sublayer with the same size." | ||
| def forward( | ||
| self, x: torch.Tensor, sublayer: callable |
Copilot
AI
Dec 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The type hint callable (lowercase) is not valid in Python's typing system. It should be Callable from the typing module. Since Callable is already imported at the top of the file (line 24: from typing import Dict, List, Optional, Tuple), add Callable to that import and use Callable[[torch.Tensor], Tuple[torch.Tensor, any]] as the type hint for the sublayer parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed - added Callable to imports
pyhealth/models/concare.py
Outdated
| """ | ||
|
|
||
| import math | ||
| from typing import Dict, List, Optional, Tuple |
Copilot
AI
Dec 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Callable type is used in the SublayerConnection.forward method (line 399) but is not imported. Add Callable to the typing imports on line 24.
Change: from typing import Dict, List, Optional, Tuple
To: from typing import Callable, Dict, List, Optional, Tuple
| from typing import Dict, List, Optional, Tuple | |
| from typing import Callable, Dict, List, Optional, Tuple |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed - added Callable to imports
c6706ae to
8fdd93b
Compare
jhnwu3
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clean adaptation, no complaints for me.
Pull Request: ConCare Model Update (1.0 → 2.0)
Contributor Information
Contribution Type
High-Level Description
This PR updates the ConCare model from PyHealth 1.0 API to PyHealth 2.0 API.
Paper Reference
Changes Made
API Migration (1.0 → 2.0):
SampleEHRDatasetwithSampleDatasetEmbeddingModelfor unified embedding handlingfeature_keys,label_key,mode,use_embeddingparametersCode Improvements:
SingleAttention(removed undefinedself.Wdinitialization)Testing:
Examples:
Files to Review
How to Test
1. Run Unit Tests
2. Run Quick Example (main block)
3. Expected Output
Checklist
Additional Notes
The ConCare model includes:
This update maintains full backward compatibility with the original ConCare functionality while adopting the cleaner 2.0 API patterns.