-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increase head #103
base: core
Are you sure you want to change the base?
Increase head #103
Conversation
WalkthroughThe pull request introduces modifications across several files in the ehr2vec project. In Changes
Possibly related PRs
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
ehr2vec/model/heads.py (2)
58-64
: Consider making dropout and hidden size configurable.
Using hardcoded values (128 for hidden size and 0.1 for dropout) restricts the flexibility of your model. Providing these as configuration options allows for easier experimentation and tuning.Here's a sample diff to fetch them from the config (with a fallback to the current values):
- classifier_hidden = 128 + classifier_hidden = config.to_dict().get("classifier_hidden_dim", 128) ... - nn.Dropout(0.1), + nn.Dropout(config.to_dict().get("classifier_dropout", 0.1)),
145-151
: Unify the configuration retrieval for the classifier parameters.
Similar to BaseRNN, using configurable parameters for hidden size and dropout inFineTuneHead
fosters consistency and makes it easier to share hyperparameters across multiple classes.Below is a sample diff for consistency:
- classifier_hidden = 128 + classifier_hidden = config.to_dict().get("classifier_hidden_dim", 128) ... - nn.Dropout(0.1), + nn.Dropout(config.to_dict().get("classifier_dropout", 0.1)),
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
ehr2vec/main/04_finetune_cv.py
(1 hunks)ehr2vec/model/heads.py
(2 hunks)
🔇 Additional comments (1)
ehr2vec/main/04_finetune_cv.py (1)
198-198
: Good practice addingrun
parameter tofinetune_fold
.
This enhancement allows for more dynamic logging or tracking (e.g., with W&B), aligning with common MLOps patterns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (5)
ehr2vec/model/heads.py (2)
198-210
: Consider parameterizing the BigHead configuration.The implementation looks good, but consider making the following improvements:
- Add configuration parameters for
classifier_hidden
anddropout
instead of hard-coding- Add a docstring explaining the purpose and architecture of this classifier
class BigHead(nn.Module): def __init__(self, input_size): super().__init__() - self.classifier_hidden = 128 + self.classifier_hidden = config.classifier_hidden_size self.classifier = nn.Sequential( nn.Linear(input_size, self.classifier_hidden), nn.ReLU(), - nn.Dropout(0.1), + nn.Dropout(config.classifier_dropout), nn.Linear(self.classifier_hidden, 1), )
58-64
: Consider using enum for classifier types.Instead of string literals, consider using an enum or constants for classifier types to prevent typos and enable better IDE support.
from enum import Enum class ClassifierType(Enum): BIG = "big" STANDARD = "standard"ehr2vec/configs/example_configs/05_02_finetune_simulated.yaml (1)
18-18
: Document available classifier options.Please add a comment documenting the available classifier options ("big" or "standard") and their implications.
+ # Classifier type: "big" (MLP with dropout) or "standard" (linear) classifier: "big"
ehr2vec/double_robust/counterfactual.py (2)
14-14
: Improve type hints for optional parameters.Consider using
Optional
from typing module for optional parameters and document the behavior in the docstring.-def create_counterfactual_data(data: Data, exposure_regex: List[str] = None, control_regex: str = None) -> Data: +from typing import Optional +def create_counterfactual_data( + data: Data, + exposure_regex: Optional[List[str]] = None, + control_regex: Optional[str] = None +) -> Data:Also applies to: 19-27
61-70
: Simplify the conditional logic in the return statement.The current implementation can be simplified using a conditional expression.
- exposed_patients=( - [ - pid - for pid in data.pids - if counterfactual_exposures[data.pids.index(pid)] == 1 - ] - if counterfactual_exposures is not None - else [] - ), + exposed_patients=[ + pid for pid in data.pids + if counterfactual_exposures and counterfactual_exposures[data.pids.index(pid)] == 1 + ],
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
ehr2vec/configs/example_configs/05_02_finetune_simulated.yaml
(1 hunks)ehr2vec/double_robust/counterfactual.py
(2 hunks)ehr2vec/model/heads.py
(4 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (2)
- GitHub Check: tests
- GitHub Check: tests
🔇 Additional comments (2)
ehr2vec/model/heads.py (1)
213-219
: LGTM!Clean implementation of a standard linear classifier.
ehr2vec/double_robust/counterfactual.py (1)
51-54
: LGTM! Good practice to avoid modifying input data.Creating a new list instead of modifying the original preserves data immutability.
classifier_input_size = config.hidden_size + self.exposure_dim | ||
if config.to_dict().get("classifier", None) is not None: | ||
if config.classifier == "big": | ||
self.classifier = BigHead(classifier_input_size) | ||
else: | ||
self.classifier = StandardHead(classifier_input_size) | ||
else: | ||
self.classifier = StandardHead(classifier_input_size) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Extract classifier creation logic to avoid duplication.
The classifier creation logic is duplicated between BaseRNN
and FineTuneHead
. Consider extracting it to a factory method.
def create_classifier(config, input_size):
"""Create classifier based on config."""
if config.to_dict().get("classifier", None) is not None:
if config.classifier == "big":
return BigHead(input_size)
return StandardHead(input_size)
Summary by CodeRabbit
New Features
BigHead
andStandardHead
.Improvements
Technical Updates