-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor & feat: re-organized the scripts and added new features #199
Conversation
Signed-off-by: nanayves <[email protected]>
Signed-off-by: nanayves <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some initial comments.
Main points:
- fix the build
- a lot of stuff that lives in processing should go in a specific submodule, e.g., CrossverGenerator or the selection, probably in optimisation.py
- docstrings are not consistent
@@ -81,6 +81,9 @@ gt4sd = | |||
training_pipelines/tests/*json | |||
training_pipelines/tests/*smi | |||
frameworks/crystals_rfc/*.csv | |||
frameworks/enzeptional/tests/*pkl | |||
frameworks/enzeptional/tests/*json | |||
frameworks/enzeptional/mutation_model/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add extension otherwise it's risky
from gt4sd.frameworks.enzeptional import core | ||
from gt4sd.frameworks.enzeptional.processing import AutoModelFromHFEmbedding, TAPEEmbedding | ||
|
||
warnings.simplefilter(action="ignore", category=FutureWarning) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
write it as a pytest unit test, also it would be good to have atomical tests (per method) as well
sequence_from_intervals (str): orignial sequence extrcated from interval. | ||
tmp_results (List[Dict[str, Any]]): the temporary results | ||
intervals (List[Tuple[int, int]]): list of ranges in the sequence, zero-based. The same interval is applied to all sequences | ||
selection_method (str): methodology used to selection | ||
crossover_method (str): methodology used for crossover | ||
crossover_probability (float, optional): Crossover probability. Used in case Uniform crossover is selected. Defaults to 0.5. | ||
top_k_selection (Optional[int], optional): Number of samples to select. Defaults to -1. | ||
|
||
Returns: | ||
List[str]: New samples for the next round of optimization | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this docstring style is to compliant and will not the rendered properly in the docs, fix it following the other examples
class SelectionGenerator: | ||
def __init__(self) -> None: | ||
pass | ||
|
||
def selection( | ||
self, scores: List[Dict[str, Any]], k: Optional[int] = -1 | ||
) -> List[Any]: | ||
"""Select the top k mutated sequences based on the score | ||
Args: | ||
scores: dictionary containing sequences and scores. | ||
k: number of top sequences to return. | ||
return: | ||
The top k sequences | ||
""" | ||
return list(sorted(scores, key=lambda d: d["score"], reverse=True))[:k] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this does not need to be a class, it's just a method :)
Old PR |
Signed-off-by: nanayves [email protected]