Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor & feat: re-organized the scripts and added new features #199

Closed
wants to merge 11 commits into from

Conversation

yvesnana
Copy link
Contributor

Signed-off-by: nanayves [email protected]

@yvesnana yvesnana requested a review from drugilsberg February 10, 2023 09:54
@cla-bot cla-bot bot added the cla-signed CLA has been signed label Feb 10, 2023
Copy link
Contributor

@drugilsberg drugilsberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some initial comments.
Main points:

  • fix the build
  • a lot of stuff that lives in processing should go in a specific submodule, e.g., CrossverGenerator or the selection, probably in optimisation.py
  • docstrings are not consistent

@@ -81,6 +81,9 @@ gt4sd =
training_pipelines/tests/*json
training_pipelines/tests/*smi
frameworks/crystals_rfc/*.csv
frameworks/enzeptional/tests/*pkl
frameworks/enzeptional/tests/*json
frameworks/enzeptional/mutation_model/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add extension otherwise it's risky

from gt4sd.frameworks.enzeptional import core
from gt4sd.frameworks.enzeptional.processing import AutoModelFromHFEmbedding, TAPEEmbedding

warnings.simplefilter(action="ignore", category=FutureWarning)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

write it as a pytest unit test, also it would be good to have atomical tests (per method) as well

Comment on lines 638 to 648
sequence_from_intervals (str): orignial sequence extrcated from interval.
tmp_results (List[Dict[str, Any]]): the temporary results
intervals (List[Tuple[int, int]]): list of ranges in the sequence, zero-based. The same interval is applied to all sequences
selection_method (str): methodology used to selection
crossover_method (str): methodology used for crossover
crossover_probability (float, optional): Crossover probability. Used in case Uniform crossover is selected. Defaults to 0.5.
top_k_selection (Optional[int], optional): Number of samples to select. Defaults to -1.

Returns:
List[str]: New samples for the next round of optimization
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this docstring style is to compliant and will not the rendered properly in the docs, fix it following the other examples

Comment on lines 389 to 403
class SelectionGenerator:
def __init__(self) -> None:
pass

def selection(
self, scores: List[Dict[str, Any]], k: Optional[int] = -1
) -> List[Any]:
"""Select the top k mutated sequences based on the score
Args:
scores: dictionary containing sequences and scores.
k: number of top sequences to return.
return:
The top k sequences
"""
return list(sorted(scores, key=lambda d: d["score"], reverse=True))[:k]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this does not need to be a class, it's just a method :)

@yvesnana yvesnana closed this Nov 22, 2023
@yvesnana
Copy link
Contributor Author

Old PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed CLA has been signed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants