-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sourcery refactored master branch #1
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sourcery timed out performing refactorings.
Due to GitHub API limits, only the first 60 comments can be shown.
for _ in dataset: | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function read_unformated
refactored with the following changes:
- Remove redundant pass statement (
remove-redundant-pass
) - Remove nested block which has no effect (
remove-empty-nested-block
)
for _ in dataset: | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function read_formatted_as_numpy
refactored with the following changes:
- Remove redundant pass statement (
remove-redundant-pass
) - Remove nested block which has no effect (
remove-empty-nested-block
)
times[read_func.__name__ + " after write_array2d"] = read_func(feats, tmp_dir) | ||
times[f"{read_func.__name__} after write_array2d"] = read_func(feats, tmp_dir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function benchmark_array_xd
refactored with the following changes:
- Use f-string instead of string concatenation [×3] (
use-fstring-for-concatenation
)
print(func.__name__, str(kwargs)) | ||
times[func.__name__ + " " + " ".join(str(v) for v in kwargs.values())] = func(dataset, **kwargs) | ||
print(func.__name__, kwargs) | ||
times[ | ||
f"{func.__name__} " + " ".join(str(v) for v in kwargs.values()) | ||
] = func(dataset, **kwargs) | ||
|
||
|
||
print("shuffling dataset") | ||
dataset = dataset.shuffle() | ||
print("Second set of iterations (after shuffling") | ||
for func, kwargs in functions_shuffled: | ||
print("shuffled ", func.__name__, str(kwargs)) | ||
times["shuffled " + func.__name__ + " " + " ".join(str(v) for v in kwargs.values())] = func( | ||
dataset, **kwargs | ||
) | ||
print("shuffled ", func.__name__, kwargs) | ||
times[ | ||
f"shuffled {func.__name__} " | ||
+ " ".join(str(v) for v in kwargs.values()) | ||
] = func(dataset, **kwargs) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function benchmark_iterating
refactored with the following changes:
- Remove unnecessary call to
str()
withinprint()
[×2] (remove-str-from-print
) - Use f-string instead of string concatenation [×3] (
use-fstring-for-concatenation
)
title += " " + metric_name + " |" | ||
title += f" {metric_name} |" | ||
lines += "---|" | ||
value += val_str + " |" | ||
value += f"{val_str} |" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function format_json_to_md
refactored with the following changes:
- Use f-string instead of string concatenation [×3] (
use-fstring-for-concatenation
)
@@ -16,6 +16,7 @@ | |||
"""Amazon Customer Reviews Dataset --- US REVIEWS DATASET.""" | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lines 101-101
refactored with the following changes:
- Use f-string instead of string concatenation [×2] (
use-fstring-for-concatenation
)
for i, row in enumerate(reader): | ||
yield i, row | ||
yield from enumerate(reader) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function AmazonUSReviews._generate_examples
refactored with the following changes:
- Replace yield inside for loop with yield from (
yield-from
)
@@ -16,6 +16,7 @@ | |||
"""AmbigQA: Answering Ambiguous Open-domain Questions""" | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lines 43-46
refactored with the following changes:
- Use f-string instead of string concatenation [×2] (
use-fstring-for-concatenation
)
features_dict.update(detail_features) | ||
features_dict |= detail_features |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function AmbigQa._info
refactored with the following changes:
- Merge dictionary updates via the union operator (
dict-assign-update-to-union
)
path_dict = dict() | ||
path_dict = {} | ||
for round_tag in ["R1", "R2", "R3"]: | ||
path_dict[round_tag] = dict() | ||
path_dict[round_tag] = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function ANLI._split_generators
refactored with the following changes:
- Replace dict() with {} [×2] (
dict-literal
)
for idx, line in enumerate(open(filepath, "rb")): | ||
for line in open(filepath, "rb"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function ANLI._generate_examples
refactored with the following changes:
- Remove unnecessary calls to
enumerate
when the index is not used (remove-unused-enumerate
)
@@ -15,6 +15,7 @@ | |||
"""AQUA-RAT (Algebra Question Answering with Rationales) Dataset""" | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lines 56-60
refactored with the following changes:
- Use f-string instead of string concatenation [×6] (
use-fstring-for-concatenation
)
features = {} | ||
|
||
features["tweetID"] = datasets.Value("int64") | ||
features = {"tweetID": datasets.Value("int64")} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function ArCov19._info
refactored with the following changes:
- Merge dictionary assignment with declaration (
merge-dict-assign
)
@@ -15,6 +15,7 @@ | |||
"""Arabic Billion Words Corpus""" | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lines 44-53
refactored with the following changes:
- Use f-string instead of string concatenation [×10] (
use-fstring-for-concatenation
)
out = re.findall(r"" + pattern, sample.group(0), re.MULTILINE | re.DOTALL) | ||
out = re.findall(f"{pattern}", sample.group(0), re.MULTILINE | re.DOTALL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function ArabicBillionWords._extract_tags
refactored with the following changes:
- Use f-string instead of string concatenation (
use-fstring-for-concatenation
)
if self.config.name == "full" or self.config.name == "ptpt": | ||
if self.config.name in ["full", "ptpt"]: | ||
train_paths.append(os.path.join(data_dir, "assin-ptpt-train.xml")) | ||
dev_paths.append(os.path.join(data_dir, "assin-ptpt-dev.xml")) | ||
test_paths.append(os.path.join(data_dir, "assin-ptpt-test.xml")) | ||
|
||
if self.config.name == "full" or self.config.name == "ptbr": | ||
if self.config.name in ["full", "ptbr"]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function Assin._split_generators
refactored with the following changes:
- Replace multiple comparisons of same variable with
in
operator [×2] (merge-comparisons
)
rest = "[" + rest | ||
rest = f"[{rest}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function Atomic._generate_examples
refactored with the following changes:
- Use f-string instead of string concatenation (
use-fstring-for-concatenation
)
for idx, line in enumerate(f): | ||
for line in f: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function BabiQa._generate_examples
refactored with the following changes:
- Remove unnecessary calls to
enumerate
when the index is not used (remove-unused-enumerate
) - If else clause is always executed move code to same level as loop (
useless-else-on-loop
)
This removes the following comments ( why? ):
# After last line
if line == "" or line == "\n": | ||
if line in ["", "\n"]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function Bc2gmCorpus._generate_examples
refactored with the following changes:
- Replace multiple comparisons of same variable with
in
operator (merge-comparisons
)
yield f"{file_idx}_{line_idx}", { | ||
"fname": fname.name, | ||
"char": chars, | ||
"char_type": char_types, | ||
"is_beginning": is_beginnings if split == "train" else [0 for i in range(len(chars))], | ||
} | ||
yield ( | ||
f"{file_idx}_{line_idx}", | ||
{ | ||
"fname": fname.name, | ||
"char": chars, | ||
"char_type": char_types, | ||
"is_beginning": is_beginnings | ||
if split == "train" | ||
else [0 for _ in range(len(chars))], | ||
}, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function Best2009._generate_examples
refactored with the following changes:
- Replace unused for index with underscore (
for-index-underscore
)
name = "%s_to_%s" % (language_pair[0], language_pair[1]) | ||
name = f"{language_pair[0]}_to_{language_pair[1]}" | ||
|
||
description = f"Translation dataset from {language_pair[0]} to {language_pair[1]} or {language_pair[1]} to {language_pair[0]}." | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function BianetConfig.__init__
refactored with the following changes:
- Replace interpolated string formatting with f-string [×2] (
replace-interpolation-with-fstring
)
result = ( | ||
yield ( | ||
sentence_counter, | ||
{ | ||
"id": str(sentence_counter), | ||
"translation": {lang1: x, lang2: y}, | ||
}, | ||
) | ||
yield result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function Bianet._generate_examples
refactored with the following changes:
- Inline variable that is immediately yielded (
inline-immediately-yielded-variable
)
folder = l1 + "-" + l2 | ||
folder = f"{l1}-{l2}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function BiblePara._generate_examples
refactored with the following changes:
- Use f-string instead of string concatenation [×2] (
use-fstring-for-concatenation
)
{k: os.path.join(dl_path, "bigPatentData", k + ".tar.gz") for k in split_types} | ||
{ | ||
k: os.path.join(dl_path, "bigPatentData", f"{k}.tar.gz") | ||
for k in split_types | ||
} | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function BigPatent._split_generators
refactored with the following changes:
- Use f-string instead of string concatenation (
use-fstring-for-concatenation
)
"train": "https://archive.org/download/biomrc_dataset/biomrc_large/dataset_train{}.json.gz".format( | ||
setting | ||
), | ||
"val": "https://archive.org/download/biomrc_dataset/biomrc_large/dataset_val{}.json.gz".format( | ||
setting | ||
), | ||
"test": "https://archive.org/download/biomrc_dataset/biomrc_large/dataset_test{}.json.gz".format( | ||
setting | ||
), | ||
"train": f"https://archive.org/download/biomrc_dataset/biomrc_large/dataset_train{setting}.json.gz", | ||
"val": f"https://archive.org/download/biomrc_dataset/biomrc_large/dataset_val{setting}.json.gz", | ||
"test": f"https://archive.org/download/biomrc_dataset/biomrc_large/dataset_test{setting}.json.gz", | ||
} | ||
|
||
elif self.config.biomrc_version == "small": | ||
urls_to_download = { | ||
"train": "https://archive.org/download/biomrc_dataset/biomrc_small/dataset_train_small{}.json.gz".format( | ||
setting | ||
), | ||
"val": "https://archive.org/download/biomrc_dataset/biomrc_small/dataset_val_small{}.json.gz".format( | ||
setting | ||
), | ||
"test": "https://archive.org/download/biomrc_dataset/biomrc_small/dataset_test_small{}.json.gz".format( | ||
setting | ||
), | ||
"train": f"https://archive.org/download/biomrc_dataset/biomrc_small/dataset_train_small{setting}.json.gz", | ||
"val": f"https://archive.org/download/biomrc_dataset/biomrc_small/dataset_val_small{setting}.json.gz", | ||
"test": f"https://archive.org/download/biomrc_dataset/biomrc_small/dataset_test_small{setting}.json.gz", | ||
} | ||
|
||
else: | ||
urls_to_download = { | ||
"test": "https://archive.org/download/biomrc_dataset/biomrc_tiny/dataset_tiny{}.json.gz".format( | ||
setting | ||
) | ||
"test": f"https://archive.org/download/biomrc_dataset/biomrc_tiny/dataset_tiny{setting}.json.gz" | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function Biomrc._split_generators
refactored with the following changes:
- Simplify conditional into switch-like form [×7] (
switch
) - Replace call to format with f-string [×7] (
use-fstring-for-formatting
)
result = ( | ||
yield ( | ||
sentence_counter, | ||
{ | ||
"id": str(sentence_counter), | ||
"text": row, | ||
}, | ||
) | ||
yield result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function Cc100._generate_examples
refactored with the following changes:
- Inline variable that is immediately yielded (
inline-immediately-yielded-variable
)
url = my_urls + "en_XX-" + self.config.language_code + ".tsv.xz" | ||
url = f"{my_urls}en_XX-{self.config.language_code}.tsv.xz" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function CcalignedMultilingual._split_generators
refactored with the following changes:
- Use f-string instead of string concatenation [×3] (
use-fstring-for-concatenation
)
elif reverse: | ||
yield id_, { | ||
"translation": {lc: data[0].strip(), "en_XX": data[1].strip()}, | ||
"LASER_similarity": data[2], | ||
} | ||
|
||
else: | ||
if not reverse: | ||
yield id_, { | ||
"translation": {"en_XX": data[0].strip(), lc: data[1].strip()}, | ||
"LASER_similarity": data[2], | ||
} | ||
else: | ||
yield id_, { | ||
"translation": {lc: data[0].strip(), "en_XX": data[1].strip()}, | ||
"LASER_similarity": data[2], | ||
} | ||
yield id_, { | ||
"translation": {"en_XX": data[0].strip(), lc: data[1].strip()}, | ||
"LASER_similarity": data[2], | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function CcalignedMultilingual._generate_examples
refactored with the following changes:
- Merge else clause's nested if statement into elif (
merge-else-if-into-elif
) - Swap if/else branches (
swap-if-else-branches
)
self.split_file = os.path.join(directory, name + ".json") | ||
self.split_file = os.path.join(directory, f"{name}.json") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function CfqConfig.__init__
refactored with the following changes:
- Use f-string instead of string concatenation (
use-fstring-for-concatenation
)
if split == "train": | ||
batches = ["data_batch_1", "data_batch_2", "data_batch_3", "data_batch_4", "data_batch_5"] | ||
|
||
if split == "test": | ||
batches = ["test_batch"] | ||
|
||
elif split == "train": | ||
batches = ["data_batch_1", "data_batch_2", "data_batch_3", "data_batch_4", "data_batch_5"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function Cifar10._generate_examples
refactored with the following changes:
- Simplify conditional into switch-like form (
switch
)
Sourcery Code Quality Report✅ Merging this PR will increase code quality in the affected files by 0.77%.
Here are some functions in these files that still need a tune-up:
Legend and ExplanationThe emojis denote the absolute quality of the code:
The 👍 and 👎 indicate whether the quality has improved or gotten worse with this pull request. Please see our documentation here for details on how these metrics are calculated. We are actively working on this report - lots more documentation and extra metrics to come! Help us improve this quality report! |
Branch
master
refactored by Sourcery.If you're happy with these changes, merge this Pull Request using the Squash and merge strategy.
See our documentation here.
Run Sourcery locally
Reduce the feedback loop during development by using the Sourcery editor plugin:
Review changes via command line
To manually merge these changes, make sure you're on the
master
branch, then run:Help us improve this pull request!