Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validation of config.yaml against datafile #182

Open
willu47 opened this issue Jun 20, 2023 · 1 comment
Open

Validation of config.yaml against datafile #182

willu47 opened this issue Jun 20, 2023 · 1 comment

Comments

@willu47
Copy link
Member

willu47 commented Jun 20, 2023

Duplicate of #160. See also #151

Issue #179 highlighted that if the list of parameters and sets in the config.yaml file does not match what is in the data file, otoole returns a cryptic error message from Amply.

It is currently not possible to validate parameters and sets against what is in a datafile as Amply requires these to parse the datafile.

There are some hacky ways to extract parameters and sets from datafiles using string matching or regex, but it's fragile.

Catching the Amply error and returning a more useful error in otoole e.g. "DatafileParseError: Please check that the config file provided matches the parameters and sets in your datafile" maybe a quicker route to something usable...

@trevorb1
Copy link
Member

Hi @willu47! I have a question on this issue. I think I have implemented the solution for catching the Amply error, however, I am wondering what general logic we want implement for when a config file does not match the input data.

If there are inconsistencies with the reading in of data (for example, if the config file has a parameter/set defined which is not in the input data, or visa versa), do we still want to read in the data and just print a warning printed to the user? Or should we halt the conversion all together and raise a warning. Right now the current logic is shown in PR #157.

I heard from a few people that when the OtooleNameMismatchError is raised:

  1. It's a confusing name. I have addressed this in the upcoming PR for this issue
  2. If they have extra data in a folder of CSVs, otoole will no longer work with their workflow. I guess my thought when I implemented this error message was that we should be checking for perfectly matching data and config files. But maybe this is to strict of a requirement and it should be relaxed to only print warnings if the config file and input data do not match? Or maybe we allow users to bypass this error through the use of a --skip_input_check flag (or something similar to this)?

Do you have any thoughts on these questions? Cause right now I think I am going in circles with what we want to implement haha.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants