-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Workflow6: Workflow4 + Workflow5 + demo_mv #21
Comments
This comment is related to what I commented earlier in #20. Current usage of pipeline looks like
Although it's easy to have another script that collects these command together, the usage of pipeline looks similar what we don't even use cwl.
Then, why we need to use CWL ? The potential reason as what I can see (maybe I am wrong) is that the coverage of functionality of a block we define is too small. That makes no difference between this and we actually run commands one by one. In my imagination, a block in a pipeline should accumulate 5~10 commands together. For example, wget, gunzip should put in a same block together with creating the initial file directory. |
|
I still don't get it why we need to create a wrapper for only one linux command. I agree that it is easy to develop but in terms of long-term maintenance, if we create a wrapper for only one command, that means another layer of complexity and another possible source of bug (compared with directly using the original linux command). Does it really benefit the maintainability, or it hurts the maintainability ?
I totally agree that. However, it seems to me that wrapping a single linux command doesn't always boost the flexibility. For |
This is a worth discussing issue: For my design, the reason why there is only one linux command in one block is simple. Helpful Link: |
It's a great question, but I don't think there is a perfect answer for that. Although I don't think there is a common answer, in our case, I do have an opinion on it based on projects that are similar to what we are doing (see below links). In my opinion of this project, each block should have a somewhat high-level meaning, rather than how it's implemented. My observation comes from other existing projects:
Basic units are like Probably, we can make use of what we already have. Based on our internal wiki, organism on-boarding have been divided into several steps and it seems to me that each step would not require more than 5 blocks. For example, the first step, set up data directories and get data can be divided into at least two blocks (set_up_data_directories and get_data), but we probably don't want to implement a block like Another thought is that each block should be unit-testable and worth to be tested (https://github.com/ncbi/pgap/tree/master/progs/unit_tests). |
I think I got your point. |
Link:
https://github.com/NAL-i5K/CWL_Common-Workflow-Language/tree/dev/demo_workflow6
wget
checksums (Workflow5)
gunzip (Workflow4)
tree
mv
The text was updated successfully, but these errors were encountered: