You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The release candidate version (v0.4-rc1) has a small issue that can creep up if you have derived columns that require multiple columns variables, plus a merge table. It was first merging the columns, and then doing the derived column variable expansion, which created bad file paths because the merged columns now had spaces inserted randomly throughout.
When a sample has a derived column and is also present in the merge table, there are a couple of different ways looper could proceed. In the event that the derived column uses only a single column variable, there is no problem, but if the derived column uses two sample attributes, then the order of merging becomes relevant. Should the derived column first be derived for each row in the merge table individually, and then second merged into a space delimited string? or should the columns be merged first, and then second, the derived column be constructed from the merged columns?
The way that makes the most sense to me is that the derived column should be populated for each row in the merge table independently, and then these columns should be merged into a space delimited string. This way, files paths are constructed for each entry in the merge table, which usually corresponds to a file. Then the list of files is concatenated into a single string at the end of the merge step. I can't think of a situation where it makes sense to first merge the column, and then derive new columns.
The way around this error in v0.4-rc1 is to include a column for any derived columns that you want populated at the individual road level in the merge table. As long as you include the column in the merge table, they will be derived individually for each row. If they were not included in the merge table however, and were only included in the main sample table, then these columns would be populated based on the already merged columns from the merge table, which is what led to errors.
I have now made a change that will solve the problem in both scenarios. Now derived columns that are not present in the merge table will still be derived individually for each row in the merged table, before being merged. Unit test added.
The text was updated successfully, but these errors were encountered:
The release candidate version (v0.4-rc1) has a small issue that can creep up if you have derived columns that require multiple columns variables, plus a merge table. It was first merging the columns, and then doing the derived column variable expansion, which created bad file paths because the merged columns now had spaces inserted randomly throughout.
When a sample has a derived column and is also present in the merge table, there are a couple of different ways looper could proceed. In the event that the derived column uses only a single column variable, there is no problem, but if the derived column uses two sample attributes, then the order of merging becomes relevant. Should the derived column first be derived for each row in the merge table individually, and then second merged into a space delimited string? or should the columns be merged first, and then second, the derived column be constructed from the merged columns?
The way that makes the most sense to me is that the derived column should be populated for each row in the merge table independently, and then these columns should be merged into a space delimited string. This way, files paths are constructed for each entry in the merge table, which usually corresponds to a file. Then the list of files is concatenated into a single string at the end of the merge step. I can't think of a situation where it makes sense to first merge the column, and then derive new columns.
The way around this error in v0.4-rc1 is to include a column for any derived columns that you want populated at the individual road level in the merge table. As long as you include the column in the merge table, they will be derived individually for each row. If they were not included in the merge table however, and were only included in the main sample table, then these columns would be populated based on the already merged columns from the merge table, which is what led to errors.
I have now made a change that will solve the problem in both scenarios. Now derived columns that are not present in the merge table will still be derived individually for each row in the merged table, before being merged. Unit test added.
The text was updated successfully, but these errors were encountered: