derived columns and merge tables #25

nsheff · 2017-01-25T16:06:59Z

The release candidate version (v0.4-rc1) has a small issue that can creep up if you have derived columns that require multiple columns variables, plus a merge table. It was first merging the columns, and then doing the derived column variable expansion, which created bad file paths because the merged columns now had spaces inserted randomly throughout.

When a sample has a derived column and is also present in the merge table, there are a couple of different ways looper could proceed. In the event that the derived column uses only a single column variable, there is no problem, but if the derived column uses two sample attributes, then the order of merging becomes relevant. Should the derived column first be derived for each row in the merge table individually, and then second merged into a space delimited string? or should the columns be merged first, and then second, the derived column be constructed from the merged columns?

The way that makes the most sense to me is that the derived column should be populated for each row in the merge table independently, and then these columns should be merged into a space delimited string. This way, files paths are constructed for each entry in the merge table, which usually corresponds to a file. Then the list of files is concatenated into a single string at the end of the merge step. I can't think of a situation where it makes sense to first merge the column, and then derive new columns.

The way around this error in v0.4-rc1 is to include a column for any derived columns that you want populated at the individual road level in the merge table. As long as you include the column in the merge table, they will be derived individually for each row. If they were not included in the merge table however, and were only included in the main sample table, then these columns would be populated based on the already merged columns from the merge table, which is what led to errors.

I have now made a change that will solve the problem in both scenarios. Now derived columns that are not present in the merge table will still be derived individually for each row in the merged table, before being merged. Unit test added.

nsheff added the bug label Jan 25, 2017

nsheff added this to the 0.4 milestone Jan 25, 2017

nsheff closed this as completed in 1121372 Feb 13, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

derived columns and merge tables #25

derived columns and merge tables #25

nsheff commented Jan 25, 2017

derived columns and merge tables #25

derived columns and merge tables #25

Comments

nsheff commented Jan 25, 2017