Problem Scenario
- Client has requested that you perform calculations on his data set.
- Each data entry is made up of a number columns/attributes and each entry has a unique ID.
- The data is provided in 3 files, each file contains information about products but each file only contains a subset of columns/attributes for each entry. The rest can be found in the other files.
- The files need to be merged into a single file containing all columns before the processing can take place.
Task
- Write a console program to merge the files provided in a timely and efficient manner. File paths should be supplied as arguments, so that the program can be evaluated on different data sets.
- The merged file should be saved as CSV; use the id column as the unique key for merging; the program should do any necessary data cleaning and error checking. We will evaluate the submission based on code, extensibility, performance, readability, correctness and integrity of the merged files (Large Files will be used), validity of the output file schema etc. Notes and observations must be submitted along with the files along with instructions and any dependencies which will be required to run the software (version of Python as an example is very important.) Any potential errors should be handled in the code