Also, this is a very basic and naïve implementation of the requirements.
- searches a path and creates an index based on the filename without extension
- assumes filename is unique (so it won't bother dealing with
file.txt
andfile.doc
separately) - finds all the files matching some extensions and treats them as backups
- checks the filestamps and reports back older files
- filetype checks based on header or metadata
- can't handle filename sans extension clashes
- doesn't handle files with multiple extensions (e.g. tar.gz)
Basic unittest
testing is included (run-tests.py
), it will also handle unpacking the mock data tarfile.
Main client is in the run.py
file and usage looks like this:
./run-tests.py
... tests run ...
./run.py -p ./mock-data
./mock-data/backups/test.bak may be superseded by ./mock-data/new/test.xls
.
.
...
To see exactly the train of thought and how the data looks like run the prototype.py
file.
The Iterator does a few things:
- Scans the path passed to the constructor
- Builds a
dictionary
data structure to hold the file names, principals and backup files. - Finding the older backup files is done by going through the structure above and looking for files with a
mtime
older than what the principal has. - Final data structure is a
list
oftuples
having the form of[(principal1, backup1), (principal1, backup2), (principal2, backup)]
and so on and so forth. - The iterator results can either be stored into a list via a comprehension and simply iterated through and worked on.
Thanks for looking!