This repo is a space to hold all our data sets, code, and text for the Tech Tank project. We can then roll it up for delivery and upload it to the Tech Tank GitHub along with our deck. Here are the deliverables:
- [Problem text](https://github.com/jgendron/boozonians/blob/master/1. Problem Text/Problem Text.md) – Text file (.txt) that provides background information, the specific problem, a description of the data, and the format for question submission. This is the problem that will be presented to participants.
- [Participant data](https://github.com/jgendron/boozonians/blob/master/2. Participant Data/) - These are two files that will be provided to the participants.
- [Training data](https://github.com/jgendron/boozonians/blob/master/2. Participant Data/alien_train.csv) – These are data points that have all the features for each observation, plus the value of the target variable (the quantity you are trying to predict)
- [Test data](https://github.com/jgendron/boozonians/blob/master/2. Participant Data/alien_test.csv) – These are data points that have all the features for each observation, but no value for the target variable.
- [Answer key](https://github.com/jgendron/boozonians/blob/master/3. Truth Data/alien_answer_key.txt) – The same data points contained in the test data, but with the value of the target variable. This is often called the truth set. This file will be used internally by the Data Science Expedition platform to evaluate participant submissions.
- [Scoring script](https://github.com/jgendron/boozonians/blob/master/4. Scoring Script/) – A Python script that takes a participant-submitted set of predictions, compares them against the answer key, and returns a quantitative accuracy score. This file is used internally by the Data Science Expedition platform.
- [Configuration file](https://github.com/jgendron/boozonians/blob/master/5. Configuration File/Configuration File.md/) – A text file that contains basic information about the question including category, relevant file paths, total points, and hints. This one seems to be the vaguest, but I think they are just asking for a summary of our submission (what file is which) and hints in case people get stuck with associated penalties.
- [Solution approach](https://github.com/jgendron/boozonians/blob/master/6. Solution Approach/) – A predictive model that can be used to create a submission. This can be written in any language (R, Python, Excel, etc.) using any modeling approach appropriate to the problem. The file is not used by the Data Science Expedition platform or provided to participants. It serves as proof that the question can be solved using the provided data.