A data analysis as part of the data analysis class offered by Johns Hopkins University on [Coursera] 1. Performed by using the [R language] 2.
The data for this assignment are the Samsung activity data available from the course website:
https://spark-public.s3.amazonaws.com/dataanalysis/samsungData.rda
These data are slightly processed to make them easier to load into R. You can also find the raw data here:
http://archive.ics.uci.edu/ml/datasets/Human+Activity+Recognition+Using+Smartphones
All of the columns of the data set (except the last two) represents one measurement from the Samsung phone. The variable subject indicates which subject was performing the tasks when the measurements were taken. The variable activity tells what activity they were performing.
Your task is to build a function that predicts what activity a subject is performing based on the quantitative measurements from the Samsung phone. For this analysis your training set must include the data from subjects 1, 3, 5, and 6. But you may use more subjects data to train if you wish. Your test set is the data from subjects 27, 28, 29, and 30, but you may use more data to test. Be careful that your training/test sets do not overlap.
You should perform all of the steps in building a predictive model and describe your analysis in a report as explained below.
Your data analysis submission will consist of the following components:
-
The main text of your document including a numbered list of references. This can be uploaded either as a pdf document or typed into the text box (not both!). The limit for the text and references is 2000 words. Your main text should be written in the form of an essay with an introduction, methods, results, and conclusions section.
-
One figure for your data analysis uploaded as a .png, .jpg, or .pdf file, along with a figure caption of up to 500 words.