Skip to content

Latest commit

 

History

History
68 lines (37 loc) · 3.26 KB

CodeBook.MD

File metadata and controls

68 lines (37 loc) · 3.26 KB

##Getting and Cleaning Data Project (Johns Hopkins)

##Description

Information about the variables, data and transformations used in the course project for the Getting and Cleaning Data course.

##Source Data

A full description of the data used in this project can be found at The UCI Machine Learning Repository (http://archive.ics.uci.edu/ml/datasets/Human+Activity+Recognition+Using+Smartphones)

Following url is the source data for this project: https://d396qusza40orc.cloudfront.net/getdata%2Fprojectfiles%2FUCI%20HAR%20Dataset.zip

##Data Set Information

The experiments have been carried out with a group of 30 volunteers within an age bracket of 19-48 years. Each person performed six activities (WALKING, WALKING_UPSTAIRS, WALKING_DOWNSTAIRS, SITTING, STANDING, LAYING) wearing a smartphone (Samsung Galaxy S II) on the waist. Using its embedded accelerometer and gyroscope, we captured 3-axial linear acceleration and 3-axial angular velocity at a constant rate of 50Hz. The experiments have been video-recorded to label the data manually. The obtained dataset has been randomly partitioned into two sets, where 70% of the volunteers was selected for generating the training data and 30% the test data.

The sensor signals (accelerometer and gyroscope) were pre-processed by applying noise filters and then sampled in fixed-width sliding windows of 2.56 sec and 50% overlap (128 readings/window). The sensor acceleration signal, which has gravitational and body motion components, was separated using a Butterworth low-pass filter into body acceleration and gravity. The gravitational force is assumed to have only low frequency components, therefore a filter with 0.3 Hz cutoff frequency was used. From each window, a vector of features was obtained by calculating variables from the time and frequency domain.

##Attribute Information

For each record in the dataset it is provided:

Triaxial acceleration from the accelerometer (total acceleration) and the estimated body acceleration. Triaxial Angular velocity from the gyroscope. A 561-feature vector with time and frequency domain variables. Its activity label. An identifier of the subject who carried out the experiment.

##1. Merge the training and the test sets to create one data set.

After setting the source directory for the files, read data the following data into tables:

features.txt activity_labels.txt subject_train.txt x_train.txt y_train.txt subject_test.txt x_test.txt y_test.txt Assign column names and merge to create one data set.

##2. Extract only the measurements on the mean and standard deviation for each measurement.

Create a logcal vector that contains TRUE values for the ID, mean and stdev columns and FALSE values for the others. Subset this data to keep only the necessary columns.

##3. Use descriptive activity names to name the activities in the data set

Merge data subset with the impactivitylabel table to inlude the descriptive activity names

##4. Appropriately label the data set with descriptive activity names.

Use gsub function for pattern replacement to clean up the data labels.

##5. Create a second, independent tidy data set with the average of each variable for each activity and each subject.

From the project instructions, produce a data set with the average of each veriable for each activity and subject.