The source data was collected using the smartphone accelerometer and gyroscope 3-axial signals.
The raw data was preprocessed to a vector containing 561 features as well as vectors containing the
activities and test subject ids.
The zipped source data contains a features_info.txt
file with additional metadata.
The raw dataset contains 68 features. 66 of those were filtered from the source data using a regular expression on the
features from the features.txt
file contained in the source data. The expression
-(std|mean)\\(\\)
redurced the feature list to those 66 containing one of the string -std()
or -mean()
in their labels. These are the features
that contain the standard deviation and mean measurements in the source data.
The other 2 columns contain the activity identifiers activity
and subject ids subject
.
The merged data contains 7314 rows from the raining dataset and 2909 rows from the test dataset.
The tidy data cntains the mean values from the raw dataset calculated per subject and activity.
The labels were modified as follows:
- Replaced
std()
withStandard
- Replaced
mean()
withMean
- Dashes are removed
- Replaced
BodyBody
withBody
Raw data labels | Tidy data labels |
---|---|
´activity´ | ´activity´ |
´subject´ | ´subject´ |
´tBodyAcc-mean()-X´ | ´tBodyAccMeanX´ |
´tBodyAcc-mean()-Y´ | ´tBodyAccMeanY´ |
´tBodyAcc-mean()-Z´ | ´tBodyAccMeanZ´ |
´tBodyAcc-std()-X´ | ´tBodyAccStandardX´ |
´tBodyAcc-std()-Y´ | ´tBodyAccStandardY´ |
´tBodyAcc-std()-Z´ | ´tBodyAccStandardZ´ |
´tGravityAcc-mean()-X´ | ´tGravityAccMeanX´ |
´tGravityAcc-mean()-Y´ | ´tGravityAccMeanY´ |
´tGravityAcc-mean()-Z´ | ´tGravityAccMeanZ´ |
´tGravityAcc-std()-X´ | ´tGravityAccStandardX´ |
´tGravityAcc-std()-Y´ | ´tGravityAccStandardY´ |
´tGravityAcc-std()-Z´ | ´tGravityAccStandardZ´ |
´tBodyAccJerk-mean()-X´ | ´tBodyAccJerkMeanX´ |