GitHub

Calling libraries

library(dplyr)

## Warning: package 'dplyr' was built under R version 3.3.2

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

library(tidyr)

## Warning: package 'tidyr' was built under R version 3.3.2

Downloading and unzipping files

fileurl <- "https://d396qusza40orc.cloudfront.net/getdata%2Fprojectfiles%2FUCI%20HAR%20Dataset.zip"

download.file(fileurl,destfile = "./UCI HAR Dataset.zip", mode = "wb")

unzip("./UCI HAR Dataset.zip")

Reading files into tables

X_test <- read.table("./UCI HAR Dataset/test/X_test.txt")
X_train <- read.table("./UCI HAR Dataset/train/X_train.txt")

Y_test <- read.table("./UCI HAR Dataset/test/Y_test.txt")
Y_train <- read.table("./UCI HAR Dataset/train/Y_train.txt")

subject_test <- read.table("./UCI HAR Dataset/test/subject_test.txt")
subject_train <- read.table("./UCI HAR Dataset/train/subject_train.txt")

features <- read.table("./UCI HAR Dataset/features.txt")
activity_labels <- read.table("./UCI HAR Dataset/activity_labels.txt")

Combining train and test tables of each content set. Converting to table format

X <- rbind(X_test, X_train);                    X <- tbl_df(X)
Y <- rbind(Y_test, Y_train);                    Y <- tbl_df(Y)

subject <- rbind(subject_test, subject_train);  subject <- tbl_df(subject)

features <- tbl_df(features)
activity_labels <- tbl_df(activity_labels)

Removing redundant tables

rm(X_test, X_train, Y_test, Y_train, subject_test, subject_train)

Setting Variable names to features' descriptions

column_names <- sapply(features$V2, as.character)
names(X) <- column_names

Extracting only the measurements on the mean and standard deviation for each measurement.

subset_X <- X[,grep("mean|std", names(X))]
        subset_X <- subset_X[,-grep("meanFreq()|Jerk|Mag|^f",names(subset_X))]

Here we filter out measures which are not raw measurements, but rather derived Such as Jerk, Magnitude, Furier Transformation and Mean Frequency

Making variable names readable

names(subset_X) <- sub("Acc", "Acceleration", names(subset_X))
names(subset_X) <- sub("Gyro", "Gyroscope", names(subset_X))
names(subset_X) <- sub("^t", "Time", names(subset_X))
names(subset_X) <- sub("-", "Signal", names(subset_X))
names(subset_X) <- sub("\\()", "", names(subset_X))

Using activity Names

Y <- Y %>% inner_join(activity_labels, by = "V1") %>%
        transmute(activity = V2)

Subject variable name change from "V1" to "subject"

names(subject) <- "subject"

Combining subject, subset_X, and Y tables

df <- subject %>% cbind(Y) %>% cbind(subset_X) %>% tbl_df()

Making data narrow to perform averaging across subject, activity, and measure

tidy <- df %>% gather(measure, value, 3:19)
tidy <- tidy %>% group_by(subject, activity, measure) %>%
        summarize(meanvalue = mean(value))

Reverting data back to the wide format when in each column there is a different measurement mean

final_table <- tidy %>% spread(measure, meanvalue)

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
CodeBook.txt		CodeBook.txt
README.md		README.md
run_analysis.R		run_analysis.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Calling libraries

Downloading and unzipping files

Reading files into tables

Combining train and test tables of each content set. Converting to table format

Removing redundant tables

Setting Variable names to features' descriptions

Extracting only the measurements on the mean and standard deviation for each measurement.

Making variable names readable

Using activity Names

Subject variable name change from "V1" to "subject"

Combining subject, subset_X, and Y tables

Making data narrow to perform averaging across subject, activity, and measure

Reverting data back to the wide format when in each column there is a different measurement mean

About

Releases

Packages

Languages

Hayk86/Getting_and_Cleaning_Data

Folders and files

Latest commit

History

Repository files navigation

Calling libraries

Downloading and unzipping files

Reading files into tables

Combining train and test tables of each content set. Converting to table format

Removing redundant tables

Setting Variable names to features' descriptions

Extracting only the measurements on the mean and standard deviation for each measurement.

Making variable names readable

Using activity Names

Subject variable name change from "V1" to "subject"

Combining subject, subset_X, and Y tables

Making data narrow to perform averaging across subject, activity, and measure

Reverting data back to the wide format when in each column there is a different measurement mean

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages