forked from DataScienceSpecialization/courses
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Pete Jarvis
committed
Aug 23, 2015
1 parent
21a70b7
commit c593dfd
Showing
5 changed files
with
491 additions
and
0 deletions.
There are no files selected for viewing
153 changes: 153 additions & 0 deletions
153
09_DevelopingDataProducts/00CourseWork/StormDatabase/ReadMe.html
Large diffs are not rendered by default.
Oops, something went wrong.
62 changes: 62 additions & 0 deletions
62
09_DevelopingDataProducts/00CourseWork/StormDatabase/ReadMe.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,62 @@ | ||
--- | ||
title: "Readme" | ||
author: "Technophobe1" | ||
date: "August 18, 2015" | ||
output: html_document | ||
--- | ||
|
||
Developing Data Prododucts - CourseProject | ||
========================= | ||
This is a submission for Coursera: Developing Data Products- Course Project | ||
|
||
## Synopsis | ||
Our aim to describe the impact of severe weather events in the United States between the years **1950** to **2011**. This analysis is intended to address the following questions: | ||
|
||
- Across the United States, which types of events (as indicated in the eventType variable) are most harmful with respect to population health? | ||
- Across the United States, which types of events have the greatest economic consequences? | ||
|
||
To investigate and answer these questions and validate our hypothesis, we obtained the Storm and Weather events database from the National Oceanic and Atmospheric Administration ([NOAA][1]) which is collected from various sources across the U.S. We specifically obtained data for the years 1950 and 2011. | ||
|
||
### Usage | ||
|
||
This Shiny application allows the viewer to review and manipulate the NOAA data in a number of ways. You may adjust date range using control panel located on the left side.The result is shown in the main pannel on the right side of the page. | ||
|
||
- Buble Chart - **Economic Impact by Year** | ||
- Buble Chart - **Population Impact by year** | ||
- Geographic Map - **Economic Geographic Impact by State** | ||
- You can view Property Damage, Crop Damage or both forms of event damage by **state** and **year**. | ||
|
||
### Data, Code and Presentation | ||
|
||
- Source code for the project is available on the [GitHub][5]. | ||
- The Shiny application is availble on [Shinyapps.io][7] | ||
- The Presentation is available on [Rpubs][8] | ||
- The dataset can be download from [NOAA][2] | ||
|
||
Note: Additional documentation is avaialble from NOOA that explains how the data has been obtained, and what and how the variables are defined and constructed. | ||
|
||
- [NOAA Event Database Website][2] | ||
- [National Weather Service Storm Data Documentation][3] | ||
- [National Climatic Data Center Storm Events FAQ][4] | ||
|
||
# Conclusions: | ||
|
||
The events most harmful to property based on the data are Floods, Huricanes and Tornados. Floods and Hurricanes whilst at lower incidence have tremendous capacity to impart large scale economic damage on property. The events most harmful to health based on the data and review period are Tornados, Heat and Wind Storms. Tornadoes and Heat are by far and away the most impactful on human health. It is worth noting that the events most harmful to health and property in the US principally occur in the central and mid states based on the data and review period analysed. | ||
|
||
# References | ||
|
||
- [NOAA][1] | ||
- *R in Action* | ||
- By: Robert Kabacoff Publisher: Manning Publications Pub. Date: August 24, 2011, ISBN-10: 1-935182-39-0 | ||
- *Mathematical Statistics with Resampling and R* | ||
- By: Laura Chihara; Tim Hesterberg Publisher: John Wiley & Sons Pub. Date: September 6, 2011 Print ISBN: 978-1-11-02985-5 | ||
|
||
|
||
[1]: http://www.noaa.gov/ | ||
[2]: http://www.ncdc.noaa.gov/stormevents/details.jsp?type=collection | ||
[3]: https://d396qusza40orc.cloudfront.net/repdata%2Fpeer2_doc%2Fpd01016005curr.pdf | ||
[4]: https://d396qusza40orc.cloudfront.net/repdata%2Fpeer2_doc%2FNCDC%20Storm%20Events-FAQ%20Page.pdf | ||
[5]: https://github.com/Technophobe01/courses/tree/master/09_DevelopingDataProducts/00CourseWork/StormDatabase | ||
[6]: https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2 "Storm Data" | ||
[7]: https://technophobe01.shinyapps.io/StormDatabase | ||
[8]: https://rpubs.com/Technophobe01/StormDatabase |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,60 @@ | ||
# UI for full dashboard | ||
|
||
library(shiny) | ||
|
||
shinyUI(pageWithSidebar( | ||
headerPanel("Analysis of [NOAA] Storm Database and Weather Events"), | ||
|
||
sidebarPanel( | ||
helpText( | ||
"Data derived from the National Oceanic and Atmospheric Administration ([NOAA])" | ||
), | ||
|
||
conditionalPanel( | ||
condition = "input.theTabs == '3Tab' ", | ||
h3('Economic Impact'), | ||
sliderInput( | ||
"EconImpactSliderYear", "Select a year:",min = 1950, max = 2011, step = 1, value = 2000, animate = TRUE, sep = "" | ||
) | ||
), | ||
|
||
conditionalPanel( | ||
condition = "input.theTabs == '4Tab' ", | ||
h3('Population Impact'), | ||
sliderInput( | ||
"EventImpactSliderYear", "Select a year:",min = 1950, max = 2011, step = 1, value = 2000, animate = TRUE, sep = "" | ||
) | ||
), | ||
|
||
conditionalPanel( | ||
condition = "input.theTabs == '5Tab' ", | ||
h3('Geographic Impact'), | ||
sliderInput("GeographicImpactSliderYear", "Select a year:", min = 1950, max = 2011, value = 2000, step = 1, animate = TRUE, sep = ""), | ||
radioButtons("economicCategoryButton","Select Impact category:", | ||
c("Both" = "both", "Property damage" = "property", "Crops damage" = "crops") | ||
) | ||
) | ||
|
||
), | ||
|
||
mainPanel( | ||
tabsetPanel( | ||
tabPanel("About",includeMarkdown("ReadMe.md")), | ||
tabPanel( | ||
"Economic Impact", plotOutput("EconomicImpact"), | ||
verbatimTextOutput("descriptionTab3"), value = "3Tab" | ||
), | ||
tabPanel( | ||
"Population Impact", plotOutput("PopulationImpact"), | ||
verbatimTextOutput("descriptionTab4"), value = "4Tab" | ||
), | ||
tabPanel( | ||
"Geographic Impact", | ||
h3(textOutput("GeographicImpactYear")), | ||
htmlOutput("GeographicImpact"), | ||
verbatimTextOutput("descriptionTab5"), value = "5Tab" | ||
), | ||
id = "theTabs" | ||
) | ||
) | ||
)) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
# Global File for shared data |
215 changes: 215 additions & 0 deletions
215
09_DevelopingDataProducts/00CourseWork/StormDatabase/server.R
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,215 @@ | ||
# server for full dashboard | ||
|
||
requiredPackages <- c( | ||
"shiny", | ||
"rmarkdown", | ||
"RCurl", | ||
"ggplot2", # Used to plot graphics | ||
"dplyr", # Used for data manipulation (very Cool!) | ||
"scales", # Used to scale map data to aesthetics, | ||
# and provide methods for automatically | ||
# determining breaks and labels for axes | ||
# and legends. | ||
"knitr", | ||
"R.utils", # Used for unziping the bz2 zip file... | ||
"lubridate", # Used for time formatting | ||
"reshape2", # Used to manipulate and reshape the data | ||
"gridExtra", # Used to map out plots in Grids | ||
"ggthemes", # Extra themes, scales and geoms for ggplot (Very cool!) | ||
"xtable", # Used to print R objects HTML tables) | ||
"maps", | ||
"rCharts", | ||
"reshape2", | ||
"data.table", | ||
"mapproj", | ||
"googleVis" | ||
) | ||
|
||
ipak <- function(pkg) | ||
{ | ||
new.pkg <- pkg[!(pkg %in% installed.packages()[, "Package"])] | ||
if (length(new.pkg)) | ||
install.packages(new.pkg, dependencies = TRUE) | ||
sapply(pkg, require, character.only = TRUE) | ||
} | ||
|
||
ipak(requiredPackages) | ||
|
||
options(scipen = 999) | ||
|
||
# load the data | ||
|
||
cleanStormData <- read.csv("./data/cleanStormData.csv.bz2") | ||
# cleanStormData$X <- NULL | ||
|
||
# Helper function to print the tab descriptions | ||
printDescription <- function(name,description) { | ||
s <- name | ||
s <- paste0(s, "\n", paste(rep("=",nchar(name)),collapse = '')) | ||
s <- paste0(s, "\n", description) | ||
cat(s) | ||
} | ||
|
||
shinyServer(function(input,output) { | ||
# bubble chart: Greatest Economic Impact by Year for Weather Events | ||
# | ||
output$EconomicImpact <- renderPlot({ | ||
propertyDamageSummary <- cleanStormData %>% | ||
filter(eventBeginYear == input$EconImpactSliderYear) %>% | ||
group_by(eventType) %>% | ||
summarise( | ||
propertyDamage = sum(PROPDMG * propExponent), | ||
cropDamage = sum(CROPDMG * cropExponent), | ||
eventCount = n() | ||
) %>% | ||
mutate(totalDamage = propertyDamage + cropDamage) %>% | ||
mutate(eventFreq = totalDamage / eventCount) %>% | ||
arrange(desc(totalDamage)) | ||
gp <- ggplot( | ||
head(propertyDamageSummary, n = 10), | ||
aes( | ||
x = totalDamage, | ||
y = eventCount, | ||
color = eventType, | ||
label = eventType, | ||
xmin = -250, | ||
ymin = -600, | ||
xmax = 120000000, | ||
ymax = 1000 | ||
) | ||
) | ||
gp <- gp + geom_point(aes(size = totalDamage)) | ||
gp <- gp + scale_size_area(max_size = 20) | ||
gp <- | ||
gp + geom_point(size = 5) + geom_text(size = 4, hjust = .5, vjust = 4) | ||
gp <- gp + theme( | ||
axis.text = element_text(size = 12), | ||
axis.title = element_text(size = 14,face = "bold"), | ||
plot.title = element_text(face = "bold") | ||
) | ||
gp <- gp + xlab(paste0("\n","Total Damage")) | ||
gp <- gp + ylab(paste0("Weather Events","\n")) | ||
gp <- | ||
gp + ggtitle( | ||
paste( | ||
"Events with the greatest economic consequences - Year: ", input$EconImpactSliderYear, "\n" | ||
) | ||
) | ||
print(gp) | ||
}) | ||
|
||
# Print Description of Economic Data | ||
# | ||
output$descriptionTab3 <- renderPrint({ | ||
printDescription( | ||
"Events most harmful with respect to property", | ||
"The events most harmful to property based on the data are Floods, Huricanes and Tornados. Floods and Hurricanes whilst at lower incidence have tremendous capacity to impart large scale economic damage on property." | ||
) | ||
}) | ||
|
||
# bubble chart: Greatest Population Impact by Year for Weather Events | ||
# | ||
output$PopulationImpact <- renderPlot({ | ||
eventImpactSummary <- cleanStormData %>% | ||
filter(eventBeginYear == input$EventImpactSliderYear) %>% | ||
group_by(eventType) %>% | ||
summarise( | ||
eventFatalities = sum(FATALITIES), | ||
eventInjuries = sum(INJURIES), | ||
eventCount = n() | ||
) %>% | ||
mutate(popImpacted = eventFatalities + eventInjuries) %>% | ||
mutate(eventFreq = popImpacted / eventCount) %>% | ||
arrange(desc(popImpacted)) | ||
|
||
gp <- ggplot( | ||
head(eventImpactSummary, n = 10), | ||
aes( | ||
x = eventFatalities, | ||
y = eventInjuries, | ||
color = eventType, | ||
label = eventType, | ||
xmin = -250, | ||
ymin = -250, | ||
ymax = eventInjuries + 1000 | ||
) | ||
) | ||
gp <- gp + geom_point(aes(size = eventCount)) | ||
gp <- gp + scale_size_area(max_size = 20) | ||
gp <- | ||
gp + geom_point(size = 5) + geom_text(size = 4, hjust = .7, vjust = 3) | ||
gp <- gp + theme( | ||
axis.text = element_text(size = 12), | ||
axis.title = element_text(size = 14,face = "bold"), | ||
plot.title = element_text(face = "bold") | ||
) | ||
gp <- gp + xlab(paste0("\n","Total Injuries")) | ||
gp <- gp + ylab(paste0("Total Fatalities","\n")) | ||
gp <- | ||
gp + ggtitle( | ||
paste( | ||
"Events most harmful with respect to population - Year: ", input$EventImpactSliderYear, "\n" | ||
) | ||
) | ||
print(gp) | ||
}) | ||
|
||
# Print Description of Economic Data | ||
# | ||
output$descriptionTab4 <- renderPrint({ | ||
printDescription( | ||
"Events most harmful with respect to population", | ||
"The events most harmful to health based on the data and review period are Tornados, Heat and Wind Storms. Tornadoes and Heat are by far and away the most impactful on human health." | ||
) | ||
}) | ||
|
||
# Geographic chart: Population / Economic Impact by Year for Weather Events | ||
# | ||
|
||
output$GeographicImpact <- renderGvis({ | ||
myYear <- reactive({ | ||
input$GeographicImpactSliderYear | ||
}) | ||
|
||
output$GeographicImpactYear <- renderText({ | ||
paste("Geographic impact of Weather Events in Year: ", myYear()) | ||
}) | ||
|
||
|
||
|
||
propertyDamageSummary2 <- cleanStormData %>% | ||
filter(eventBeginYear == myYear() ) %>% | ||
group_by(STATE, eventBeginYear) %>% | ||
summarise(propertyDamage = sum(PROPDMG * propExponent), | ||
cropDamage = sum(CROPDMG*cropExponent), | ||
totalDamage = sum(propertyDamage + cropDamage), | ||
eventCount = n() ) | ||
|
||
if (input$economicCategoryButton == 'both') { | ||
colorvarChoice <- "totalDamage" | ||
} else if (input$economicCategoryButton == 'property') { | ||
colorvarChoice <- "propertyDamage" | ||
} else { | ||
colorvarChoice <- "cropDamage" | ||
} | ||
|
||
gvisGeoChart( propertyDamageSummary2, | ||
locationvar = "STATE", colorvar = colorvarChoice, | ||
options = list( | ||
region = "US", displayMode = "regions", | ||
resolution = "provinces", | ||
width = 500, height = 400, | ||
colorAxis = "{colors:['#FFFFFF', '#0000FF']}" | ||
) | ||
) | ||
}) | ||
|
||
# Print Description of Economic Data | ||
# | ||
output$descriptionTab5 <- renderPrint({ | ||
printDescription( | ||
"Events most harmful with respect to population or property by state", | ||
"The events most harmful to health and property in the US principally occur in the central and mid states based on the data and review period analysed" | ||
)}) | ||
|
||
}) |