Skip to content

Commit

Permalink
finished wind statistics exercise
Browse files Browse the repository at this point in the history
  • Loading branch information
guipsamora committed Jul 26, 2016
1 parent 73485bc commit 884406a
Show file tree
Hide file tree
Showing 6 changed files with 7,037 additions and 2,152 deletions.
64 changes: 36 additions & 28 deletions Stats/Wind_Stats/Exercises.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
"Using pandas should make this exercise\n",
"easier, in particular for the bonus question.\n",
"\n",
"Of course, you should be able to perform all of these operations without using\n",
"You should be able to perform all of these operations without using\n",
"a for loop or other looping construct.\n",
"\n",
"\n",
Expand All @@ -26,7 +26,7 @@
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": 434,
"metadata": {
"collapsed": false
},
Expand All @@ -37,7 +37,7 @@
"'\\nYr Mo Dy RPT VAL ROS KIL SHA BIR DUB CLA MUL CLO BEL MAL\\n61 1 1 15.04 14.96 13.17 9.29 NaN 9.87 13.67 10.25 10.83 12.58 18.50 15.04\\n61 1 2 14.71 NaN 10.83 6.50 12.62 7.67 11.50 10.04 9.79 9.67 17.54 13.83\\n61 1 3 18.50 16.88 12.33 10.13 11.17 6.17 11.25 NaN 8.50 7.67 12.75 12.71\\n'"
]
},
"execution_count": 1,
"execution_count": 434,
"metadata": {},
"output_type": "execute_result"
}
Expand Down Expand Up @@ -73,27 +73,25 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": null,
"metadata": {
"collapsed": true
"collapsed": false
},
"outputs": [],
"source": [
"import pandas as pd"
]
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Step 2. Import the dataset from this [address](https://raw.githubusercontent.com/justmarkham/DAT8/master/data/chipotle.tsv). "
"### Step 2. Import the dataset from this [address](https://github.com/guipsamora/pandas_exercises/blob/master/Stats/Wind_Stats/wind.data)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Step 3. Assign it to a variable called chipo."
"### Step 3. Assign it to a variable called data and replace the first 3 columns by a proper datetime index."
]
},
{
Expand All @@ -109,7 +107,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Step 4. See the first 10 entries"
"### Step 4. Year 2061? Do we really have data from this year? Create a function to fix it and apply it."
]
},
{
Expand All @@ -125,7 +123,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Step 5. What is the number of observations in the dataset?"
"### Step 5. Set the right dates as the index. Pay attention at the data type, it should be datetime64[ns]."
]
},
{
Expand All @@ -141,14 +139,15 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Step 6. What is the number of columns in the dataset?"
"### Step 6. Compute how many values are missing for each location over the entire record. \n",
"#### They should be ignored in all calculations below. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
"collapsed": false
},
"outputs": [],
"source": []
Expand All @@ -157,14 +156,15 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Step 7. Print the name of all the columns."
"### Step 7. Compute how many non-missing values there are in total."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
"collapsed": false,
"scrolled": true
},
"outputs": [],
"source": []
Expand All @@ -173,7 +173,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Step 8. How is the dataset indexed?"
"### Step 8. Calculate the mean windspeeds of the windspeeds over all the locations and all the times.\n",
"#### A single number for the entire dataset."
]
},
{
Expand All @@ -189,7 +190,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Step 9. Which was the most ordered item?"
"### Step 9. Create a DataFrame called loc_stats and calculate the min, max and mean windspeeds and standard deviations of the windspeeds at each location over all the days \n",
"\n",
"#### A different set of numbers for each location."
]
},
{
Expand All @@ -205,7 +208,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Step 10. What was the most ordered item?"
"### Step 10. Create a DataFrame called day_stats and calculate the min, max and mean windspeed and standard deviations of the windspeeds across all the locations at each day.\n",
"\n",
"#### A different set of numbers for each day."
]
},
{
Expand All @@ -221,7 +226,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Step 11. How many items were orderd in total?"
"### Step 11. Find the average windspeed in January for each location. \n",
"#### Treat January 1961 and January 1962 both as January."
]
},
{
Expand All @@ -237,7 +243,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Step 12. How many orders have more than 1 item?"
"### Step 12. Downsample the record to a yearly frequency for each location."
]
},
{
Expand All @@ -253,7 +259,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Step 13. How much was the revenue for the period in the dataset?"
"### Step 13. Downsample the record to a monthly frequency for each location."
]
},
{
Expand All @@ -269,14 +275,14 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Step 14. How many orders were made in the period?"
"### Step 14. Downsample the record to a weekly frequency for each location."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
"collapsed": false
},
"outputs": [],
"source": []
Expand All @@ -285,14 +291,16 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Step 15. What is the average amount per order?"
"### Step 15. Calculate the mean windspeed for each month in the dataset. \n",
"#### Treat January 1961 and January 1962 as *different* months.\n",
"#### (hint: first find a way to create an identifier unique for each month.)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
"collapsed": false
},
"outputs": [],
"source": []
Expand All @@ -301,14 +309,14 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Step 16. How many different itens are sold?"
"### Step 16. Calculate the min, max and mean windspeeds and standard deviations of the windspeeds across all locations for each week (assume that the first week starts on January 2 1961) for the first 52 weeks."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
"collapsed": false
},
"outputs": [],
"source": []
Expand Down
Loading

0 comments on commit 884406a

Please sign in to comment.