Skip to content

Commit

Permalink
resolve conflicts
Browse files Browse the repository at this point in the history
  • Loading branch information
BartoszSambor committed Jun 15, 2022
2 parents ae55f6e + d577831 commit b7c3f3a
Show file tree
Hide file tree
Showing 6 changed files with 154 additions and 18 deletions.
64 changes: 55 additions & 9 deletions 1.Data_exploration.ipynb
Original file line number Diff line number Diff line change
@@ -1,5 +1,45 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "6efb0fe4-5679-4493-a35b-0495cda634f1",
"metadata": {},
"source": [
"[1.1 Load data from csv file](#1.1-Load-data-from-csv-file) \n",
"\n",
"[1.2 Use `Shapely` to visualise LineStrings from `the_geom` column](#1.2-Use-Shapely-to-visualise-LineStrings-from-the_geom-column)\n",
"\n",
"[1.3 Visualize all values from the_geom column on one plot](#1.3-Visualize-all-values-from-the_geom-column-on-one-plot)\n",
"\n",
"[1.4 Create geometry column by loading LineStrings from the_geom column as Shapely objects](#1.4-Create-geometry-column-by-loading-LineStrings-from-the_geom-column-as-Shapely-objects)\n",
"- [1.4.1 Plot the results](#1.4.1-Plot-the-results)\n",
"\n",
"[1.5 With the use of `osmnx` create visualisation of San Francisco](#1.5-With-the-use-of-osmnx-create-visualisation-of-San-Francisco)\n",
"- [1.5.1 Load graph of the city fom `osmnx`](#1.5.1-Load-graph-of-the-city-fom-osmnx)\n",
"- [1.5.2 Show city visualisation with the use of `ox` graph](#1.5.2-Show-city-visualisation-with-the-use-of--ox-graph)\n",
"\n",
"[1.6 Retrieve nodes and edges from San Francisco graph](#1.6-Retrieve-nodes-and-edges-from-San-Francisco-graph)\n",
"- [1.6.1 Visualize all nodes on the map](#1.6.1-Visualize-all-nodes-on-the-map)\n",
"- [1.6.2 Analyse edges information](#1.6.2-Analyse-edges-information)\n",
"- [1.6.3 Visualize San Francisco streets](#1.6.3-Visualize-San-Francisco-streets)\n",
"\n",
"[1.7 Plot `Speed limit compliance in San Francisco` data on San Francisco map](#1.7-Plot-Speed-limit-compliance-in-San-Francisco-data-on-San-Francisco-map)\n",
"\n",
"[1.8 Find a way to join edges with the `Speed limit compliance in SF` dataset](#1.8-Find-a-way-to-join-edges-with-the-Speed-limit-compliance-in-SF-dataset)\n",
"- [1.8.1 Quick glimpse on the geometrical data](#1.8.1-Quick-glimpse-on-the-geometrical-data)\n",
"- [1.8.2 Try to join speed limit compliance data with the edges data by performing spatial join](#1.8.2-Try-to-join-speed-limit-compliance-data-with-the-edges-data-by-performing-spatial-join.)\n",
"- [1.8.3 Show joined data on the plot](#1.8.3-Show-joined-data-on-the-plot.)\n",
"\n",
"[1.9 Join osmnx edges with Speed compliance data by the street column](#1.9-Join-osmnx-edges-with-Speed-compliance-data-by-the-street-column.)\n",
"\n",
"[1.10 Next approach to data joining](#1.10-Next-approach-to-data-joining)\n",
"- [1.10.1 Join attempt result](#1.10.1-Join-attempt-result)\n",
"\n",
"[1.11 Join data by name of the street](#1.11-Join-data-by-name-of-the-street)\n",
"- [1.11.1 Further data matching](#1.11.1-Further-data-matching)\n",
"- [1.11.2 Plot the result of joining by the streetname](#1.11.2-Plot-the-result-of-joining-by-the-streetname)"
]
},
{
"cell_type": "markdown",
"id": "2647541d-24f2-4e71-9212-e61af28f0fff",
Expand Down Expand Up @@ -32,7 +72,9 @@
{
"cell_type": "markdown",
"id": "9d8012a3-ffa9-4871-9cfd-f85e1a664102",
"metadata": {},
"metadata": {
"tags": []
},
"source": [
"# 1. Dataset analysis"
]
Expand Down Expand Up @@ -452,7 +494,7 @@
"id": "73b4ae59-cbec-4e71-9f28-c631a18343a0",
"metadata": {},
"source": [
"### Plot the results"
"### 1.4.1 Plot the results"
]
},
{
Expand Down Expand Up @@ -2059,7 +2101,7 @@
"id": "4b0cba0e-8be2-4be6-a880-6918d1aa7c7f",
"metadata": {},
"source": [
"### Show joined data on the plot."
"### 1.8.3 Show joined data on the plot."
]
},
{
Expand Down Expand Up @@ -3123,7 +3165,7 @@
"tags": []
},
"source": [
"### First attempt (failure)"
"## 1.10 Next approach to data joining"
]
},
{
Expand Down Expand Up @@ -3640,7 +3682,7 @@
"id": "9911b20c-632b-4fac-a0d4-8ccf84bfb3a0",
"metadata": {},
"source": [
"### Join attempt result\n",
"### 1.10.1 Join attempt result\n",
"As it can be shown on the following plot the joined data isn't ideal. We came across many edge cases. For instance if one street crosses another in only one point the data is joined and it's treated as the same street part which obviously is incorrect. \n",
"<br>\n",
"\n",
Expand Down Expand Up @@ -3702,6 +3744,7 @@
"tags": []
},
"source": [
"## 1.11 Join data by name of the street\n",
"#### We've decided to try another approach to the joining those two datasets.\n",
"\n",
"In both datasets we have `streetname` column but range of values used in each of them aren't the same. \n",
Expand Down Expand Up @@ -3775,7 +3818,9 @@
{
"cell_type": "markdown",
"id": "b94c446c-1b83-4838-98cd-868ac2852ace",
"metadata": {},
"metadata": {
"tags": []
},
"source": [
"#### New column with upper case streetnames is created as all column values in the Speed limit compliance is in upper case.\n",
"#### It will make easier comparing strings between those columns and perform joining."
Expand Down Expand Up @@ -3892,7 +3937,8 @@
"id": "99e3b938-ccd6-4507-8224-32603be69484",
"metadata": {},
"source": [
"#### Retrieve only those records from join_result where upper_name (column of streetnames from osmnx edges of San Francisco) includes first part of STREETNAME (column of Speed limit compliance of San Francisco) has values in osmnx edges dataset.\n",
"## 1.11.1 Further data matching\n",
"#### Retrieve only those records from join_result where upper_name (column of streetnames from osmnx edges of San Francisco) includes first part of STREETNAME (column of Speed limit compliance of San Francisco) has values in osmnx edges dataset.\n",
"\n",
"We tried to not match streets that are incorrectly joined and we limited all streets to those which names are in the Speed limit compliance dataset.\n"
]
Expand Down Expand Up @@ -4310,7 +4356,7 @@
"id": "951876c0-08ae-45a3-85b4-c5861b51e812",
"metadata": {},
"source": [
"#### Plot the result of joining by the `streetname`"
"### 1.11.2 Plot the result of joining by the `streetname`"
]
},
{
Expand Down Expand Up @@ -4817,7 +4863,7 @@
"id": "5545d289-73b7-4830-b97a-e2c48072f427",
"metadata": {},
"source": [
"#### This approach didn't give satisfactional results so next steps and ideas had to be performed on the data."
"#### 1. This approach didn't give satisfactional results so next steps and ideas had to be performed on the data."
]
}
],
Expand Down
16 changes: 12 additions & 4 deletions 3.Machine learning.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -21908,7 +21908,7 @@
"import matplotlib.pyplot as plt\n",
"import geopandas\n",
"fig, ax = plt.subplots(figsize=(32,40))\n",
"osmnx.plot(ax=ax, linewidth=4, column='Over_pct', cmap='OrRd')\n",
"osmnx.plot(ax=ax, linewidth=4, column='over_pct', cmap='OrRd')\n",
"plt.tight_layout()"
]
},
Expand Down Expand Up @@ -21941,7 +21941,7 @@
],
"source": [
"fig, ax = plt.subplots(figsize=(32,40))\n",
"osmnx.plot(ax=ax, linewidth=4, column='O5mph_pct', cmap='OrRd')\n",
"osmnx.plot(ax=ax, linewidth=4, column='o5mph_pct', cmap='OrRd')\n",
"plt.tight_layout()"
]
},
Expand Down Expand Up @@ -21974,7 +21974,7 @@
],
"source": [
"fig, ax = plt.subplots(figsize=(32,40))\n",
"osmnx.plot(ax=ax, linewidth=4, column='Speed_avg', cmap='OrRd')\n",
"osmnx.plot(ax=ax, linewidth=4, column='speed_avg', cmap='OrRd')\n",
"plt.tight_layout()"
]
},
Expand Down Expand Up @@ -22007,7 +22007,7 @@
],
"source": [
"fig, ax = plt.subplots(figsize=(32,40))\n",
"osmnx.plot(ax=ax, linewidth=4, column='SpeedO_avg', cmap='OrRd')\n",
"osmnx.plot(ax=ax, linewidth=4, column='speedo_avg', cmap='OrRd')\n",
"plt.tight_layout()"
]
},
Expand Down Expand Up @@ -22040,7 +22040,11 @@
],
"source": [
"fig, ax = plt.subplots(figsize=(32,40))\n",
<<<<<<< HEAD
"osmnx.plot(ax=ax, linewidth=4, column='Spd5O_avg', cmap='OrRd')\n",
=======
"osmnx.plot(ax=ax, linewidth=4, column='spd5o_avg', cmap='OrRd')\n",
>>>>>>> dev
"plt.tight_layout()"
]
},
Expand Down Expand Up @@ -22069,7 +22073,11 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
<<<<<<< HEAD
"version": "3.9.12"
=======
"version": "3.10.4"
>>>>>>> dev
}
},
"nbformat": 4,
Expand Down
91 changes: 86 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,96 @@ We will work with the **[Speed limit compliance in San Francisco](https://data.s

Explore the repo in the following order:

[ 2.1 Obtaining road data in San Francisco (OSMNX data set)](2.Data&#32;processing.ipynb#2.1_obtaining_road_data_in_san_francisco_(osmnx_data_set))
You can install all needed libraries (and propably a few needless) using conda and [requirements file](requirements.txt)

[ 2.2 Data cleaning and preparing for ML model (OSMNX data set)](2.Data&#32;processing.ipynb#2.2_data_cleaning_and_preparing_for_ML_model_(osmnx_data_set))
Unfortunately we can't link to specific header inside jupyter file - this is a bug which has not been resolved for 5 years - [see the issue thread](https://gitlab.com/gitlab-org/gitlab/-/issues/18269)

[ 3.1 Example machine learning usage ](3.Machine learning.ipynb)
Explore the repo in the following order:

You can install all needed libraries (and propably a few needless) using conda and [requirements file](requirements.txt)
[1.1 Load data from csv file](1.Data_exploration.ipynb#1.1-Load-data-from-csv-file)

Unfortunately we can't link to specific header inside jupyter file - this is a bug which has not been resolved for 5 years - [see the issue thread](https://gitlab.com/gitlab-org/gitlab/-/issues/18269)
[1.2 Use `Shapely` to visualise LineStrings from `the_geom` column](1.Data_exploration.ipynb#1.2-Use-Shapely-to-visualise-LineStrings-from-the_geom-column)

[1.3 Visualize all values from the_geom column on one plot](1.Data_exploration.ipynb#1.3-Visualize-all-values-from-the_geom-column-on-one-plot)

[1.4 Create geometry column by loading LineStrings from the_geom column as Shapely objects](1.Data_exploration.ipynb#1.4-Create-geometry-column-by-loading-LineStrings-from-the_geom-column-as-Shapely-objects)
- [1.4.1 Plot the results](1.Data_exploration.ipynb#1.4.1-Plot-the-results)

[1.5 With the use of `osmnx` create visualisation of San Francisco](1.Data_exploration.ipynb#1.5-With-the-use-of-osmnx-create-visualisation-of-San-Francisco)
- [1.5.1 Load graph of the city fom `osmnx`](1.Data_exploration.ipynb#1.5.1-Load-graph-of-the-city-fom-osmnx)
- [1.5.2 Show city visualisation with the use of `ox` graph](1.Data_exploration.ipynb#1.5.2-Show-city-visualisation-with-the-use-of--ox-graph)

[1.6 Retrieve nodes and edges from San Francisco graph](1.Data_exploration.ipynb#1.6-Retrieve-nodes-and-edges-from-San-Francisco-graph)
- [1.6.1 Visualize all nodes on the map](1.Data_exploration.ipynb#1.6.1-Visualize-all-nodes-on-the-map)
- [1.6.2 Analyse edges information](1.Data_exploration.ipynb#1.6.2-Analyse-edges-information)
- [1.6.3 Visualize San Francisco streets](1.Data_exploration.ipynb#1.6.3-Visualize-San-Francisco-streets)

[1.7 Plot `Speed limit compliance in San Francisco` data on San Francisco map](1.Data_exploration.ipynb#1.7-Plot-Speed-limit-compliance-in-San-Francisco-data-on-San-Francisco-map)

[1.8 Find a way to join edges with the `Speed limit compliance in SF` dataset](1.Data_exploration.ipynb#1.8-Find-a-way-to-join-edges-with-the-Speed-limit-compliance-in-SF-dataset)
- [1.8.1 Quick glimpse on the geometrical data](1.Data_exploration.ipynb#1.8.1-Quick-glimpse-on-the-geometrical-data)
- [1.8.2 Try to join speed limit compliance data with the edges data by performing spatial join](1.Data_exploration.ipynb#1.8.2-Try-to-join-speed-limit-compliance-data-with-the-edges-data-by-performing-spatial-join.)
- [1.8.3 Show joined data on the plot](1.Data_exploration.ipynb#1.8.3-Show-joined-data-on-the-plot.)

[1.9 Join osmnx edges with Speed compliance data by the street column](1.Data_exploration.ipynb#1.9-Join-osmnx-edges-with-Speed-compliance-data-by-the-street-column.)

[1.10 Next approach to data joining](1.Data_exploration.ipynb#1.10-Next-approach-to-data-joining)
- [1.10.1 Join attempt result](1.Data_exploration.ipynb#1.10.1-Join-attempt-result)

[1.11 Join data by name of the street](1.Data_exploration.ipynb#1.11-Join-data-by-name-of-the-street)
- [1.11.1 Further data matching](1.Data_exploration.ipynb#1.11.1-Further-data-matching)
- [1.11.2 Plot the result of joining by the streetname](1.Data_exploration.ipynb#1.11.2-Plot-the-result-of-joining-by-the-streetname)


[2.1 Obtaining road data in San Francisco (OSMNX data set)](2.Data processing.ipynb#2_1)

[2.2 Data cleaning and preparing for ML model (OSMNX data set)](2.Data processing.ipynb#2_2)
- [2.2.1 dropping unnecessary columns](2.Data processing.ipynb#2_2_1)
- [2.2.2 improving "maxspeed" column](2.Data processing.ipynb#2_2_2)
- [2.2.3 improving "oneway" column](2.Data processing.ipynb#2_2_3)
- [2.2.4 improving "lanes" column](2.Data processing.ipynb#2_2_4)
- [2.2.5 improving "highway" column](2.Data processing.ipynb#2_2_5)
- [2.2.5 improving "name" column](2.Data processing.ipynb#2_2_6)
- [2.2.6 summary](2.Data processing.ipynb#2_2_7)

[2.3 Obtaining speed limit data in San Francisco (SanFranciscoSpeedLimitCompliance data set)](2.Data processing.ipynb#2_3)

[2.4 Data cleaning and preparing for ML model (SanFranciscoSpeedLimitCompliance data set)](2.Data processing.ipynb#2_4)
- [2.4.1 Dropping unnecessary column](2.Data processing.ipynb#2_4_1)
- [2.4.2 improving "speedlimit" column](2.Data processing.ipynb#2_4_2)
- [2.4.2 improving "the_geom" column](2.Data processing.ipynb#2_4_3)

[2.5 Merging datasets](2.Data processing.ipynb#2_5)

[2.6 Preparing merged data for Machine Learning Model](2.Data processing.ipynb#2_6)
- [2.6.1 Data set for machine learning](2.Data processing.ipynb#2_6_1)
- [2.6.2 Data set for predictions](2.Data processing.ipynb#2_6_2)


##### Example machine learning usage:

[3. Creating machine learning model](3.Machine learning.ipynb#3)
- [3.1.1 Load dataset prepared previousl](3.Machine learning.ipynb#3_1_1)
- [3.1.2 Split into training and testing sets](3.Machine learning.ipynb#3_1_2)

[3.2 Scalling values](3.Machine learning.ipynb#3_2)

[3.3 Building deep learning model model](3.Machine learning.ipynb#3_3)
- [3.3.1 Finding the best parameters](3.Machine learning.ipynb#3_3_1)

[3.4 Predictions for all streets in San Francisco](3.Machine learning.ipynb#3_4)
- [3.4.1 Load all streets (saved previously)](3.Machine learning.ipynb#3_4_1)
- [3.4.2 Scale - exactly like we did it before with training dataset](3.Machine learning.ipynb#3_4_2)
- [3.4.3 Make prediction](3.Machine learning.ipynb#3_4_3)

[4. Result visualisation by coloring map](3.Machine learning.ipynb#4)
- [4.1 Precentage of cars moving too fast - Over_pct](3.Machine learning.ipynb#4_1)
- [4.2 Precentage of cars exceeding speed limit more than 5 mph - O5mph_pct](3.Machine learning.ipynb#4_2)
- [4.3 Average speed - Speed_avg](3.Machine learning.ipynb#4_3)
- [4.4 Average speed ovet speed limit - SpeedO_avg](3.Machine learning.ipynb#4_4)
- [4.5 Average speed of cars exceeding speed limit for more than 5 mph - Spd5O_avg](3.Machine learning.ipynb#4_5)

You can install all needed libraries (and propably a few needless) using conda and [requirements file](requirements.txt)

Team members:
- Bartosz Sambór
Expand Down
1 change: 1 addition & 0 deletions data/graph.osm

Large diffs are not rendered by default.

Binary file added osmnx_ml_data_set.pkl
Binary file not shown.
Binary file added speed_limit_ml_data_set.pkl
Binary file not shown.

0 comments on commit b7c3f3a

Please sign in to comment.