Project code link: https://winnieshi.github.io/Data-Analysis-of-Boston-Crime/Data%20Analysis%20of%20Boston%20Crime.html
The project includes four parts:
- Boston crime data collection,
- Boston crime data analysis,
- Weather data collection and
- Analysis of weather&crime data.
In this part,
(1) converting all data to the proper data type,
(2) filling or deleting missing values depending on different situations,
(3) deleting duplicates,
(4) deleting ambiguous data.
In this part, analyzing
(1) number of crimes for different categories,
(2) number of crimes for different districts,
(3) number of crimes in time series.
After diving into the Boston Crime Dataset, we can clearly see the trends and relations between the types of offenses, location and the occurances of the offense. The dataset shows:
- Top-5 non-shooting offense: 'Motor Vehicle Accident Response', 'Larceny', 'Medical Assistance', 'Investigate Person' and 'Others' and the top-5 shooting offenses. 'Aggravated Assault', 'Investigate Property', 'Ballistics', 'Homicide'and 'Vandalism'.
- Top-5 dangous district: Roxbuy, Dorchester, South End, Mattapan and Downtown.
- Highest number of offense was around August. The decrease of number of offenses on March 2020 may be caused by shutdown Policy for Covid-19. The dramatic increase of shooting offenses on May, 2020 may be caused by 'Killing of George Floyd' on May 25, 2020.
- The shooting offenses are more likely to happen from 22:00 to 24:00 and on weekends.
- The non-shooting offenses are more likely to happen during the weekdays and daytime.
- Relevant department should install more cameras on the street because motor vehicle accident response is the largest count of offenses and leaving scene in the vehicle accident occupied the most.
- Residents and shop-owners should also install cameras if necessary.
- Drivers should obey the traffic regularizations and buy the car insurance.
- Residents and travellers should avoid going out at late night.
I use the Spark SQL to clean and parse the Boston Crime Dataset. Analyze the relations and trends and possible reasons between the types of offenses and locations. Provide advice for the police, residents and travellers.
(1) Collecting data from https://www.ncdc.noaa.gov/cdo-web/ from 2015-06-01 to 2021-04-30,
(2) converting all data to the proper data type,
(3) Filling null blanks.
In this part,
(1) Analysis relation between temperature and number of motor vehicle accidents,
(2) Analysis relation between precipitation and number of motor vehicle accidents,
(3) Analysis relation between wind speed and number of motor vehicle accidents,
(4) Analysis relation between extreme weather(fog,hail,glaze,smoke) and number of motor vehicle accidents,
(5) Analysis relation between snow and number of motor vehicle accidents.
(1) Number of motor vehicle accident doesnt't have same trends with temperature, but it has same trends with the number of all crimes.
(2) Precipitation, wind speed and extreme weather don't have impact on motor vehicle accident.
(3) In the snowy days, number of motor vehicle accidents was less than others.