Skip to content

Data visualisations corresponding to the current Covid19 outbreak in South Africa

Notifications You must be signed in to change notification settings

SimonRosen173/Covid19SAData

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Covid19 SA Data

Website to show data visualisations pertaining to current coronavirus outbreak in South Africa.

Note: This repo has swollen to be massive in size due to the frequent commits of large files & a lot of corruption has occured. The contents of this repo will most likely be moved to a new repo as well as a major rework of the site. This rework includes changing how the site is updated, moving the site off GitHub pages and changing from Plotly.py to Plotly.js


Note: I manually recreate the graphs and ensure the data is correct, thus the website may take a bit of time to update after data is officially released.

Parts of Repo

The repo currently consists of the following parts:

  • Jekyll related files to control styling of site.
    • These folders are the ones preceded by '_'
  • Jupyter notebooks to preprocess data, calculate predictions and generate curves
    • Data_Preprocessing.ipynb
    • Predictions.ipynb
    • Visualisations.ipynb
    • Other notebooks are currently not used.
  • Graphs in HTML file form.
  • template_renderer.py
    • Custom template renderer that allows for use of variables in markdown files.
    • Markdwon files followed by '_template' are used by the template_renderer.
  • Markdown files
    • Used as a template by Jekyll to render markdown to html. These are the files that compile to the website page/s.
  • Data folder
    • csv files that are used to generate graphs and data for site.
  • NICD updates
    • NICD daily updates in image form taken from their twitter.
    • Note that this folder does not contain all the updates.
  • data_from_img.py
    • Python code to use computer vision along with the associated preprocessing (pytesseract & OpenCV) to automatically get data from the NICD twitter update infographics.
    • This is an overengineered solution to a simple problem (getting latest data) that is not 100% accurate in its output. None the less it is a fun and informative intro to computer vision for a real world scenario.

Upcomming Features

Backend

  • Consolidate Jupyter notebooks code from preprocessing data and visualisations into single callable Pythpn file.
    • Will make updating data on site much easier and could potentially be triggered by updates to the DSFSI research group repo.

Misc

  • Jupyter notebook to show process of preprocessing image and extracting text data from NICD infographics.

Front End

Site Layout

  • Split site into multiple pages for better usability and smaller download sizes. I.e. page for provinces and then a page for each province.
  • Data per district for each province. This will first be Gauteng, then Western Cape, then Kwa-Zulu Natal and from there it is undecided.

Graphs/Charts

  • Make better use of hide trace functionality of Plotly.
    • I.e. instead of seperate graphs for tests and confirmed over time use a single graph containing both and stress the use of clicking on the legend to hide the ones you don't wish to see.
  • Add active cases to confirmed cases and tests graphs.
  • Deaths per province over time graph.
  • Recoveries per province pie chart.
  • Replace totals per province charts with choropleth maps. (Potentially)

Acknowledgements

Libraries Used

Data

Original data taken from the following sources:

License

Data License: CC BY-NC-SA 4.0

About

Data visualisations corresponding to the current Covid19 outbreak in South Africa

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published