About this python workshop

Goals

This 4 day workshop is intended to introduce participants to the python language. It is designed to provide the solid foundation needed to conduct data analysis and visualization for data science. While no previous experience is required, some basic programming or data science experience is helpful.

I will lean heavily on the book Python for Data Analysis (as well as the Python Data Science Handbook).

The first day will focus on the fundamentals of data types and flow structures while the ultimate goal of the course will be to introduce you to statistical thinking, data literacy and modeling.

DataCamp

DataCamp is a pretty good resource for students to learn coding and data analysis skills. By completing the DataCamp courses listed below we would be able to significantly shorten the time we spend on basics and open up more space for data science concepts.

If you have extra time:

Data Manipulation with pandas

And much more advanced and totally optional:

Coding Environment

The most convenient environment for you to code in might be Google Colab, for which you probably need a gmail account. It does not hurt to look at the 2-minute intro video. If you prefer a real IDE, I would recommend Visual Studio or PyCharm. (I will not be able to help much with the latter though)

Agenda

Day 1: Basic python programming

basic data types: lists, tuples, dictionaries, strings
control structures (for, if else, while)
functions
numpy arrays: slicing and subsetting, axis
Probabilistic Simulations
basic plots

Day 2: pandas and visualization

pandas Data Frames: slicing and subsetting
Counting and Summary Statistics
Handling Files
Grouped Operations
plotting with pandas
Contingency Tables as models

Day 3: Statistical Modeling

A/B Testing and sampling distributions
Hypothesis Testing
- parametric
- permutation
- the bootstrap
regression
- simple and multiple
- logistic
- categorical variables and interactions
- regularization

Day 4: Machine Learning

Basic ML tools
- Cross Validation
- sklearn
- Data Cleaning
Classification and Regression Trees
Random Forests and Boosting
Exlainable ML
- Partial dependence plots
- SHAP values

Lecturer

Prof. Dr. Markus Loecher

Professor for Mathematics and Statistics

Berlin School of Economics and Law

https://www.linkedin.com/in/loecher/

https://www.researchgate.net/profile/Markus-Loecher

my blog

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
Labs		Labs
Lectures		Lectures
ExampleNotebookInR.Rmd		ExampleNotebookInR.Rmd
Guessing-Shoe-Sizes.pdf		Guessing-Shoe-Sizes.pdf
README.md		README.md
_book.zip		_book.zip

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About this python workshop

Goals

DataCamp

Coding Environment

Agenda

Day 1: Basic python programming

Day 2: pandas and visualization

Day 3: Statistical Modeling

Day 4: Machine Learning

Lecturer

About

Releases

Packages

Languages

markusloecher/Python-Workshop

Folders and files

Latest commit

History

Repository files navigation

About this python workshop

Goals

DataCamp

Coding Environment

Agenda

Day 1: Basic python programming

Day 2: pandas and visualization

Day 3: Statistical Modeling

Day 4: Machine Learning

Lecturer

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages