Big Data with PySpark

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
books		books
cheat-sheets		cheat-sheets
datasets		datasets
images		images
jars		jars
notebooks		notebooks
other_materials		other_materials
.gitignore		.gitignore
README.md		README.md
poc.ipynb		poc.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Big Data with PySpark

Progress

Introducation to PySpark

Getting to know PySpark

Manipulating data

Getting started with machine learning pipelines

Model tuning and selection

Big Data Fundamentals with PySpark

Introduction to Big Data analysis with Spark

Programming in PySpark RDD’s

PySpark SQL & DataFrames

Machine Learning with PySpark MLlib

Cleaning Data with PySpark

DataFrame details

Manipulating DataFrames in the real world

Improving Performance

Complex processing and data pipelines

About

Releases

Packages

Languages

maheshcheetirala/big-data-with-pyspark

Folders and files

Latest commit

History

Repository files navigation

Big Data with PySpark

Progress

Introducation to PySpark

Getting to know PySpark

Manipulating data

Getting started with machine learning pipelines

Model tuning and selection

Big Data Fundamentals with PySpark

Introduction to Big Data analysis with Spark

Programming in PySpark RDD’s

PySpark SQL & DataFrames

Machine Learning with PySpark MLlib

Cleaning Data with PySpark

DataFrame details

Manipulating DataFrames in the real world

Improving Performance

Complex processing and data pipelines

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages