GitHub - polla-fattah/R4NTR at 34e3e235a362751346cdb040aa3cce980857757b

polla-fattah / R4NTR Public

forked from jasongrimes/jekyll-chapterbook

Notifications You must be signed in to change notification settings
Fork 1
Star 0

R for Non-Technical PhD Students (R4PhDs), a comprehensive guide designed to empower researchers with limited technical backgrounds to harness the power of R for data analysis and statistical computing.

polla.dev/R4PhDs/

Apache-2.0 license

0 stars 24 forks Branches Tags Activity

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 249 Commits
_chapters		_chapters
_includes		_includes
_layouts		_layouts
_pages		_pages
_posts		_posts
assets		assets
.gitignore		.gitignore
404.html		404.html
Gemfile		Gemfile
Gemfile.lock		Gemfile.lock
LICENSE		LICENSE
README.md		README.md
_config.yml		_config.yml
sitemap.txt		sitemap.txt

Repository files navigation

layout	title	permalink
home	R for non-technical PhD researchers	/

Book Outline:

Part I: Introduction to R Programming

Chapter 1: Introduction to R and RStudio

Installing R and RStudio
Navigating the RStudio interface
Basic R syntax: Variables, data types, operators, and functions
Writing and running R scripts
Key libraries for data science in R

Chapter 2: Data Structures and Basic Operations in R

Vectors, Matrices, Lists, and Data Frames
Indexing, subsetting, and manipulating data
Importing and exporting data (CSV, Excel, SPSS, etc.)
Basic exploratory analysis (summary statistics, structure, and head/tail functions)

Chapter 3: Data Manipulation with dplyr and tidyr

Filtering, arranging, mutating, and summarizing data
Grouped operations and pipelines using %>%
Reshaping data with pivot_longer() and pivot_wider()
Joining datasets: inner, outer, left, and right joins
Hands-on data cleaning exercise

Chapter 4: Data Visualization with ggplot2

Basic plots: Histograms, scatter plots, bar charts, and box plots
Advanced visualizations: Heatmaps, faceted plots, and density plots
Customizing plots with themes, annotations, and labels
Interactive visualizations with plotly
Case study: Visualizing relationships in a real dataset

Part II: Expanded Statistics

Chapter 5: Descriptive Statistics and Exploratory Data Analysis (EDA)

Measures of central tendency and variability
Identifying outliers and missing data
Visualizing distributions and relationships (e.g., correlation plots)
Preparing datasets for statistical analysis

Chapter 6: Hypothesis Testing and Statistical Inference

Introduction to hypothesis testing
One-sample and two-sample t-tests
Paired t-tests and their applications
Chi-square tests for independence
Non-parametric tests: Wilcoxon and Mann-Whitney U tests

Chapter 7: Advanced Statistical Methods

Analysis of Variance (ANOVA): One-way and two-way
Post hoc tests (e.g., Tukey’s HSD)
Simple and multiple linear regression analysis
Logistic regression for binary outcomes
Case study: Predicting outcomes using regression models

Chapter 8: Multivariate Statistical Techniques

Principal Component Analysis (PCA) for dimensionality reduction
Factor analysis and interpretation of factors
Cluster analysis: k-means and hierarchical clustering
Case study: Clustering research observations

Part III: Machine Learning

Chapter 9: Introduction to Machine Learning in R

Overview of supervised and unsupervised learning
Data preprocessing: Scaling, normalization, and feature engineering
Splitting datasets into training, testing, and validation sets
Implementing cross-validation and hyperparameter tuning

Chapter 10: Supervised Learning - Classification Models

Decision trees and random forests
k-Nearest Neighbors (k-NN)
Support Vector Machines (SVM)
Performance metrics: Confusion matrix, accuracy, precision, recall, F1 score
Case study: Classifying research observations

Chapter 11: Supervised Learning - Regression Models

Advanced regression techniques: Ridge, Lasso, and Elastic Net
Regression trees and boosting methods (e.g., XGBoost, LightGBM)
Hands-on project: Regression modeling for real-world research data

Chapter 12: Unsupervised Learning - Clustering

k-means and hierarchical clustering revisited
Density-based clustering (DBSCAN) and Gaussian Mixture Models
Evaluating clustering performance
Case study: Discovering patterns in research datasets

Chapter 13: Advanced Machine Learning Techniques

Ensemble learning: Bagging and Boosting
Neural networks basics with R (e.g., keras and tensorflow packages)
Time series forecasting using ARIMA and Prophet
Hands-on project: Applying advanced ML techniques

Part IV: Reproducible Research and Project Work

Chapter 14: Reproducible Research with R Markdown and Shiny

Creating dynamic reports with R Markdown
Exporting to PDF, Word, and HTML
Introduction to Shiny apps for interactive research tools
Best practices for reproducible research workflows

Chapter 15: Final Project Presentation and Advanced Applications

Students present their final projects, integrating statistical or machine learning methods
Feedback and discussion on projects
Exploring advanced R tools for domain-specific applications
Course summary and future learning resources

About

R for Non-Technical PhD Students (R4PhDs), a comprehensive guide designed to empower researchers with limited technical backgrounds to harness the power of R for data analysis and statistical computing.

polla.dev/R4PhDs/

Apache-2.0 license

Report repository

Releases

No releases published

Packages

No packages published

Languages

JavaScript 66.7%
HTML 16.1%
CSS 15.7%
Smarty 1.4%
Ruby 0.1%