Skip to content

R for Non-Technical PhD Students (R4PhDs), a comprehensive guide designed to empower researchers with limited technical backgrounds to harness the power of R for data analysis and statistical computing.

License

Notifications You must be signed in to change notification settings

polla-fattah/R4NTR

Repository files navigation

layout title permalink
home
R for non-technical PhD researchers
/

Book Outline:

Part I: Introduction to R Programming

Chapter 1: Introduction to R and RStudio

  • Installing R and RStudio
  • Navigating the RStudio interface
  • Basic R syntax: Variables, data types, operators, and functions
  • Writing and running R scripts
  • Key libraries for data science in R

Chapter 2: Data Structures and Basic Operations in R

  • Vectors, Matrices, Lists, and Data Frames
  • Indexing, subsetting, and manipulating data
  • Importing and exporting data (CSV, Excel, SPSS, etc.)
  • Basic exploratory analysis (summary statistics, structure, and head/tail functions)

Chapter 3: Data Manipulation with dplyr and tidyr

  • Filtering, arranging, mutating, and summarizing data
  • Grouped operations and pipelines using %>%
  • Reshaping data with pivot_longer() and pivot_wider()
  • Joining datasets: inner, outer, left, and right joins
  • Hands-on data cleaning exercise

Chapter 4: Data Visualization with ggplot2

  • Basic plots: Histograms, scatter plots, bar charts, and box plots
  • Advanced visualizations: Heatmaps, faceted plots, and density plots
  • Customizing plots with themes, annotations, and labels
  • Interactive visualizations with plotly
  • Case study: Visualizing relationships in a real dataset

Part II: Expanded Statistics

Chapter 5: Descriptive Statistics and Exploratory Data Analysis (EDA)

  • Measures of central tendency and variability
  • Identifying outliers and missing data
  • Visualizing distributions and relationships (e.g., correlation plots)
  • Preparing datasets for statistical analysis

Chapter 6: Hypothesis Testing and Statistical Inference

  • Introduction to hypothesis testing
  • One-sample and two-sample t-tests
  • Paired t-tests and their applications
  • Chi-square tests for independence
  • Non-parametric tests: Wilcoxon and Mann-Whitney U tests

Chapter 7: Advanced Statistical Methods

  • Analysis of Variance (ANOVA): One-way and two-way
  • Post hoc tests (e.g., Tukey’s HSD)
  • Simple and multiple linear regression analysis
  • Logistic regression for binary outcomes
  • Case study: Predicting outcomes using regression models

Chapter 8: Multivariate Statistical Techniques

  • Principal Component Analysis (PCA) for dimensionality reduction
  • Factor analysis and interpretation of factors
  • Cluster analysis: k-means and hierarchical clustering
  • Case study: Clustering research observations

Part III: Machine Learning

Chapter 9: Introduction to Machine Learning in R

  • Overview of supervised and unsupervised learning
  • Data preprocessing: Scaling, normalization, and feature engineering
  • Splitting datasets into training, testing, and validation sets
  • Implementing cross-validation and hyperparameter tuning

Chapter 10: Supervised Learning - Classification Models

  • Decision trees and random forests
  • k-Nearest Neighbors (k-NN)
  • Support Vector Machines (SVM)
  • Performance metrics: Confusion matrix, accuracy, precision, recall, F1 score
  • Case study: Classifying research observations

Chapter 11: Supervised Learning - Regression Models

  • Advanced regression techniques: Ridge, Lasso, and Elastic Net
  • Regression trees and boosting methods (e.g., XGBoost, LightGBM)
  • Hands-on project: Regression modeling for real-world research data

Chapter 12: Unsupervised Learning - Clustering

  • k-means and hierarchical clustering revisited
  • Density-based clustering (DBSCAN) and Gaussian Mixture Models
  • Evaluating clustering performance
  • Case study: Discovering patterns in research datasets

Chapter 13: Advanced Machine Learning Techniques

  • Ensemble learning: Bagging and Boosting
  • Neural networks basics with R (e.g., keras and tensorflow packages)
  • Time series forecasting using ARIMA and Prophet
  • Hands-on project: Applying advanced ML techniques

Part IV: Reproducible Research and Project Work

Chapter 14: Reproducible Research with R Markdown and Shiny

  • Creating dynamic reports with R Markdown
  • Exporting to PDF, Word, and HTML
  • Introduction to Shiny apps for interactive research tools
  • Best practices for reproducible research workflows

Chapter 15: Final Project Presentation and Advanced Applications

  • Students present their final projects, integrating statistical or machine learning methods
  • Feedback and discussion on projects
  • Exploring advanced R tools for domain-specific applications
  • Course summary and future learning resources

About

R for Non-Technical PhD Students (R4PhDs), a comprehensive guide designed to empower researchers with limited technical backgrounds to harness the power of R for data analysis and statistical computing.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • JavaScript 66.7%
  • HTML 16.1%
  • CSS 15.7%
  • Smarty 1.4%
  • Ruby 0.1%