Skip to content

Latest commit

 

History

History
81 lines (40 loc) · 5.84 KB

announcements.md

File metadata and controls

81 lines (40 loc) · 5.84 KB

Exploratory Data Analysis: Week 1

I'm very excited to start Exploratory Data Analysis and I hope you are too. Exploratory data analysis (EDA) is a key element of data science because it allows you to develop a rough idea of what your data look like and what kinds of questions might be answered by them. EDA is often the "fun part" of data analysis, where you get to play around with the data and, well, explore!

As of now the course web site on Coursera is open and you are free to start watching lecture videos, take the quizzes, and look at the first programming assignment. As you browse the course web site, please make sure to read through the syllabus which contains important information about the grading policy for quizzes and programming assignments as well as the course schedule.

The primary way to interact with me in this course is through the discussion forums. Here, you can start new threads by asking questions or you can respond to other people's questions. If you have a question about any aspect of the course, I strongly suggest that you search through the discussion boards first to see if anyone as already asked that question. If you see something similar to what you want to ask, you should up-vote that question using the up-arrow button rather than asking your question separately. The more votes a question or comment gets, the more likely it is that I will see it and be able to respond quickly. Of course, if you don't see a question similar to the one you want to ask, then you should definitely start a new thread on the appropriate forum.

This week will cover the basics of analytic graphics and the base plotting system in R. I recommend that you watch the videos in the order that they are listed on the web page, but watching the videos out of order isn't going to ruin the story. For each lecture video you can download a separate PDF document of the slides (the demonstration videos don't have slides associated with them).

Watching the videos on the Coursera web site is the best way to watch the lectures. However, there are alternative ways to view the lectures if that suits you. You can download the lecture video MP4 files and watch them locally on your computer.

I hope you enjoy the class. I anticipate a fun four weeks!

Roger Peng and the Data Science Team


Exploratory Data Analysis: Week 2

Welcome to Week 2 of Exploratory Data Analysis. This week covers some of the more advanced graphing systems available in R: the Lattice system and the ggplot2 system. While the base graphics system provides many important tools for visualizing data, it was part of the original R system and lacks many features that may be desirable in a plotting system, particularly when visualizing high dimensional data. The Lattice and ggplot2 systems also simplify the laying out of plots making it a much less tedious process.

This week there is a Quiz but there is no Peer Assessment. Rather, you will spend the week grading your classmates' Peer Assessment submissions. For each Peer Assessment you evaluate you will receive 2 points towards your final grade (up to a maximum of 20 points). Evaluating Peer Assessments is a great way to see the variety with which a problem can be solved by others.

Good luck and have a great week!

Roger Peng and the Data Science Team


Exploratory Data Analysis: Week 3

Welcome to Week 3 of Exploratory Data Analysis. This week covers some of the workhorse statisticsl methods for exploratory analysis. These methods include clustering and dimension reduction techniques that allow you to make graphical displayes of very high dimensional data (many many variables). We also cover novel ways to specify colors in R so that you can use color as an important and useful dimension when making data graphics.

This week also introduces a new Peer Assessment involving the analysis of data from the U.S. Environmental Protection Agency's National Emissions Inventory. Because this Assessment is rather involved, there is no Quiz for this week.

Good luck and have a great week!

Roger Peng and the Data Science Team


Exploratory Data Analysis: Week 4

Welcome to Week 4 of Exploratory Data Analysis. In this final week we will focus on peer grading of assignments. I also have posted two case studies in exploratory data analysis. The first involves the use of cluster analysis techniques and the second is a more involved analysis of some air pollution data. How one goes about doing EDA is often personal, but by providing these videos I thought I would give you a sense of how one might proceed with a specific type of dataset.

Thanks again for all of your efforts in the course, we are in the last stretch. Good luck and have a great week!

Roger Peng and the Data Science Team


Course wrap-up

Congratulations on finishing Prediction and Machine Learning!

We have set the grading and released the Statements of Accomplishment for the Course. It might take a few hours/days for the statements to be disbursed to accounts.

A couple of other notes:

  • The course will begin again immediately starting in a couple of days. If you are still interested in keeping in touch with your fellow learners, please enroll in the new course and keep the conversation going. You may also be an invaluable resource for new course takers!
  • Keep your eye on Hopkins offerings from Coursera. All announcements about future offerings will be posted at: https://twitter.com/jhubiostat and http://simplystatistics.org/, http://twitter.com/simplystats.
  • If you liked this course, please consider taking some of the other course offerings through the Data Science Track.
  • If you have cool projects you created through the course, please Tweet them to either of the addresses above so we can see them!

Thanks again for all of your efforts during the course of the class and best of luck in your career!

Jeff Leek and the Data Science Track Team