Answer:
No, some students take Data 101 before Data 110, but others take both at the same time. It depends on your comfort level and previous experience with data science concepts.
Answer:
No, you don't need to know Python before starting this course. However, you will need to start practicing and work on the assignments each week to build your skills progressively.
Answer:
The key Python packages you'll use and learn in this course include:
- Matplotlib: Essential for creating a wide range of static, animated, and interactive visualizations.
- NumPy: Crucial for numerical computing, handling arrays, and performing mathematical operations efficiently.
- Pandas: Vital for data manipulation and analysis, especially when working with structured data (e.g., tables and data frames).
- Seaborn: Built on top of Matplotlib, this package makes it easier to create visually appealing and informative statistical plots.
Additionally, we may also use:
- Plotly: Useful for creating interactive plots, dashboards, and web-based visualizations.
Answer:
We do not use one single source, but this class draws heavily from materials presented in the following book:
Claus O. Wilke. Fundamentals of Data Visualization. O’Reilly Media, 2019.
Answer:
Short: No.
Longer: Claus himself teaches about data visualization and he does not follow the chapter order of the book. We follow almost his lecture order. His course can be found here: https://wilkelab.org/DSC385/.
Answer:
We use GitHub because it mirrors industry standards. In Data Science and related fields, proficiency in version control and collaboration tools like GitHub is essential. While Blackboard is useful for managing coursework, it doesn't prepare students for the kind of collaborative, code-centric work environments they’ll encounter in their careers. Mastering GitHub not only builds your coding skills but also enhances your ability to contribute to real-world projects, track changes, and collaborate with others. This makes it a much more valuable skill for your future career.
Answer:
Markdown is an essential tool for any data scientist or developer. It's a lightweight, easy-to-learn format for creating well-structured, readable documentation. You’ll use it extensively in GitHub to document projects, in Google Colab to write clean, organized notebooks, and even in MS Teams for seamless communication. By mastering Markdown, you enhance your ability to create professional reports, share code effectively, and collaborate with others in any technical environment. It’s a skill that not only improves your workflow but is also highly valued in the industry.
Question 8: To be successful in this course and Data Science, do I need to be an expert in any field?
Answer:
No, you don’t need to be an expert. What matters most is curiosity and a willingness to work hard and ask for guidence when you feel lost.