Skip to content

Latest commit

 

History

History
 
 

Machine_Learning_1

Machine Learning 1: Making Your Data Useful for Analysis

Having complete and accurate data is a critical first step to being able to learn from it, but part of the complexity of data science is narrowing down what part of the data is important. In this introductory workshop to Machine Learning you will begin to understand how to narrow down the feature scope of your data so that the predictions are based on causation and not just correlation.

You do not need any prior experience with data science to attend this workshop. You are likely someone who is interested in data science, and has 1-2 years coding in Python, or another programming language and feel comfortable enough with Python to be able to code in it during the workshop. You are interested in learning about how to prepare your data for advanced machine learning models using Python and specific Python libraries.

You should:

  • Bring your own laptop (Windows or Mac) with an Internet browser.
  • Have coded in Python previously

You will be using Azure Notebooks, a cloud-based Jupyter Notebooks instance. All you will need is a Microsoft Account, which only requires an email address and for which you can sign up for at the event.

You should:

  • Bring your own laptop (Windows or Mac) with an Internet browser.
  • Have coded in Python and done data cleansing, manipulation, and preparation to run through machine learning models.

You will be using Azure Notebooks, a cloud-based Jupyter Notebooks instance. All you will need is a Microsoft Account, which only requires an email address and for which you can sign up for at the event

Engagement Expectations

This workshop is meant to be highly interactive. The instructor will lead you in two interactive teaching styles:

  1. Interactive Lecturing: The majority of content for this workshop is in a Notebook. Though the content will be introduced via PowerPoint, the rest of the workshop will consist of walking them through the Azure Notebooks. During this time, instructors will employ an interactive lecture style, where learners will be asked to participate by asking questions and offering up ideas.

  2. Think, Pair, Share: For some of the more complex topics, the instructor will use the "Think, Pair, Share" method. This is where you will be asked a question and given about 45 seconds to think quietly to yourself. During this time it is imperative that you are not discussing with others yet. Then, you will have an opportunity to disucss with the 1-2 people next to you. Make sure you don't just share your answer, but why you think that is the answer. Finally, the isntructor will ask for a few people to share what they discussed with their neighbors.

Notice: Various interactive cues are called out in the Notebooks. These are suggestions and at the instructor's discression.

Content

The primary source of content will be relatively bare Azure Notebooks where the instructor will guide you through discovering the different features of Pandas, general data cleaning and manipulation, and a few advanced machine learning models such as PCA, ROC, K-Means, and Naive Bayes.

Today's Schedule

Timing T opic
45 minutes Introduction to Data Science Keynote
75 minutes Joining Datasets
30 minutes Lunch
75 minutes Principal Component Analysis (PCA)
75 minutes Machine Learning Accuracy
45 minutes Wrap Up and Next Steps

Tips and Tricks

Azure Notebooks is still in Preview. This means that there are some times when it will fail. Here are some tips for avoiding losing your work:

  • Ensure their work is being saved. In the Jupyter Notebook there is always one of two messages to the right of the title of the notebook: (autosaved) or (unsaved changes). Make sure you're noticing that your work is being saved. You should consider checking every 10 minutes or so.
  • Sometimes Notebooks get into a state where the Kernel cannot be started. Sometimes re-starting the kernel will work. But often you will have to somepletely sign out of Azure Notebooks and then sign back in.

Reference Material

If you need a referesher on how to code in Python or work with NumPy or Pandas, we recommend you check out the materials from our other Reactor Wowrkshops:

Further Microsoft Learn Pathways