title	description	services	documentationcenter	author	manager	editor	ms.assetid	ms.service	ms.workload	ms.tgt_pltfrm	ms.devlang	ms.topic	ms.date	ms.author
Team Data Science Process walkthroughs \| Microsoft Docs	Walkthoughs show how to combine cloud and on-premise tools and services into a workflow or pipeline to create an intelligent application.	machine-learning		bradsev	jhubbard	cgronlun	aa63d5a5-25ee-4c4b-9a4c-7553b98d7f6e	machine-learning	data-services	na	na	article	10/07/2016	bradsev

Team Data Science Process walkthroughs

The end-to-end walkthroughs itemized here each demonstrate the steps in the Team Data Science Process for specific scenarios. They illustrate how to combine cloud, on-premise tools, and services into a workflow or pipeline to create an intelligent application.

Use SQL Data Warehouse

The Team Data Science Process in action: using SQL Data Warehouse walkthrough shows you how to build and deploy machine learning classification and regression models using SQL Data Warehouse (SQL DW) for a publicly available NYC taxi trip and fare dataset.

Use SQL Server

The Team Data Science Process in action: using SQL Server walkthrough shows you build and deploy machine learning classification and regression models using SQL Server and a publicly available NYC taxi trip and fare dataset.

Use HDInsight Hadoop clusters

The Team Data Science Process in action: using HDInsight Hadoop clusters walkthrough uses an Azure HDInsight Hadoop cluster to store, explore and feature engineer data from a publicly available NYC taxi trip and fare dataset

Use Azure HDInsight Hadoop Clusters on a 1-TB dataset

The Team Data Science Process in action: using Azure HDInsight Hadoop Clusters on a 1-TB dataset walkthrough presents an end-to-end scenario that uses an Azure HDInsight Hadoop cluster to store, explore, feature engineer, and down sample data from a publicly available Criteo dataset.

Data Science using Python with Spark on Azure

The Data Science using Spark on Azure HDInsight walkthrough uses the Team Data Science Process in an end-to-end scenario using an Azure HDInsight Spark cluster to store, explore and feature engineer data from the publicly available NYC taxi trip and fare dataset.

Data Science using Scala with Spark on Azure

The Data Science using Scala with Spark on Azure walkthrough shows how to use Scala for supervised machine learning tasks with the Spark scalable machine learning library (MLlib) and SparkML packages on an Azure HDInsight Spark cluster. It walks you through the tasks that constitute the Data Science Process: data ingestion and exploration, visualization, feature engineering, modeling, and model consumption. The models built include logistic and linear regression, random forests, and gradient boosted trees.

Use Azure Data Lake Storage and Analytics

The Scalable Data Science in Azure Data Lake: An end-to-end Walkthrough shows how to use Azure Data Lake to do data exploration and binary classification tasks on a sample of the NYC taxi dataset to predict whether or not a tip is paid by a customer.

Use R with SQL Server R Services

The Data Science End-to-End Walkthrough using SQL Server R Services walkthrough provides data scientists with a combination of R code, SQL Server data, and custom SQL functions to build and deploy an R model to SQL Server.

Use T-SQL with SQL Server R Services

The In-Database Advanced Analytics for SQL Developers walkthrough provides SQL programmers with experience building an advanced analytics solution with Transact-SQL using SQL Server R Services to operationalize an R solution.

What's next?

For an overview of topics that walk you through the tasks that comprise the data science process in Azure, see Data Science Process.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data-science-process-walkthroughs.md

data-science-process-walkthroughs.md

Team Data Science Process walkthroughs

Use SQL Data Warehouse

Use SQL Server

Use HDInsight Hadoop clusters

Use Azure HDInsight Hadoop Clusters on a 1-TB dataset

Data Science using Python with Spark on Azure

Data Science using Scala with Spark on Azure

Use Azure Data Lake Storage and Analytics

Use R with SQL Server R Services

Use T-SQL with SQL Server R Services

What's next?

Files

data-science-process-walkthroughs.md

Latest commit

History

data-science-process-walkthroughs.md

File metadata and controls

Team Data Science Process walkthroughs

Use SQL Data Warehouse

Use SQL Server

Use HDInsight Hadoop clusters

Use Azure HDInsight Hadoop Clusters on a 1-TB dataset

Data Science using Python with Spark on Azure

Data Science using Scala with Spark on Azure

Use Azure Data Lake Storage and Analytics

Use R with SQL Server R Services

Use T-SQL with SQL Server R Services

What's next?