The following accelerators can be used to deploy an Azure Synapse Analytics Workspace
and/or an Azure Machine Learning Workspace
and/or an Azure Databricks Workspace
and/or an Azure Purview Account
into the same Azure Resource Group or separate Azure Resource Groups. It will allow you to explore some of the AI and Machine Learning, Data Lake, Data Lakehouse, Data Warehousing, and Data Governance capabilities available on Microsoft Azure. You will also be able to use Power BI to access data from analytic data stores and access deployed Azure Machine Learning custom models for scoring data. With the GA release of Azure Purview you will also be to explore data governance of your data estate.
Jump to Deploy an Azure Synapse Analytics Workspace
Jump to Deploy an Azure Machine Learning Workspace
Jump to Deploy an Azure Databricks Workspace
Jump to Deploy an Azure Purview Account
The purpose of this Analytics Accelerator is to help you learn and grow through Hands-on common use cases that show you how to use things like data integration pipelines, Spark notebooks, and SQL Scripts in Azure Synapse Analytics, and/or ADF pipelines, and/or Azure Databricks notebooks, and/or Azure Machine Learning AutoML, and/or Azure Purview scans.
This GitHub Repository along with an Azure Subscription No Azure Subscription click here should allow you to accelerate:
- Business Value
- Time-to-insight
- Modernization
- Skilling
- Proof of Concepts
- Architecture choice
- Infrastructure as code for PoC, Dev, Test, Prod
Event | Title | Presentation | Demo |
---|---|---|---|
DevDays2021 | Accelerate business value and time-to-insight with Azure Analytic | Deck with Recording | Demo Recording |
This accelerator provides an examples of common use cases using things like Seattle Public Library public csv datasets (and other datasets), that are captured on ADLS and composed into parquet files, Spark tables, Serverless external tables, and Dedicated SQL pool tables, then consumed using SQL.
Deployment | Use Case Name | Use Case Type | Dataset | Description | Code | Instruction Steps |
---|---|---|---|---|---|---|
Azure Synapse Analytics | Seattle Public Library | Data Lake / Data Warehouse | Seattle Public Library CSV | Public csv datasets that are captured on ADLS and composed into parquet files, Spark tables, Serverless external tables, and Dedicated SQL pool tables, then consumed using SQL. | Code | Steps |
Azure Machine Learning | Car Price | AutoML | UC Irvine Machine Learning Repository | UCI Automobile Data Set | Code | Steps |
Azure Machine Learning | Student Success | AutoML | UC Irvine Machine Learning Repository | UCI Student Performance Data Set | TBD Code | TBD Steps |
Azure Databricks | Covid-19 | Azure Databricks | JHU Covid-19 | TBD | TBD Code | TBD Steps |
Azure Databricks | Change Data Capture | Azure Databricks, ADF, Azure SQL DB | AdventureworksLT | Change Data Capture using ADF and Databricks Autoloader | Code | Steps |
Azure Purview | Data Governance | Azure Purview | AdventureworksLT | Data Governance with Azure Purview | Code | Steps |
If you are interest in Education Analytics please check out the GitHub Repository OpenEduAnalytics
Open Education Analytics (OEA) is a fully open-sourced (Creative Commons and MIT) data integration and analytics architecture and reference implementation for the education sector built on Synapse Analytics - with Azure Data Lake Storage as the storage backbone, and Azure Active Directory as providing the role-based access control.
This accelerator should allow you more time to focus on hands on keyboard and learn about these Azure Analytics Services:
Use your own data, the Common Use Cases above, or the Getting Started wizard inside of the workspace is recommended to use sample data if you do not have your own.
- Owner to the Azure Subscription being deployed. This is for creation of a separate Analytics Accelerator Resource Group(s) and to delegate roles necessary for this deployment.
***Remember to come back to this link above after the deployment has completed***
Synapse Analytics Post Deployment
This template deploys necessary resources to run an Azure Synapse Analytics Workspace.
This template deploys the following:
- An Azure Synapse Workspace
- (OPTIONAL) Allows All connections in by default (Firewall IP Addresses)
- Allows Azure Services to access the workspace by default
- Managed Virtual Network is Enabled
- An Azure Synapse SQL Pool
- (OPTIONAL) Apache Spark Pool
- Auto-paused set to 15 minutes of idling
- Azure Data Lake Storage Gen2 account
- Azure Synapse Workspace identity given Storage Blob Data Contributor to the Storage Account
- A new File System inside the Storage Account to be used by Azure Synapse
- Azure Synapse Workspace identity given Storage Blob Data Contributor to the Storage Account
- A Logic App to Pause the SQL Pool at defined schedule
- The Logic App will check for Active Queries. If there are active queries, it will wait 5 minutes and check again until there are none before pausing
- A Logic App to Resume the SQL Pool at defined schedule
- Both Logic App managed identities are given Contributor rights to the Resource Group
- Grants the Workspace identity CONTROL to all SQL pools and SQL on-demand pool
This template deploys necessary resources to run an Azure Machine Learning Workspace.
This template deploys the following:
- Azure Machine Learning Workspace
- Encrypted Storage Account
- KeyVault
- Applications Insights
Together with Azure Data Lake Storage Gen2, Azure Data Factory, and Azure SQL Database
This template deploys the following:
- Azure Databricks Workspace
- Azure Data Lake Storage Gen2
- Azure Data Factory
- Azure SQL Database
Together with Azure Key Vault
This template deploys the following:
- Azure Purview
- Azure Key Vault