Skip to content

Commit

Permalink
Lab5 Added
Browse files Browse the repository at this point in the history
  • Loading branch information
fabragaMS committed May 18, 2019
1 parent 82958ad commit cecb042
Show file tree
Hide file tree
Showing 52 changed files with 362 additions and 2 deletions.
360 changes: 360 additions & 0 deletions Lab/Lab5/Lab5.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,360 @@
# Lab 5: Ingest and Analyse real-time data with Event Hubs and Stream Analytics
In this lab you will use an Azure Logic App to connect to Twitter and generate a stream of messages using the hashtag #NYC. The logic app will invoke the Azure Text Analytics Cognitive service to score Tweet sentiment and send the messages to Event Hubs. You will use Stream Analytics to generate the average Tweet sentiment in the last 60 seconds and send the results to a real-time dataset in Power BI.

**IMPORTANT**: This lab requires you have a valid Twitter account. If you don’t have one, you can sign up for free following the instructions here: https://twitter.com/signup.

**IMPORTANT**: This lab requires you have a valid Power BI account. If you don’t have one you can register for a 60-day trial here: https://powerbi.microsoft.com/en-us/power-bi-pro/

The estimated time to complete this lab is: **60 minutes**.

## Lab Architecture
![Lab Architecture](./Media/Lab5-Image01.png)

Step | Description
-------- | -----
![](./Media/Orange1.png) | Build an Azure Logic App to invoke the Twitter API and retrieve Tweets with the hashtag #NYC
![](./Media/Orange2.png) | For each Tweet, invoke the Azure Text Analytics Cognitive service to detect its sentiment score
![](./Media/Orange3.png) | Format and send the Tweet’s JSON message to Event Hubs
![](./Media/Orange4.png) | Save Tweet messages into your data lake for future analysis (cold path)
![](./Media/Orange5.png) | Send stream of Tweet messages to Stream Analytics for real-time analytics (hot path)
![](./Media/Orange6.png) | Visualize real-time data generated by Stream Analytics with Power BI

**IMPORTANT**: Some of the Azure services provisioned require globally unique name and a “-suffix” has been added to their names to ensure this uniqueness. Please take note of the suffix generated as you will need it for the following resources:

Name |Type
-----------------------------|--------------------
mdwcosmosdb-*suffix* |Cosmos DB account
MDWDataFactory-*suffix* |Data Factory (V2)
mdwdatalake*suffix* |Storage Account
MDWEventHubs-*suffix* |Event Hubs Namespace
MDWKeyVault-*suffix* |Key vault
mdwsqlvirtualserver-*suffix* |SQL server
MDWStreamAnalytics-*suffix* |Stream Analytics job

# Create NYCTweets Container in Azure Blob Storage
In this section you will create a container in your MDWDataLake that will be used as a repository for the NYC image files. You will copy 30 files from the MDWResources Storage Account into your NYCTaxiData container.

![](./Media/Lab5-Image02.png)

**IMPORTANT**|
-------------|
**Execute these steps on your host computer**|

1. In the Azure Portal, go to the MDW-Lab resource group, and then locate and click the Azure Storage account **mdwdatalake*suffix***.

2. On the **Overview** panel, click **Blobs**.

![](./Media/Lab5-Image03.png)

3. On the **mdwdalalake*suffix* – Blobs** blade, click **+ Container**.

![](./Media/Lab5-Image04.png)

4. On the New container blade, enter the following details:
<br>- **Name**: nyctweets
<br>- **Public access level**: Private (no anonymous access)

5. Click **OK** to create the new container.

![](./Media/Lab5-Image05.png)

## Create and Configure Event Hubs
In this section you will prepare Event Hubs to ingest Twitter data collected by the Logic App and save incoming messages to your Data Lake storage account.

![](./Media/Lab5-Image06.png)

**IMPORTANT**|
-------------|
**Execute these steps on your host computer**|

1. In the Azure Portal, go to the lab resource group and locate the Event Hubs resource **MDWEventHubs-*suffix***.

2. On the **Event Hubs** panel, click **+ Event Hub** button to create a new event hub.

![](./Media/Lab5-Image07.png)

3. On the **Create Event Hub** blade, type “NYCTweets” in the Name field and leave the remaining fields with their default values.

4. Click **Create**.

![](./Media/Lab5-Image08.png)

5. Once the NYCTweets event hub is created, click Capture on the left-hand side menu.
6. Enter the following details:
<br>- **Capture**: On
<br>- **Time window (minutes)**: 1
<br>- **Do not emit empty files when no events occur during the capture time window**: Checked.
<br>- **Capture Provider**: Azure Storage
<br>- **Azure Storage Container**: [select the **nyctweets** container in your **MDWDataLake*suffix*** storage account]
7. Leave remaining fields with their default values.
8. Click **Save Changes**.

![](./Media/Lab5-Image09.png)

## Create Azure Logic App to Read #NYC Tweets and post them to Event Hubs
In this section you will create a Logic App to invoke the Twitter API and retrieve tweets for the hashtag #NYC. Tweets will then be formatted into a JSON message and sent to Event Hubs for processing.

![](./Media/Lab5-Image10.png)

**IMPORTANT**|
-------------|
**Execute these steps on your host computer**|

1. In the Azure Portal, go to the lab resource group and locate the Logic App resource **MDWLogicApp**.

2. On the **MDWLogicApp** menu, click **Logic app designer** to open the design blade.

![](./Media/Lab5-Image11.png)

3. On the **Logic app designer** blade, scroll down to the section **Start with a common trigger**.
4. Click **When a new tweet is posted**.

![](./Media/Lab5-Image12.png)

5. On the design surface you will see the Twitter connector. Click **Sign in** and use your personal Twitter account to authenticate.

![](./Media/Lab5-Image13.png)

6. On the log on screen review the permissions that will be granted to logic apps. If you agree with the permissions granted then enter your credentials and click **Authorise app**.

7. On the Confirmation Required page, click **Allow access** to proceed with the authentication process and to grant the right permissions to Logic Apps.

![](./Media/Lab5-Image14.jpg)

8. Once authenticated with Twitter you will notice the green tick next to your user name.

9. Click **Continue** to configure the Twitter connector.

![](./Media/Lab5-Image15.png)

10. On the **When a new tweet is posted** activity properties, enter the follow details:
<br>- **Search text**: #NYC
<br>- **Interval**: 10
<br>- **Frequency**: Seconds

11. Leave remaining fields with their default values.

12. Click **+ New** step to create a new Logic App task.

![](./Media/Lab5-Image16.png)

13. On the **Choose an action** box, type “Text Analytics” in the search field. Select **Detect Sentiment (preview)** from the **Actions** tab.

![](./Media/Lab5-Image17.png)

14. On the **Text Analytics** properties, type “MDWTextAnalyticsConnection” in the **Connection Name** field.

15. Open the **Azure Portal** in a new browser tab and copy **Endpoint** and **Key** values for MDWTextAnalytics.

16. Paste the values in the respective fields in the Text Analytics properties.

17. Click **Create**.

![](./Media/Lab5-Image18.png)

18. In the **Detect Sentiment (preview)** properties, click **Add new parameter**.

19. Select **Text** check box.

20. In the **Text** parameter box, select **Tweet Text** from the **Dynamic content** list.

21. Click **+ New step** to create a new task

![](./Media/Lab5-Image19.png)

22. Type “Compose” in the search box and select the **Compose** Data Operation.

![](./Media/Lab5-Image20.png)

23. On the **Compose** properties, build a new JSON message using data elements returned by the previous tasks. Your JSON message should look like this.

![](./Media/Lab5-Image21.png)

Alternatively you can copy and paste the JSON definition below:
```json
{
"TweetID": @{triggerBody()?['TweetId']},
"CreatedAt": "@{triggerBody()?['CreatedAtIso']}",
"TweetText": "@{triggerBody()?['TweetText']}",
"SentimentScore": @{body('Detect_Sentiment')?['score']}
}
```

24. Click **+ New step** to create a new task.

25. On the **Choose an action** box, type “Send event” in the search field. Select **Send Event** from the **Actions** tab.

![](./Media/Lab5-Image22.png)

26. On the **Send event** properties, type “MDWEventHubsConnection” in the **Connection Name** field.

27. Select **MDWEventHubs-*suffix*** from the list of **Event Hubs Namespaces**.

28. Select the default access policy **RootManageSharedAccessKey**.

29. Click **Create**.

![](./Media/Lab5-Image23.png)

30. On the **Send event** properties, select **nyctweets** in the **Event Hub name** field.

31. Click **Add new parameter** and select **Content**.

32. Click on the **Content** field. From the **Dynamic content** pop-up window, select click the **See more** link under **Compose** to display the **Outputs** field.

33. Select **Outputs** as the value for the **Content** field.

![](./Media/Lab5-Image24.png)

34. Click the **Save** button to save your Logic App.

![](./Media/Lab5-Image25.png)

35. On the **Overview** panel, click **Enable**.

![](./Media/Lab5-Image26.png)

36. In the Azure Portal, navigate to the **MDWDataLake*suffix*** storage account.

37. Wait a couple of minutes and you should be able to see new files being created in the **nyctweets** container.

![](./Media/Lab5-Image27.png)

## Create and Configure Stream Analytics
In this section you will configure Stream Analytics to perform analytic queries on streaming data sent by Event Hubs and generate outputs to Power BI.

![](./Media/Lab5-Image28.png)

**IMPORTANT**|
-------------|
**Execute these steps on your host computer**|

1. In the Azure Portal, go to the lab resource group and locate the Stream Analytics resource **MDWStreamAnalytics-*suffix***.

2. On the Inputs panel, click **+ Add stream** input button and select **Event Hub** to create a new stream input.

![](./Media/Lab5-Image29.png)

3. On the Event Hub New input blade enter the following details:
<br>- **Input alias**: MDWEventHubs
<br>- **Event Hub namespace**: MDWEventHubs-suffix
<br>- **Event hub name > Use existing**: nyctweets

4. Leave remaining fields with their default values.

![](./Media/Lab5-Image30.png)

5. On the **Outputs** panel, click **+ Add** button and select **Power BI** to create a new stream output.

![](./Media/Lab5-Image31.png)

6. On the Power BI New Output blade, click Authorize to authenticate with Power BI.
7. Once authenticated, enter the following details:
<br>- **Output alias**: PBITweetDetails
<br>- **Group Workspace**: My Workspace
<br>- **Dataset name**: NYCTweetStats
<br>- **Table name**: TweetDetails

8. Leave remaining fields with their default values.

9. Click **Save**.

![](./Media/Lab5-Image32.png)

10. Repeat the process to create another Power BI Output. This time enter the following details:
<br>- **Output alias**: PBITweetStats
<br>- **Group Workspace**: My Workspace
<br>- **Dataset name**: NYCTweetStats
<br>- **Table name**: TweetStats

11. Click **Save**.

![](./Media/Lab5-Image33.png)

12. On the **Query** panel, note the inputs and outputs you created in the previous steps.

![](./Media/Lab5-Image34.png)

13. Enter the following SQL commands in the query window.

```sql
SELECT count(*) as TotalTweets
, avg(SentimentScore) as AverageSentiment
INTO PBITweetStats
FROM MDWEventHubs TIMESTAMP BY CreatedAt
GROUP BY HoppingWindow(second,60,5)

SELECT CreatedAt
, TweetText
, SentimentScore
INTO PBITweetDetails
FROM MDWEventHubs TIMESTAMP BY CreatedAt

```

14. Click Save.

15. On the **Overview** panel, click **Start** to start the Stream Analytics job.

![](./Media/Lab5-Image35.png)

# Create Power BI Dashboard to Visualise Real-Time Data
In this section you will log on to the Power BI portal to create a dashboard to visualize real-time Tweeter statistics data sent by Stream Analytics.

![](./Media/Lab5-Image36.png)

**IMPORTANT**|
-------------|
**Execute these steps on your host computer**|

1. Open a new browser tab and navigate to https://www.powerbi.com
2. Enter your credentials to authenticate with the Power BI service.

![](./Media/Lab5-Image37.png)

3. Once authenticated, open the **Workspaces** menu and click **My Workspace** at the top of the Workspaces list.

![](./Media/Lab5-Image38.png)

4. Navigate to the **Datasets** tab and verify that two datasets have been created by Stream Analytics: **NYCTweetDetails** and **NYCTweetStats**.

![](./Media/Lab5-Image39.png)

5. On the top right-hand side corner click **+ Create** and then click **Dashboard** from the dropdown menu to create a new dashboard.

![](./Media/Lab5-Image40.png)

6. Type “NYC Tweet Stats” in the **Dashboard name** field and click **Create**.

7. Click **+ Add tile** button on the toolbar.

8. On the **Add tile** blade, select **Custom Streaming Data** under the **Real-Time Data** section.

9. Click **Next**.

![](./Media/Lab5-Image41.png)

10. On the **Add a custom streaming data tile** blade, select the **NYCTweetStats** dataset.

11. Click **Next**.

![](./Media/Lab5-Image42.png)

12. On the **Visualization Type** field select **Card**.

13. On the **Fields** field select **totaltweets**.

14. Click **Next**.

![](./Media/Lab5-Image43.png)

15. On the **Tile details** blade, enter the following details:
<br>- **Title**: Total Tweets
<br>- **Subtitle**: in the last minute.

16. Leave remaining fields with their default values.

17. Click **Apply**.

![](./Media/Lab5-Image44.png)

Binary file added Lab/Lab5/Media/Lab5-Image01.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Lab/Lab5/Media/Lab5-Image02.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Lab/Lab5/Media/Lab5-Image03.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Lab/Lab5/Media/Lab5-Image04.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Lab/Lab5/Media/Lab5-Image05.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Lab/Lab5/Media/Lab5-Image06.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Lab/Lab5/Media/Lab5-Image07.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Lab/Lab5/Media/Lab5-Image08.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Lab/Lab5/Media/Lab5-Image09.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Lab/Lab5/Media/Lab5-Image10.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Lab/Lab5/Media/Lab5-Image11.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Lab/Lab5/Media/Lab5-Image12.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Lab/Lab5/Media/Lab5-Image13.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Lab/Lab5/Media/Lab5-Image14.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Lab/Lab5/Media/Lab5-Image15.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Lab/Lab5/Media/Lab5-Image16.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Lab/Lab5/Media/Lab5-Image17.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Lab/Lab5/Media/Lab5-Image18.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Lab/Lab5/Media/Lab5-Image19.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Lab/Lab5/Media/Lab5-Image20.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Lab/Lab5/Media/Lab5-Image21.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Lab/Lab5/Media/Lab5-Image22.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Lab/Lab5/Media/Lab5-Image23.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Lab/Lab5/Media/Lab5-Image24.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Lab/Lab5/Media/Lab5-Image25.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Lab/Lab5/Media/Lab5-Image26.png
Binary file added Lab/Lab5/Media/Lab5-Image27.png
Binary file added Lab/Lab5/Media/Lab5-Image28.png
Binary file added Lab/Lab5/Media/Lab5-Image29.png
Binary file added Lab/Lab5/Media/Lab5-Image30.png
Binary file added Lab/Lab5/Media/Lab5-Image31.png
Binary file added Lab/Lab5/Media/Lab5-Image32.png
Binary file added Lab/Lab5/Media/Lab5-Image33.png
Binary file added Lab/Lab5/Media/Lab5-Image34.png
Binary file added Lab/Lab5/Media/Lab5-Image35.png
Binary file added Lab/Lab5/Media/Lab5-Image36.png
Binary file added Lab/Lab5/Media/Lab5-Image37.png
Binary file added Lab/Lab5/Media/Lab5-Image38.png
Binary file added Lab/Lab5/Media/Lab5-Image39.png
Binary file added Lab/Lab5/Media/Lab5-Image40.png
Binary file added Lab/Lab5/Media/Lab5-Image41.png
Binary file added Lab/Lab5/Media/Lab5-Image42.png
Binary file added Lab/Lab5/Media/Lab5-Image43.png
Binary file added Lab/Lab5/Media/Lab5-Image44.png
Binary file added Lab/Lab5/Media/Orange1.png
Binary file added Lab/Lab5/Media/Orange2.png
Binary file added Lab/Lab5/Media/Orange3.png
Binary file added Lab/Lab5/Media/Orange4.png
Binary file added Lab/Lab5/Media/Orange5.png
Binary file added Lab/Lab5/Media/Orange6.png
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Azure Data Platform End-to-End
# Azure Data Platform End2End

In this workshop you will learn about the main concepts related to advanced analytics and Big Data processing and how Azure Data Services can be used to implement a modern data warehouse architecture. You will understand what Azure services you can leverage to establish a solid data platform to quickly ingest, process and visualise data from a large variety of data sources. The reference architecture you will build as part of this exercise has been proven to give you the flexibility and scalability to grow and handle large volumes of data and keep an optimal level of performance.
In the exercises in this lab you will build data pipelines using data related to New York City. The workshop was designed to progressively implement an extended modern data platform architecture starting from a traditional relational data pipeline. Then we introduce big data scenarios with large files and distributed computing. We add non-structured data and AI into the mix and finish with real-time streaming analytics. You will have done all of that by the end of the day.
Expand Down Expand Up @@ -85,7 +85,7 @@ Step | Description
![](./Media/Blue5.png) | Copy metadata JSON documents into your Cosmos DB database
![](./Media/Blue6.png) | Visualize images and associated metadata using Power BI

### Lab 5: Ingest and Analyse real-time data with Event Hubs and Stream Analytics
### [Lab 5: Ingest and Analyse real-time data with Event Hubs and Stream Analytics](./Lab/Lab5/Lab5.md)
In this lab you will use an Azure Logic App to connect to Twitter and generate a stream of messages using the hashtag #NYC. The logic app will invoke the Azure Text Analytics Cognitive service to score Tweet sentiment and send the messages to Event Hubs. You will use Stream Analytics to generate the average Tweet sentiment in the last 60 seconds and send the results to a real-time dataset in Power BI.

The estimated time to complete this lab is: **60 minutes**.
Expand Down

0 comments on commit cecb042

Please sign in to comment.