diff --git a/articles/hdinsight-hbase-provision-vnet.md b/articles/hdinsight-hbase-provision-vnet.md index e7be094a920c5..33741e6b1ddb3 100644 --- a/articles/hdinsight-hbase-provision-vnet.md +++ b/articles/hdinsight-hbase-provision-vnet.md @@ -18,13 +18,13 @@ # Provision HBase clusters on Azure Virtual Network -Learn how to create HDInsight Hbase clusters on [Azure Virtual Network][1]. +Learn how to create Azure HDInsight HBase clusters on [Azure Virtual Network][1]. -With the virtual network integration, HBase clusters can be deployed to the same virtual network as your applications so that applications can communicate with HBase directly. The benefits include: +With virtual network integration, HBase clusters can be deployed to the same virtual network as your applications so that applications can communicate with HBase directly. The benefits include: -- Direct connectivity of the web application to the nodes of the HBase cluster which enables communication using HBase Java RPC APIs. -- Improve performance by not having your traffic go over multiple gateway and load-balancer. -- process sensitive information in a more secure manner without exposing a public endpoint +- Direct connectivity of the web application to the nodes of the HBase cluster, which enables communication via HBase Java remote procedure call (RPC) APIs +- Improved performance by not having your traffic go over multiple gateways and load-balancers +- The ability to process sensitive information in a more secure manner without exposing a public endpoint ##Prerequisites @@ -32,9 +32,9 @@ Before you begin this tutorial, you must have the following: - **An Azure subscription**. Azure is a subscription-based platform. For more information about obtaining a subscription, see [Purchase Options][azure-purchase-options], [Member Offers][azure-member-offers], or [Free Trial][azure-free-trial]. -- **A workstation with Azure PowerShell installed and configured**. For instructions, see [Install and configure Azure PowerShell][powershell-install]. To execute PowerShell scripts, you must run Azure PowerShell as administrator and set the execution policy to *RemoteSigned*. See [Using the Set-ExecutionPolicy cmdlet][2]. +- **A workstation with Azure PowerShell installed and configured**. For instructions, see [Install and configure Azure PowerShell][powershell-install]. To execute Azure PowerShell scripts, you must run Azure PowerShell as administrator and set the execution policy to *RemoteSigned*. See [Using the Set-ExecutionPolicy cmdlet][2]. - Before running PowerShell scripts, make sure you are connected to your Azure subscription using the following cmdlet: + Before running PowerShell scripts, make sure you are connected to your Azure subscription by using the following cmdlet: Add-AzureAccount @@ -45,24 +45,24 @@ Before you begin this tutorial, you must have the following: ##Provision an HBase cluster into a virtual network. -**To create a Virtual Network using the management portal:** +**To create a virtual network by using the Azure portal** -1. Sign in to the [Azure Management portal][azure-portal]. -2. Click **NEW** in the bottom left corner, click **NETWORK SERVICES**, click **VIRTUAL NETWORK**, and then click **QUICK CREATE**. +1. Sign in to the [Azure portal][azure-portal]. +2. Click **NEW** in the bottom-left corner, click **NETWORK SERVICES**, click **VIRTUAL NETWORK**, and then click **QUICK CREATE**. 3. Type or select the following values: - - **Name**: The name of your virtual network. - - **Address space**: Choose an address space for the virtual network that is large enough to provide addresses for all nodes in the cluster. Otherwise the provision will fail. For walking through this tutorial, you can pick any of the three choices. - - **Maximum VM count**: Choose one of the Maximum VM counts. This value determines the number of possible hosts (VMs) that can be created under the address space. For walking through this tutorial, **4096 [CIDR: /20]** is sufficient. - - **Location**: The location must be the same as the HBase cluster that you will create. - - **DNS server**: This article uses internal DNS server provided by Azure, therefore you can choose **None**. More advanced networking configuration with custom DNS servers are also supported. For the detailed guidance, see [Name Resolution (DNS)](http://msdn.microsoft.com/library/azure/jj156088.aspx). + - **Name** - The name of your virtual network. + - **Address space** - Choose an address space for the virtual network that is large enough to provide addresses for all nodes in the cluster. Otherwise the provision will fail. For walking through this tutorial, you can pick any of the three choices. + - **Maximum VM count** - Choose one of the Maximum VM counts. This value determines the number of possible hosts (virtual machines) that can be created under the address space. For walking through this tutorial, **4096 [CIDR: /20]** is sufficient. + - **Location** - The location must be the same as the HBase cluster that you will create. + - **DNS server** - This tutorial uses an internal Domain Name System (DNS) server provided by Azure, so you can choose **None**. More advanced networking configuration with custom DNS servers are also supported. For detailed guidance, see [Name Resolution (DNS)](http://msdn.microsoft.com/library/azure/jj156088.aspx). 4. Click **CREATE A VIRTUAL NETWORK**. The new virtual network name will appear in the list. Wait until the Status column shows **Created**. 5. In the main pane, click the virtual network you just created. -6. Click **DASHBOARD** on the top of the page, . -7. Under **quick glance**, make a note of **VIRTUAL NETWORK ID**. You will need it when provisioning HBase cluster. +6. Click **DASHBOARD** on the top of the page. +7. Under **quick glance**, make a note of **VIRTUAL NETWORK ID**. You will need it when provisioning the HBase cluster. 8. Click **CONFIGURE** on the top of the page. -9. On the bottom of the page, the default subnet name is **Subnet-1**. You can optionally rename the subnet or add a new subnet for the HBase cluster. Make a note of the subnet name, you will need it when provisioning the cluster -10. Verify the **CIDR(ADDRESS COUNT)** for the subnet that will be used for the cluster. The address count must be greater than the number of worker nodes plus seven (Gateway: 2, Headnode: 2, Zookeeper: 3). For example, if you need a 10 node HBase cluster, the address count for the subnet must be greater than 17 (10+7). Otherwise the deployment will fail. +9. On the bottom of the page, the default subnet name is **Subnet-1**. You can optionally rename the subnet or add a new subnet for the HBase cluster. Make a note of the subnet name; you will need it when provisioning the cluster. +10. Verify the **CIDR(ADDRESS COUNT)** for the subnet that will be used for the cluster. The address count must be greater than the number of worker nodes plus seven (gateway: 2, head node: 2, Zookeeper: 3). For example, if you need a 10-node HBase cluster, the address count for the subnet must be greater than 17 (10+7). Otherwise the deployment will fail. > [WACOM.NOTE] It is highly recommended to designate a single subnet for one cluster. @@ -72,56 +72,56 @@ Before you begin this tutorial, you must have the following: > [WACOM.NOTE] HDInsight clusters use Azure Blob storage for storing data. For more information, see [Use Azure Blob storage with Hadoop in HDInsight][hdinsight-storage]. You will need a storage account and a Blob storage container. The storage account location must match the virtual network location and the cluster location. -**To create an Azure Storage account and a Blob storage container:** +**To create an Azure Storage account and a Blob storage container** -1. Sign in to the [Azure Management Portal][azure-portal]. -2. Click **NEW** on the lower left corner, point to **DATA SERVICES**, point to **STORAGE**, and then click **QUICK CREATE**. +1. Sign in to the [Azure portal][azure-portal]. +2. Click **NEW** in the lower-left corner, point to **DATA SERVICES**, point to **STORAGE**, and then click **QUICK CREATE**. 3. Type or select the following values: - - **URL**: The name of the storage account - - **LOCATION**: The location of the storage account. Make sure it matches the virtual network location. Affinity groups are not supported. - - **REPLICATION**: For testing purposes, use **Locally Redundant** to reduce the cost. + - **URL** - The name of the Storage account. + - **LOCATION** - The location of the Storage account. Make sure it matches the virtual network location. Affinity groups are not supported. + - **REPLICATION** - For testing purposes, use **Locally Redundant** to reduce the cost. -4. Click **CREATE STORAGE ACCOUNT**. You will see the new storage account in the storage list. -5. Wait until the **STATUS** of the new storage account changes to **Online**. -6. Click the new storage account from the list to select it. +4. Click **CREATE STORAGE ACCOUNT**. You will see the new Storage account in the storage list. +5. Wait until the **STATUS** of the new Storage account changes to **Online**. +6. Click the new Storage account from the list to select it. 7. Click **MANAGE ACCESS KEYS** from the bottom of the page. -8. Make a note of the **STORAGE ACCOUNT NAME** and the **PRIMARY ACCESS KEY** (or the **SECONDARY ACCESS KEY**. Either of the keys works). You will need them later in the tutorial. +8. Make a note of the **STORAGE ACCOUNT NAME** and the **PRIMARY ACCESS KEY** (or the **SECONDARY ACCESS KEY**. Either of the keys works). You will need them later in the tutorial. 9. From the top of the page, click **CONTAINER**. 10. From the bottom of the page, click **ADD**. -11. Enter the container name. This container will be used as the default container for the HBase cluster. By default, the default container name matches the cluster name. Keep the **ACCESS** field as **Private**. -12. Click the check icon to create the container. +11. Enter the container name. This container will be used as the default container for the HBase cluster. By default, the default container name matches the cluster name. Keep the **ACCESS** field as **Private**. +12. Click the checkmark to create the container. -**To provision an HBase cluster using the Azure Portal:** +**To provision an HBase cluster by using the Azure portal** -> [WACOM.NOTE] For information on provisioning a new HBase cluster using PowerShell, see [Provision an HBase cluster using PowerShell](#powershell). +> [WACOM.NOTE] For information on provisioning a new HBase cluster by using Azure PowerShell, see [Provision an HBase cluster using Azure PowerShell](#powershell). -1. Sign in to the [Azure Management Portal][azure-portal]. +1. Sign in to the [Azure portal][azure-portal]. -2. Click **NEW** on the lower left corner, point to **DATA SERVICES**, point to **HDINSIGHT**, and then click **CUSTOM CREATE**. +2. Click **NEW** in the lower-left corner, point to **DATA SERVICES**, point to **HDINSIGHT**, and then click **CUSTOM CREATE**. -3. Enter a CLUSTER NAME, select the CLUSTER TYPE as HBase, select the Windows Server 2012 operating system, select the HDINSIGHT version, and then click the right button. +3. Enter a **CLUSTER NAME**, select the **CLUSTER TYPE** as HBase, select the Windows Server 2012 operating system, select the HDInsight version, and then click the right button. ![Provide details for the HBase cluster][img-provision-cluster-page1] > [WACOM.NOTE] For an HBase cluster, Windows Server is the only available OS option. -4. On the Configure Cluster page, enter or select the following: +4. On the **Configure Cluster** page, enter or select the following: ![Provide details for the HBase cluster](./media/hdinsight-hbase-provision-vnet/hbasewizard2.png)
Property | Value |
---|---|
Data nodes | Number of data nodes you want to deploy. For testing purposes, create a single node cluster. The cluster size limit varies for Azure subscriptions. Contact Azure billing support to increase the limit. |
Region/Virtual network | Select a region or an Azure Virtual Network, if you have already created. For this tutorial, select the network that you created earlier, and then select a corresponding subnet. The default name is Subnet-1. |
Region/Virtual network | Select a region or an Azure virtual network, if you have one already created. For this tutorial, select the network that you created earlier, and then select a corresponding subnet. The default name is Subnet-1. |
Head node size | Select a VM size for the head node. |
Data node size | Select a VM size for the data nodes. |
Zookeper size | Select a VM size for the zookeper node. |
Zookeeper size | Select a VM size for the Zookeeper node. |
Property | Value | |
---|---|---|
Storage Account | -Specify the Azure storage account that will be used as the default file system for the HDInsight cluster. You can choose one of the three options: + | Specify the Azure Storage account that will be used as the default file system for the HDInsight cluster. You can choose one of the three options:
|
Account Name |
| |
Account Key | -If you chose the Use Storage From Another Subscription option, specify the account key for that storage account. | If you chose the Use Storage From Another Subscription option, specify the account key for that Storage account. |
Default container | -Specifies the default container on the storage account that is used as the default file system for the HDInsight cluster. If you chose Use Existing Storage for the Storage Account field, and there are no existing containers in that account, the container is created by default with a the same name as the cluster name. If a container with the name of the cluster already exists, a sequence number will be appended to the container name. For example, mycontainer1, mycontainer2, and so on. However, if the existing storage account has a container with a name different from the cluster name you specified, you can use that container as well. -If you chose to create a new storage or use storage from another Azure subscription, you must specify the default container name + | Specifies the default container on the Storage account that is used as the default file system for the HDInsight cluster. If you chose Use Existing Storage for the Storage Account field, and there are no existing containers in that account, the container is created by default with the same name as the cluster name. If a container with the name of the cluster already exists, a sequence number will be appended to the container name. For example, mycontainer1, mycontainer2, and so on. However, if the existing Storage account has a container with a name different from the cluster name you specified, you can use that container as well. +If you chose to create a new storage or use storage from another Azure subscription, you must specify the default container name. |
Additional Storage Accounts | -If required, specify additonal storage accounts for the cluster. HDInsight supports multiple storage accounts. There is no limit on the additional storage account that can be used by a cluster. However, if you create a cluster using the Management Portal, you have a limit of seven due to the UI constraints. Each additional storage account you specify adds an extra Storage Account page to the wizard where you can specify the account information. For example, in the screenshot above, one additional storage account is selected, and hence page 5 is added to the dialog. | If required, specify additional Storage accounts for the cluster. HDInsight supports multiple Storage accounts. There is no limit on the additional Storage account that can be used by a cluster. However, if you create a cluster by using the Azure portal, you have a limit of seven due to the UI constraints. Each additional Storage account you specify adds an extra **Storage Account** page to the wizard where you can specify the account information. For example, in the screenshot above, one additional Storage account is selected, and hence page 5 is added to the dialog. |