diff --git a/articles/hdinsight-hbase-provision-vnet.md b/articles/hdinsight-hbase-provision-vnet.md index e7be094a920c5..33741e6b1ddb3 100644 --- a/articles/hdinsight-hbase-provision-vnet.md +++ b/articles/hdinsight-hbase-provision-vnet.md @@ -18,13 +18,13 @@ # Provision HBase clusters on Azure Virtual Network -Learn how to create HDInsight Hbase clusters on [Azure Virtual Network][1]. +Learn how to create Azure HDInsight HBase clusters on [Azure Virtual Network][1]. -With the virtual network integration, HBase clusters can be deployed to the same virtual network as your applications so that applications can communicate with HBase directly. The benefits include: +With virtual network integration, HBase clusters can be deployed to the same virtual network as your applications so that applications can communicate with HBase directly. The benefits include: -- Direct connectivity of the web application to the nodes of the HBase cluster which enables communication using HBase Java RPC APIs. -- Improve performance by not having your traffic go over multiple gateway and load-balancer. -- process sensitive information in a more secure manner without exposing a public endpoint +- Direct connectivity of the web application to the nodes of the HBase cluster, which enables communication via HBase Java remote procedure call (RPC) APIs +- Improved performance by not having your traffic go over multiple gateways and load-balancers +- The ability to process sensitive information in a more secure manner without exposing a public endpoint ##Prerequisites @@ -32,9 +32,9 @@ Before you begin this tutorial, you must have the following: - **An Azure subscription**. Azure is a subscription-based platform. For more information about obtaining a subscription, see [Purchase Options][azure-purchase-options], [Member Offers][azure-member-offers], or [Free Trial][azure-free-trial]. -- **A workstation with Azure PowerShell installed and configured**. For instructions, see [Install and configure Azure PowerShell][powershell-install]. To execute PowerShell scripts, you must run Azure PowerShell as administrator and set the execution policy to *RemoteSigned*. See [Using the Set-ExecutionPolicy cmdlet][2]. +- **A workstation with Azure PowerShell installed and configured**. For instructions, see [Install and configure Azure PowerShell][powershell-install]. To execute Azure PowerShell scripts, you must run Azure PowerShell as administrator and set the execution policy to *RemoteSigned*. See [Using the Set-ExecutionPolicy cmdlet][2]. - Before running PowerShell scripts, make sure you are connected to your Azure subscription using the following cmdlet: + Before running PowerShell scripts, make sure you are connected to your Azure subscription by using the following cmdlet: Add-AzureAccount @@ -45,24 +45,24 @@ Before you begin this tutorial, you must have the following: ##Provision an HBase cluster into a virtual network. -**To create a Virtual Network using the management portal:** +**To create a virtual network by using the Azure portal** -1. Sign in to the [Azure Management portal][azure-portal]. -2. Click **NEW** in the bottom left corner, click **NETWORK SERVICES**, click **VIRTUAL NETWORK**, and then click **QUICK CREATE**. +1. Sign in to the [Azure portal][azure-portal]. +2. Click **NEW** in the bottom-left corner, click **NETWORK SERVICES**, click **VIRTUAL NETWORK**, and then click **QUICK CREATE**. 3. Type or select the following values: - - **Name**: The name of your virtual network. - - **Address space**: Choose an address space for the virtual network that is large enough to provide addresses for all nodes in the cluster. Otherwise the provision will fail. For walking through this tutorial, you can pick any of the three choices. - - **Maximum VM count**: Choose one of the Maximum VM counts. This value determines the number of possible hosts (VMs) that can be created under the address space. For walking through this tutorial, **4096 [CIDR: /20]** is sufficient. - - **Location**: The location must be the same as the HBase cluster that you will create. - - **DNS server**: This article uses internal DNS server provided by Azure, therefore you can choose **None**. More advanced networking configuration with custom DNS servers are also supported. For the detailed guidance, see [Name Resolution (DNS)](http://msdn.microsoft.com/library/azure/jj156088.aspx). + - **Name** - The name of your virtual network. + - **Address space** - Choose an address space for the virtual network that is large enough to provide addresses for all nodes in the cluster. Otherwise the provision will fail. For walking through this tutorial, you can pick any of the three choices. + - **Maximum VM count** - Choose one of the Maximum VM counts. This value determines the number of possible hosts (virtual machines) that can be created under the address space. For walking through this tutorial, **4096 [CIDR: /20]** is sufficient. + - **Location** - The location must be the same as the HBase cluster that you will create. + - **DNS server** - This tutorial uses an internal Domain Name System (DNS) server provided by Azure, so you can choose **None**. More advanced networking configuration with custom DNS servers are also supported. For detailed guidance, see [Name Resolution (DNS)](http://msdn.microsoft.com/library/azure/jj156088.aspx). 4. Click **CREATE A VIRTUAL NETWORK**. The new virtual network name will appear in the list. Wait until the Status column shows **Created**. 5. In the main pane, click the virtual network you just created. -6. Click **DASHBOARD** on the top of the page, . -7. Under **quick glance**, make a note of **VIRTUAL NETWORK ID**. You will need it when provisioning HBase cluster. +6. Click **DASHBOARD** on the top of the page. +7. Under **quick glance**, make a note of **VIRTUAL NETWORK ID**. You will need it when provisioning the HBase cluster. 8. Click **CONFIGURE** on the top of the page. -9. On the bottom of the page, the default subnet name is **Subnet-1**. You can optionally rename the subnet or add a new subnet for the HBase cluster. Make a note of the subnet name, you will need it when provisioning the cluster -10. Verify the **CIDR(ADDRESS COUNT)** for the subnet that will be used for the cluster. The address count must be greater than the number of worker nodes plus seven (Gateway: 2, Headnode: 2, Zookeeper: 3). For example, if you need a 10 node HBase cluster, the address count for the subnet must be greater than 17 (10+7). Otherwise the deployment will fail. +9. On the bottom of the page, the default subnet name is **Subnet-1**. You can optionally rename the subnet or add a new subnet for the HBase cluster. Make a note of the subnet name; you will need it when provisioning the cluster. +10. Verify the **CIDR(ADDRESS COUNT)** for the subnet that will be used for the cluster. The address count must be greater than the number of worker nodes plus seven (gateway: 2, head node: 2, Zookeeper: 3). For example, if you need a 10-node HBase cluster, the address count for the subnet must be greater than 17 (10+7). Otherwise the deployment will fail. > [WACOM.NOTE] It is highly recommended to designate a single subnet for one cluster. @@ -72,56 +72,56 @@ Before you begin this tutorial, you must have the following: > [WACOM.NOTE] HDInsight clusters use Azure Blob storage for storing data. For more information, see [Use Azure Blob storage with Hadoop in HDInsight][hdinsight-storage]. You will need a storage account and a Blob storage container. The storage account location must match the virtual network location and the cluster location. -**To create an Azure Storage account and a Blob storage container:** +**To create an Azure Storage account and a Blob storage container** -1. Sign in to the [Azure Management Portal][azure-portal]. -2. Click **NEW** on the lower left corner, point to **DATA SERVICES**, point to **STORAGE**, and then click **QUICK CREATE**. +1. Sign in to the [Azure portal][azure-portal]. +2. Click **NEW** in the lower-left corner, point to **DATA SERVICES**, point to **STORAGE**, and then click **QUICK CREATE**. 3. Type or select the following values: - - **URL**: The name of the storage account - - **LOCATION**: The location of the storage account. Make sure it matches the virtual network location. Affinity groups are not supported. - - **REPLICATION**: For testing purposes, use **Locally Redundant** to reduce the cost. + - **URL** - The name of the Storage account. + - **LOCATION** - The location of the Storage account. Make sure it matches the virtual network location. Affinity groups are not supported. + - **REPLICATION** - For testing purposes, use **Locally Redundant** to reduce the cost. -4. Click **CREATE STORAGE ACCOUNT**. You will see the new storage account in the storage list. -5. Wait until the **STATUS** of the new storage account changes to **Online**. -6. Click the new storage account from the list to select it. +4. Click **CREATE STORAGE ACCOUNT**. You will see the new Storage account in the storage list. +5. Wait until the **STATUS** of the new Storage account changes to **Online**. +6. Click the new Storage account from the list to select it. 7. Click **MANAGE ACCESS KEYS** from the bottom of the page. -8. Make a note of the **STORAGE ACCOUNT NAME** and the **PRIMARY ACCESS KEY** (or the **SECONDARY ACCESS KEY**. Either of the keys works). You will need them later in the tutorial. +8. Make a note of the **STORAGE ACCOUNT NAME** and the **PRIMARY ACCESS KEY** (or the **SECONDARY ACCESS KEY**. Either of the keys works). You will need them later in the tutorial. 9. From the top of the page, click **CONTAINER**. 10. From the bottom of the page, click **ADD**. -11. Enter the container name. This container will be used as the default container for the HBase cluster. By default, the default container name matches the cluster name. Keep the **ACCESS** field as **Private**. -12. Click the check icon to create the container. +11. Enter the container name. This container will be used as the default container for the HBase cluster. By default, the default container name matches the cluster name. Keep the **ACCESS** field as **Private**. +12. Click the checkmark to create the container. -**To provision an HBase cluster using the Azure Portal:** +**To provision an HBase cluster by using the Azure portal** -> [WACOM.NOTE] For information on provisioning a new HBase cluster using PowerShell, see [Provision an HBase cluster using PowerShell](#powershell). +> [WACOM.NOTE] For information on provisioning a new HBase cluster by using Azure PowerShell, see [Provision an HBase cluster using Azure PowerShell](#powershell). -1. Sign in to the [Azure Management Portal][azure-portal]. +1. Sign in to the [Azure portal][azure-portal]. -2. Click **NEW** on the lower left corner, point to **DATA SERVICES**, point to **HDINSIGHT**, and then click **CUSTOM CREATE**. +2. Click **NEW** in the lower-left corner, point to **DATA SERVICES**, point to **HDINSIGHT**, and then click **CUSTOM CREATE**. -3. Enter a CLUSTER NAME, select the CLUSTER TYPE as HBase, select the Windows Server 2012 operating system, select the HDINSIGHT version, and then click the right button. +3. Enter a **CLUSTER NAME**, select the **CLUSTER TYPE** as HBase, select the Windows Server 2012 operating system, select the HDInsight version, and then click the right button. ![Provide details for the HBase cluster][img-provision-cluster-page1] > [WACOM.NOTE] For an HBase cluster, Windows Server is the only available OS option. -4. On the Configure Cluster page, enter or select the following: +4. On the **Configure Cluster** page, enter or select the following: ![Provide details for the HBase cluster](./media/hdinsight-hbase-provision-vnet/hbasewizard2.png) - + - +
PropertyValue
Data nodesNumber of data nodes you want to deploy. For testing purposes, create a single node cluster.
The cluster size limit varies for Azure subscriptions. Contact Azure billing support to increase the limit.
Region/Virtual network

Select a region or an Azure Virtual Network, if you have already created. For this tutorial, select the network that you created earlier, and then select a corresponding subnet. The default name is Subnet-1.

Region/Virtual network

Select a region or an Azure virtual network, if you have one already created. For this tutorial, select the network that you created earlier, and then select a corresponding subnet. The default name is Subnet-1.

Head node size

Select a VM size for the head node.

Data node size

Select a VM size for the data nodes.

Zookeper size

Select a VM size for the zookeper node.

Zookeeper size

Select a VM size for the Zookeeper node.

- >[WACOM.NOTE] Based on the choice of VMs, your cost might vary. HDInsight uses all standard tier VMs for cluster nodes. For information on how VM sizes affect your prices, see HDInsight Pricing. + >[WACOM.NOTE] Based on the choice of VMs, your cost might vary. HDInsight uses all standard-tier VMs for cluster nodes. For information on how VM sizes affect your prices, see HDInsight Pricing. Click the right button. @@ -129,12 +129,12 @@ Before you begin this tutorial, you must have the following: 6. On the **Storage Account** page, provide the following value: - ![Provide storage account for Hadoop HDInsight cluster](./media/hdinsight-hbase-provision-vnet/hbasewizard4.png) + ![Provide Storage account for Hadoop HDInsight cluster](./media/hdinsight-hbase-provision-vnet/hbasewizard4.png) - - + - - +
PropertyValue
Storage AccountSpecify the Azure storage account that will be used as the default file system for the HDInsight cluster. You can choose one of the three options: + Specify the Azure Storage account that will be used as the default file system for the HDInsight cluster. You can choose one of the three options:
  • Use Existing Storage
  • Create New Storage
  • @@ -143,22 +143,22 @@ Before you begin this tutorial, you must have the following:
Account Name
    -
  • If you chose to use existing storage, for Account name, select an exising storage account. The drop-down only lists the storage accounts located in the same data center where you chose to provision the cluster.
  • -
  • If you chose Create new storage or Use storage from another subscription option, you must provide the storage account name.
  • +
  • If you chose to use existing storage, for Account name, select an existing storage account. The drop-down lists only the Storage accounts located in the same data center where you chose to provision the cluster.
  • +
  • If you chose Create new storage or Use storage from another subscription option, you must provide the Storage account name.
Account KeyIf you chose the Use Storage From Another Subscription option, specify the account key for that storage account.
If you chose the Use Storage From Another Subscription option, specify the account key for that Storage account.
Default container

Specifies the default container on the storage account that is used as the default file system for the HDInsight cluster. If you chose Use Existing Storage for the Storage Account field, and there are no existing containers in that account, the container is created by default with a the same name as the cluster name. If a container with the name of the cluster already exists, a sequence number will be appended to the container name. For example, mycontainer1, mycontainer2, and so on. However, if the existing storage account has a container with a name different from the cluster name you specified, you can use that container as well.

-

If you chose to create a new storage or use storage from another Azure subscription, you must specify the default container name

+

Specifies the default container on the Storage account that is used as the default file system for the HDInsight cluster. If you chose Use Existing Storage for the Storage Account field, and there are no existing containers in that account, the container is created by default with the same name as the cluster name. If a container with the name of the cluster already exists, a sequence number will be appended to the container name. For example, mycontainer1, mycontainer2, and so on. However, if the existing Storage account has a container with a name different from the cluster name you specified, you can use that container as well.

+

If you chose to create a new storage or use storage from another Azure subscription, you must specify the default container name.

Additional Storage AccountsIf required, specify additonal storage accounts for the cluster. HDInsight supports multiple storage accounts. There is no limit on the additional storage account that can be used by a cluster. However, if you create a cluster using the Management Portal, you have a limit of seven due to the UI constraints. Each additional storage account you specify adds an extra Storage Account page to the wizard where you can specify the account information. For example, in the screenshot above, one additional storage account is selected, and hence page 5 is added to the dialog.
If required, specify additional Storage accounts for the cluster. HDInsight supports multiple Storage accounts. There is no limit on the additional Storage account that can be used by a cluster. However, if you create a cluster by using the Azure portal, you have a limit of seven due to the UI constraints. Each additional Storage account you specify adds an extra **Storage Account** page to the wizard where you can specify the account information. For example, in the screenshot above, one additional Storage account is selected, and hence page 5 is added to the dialog.
Click the right arrow. -7. On the **Script Actions** page, select the **Checkmark** in the lower right corner. **Do not click the add script button**, as this tutorial does not require a customized cluster setup. +7. On the **Script Actions** page, select the checkmark in the lower-right corner. **Do not click the add script button**, as this tutorial does not require a customized cluster setup. ![Configure Script Action to customize an HDInsight HBase cluster][img-provision-cluster-page5] @@ -166,17 +166,17 @@ Before you begin this tutorial, you must have the following: To begin working with your new HBase cluster, you can use the procedures found in [Get started using HBase with Hadoop in HDInsight][hbase-get-started]. -##Connect to the HBase cluster provisioned in virtual network using HBase Java RPC APIs +##Connect to the HBase cluster provisioned in the virtual network by using HBase Java RPC APIs -1. Provision an IaaS virtual machine into the same Azure virtual network and the same subnet. So both the virtual machine and the HBase cluster use the same internal DNS server to resolve host names. To do so, you must choose the From Gallery option, and select the virtual network instead of a data center. For the instructions, see [Create a Virtual Machine Running Windows Server][vm-create]. A standard Windows Server 2012 image with a small VM size is sufficient. +1. Provision an infrastructure as a service (IaaS) virtual machine into the same Azure virtual network and the same subnet. So both the virtual machine and the HBase cluster use the same internal DNS server to resolve host names. To do so, you must choose the **From Gallery** option, and select the virtual network instead of a data center. For instructions, see [Create a Virtual Machine Running Windows Server][vm-create]. A standard Windows Server 2012 image with a small VM size is sufficient. -2. When using a Java application to connect to HBase remotely, you must use the fully qualified domain name (FQDN). To determine this, we must get the connection-specific DNS suffix of the HBase cluster. To do that use Curl to query Ambari, or remote desktop to connect to the cluster. +2. When using a Java application to connect to HBase remotely, you must use the fully qualified domain name (FQDN). To determine this, we must get the connection-specific DNS suffix of the HBase cluster. To do that, use Curl to query Ambari, or use Remote Desktop to connect to the cluster. - * **Curl** - use the following command: + * **Curl** - Use the following command: curl -u : -k https://.azurehdinsight.net/ambari/api/v1/clusters/.azurehdinsight.net/services/hbase/components/hbrest - In the JSON data returned, find the "host_name" entry. This will contain the fully qualified domain name (FQDN) for the nodes in the cluster. For example: + In the JSON data returned, find the "host_name" entry. This will contain the FQDN for the nodes in the cluster. For example: ... "host_name": "wordkernode0..b1.cloudapp.net @@ -184,7 +184,7 @@ To begin working with your new HBase cluster, you can use the procedures found i The portion of the domain name beginning with the cluster name is the DNS suffix. For example, mycluster.b1.cloudapp.net. - * **PowerShell** - use the following PowerShell script to register the **Get-ClusterDetail** function, which can be used to return the DNS suffix. + * **Azure PowerShell** - Use the following Azure PowerShell script to register the **Get-ClusterDetail** function, which can be used to return the DNS suffix: function Get-ClusterDetail( [String] @@ -203,7 +203,7 @@ To begin working with your new HBase cluster, you can use the procedures found i { <# .SYNOPSIS - Displays information to facilitate HDInsight cluster to cluster scinario within same virtual network. + Displays information to facilitate HDInsight cluster to cluster scenario within same virtual network. .Description This command shows following 4 properties of an HDInsight cluster. 1. ZookeeperQuorum (only support HBase type cluster) @@ -276,35 +276,35 @@ To begin working with your new HBase cluster, you can use the procedures found i } } - After running the PowerShell script, use the following command to return the DNS suffix using the Get-ClusterDetail function. Specify your HDInsight HBase cluster name, admin name, and admin password when using this command. + After running the Azure PowerShell script, use the following command to return the DNS suffix by using the **Get-ClusterDetail** function. Specify your HDInsight HBase cluster name, admin name, and admin password when using this command. Get-ClusterDetail -ClusterDnsName -PropertyName FQDNSuffix -Username -Password This will return the DNS suffix. For example, **yourclustername.b4.internal.cloudapp.net**. - > [WACOM.NOTE] You can also use Remote Desktop to connect the HBase cluster (you will be connected to the headnode) and run **ipconfig** from a command prompt to obtain the DNS suffix. For instructions on enabling RDP and connect to the cluster using RDP, see [Manage Hadoop clusters in HDInsight using the Azure Management Portal][hdinsight-admin-portal]. + > [WACOM.NOTE] You can also use Remote Desktop to connect the HBase cluster (you will be connected to the head node) and run **ipconfig** from a command prompt to obtain the DNS suffix. For instructions on enabling RDP and connecting to the cluster by using Remote Desktop Protocol (RDP), see [Manage Hadoop clusters in HDInsight using the Azure portal][hdinsight-admin-portal]. > > ![hdinsight.hbase.dns.surffix][img-dns-surffix] -To verify that the virtual machine can communicate with the HBase cluster, use the following command `ping headnode0.` from the virtual machine. For example, ping headnode0.mycluster.b1.cloudapp.net +To verify that the virtual machine can communicate with the HBase cluster, use the command `ping headnode0.` from the virtual machine. For example, ping headnode0.mycluster.b1.cloudapp.net. -To use this information in a Java application, you can follow the steps in [Use Maven to build Java applications that use HBase with HDInsight (Hadoop)](azure.microsoft.com/documentation/articles/hdinsight-hbase-build-java-maven/) to create an application. To have the application connect to a remote HBase server, modify the **hbase-site.xml** file in this example to use the FQDN for ZooKeeper. For example: +To use this information in a Java application, you can follow the steps in [Use Maven to build Java applications that use HBase with HDInsight (Hadoop)](azure.microsoft.com/documentation/articles/hdinsight-hbase-build-java-maven/) to create an application. To have the application connect to a remote HBase server, modify the **hbase-site.xml** file in this example to use the FQDN for Zookeeper. For example: hbase.zookeeper.quorum @@ -313,12 +313,12 @@ To use this information in a Java application, you can follow the steps in [Use > [WACOM.NOTE] For more information on name resolution in Azure Virtual Networks, including how to use your own DNS server, see [Name Resolution (DNS)](http://msdn.microsoft.com/library/azure/jj156088.aspx). -##Provision an HBase cluster using Azure PowerShell +##Provision an HBase cluster by using Azure PowerShell -**To provision an HBase cluster using Azure PowerShell** +**To provision an HBase cluster by using Azure PowerShell** -1. Open PowerShell ISE. -2. Copy and paste the following copy into the script pane. +1. Open the Azure PowerShell Integrated Scripting Environment (ISE). +2. Copy and paste the following into the script pane: $hbaseClusterName = "" $hadoopUserName = "" @@ -348,13 +348,13 @@ To use this information in a Java application, you can follow the steps in [Use 3. Click **Run Script**, or press **F5**. -4. To validate the cluster, you can either check the cluster from the management portal, or run the following PowerShell cmdlet from the bottom pane: +4. To validate the cluster, you can either check the cluster from the Azure portal, or run the following Azure PowerShell cmdlet from the bottom pane: Get-AzureHDInsightCluster -##Next Steps +##Next steps -In this tutorial we have learned how to provision an HBase cluster. To learn more, see: +In this tutorial we learned how to provision an HBase cluster. To learn more, see: - [Get started with HDInsight][hdinsight-get-started] - [Provision Hadoop clusters in HDInsight][hdinsight-provision] @@ -404,4 +404,4 @@ In this tutorial we have learned how to provision an HBase cluster. To learn mor [img-dns-surffix]: ./media/hdinsight-hbase-provision-vnet/DNSSuffix.png [img-primary-dns-suffix]: ./media/hdinsight-hbase-provision-vnet/PrimaryDNSSuffix.png [img-provision-cluster-page1]: ./media/hdinsight-hbase-provision-vnet/hbasewizard1.png "Provision details for the new HBase cluster" -[img-provision-cluster-page5]: ./media/hdinsight-hbase-provision-vnet/hbasewizard5.png "Use Script Action to customize an HBase cluster" \ No newline at end of file +[img-provision-cluster-page5]: ./media/hdinsight-hbase-provision-vnet/hbasewizard5.png "Use Script Action to customize an HBase cluster"