title | description | services | documentationcenter | author | manager | editor | tags | ms.assetid | ms.service | ms.custom | ms.devlang | ms.topic | ms.tgt_pltfrm | ms.workload | ms.date | ms.author |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Create Hadoop clusters using PowerShell - Azure HDInsight | Microsoft Docs |
Learn how to create Hadoop, HBase, Storm, or Spark clusters on Linux for HDInsight by using Azure PowerShell. |
hdinsight |
nitinme |
jhubbard |
cgronlun |
azure-portal |
4208deca-d64a-45e1-8948-2673d5d7678c |
hdinsight |
hdinsightactive |
na |
article |
na |
big-data |
08/28/2017 |
nitinme |
[!INCLUDE selector]
Azure PowerShell is a powerful scripting environment that you can use to control and automate the deployment and management of your workloads in Microsoft Azure. This document provides information about how to create a Linux-based HDInsight cluster by using Azure PowerShell. It also includes an example script.
Note
Azure PowerShell is only available on Windows clients. If you are using a Linux, Unix, or Mac OS X client, see Create a Linux-based HDInsight cluster using Azure CLI for information about using the Azure CLI to create a cluster.
You must have the following before starting this procedure:
-
An Azure subscription. See Get Azure free trial.
-
[!IMPORTANT] Azure PowerShell support for managing HDInsight resources using Azure Service Manager is deprecated, and was removed on January 1, 2017. The steps in this document use the new HDInsight cmdlets that work with Azure Resource Manager.
Please follow the steps in Install Azure PowerShell to install the latest version of Azure PowerShell. If you have scripts that need to be modified to use the new cmdlets that work with Azure Resource Manager, see Migrating to Azure Resource Manager-based development tools for HDInsight clusters for more information.
[!INCLUDE delete-cluster-warning]
To create an HDInsight cluster by using Azure PowerShell, you must complete the following procedures:
- Create an Azure resource group
- Create an Azure Storage account
- Create an Azure Blob container
- Create an HDInsight cluster
The following script demonstrates how to create a new cluster:
[!code-powershellmain]
The values you specify for the cluster login are used to create the Hadoop user account for the cluster. Use this account to connect to services hosted on the cluster such as web UIs or REST APIs.
The values you specify for the SSH user are used to create the SSH user for the cluster. Use this account to start a remote SSH session on the cluster and run jobs. For more information, see the Use SSH with HDInsight document.
Important
If you plan to use more than 32 worker nodes (either at cluster creation or by scaling the cluster after creation), you must also specify a head node size with at least 8 cores and 14 GB of RAM.
For more information on node sizes and associated costs, see HDInsight pricing.
It can take up to 20 minutes to create a cluster.
You can also create an HDInsight configuration object using New-AzureRmHDInsightClusterConfig
cmdlet. You can then modify this configuration object to enable additional configuration options for your cluster. Finally, use the -Config
parameter of the New-AzureRmHDInsightCluster
cmdlet to use the configuration.
The following script creates a configuration object to configure an R Server on HDInsight cluster type. The configuration enables an edge node, RStudio, and an additional storage account.
[!code-powershellmain]
Warning
Using a storage account in a different location than the HDInsight cluster is not supported. When using this example, create the additional storage account in the same location as the server.
- See Customize HDInsight clusters using Bootstrap.
- See Customize HDInsight clusters using Script Action.
[!INCLUDE delete-cluster-warning]
If you run into issues with creating HDInsight clusters, see access control requirements.
Now that you have successfully created an HDInsight cluster, use the following resources to learn how to work with your cluster.
- Develop Java topologies for Storm on HDInsight
- Use Python components in Storm on HDInsight
- Deploy and monitor topologies with Storm on HDInsight
- Create a standalone application using Scala
- Run jobs remotely on a Spark cluster using Livy
- Spark with BI: Perform interactive data analysis using Spark in HDInsight with BI tools
- Spark with Machine Learning: Use Spark in HDInsight to predict food inspection results
- Spark Streaming: Use Spark in HDInsight for building real-time streaming applications