Skip to content

Latest commit

 

History

History
362 lines (267 loc) · 20.4 KB

hdinsight-hadoop-development-using-azure-resource-manager.md

File metadata and controls

362 lines (267 loc) · 20.4 KB
title description services editor manager author documentationcenter ms.assetid ms.service ms.workload ms.tgt_pltfrm ms.devlang ms.topic ms.date ms.author
Migrate to Azure Resource Manager development tools for HDInsight clusters | Microsoft Docs
How to migrate to Azure Resource Manager development tools for HDInsight clusters
hdinsight
cgronlun
jhubbard
nitinme
05efedb5-6456-4552-87ff-156d77fbe2e1
hdinsight
big-data
na
na
article
10/05/2016
nitinme

Migrating to Azure Resource Manager-based development tools for HDInsight clusters

HDInsight is deprecating Azure Service Manager (ASM)-based tools for HDInsight. If you have been using Azure PowerShell, Azure CLI, or the HDInsight .NET SDK to work with HDInsight clusters, you are encouraged to use the Azure Resource Manager (ARM)-based versions of PowerShell, CLI, and .NET SDK going forward. This article provides pointers on how to migrate to the new ARM-based approach. Wherever applicable, this article also points out the differences between the ASM and ARM approaches for HDInsight.

Important

The support for ASM based PowerShell, CLI, and .NET SDK will discontinue on January 1, 2017.

Migrating Azure CLI to Azure Resource Manager

The Azure CLI now defaults to Azure Resource Manager (ARM) mode, unless you are upgrading from a previous installation; in this case, you may need to use the azure config mode arm command to switch to ARM mode.

The basic commands that the Azure CLI provided to work with HDInsight using Azure Service Management (ASM) are the same when using ARM; however some parameters and switches may have new names, and there are many new parameters available when using ARM. For example, you can now use azure hdinsight cluster create to specify the Azure Virtual Network that a cluster should be created in, or Hive and Oozie metastore information.

Basic commands for working with HDInsight through Azure Resource Manager are:

  • azure hdinsight cluster create - creates a new HDInsight cluster
  • azure hdinsight cluster delete - deletes an existing HDInsight cluster
  • azure hdinsight cluster show - display information about an existing cluster
  • azure hdinsight cluster list - lists HDInsight clusters for your Azure subscription

Use the -h switch to inspect the parameters and switches available for each command.

New commands

New commands available with Azure Resource Manager are:

  • azure hdinsight cluster resize - dynamically changes the number of worker nodes in the cluster
  • azure hdinsight cluster enable-http-access - enables HTTPs access to the cluster (on by default)
  • azure hdinsight cluster disable-http-access - disables HTTPs access to the cluster
  • azure hdinsight-enable-rdp-access - enables Remote Desktop Protocol on a Windows-based HDInsight cluster
  • azure hdinsight-disable-rdp-access - disables Remote Desktop Protocol on a Windows-based HDInsight cluster
  • azure hdinsight script-action - provides commands for creating/managing Script Actions on a cluster
  • azure hdinsight config - provides commands for creating a configuration file that can be used with the hdinsight cluster create command to provide configuration information.

Deprecated commands

If you use the azure hdinsight job commands to submit jobs to your HDInsight cluster, these are not available through the ARM commands. If you need to programmatically submit jobs to HDInsight from scripts, you should instead use the REST APIs provided by HDInsight. For more information on submitting jobs using REST APIs, see the following documents.

For information on other ways to run MapReduce, Hive, and Pig interactively, see Use MapReduce with Hadoop on HDInsight, Use Hive with Hadoop on HDInsight, and Use Pig with Hadoop on HDInsight.

Examples

Creating a cluster

  • Old command (ASM) - azure hdinsight cluster create myhdicluster --location northeurope --osType linux --storageAccountName mystorage --storageAccountKey <storagekey> --storageContainer mycontainer --userName admin --password mypassword --sshUserName sshuser --sshPassword mypassword
  • New command (ARM) - azure hdinsight cluster create myhdicluster -g myresourcegroup --location northeurope --osType linux --clusterType hadoop --defaultStorageAccountName mystorage --defaultStorageAccountKey <storagekey> --defaultStorageContainer mycontainer --userName admin -password mypassword --sshUserName sshuser --sshPassword mypassword

Deleting a cluster

  • Old command (ASM) - azure hdinsight cluster delete myhdicluster
  • New command (ARM) - azure hdinsight cluster delete mycluster -g myresourcegroup

List clusters

  • Old command (ASM) - azure hdinsight cluster list
  • New command (ARM) - azure hdinsight cluster list

Note

For the list command, specifying the resource group using -g will return only the clusters in the specified resource group.

Show cluster information

  • Old command (ASM) - azure hdinsight cluster show myhdicluster
  • New command (ARM) - azure hdinsight cluster show myhdicluster -g myresourcegroup

Migrating Azure PowerShell to Azure Resource Manager

The general information about Azure PowerShell in the Azure Resource Manager (ARM) mode can be found at Using Azure PowerShell with Azure Resource Manager.

The Azure PowerShell ARM cmdlets can be installed side-by-side with the ASM cmdlets. The cmdlets from the two modes can be distinguished by their names. The ARM mode has AzureRmHDInsight in the cmdlet names comparing to AzureHDInsight in the ASM mode. For example, New-AzureRmHDInsightCluster vs. New-AzureHDInsightCluster. Parameters and switches may have news names, and there are many new parameters available when using ARM. For example, several cmdlets require a new switch called -ResourceGroupName.

Before you can use the HDInsight cmdlets, you must connect to your Azure account, and create a new resource group:

Renamed cmdlets

To list the HDInsight ASM cmdlets in Windows PowerShell console:

help *azurermhdinsight*

The following table lists the ASM cmdlets and their names in the ARM mode:

ASM cmdlets ARM cmdlets
Add-AzureHDInsightConfigValues Add-AzureRmHDInsightConfigValues
Add-AzureHDInsightMetastore Add-AzureRmHDInsightMetastore
Add-AzureHDInsightScriptAction Add-AzureRmHDInsightScriptAction
Add-AzureHDInsightStorage Add-AzureRmHDInsightStorage
Get-AzureHDInsightCluster Get-AzureRmHDInsightCluster
Get-AzureHDInsightJob Get-AzureRmHDInsightJob
Get-AzureHDInsightJobOutput Get-AzureRmHDInsightJobOutput
Get-AzureHDInsightProperties Get-AzureRmHDInsightProperties
Grant-AzureHDInsightHttpServicesAccess Grant-AzureRmHDInsightHttpServicesAccess
Grant-AzureHdinsightRdpAccess Grant-AzureRmHDInsightRdpServicesAccess
Invoke-AzureHDInsightHiveJob Invoke-AzureRmHDInsightHiveJob
New-AzureHDInsightCluster New-AzureRmHDInsightCluster
New-AzureHDInsightClusterConfig New-AzureRmHDInsightClusterConfig
New-AzureHDInsightHiveJobDefinition New-AzureRmHDInsightHiveJobDefinition
New-AzureHDInsightMapReduceJobDefinition New-AzureRmHDInsightMapReduceJobDefinition
New-AzureHDInsightPigJobDefinition New-AzureRmHDInsightPigJobDefinition
New-AzureHDInsightSqoopJobDefinition New-AzureRmHDInsightSqoopJobDefinition
New-AzureHDInsightStreamingMapReduceJobDefinition New-AzureRmHDInsightStreamingMapReduceJobDefinition
Remove-AzureHDInsightCluster Remove-AzureRmHDInsightCluster
Revoke-AzureHDInsightHttpServicesAccess Revoke-AzureRmHDInsightHttpServicesAccess
Revoke-AzureHdinsightRdpAccess Revoke-AzureRmHDInsightRdpServicesAccess
Set-AzureHDInsightClusterSize Set-AzureRmHDInsightClusterSize
Set-AzureHDInsightDefaultStorage Set-AzureRmHDInsightDefaultStorage
Start-AzureHDInsightJob Start-AzureRmHDInsightJob
Stop-AzureHDInsightJob Stop-AzureRmHDInsightJob
Use-AzureHDInsightCluster Use-AzureRmHDInsightCluster
Wait-AzureHDInsightJob Wait-AzureRmHDInsightJob

New cmdlets

The following are the new cmdlets that are only available in the ARM mode.

Script action related cmdlets:

  • Get-AzureRmHDInsightPersistedScriptAction: Gets the persisted script actions for a cluster and lists them in chronological order, or gets details for a specified persisted script action.
  • Get-AzureRmHDInsightScriptActionHistory: Gets the script action history for a cluster and lists it in reverse chronological order, or gets details of a previously executed script action.
  • Remove-AzureRmHDInsightPersistedScriptAction: Removes a persisted script action from an HDInsight cluster.
  • Set-AzureRmHDInsightPersistedScriptAction: Sets a previously executed script action to be a persisted script action.
  • Submit-AzureRmHDInsightScriptAction: Submits a new script action to an Azure HDInsight cluster.

For additional usage information, see Customize Linux-based HDInsight clusters using Script Action.

Clsuter identity related cmdlets:

Examples

Create cluster

Old command (ASM):

New-AzureHDInsightCluster `
    -Name $clusterName `
    -Location $location `
    -DefaultStorageAccountName "$storageAccountName.blob.core.windows.net" `
    -DefaultStorageAccountKey $storageAccountKey `
    -DefaultStorageContainerName $containerName `
    -ClusterSizeInNodes 2 `
    -ClusterType Hadoop `
    -OSType Linux `
    -Version "3.2" `
    -Credential $httpCredential `
    -SshCredential $sshCredential

New command (ARM):

New-AzureRmHDInsightCluster `
    -ClusterName $clusterName `
    -ResourceGroupName $resourceGroupName `
    -Location $location `
    -DefaultStorageAccountName "$storageAccountName.blob.core.windows.net" `
    -DefaultStorageAccountKey $storageAccountKey `
    -DefaultStorageContainer $containerName  `
    -ClusterSizeInNodes 2 `
    -ClusterType Hadoop `
    -OSType Linux `
    -Version "3.2" `
    -HttpCredential $httpcredentials `
    -SshCredential $sshCredentials

Delete cluster

Old command (ASM):

Remove-AzureHDInsightCluster -name $clusterName 

New command (ARM):

Remove-AzureRmHDInsightCluster -ResourceGroupName $resourceGroupName -ClusterName $clusterName 

List cluster

Old command (ASM):

Get-AzureHDInsightCluster

New command (ARM):

Get-AzureRmHDInsightCluster 

Show cluster

Old command (ASM):

Get-AzureHDInsightCluster -Name $clusterName

New command (ARM):

Get-AzureRmHDInsightCluster -ResourceGroupName $resourceGroupName -clusterName $clusterName

Other samples

Migrating to the ARM-based HDInsight .NET SDK

The Azure Service Management-based (ASM) HDInsight .NET SDK is now deprecated. You are encouraged to use the Azure Resource Management-based (ARM) HDInsight .NET SDK. The following ASM-based HDInsight packages are being deprecated.

  • Microsoft.WindowsAzure.Management.HDInsight
  • Microsoft.Hadoop.Client

This section provides pointers to more information on how to perform certain tasks using the ARM-based SDK.

How to... using the ARM-based HDInsight SDK Links
Create HDInsight clusters using .NET SDK See Create HDInsight clusters using .NET SDK
Customize a cluster using Script Action with .NET SDK See Customize HDInsight Linux clusters using Script Action
Authenticate applications interactively using Azure Active Directory with .NET SDK See Run Hive queries using .NET SDK. The code snippet in this article uses the interactive authentication approach.
Authenticate applications non-interactively using Azure Active Directory with .NET SDK See Create non-interactive applications for HDInsight
Submit a Hive job using .NET SDK See Submit Hive jobs
Submit a Pig job using .NET SDK See Submit Pig jobs
Submit a Sqoop job using .NET SDK See Submit Sqoop jobs
List HDInsight clusters using .NET SDK See List HDInsight clusters
Scale HDInsight clusters using .NET SDK See Scale HDInsight clusters
Grant/revoke access to HDInsight clusters using .NET SDK See Grant/revoke access to HDInsight clusters
Update HTTP user credentials for HDInsight clusters using .NET SDK See Update HTTP user credentials for HDInsight clusters
Find the default storage account for HDInsight clusters using .NET SDK See Find the default storage account for HDInsight clusters
Delete HDInsight clusters using .NET SDK See Delete HDInsight clusters

Examples

Following are some examples on how an operation is performed using the ASM-based SDK and the equivalent code snippet for the ARM-based SDK.

Creating a cluster CRUD client

  • Old command (ASM)

      //Certificate auth
      //This logs the application in using a subscription administration certificate, which is not offered in Azure Resource Manager (ARM)
    
      const string subid = "454467d4-60ca-4dfd-a556-216eeeeeeee1";
      var cred = new HDInsightCertificateCredential(new Guid(subid), new X509Certificate2(@"path\to\certificate.cer"));
      var client = HDInsightClient.Connect(cred);
    
  • New command (ARM) (Service principal authorization)

      //Service principal auth
      //This will log the application in as itself, rather than on behalf of a specific user.
      //For details, including how to set up the application, see:
      //   https://azure.microsoft.com/en-us/documentation/articles/hdinsight-create-non-interactive-authentication-dotnet-applications/
    
      var authFactory = new AuthenticationFactory();
    
      var account = new AzureAccount { Type = AzureAccount.AccountType.ServicePrincipal, Id = clientId };
    
      var env = AzureEnvironment.PublicEnvironments[EnvironmentName.AzureCloud];
    
      var accessToken = authFactory.Authenticate(account, env, tenantId, secretKey, ShowDialog.Never).AccessToken;
    
      var creds = new TokenCloudCredentials(subId.ToString(), accessToken);
    
      _hdiManagementClient = new HDInsightManagementClient(creds);
    
  • New command (ARM) (User authorization)

      //User auth
      //This will log the application in on behalf of the user.
      //The end-user will see a login popup.
    
      var authFactory = new AuthenticationFactory();
    
      var account = new AzureAccount { Type = AzureAccount.AccountType.User, Id = username };
    
      var env = AzureEnvironment.PublicEnvironments[EnvironmentName.AzureCloud];
    
      var accessToken = authFactory.Authenticate(account, env, AuthenticationFactory.CommonAdTenant, password, ShowDialog.Auto).AccessToken;
    
      var creds = new TokenCloudCredentials(subId.ToString(), accessToken);
    
      _hdiManagementClient = new HDInsightManagementClient(creds);
    

Creating a cluster

  • Old command (ASM)

      var clusterInfo = new ClusterCreateParameters
                  {
                      Name = dnsName,
                      DefaultStorageAccountKey = key,
                      DefaultStorageContainer = defaultStorageContainer,
                      DefaultStorageAccountName = storageAccountDnsName,
                      ClusterSizeInNodes = 1,
                      ClusterType = type,
                      Location = "West US",
                      UserName = "admin",
                      Password = "*******",
                      Version = version,
                      HeadNodeSize = NodeVMSize.Large,
                  };
      clusterInfo.CoreConfiguration.Add(new KeyValuePair<string, string>("config1", "value1"));
      client.CreateCluster(clusterInfo);
    
  • New command (ARM)

      var clusterCreateParameters = new ClusterCreateParameters
          {
              Location = "West US",
              ClusterType = "Hadoop",
              Version = "3.1",
              OSType = OSType.Windows,
              DefaultStorageAccountName = "mystorage.blob.core.windows.net",
              DefaultStorageAccountKey =
                  "O9EQvp3A3AjXq/W27rst1GQfLllhp0gUeiUUn2D8zX2lU3taiXSSfqkZlcPv+nQcYUxYw==",
              UserName = "hadoopuser",
              Password = "*******",
              HeadNodeSize = "ExtraLarge",
              RdpUsername = "hdirp",
              RdpPassword = ""*******",
              RdpAccessExpiry = new DateTime(2025, 3, 1),
              ClusterSizeInNodes = 5
          };
      var coreConfigs = new Dictionary<string, string> {{"config1", "value1"}};
      clusterCreateParameters.Configurations.Add(ConfigurationKey.CoreSite, coreConfigs);
    

Enabling HTTP access

  • Old command (ASM)

      client.EnableHttp(dnsName, "West US", "admin", "*******");
    
  • New command (ARM)

      var httpParams = new HttpSettingsParameters
      {
             HttpUserEnabled = true,
             HttpUsername = "admin",
             HttpPassword = "*******",
      };
      client.Clusters.ConfigureHttpSettings(resourceGroup, dnsname, httpParams);
    

Deleting a cluster

  • Old command (ASM)

      client.DeleteCluster(dnsName);
    
  • New command (ARM)

      client.Clusters.Delete(resourceGroup, dnsname);