title | description | services | documentationcenter | tags | author | manager | editor | ms.assetid | ms.service | ms.workload | ms.tgt_pltfrm | ms.devlang | ms.topic | ms.date | ms.author |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Debug and analyze Hadoop services with heap dumps | Microsoft Docs |
Automatically collect heap dumps for Hadoop services and place inside the Azure Blob storage account for debugging and analysis. |
hdinsight |
azure-portal |
mumian |
jhubbard |
cgronlun |
e4ec4ebb-fd32-4668-8382-f956581485c4 |
hdinsight |
big-data |
na |
na |
article |
02/06/2017 |
jgao |
[!INCLUDE heapdump-selector]
Heap dumps contain a snapshot of the application's memory, including the values of variables at the time the dump was created. So they are very useful for diagnosing problems that occur at run-time. Heap dumps can be automatically collected for Hadoop services and placed inside the Azure Blob storage account of a user under HDInsightHeapDumps/.
The collection of heap dumps for various services must be enabled for services on individual clusters. The default for this feature is to be off for a cluster. These heap dumps can be large, so it is advisable to monitor the Blob storage account where they are being saved once the collection has been enabled.
Important
Linux is the only operating system used on HDInsight version 3.4 or greater. For more information, see HDInsight Deprecation on Windows. The information in this article only applies to Windows-based HDInsight. For information on Linux-based HDInsight, see Enable heap dumps for Hadoop services on Linux-based HDInsight
You can enable heap dumps for the following services:
- hcatalog - tempelton
- hive - hiveserver2, metastore, derbyserver
- mapreduce - jobhistoryserver
- yarn - resourcemanager, nodemanager, timelineserver
- hdfs - datanode, secondarynamenode, namenode
To turn on heap dumps for a service, you need to set the appropriate configuration elements in the section for that service, which is specified by service_name.
"javaargs.<service_name>.XX:+HeapDumpOnOutOfMemoryError" = "-XX:+HeapDumpOnOutOfMemoryError",
"javaargs.<service_name>.XX:HeapDumpPath" = "-XX:HeapDumpPath=c:\Dumps\<service_name>_%date:~4,2%%date:~7,2%%date:~10,2%%time:~0,2%%time:~3,2%%time:~6,2%.hprof"
The value of service_name can be any of the services listed above: tempelton, hiveserver2, metastore, derbyserver, jobhistoryserver, resourcemanager, nodemanager, timelineserver, datanode, secondarynamenode, or namenode.
For example, to turn on heap dumps by using Azure PowerShell for jobhistoryserver, you would do the following:
[!INCLUDE upgrade-powershell]
$MapRedConfigValues = new-object 'Microsoft.WindowsAzure.Management.HDInsight.Cmdlet.DataObjects.AzureHDInsightMapReduceConfiguration'
$MapRedConfigValues.Configuration = @{ "javaargs.jobhistoryserver.XX:+HeapDumpOnOutOfMemoryError"="-XX:+HeapDumpOnOutOfMemoryError" ; "javaargs.jobhistoryserver.XX:HeapDumpPath" = "-XX:HeapDumpPath=c:\\Dumps\\jobhistoryserver_%date:~4,2%_%date:~7,2%_%date:~10,2%_%time:~0,2%_%time:~3,2%_%time:~6,2%.hprof" }
For example, to turn on heap dumps by using the Azure HDInsight .NET SDK for jobhistoryserver, you would do the following:
clusterInfo.MapReduceConfiguration.ConfigurationCollection.Add(new KeyValuePair<string, string>("javaargs.jobhistoryserver.XX:+HeapDumpOnOutOfMemoryError", "-XX:+HeapDumpOnOutOfMemoryError"));
clusterInfo.MapReduceConfiguration.ConfigurationCollection.Add(new KeyValuePair<string, string>("javaargs.jobhistoryserver.XX:HeapDumpPath", "-XX:HeapDumpPath=c:\\Dumps\\jobhistoryserver_%date:~4,2%_%date:~7,2%_%date:~10,2%_%time:~0,2%_%time:~3,2%_%time:~6,2%.hprof"));