title | description | services | author | ms.reviewer | ms.service | ms.topic | ms.date | ms.author | ROBOTS |
---|---|---|---|---|---|---|---|---|---|
Debug and analyze Apache Hadoop services with heap dumps - Azure |
Automatically collect heap dumps for Apache Hadoop services and place inside the Azure Blob storage account for debugging and analysis. |
hdinsight |
hrasheed-msft |
jasonh |
hdinsight |
conceptual |
05/25/2017 |
hrasheed |
NOINDEX |
[!INCLUDE heapdump-selector]
Heap dumps contain a snapshot of the application's memory, including the values of variables at the time the dump was created. So they are useful for diagnosing problems that occur at run-time. Heap dumps can be automatically collected for Apache Hadoop services and placed inside the Azure Blob storage account of a user under HDInsightHeapDumps/.
The collection of heap dumps for various services must be enabled for services on individual clusters. The default for this feature is to be off for a cluster. These heap dumps can be large, so it is advisable to monitor the Blob storage account where they are being saved once the collection has been enabled.
Important
Linux is the only operating system used on HDInsight version 3.4 or greater. For more information, see HDInsight retirement on Windows. The information in this article only applies to Windows-based HDInsight. For information on Linux-based HDInsight, see Enable heap dumps for Apache Hadoop services on Linux-based HDInsight
You can enable heap dumps for the following services:
- Apache hcatalog - tempelton
- Apache hive - hiveserver2, metastore, derbyserver
- mapreduce - jobhistoryserver
- Apache yarn - resourcemanager, nodemanager, timelineserver
- Apache hdfs - datanode, secondarynamenode, namenode
To turn on heap dumps for a service, you need to set the appropriate configuration elements in the section for that service, which is specified by service_name.
"javaargs.<service_name>.XX:+HeapDumpOnOutOfMemoryError" = "-XX:+HeapDumpOnOutOfMemoryError",
"javaargs.<service_name>.XX:HeapDumpPath" = "-XX:HeapDumpPath=c:\Dumps\<service_name>_%date:~4,2%%date:~7,2%%date:~10,2%%time:~0,2%%time:~3,2%%time:~6,2%.hprof"
The value of service_name can be any of the services listed here: tempelton, hiveserver2, metastore, derbyserver, jobhistoryserver, resourcemanager, nodemanager, timelineserver, datanode, secondarynamenode, or namenode.
For example, to turn on heap dumps by using Azure PowerShell for jobhistoryserver, you can use the following script:
[!INCLUDE upgrade-powershell]
$MapRedConfigValues = new-object 'Microsoft.WindowsAzure.Management.HDInsight.Cmdlet.DataObjects.AzureHDInsightMapReduceConfiguration'
$MapRedConfigValues.Configuration = @{ "javaargs.jobhistoryserver.XX:+HeapDumpOnOutOfMemoryError"="-XX:+HeapDumpOnOutOfMemoryError" ; "javaargs.jobhistoryserver.XX:HeapDumpPath" = "-XX:HeapDumpPath=c:\\Dumps\\jobhistoryserver_%date:~4,2%_%date:~7,2%_%date:~10,2%_%time:~0,2%_%time:~3,2%_%time:~6,2%.hprof" }
For example, to turn on heap dumps by using the Azure HDInsight .NET SDK for jobhistoryserver, you can use the following code:
clusterInfo.MapReduceConfiguration.ConfigurationCollection.Add(new KeyValuePair<string, string>("javaargs.jobhistoryserver.XX:+HeapDumpOnOutOfMemoryError", "-XX:+HeapDumpOnOutOfMemoryError"));
clusterInfo.MapReduceConfiguration.ConfigurationCollection.Add(new KeyValuePair<string, string>("javaargs.jobhistoryserver.XX:HeapDumpPath", "-XX:HeapDumpPath=c:\\Dumps\\jobhistoryserver_%date:~4,2%_%date:~7,2%_%date:~10,2%_%time:~0,2%_%time:~3,2%_%time:~6,2%.hprof"));