forked from apache/hadoop
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
MAPREDUCE-6495. Docs for archive-logs tool (rkanter)
- Loading branch information
Showing
6 changed files
with
127 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
85 changes: 85 additions & 0 deletions
85
hadoop-tools/hadoop-archive-logs/src/site/markdown/HadoopArchiveLogs.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,85 @@ | ||
<!--- | ||
Licensed under the Apache License, Version 2.0 (the "License"); | ||
you may not use this file except in compliance with the License. | ||
You may obtain a copy of the License at | ||
http://www.apache.org/licenses/LICENSE-2.0 | ||
Unless required by applicable law or agreed to in writing, software | ||
distributed under the License is distributed on an "AS IS" BASIS, | ||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
See the License for the specific language governing permissions and | ||
limitations under the License. See accompanying LICENSE file. | ||
--> | ||
|
||
Hadoop Archive Logs Guide | ||
========================= | ||
|
||
- [Overview](#Overview) | ||
- [How to Archive Logs](#How_to_Archive_Logs) | ||
|
||
Overview | ||
-------- | ||
|
||
For clusters with a lot of Yarn aggregated logs, it can be helpful to combine | ||
them into hadoop archives in order to reduce the number of small files, and | ||
hence the stress on the NameNode. This tool provides an easy way to do this. | ||
Aggregated logs in hadoop archives can still be read by the Job History Server | ||
and by the `yarn logs` command. | ||
|
||
For more on hadoop archives, see | ||
[Hadoop Archives Guide](../hadoop-archives/HadoopArchives.html). | ||
|
||
How to Archive Logs | ||
------------------- | ||
|
||
usage: mapred archive-logs | ||
-force Force recreating the working directory if | ||
an existing one is found. This should | ||
only be used if you know that another | ||
instance is not currently running | ||
-help Prints this message | ||
-maxEligibleApps <n> The maximum number of eligible apps to | ||
process (default: -1 (all)) | ||
-maxTotalLogsSize <megabytes> The maximum total logs size (in | ||
megabytes) required to be eligible | ||
(default: 1024) | ||
-memory <megabytes> The amount of memory (in megabytes) for | ||
each container (default: 1024) | ||
-minNumberLogFiles <n> The minimum number of log files required | ||
to be eligible (default: 20) | ||
-verbose Print more details. | ||
|
||
The tool only supports running one instance on a cluster at a time in order | ||
to prevent conflicts. It does this by checking for the existance of a | ||
directory named ``archive-logs-work`` under | ||
``yarn.nodemanager.remote-app-log-dir`` in HDFS | ||
(default: ``/tmp/logs/archive-logs-work``). If for some reason that | ||
directory was not cleaned up properly, and the tool refuses to run, you can | ||
force it with the ``-force`` option. | ||
|
||
The ``-help`` option prints out the usage information. | ||
|
||
The tool works by performing the following procedure: | ||
|
||
1. Determine the list of eligible applications, based on the following | ||
criteria: | ||
- is not already archived | ||
- its aggregation status has successfully completed | ||
- has at least ``-minNumberLogFiles`` log files | ||
- the sum of its log files size is less than ``-maxTotalLogsSize`` megabytes | ||
2. If there are are more than ``-maxEligibleApps`` applications found, the | ||
newest applications are dropped. They can be processed next time. | ||
3. A shell script is generated based on the eligible applications | ||
4. The Distributed Shell program is run with the aformentioned script. It | ||
will run with ``-maxEligibleApps`` containers, one to process each | ||
application, and with ``-memory`` megabytes of memory. Each container runs | ||
the ``hadoop archives`` command for a single application and replaces | ||
its aggregated log files with the resulting archive. | ||
|
||
The ``-verbose`` option makes the tool print more details about what it's | ||
doing. | ||
|
||
The end result of running the tool is that the original aggregated log files for | ||
a processed application will be replaced by a hadoop archive containing all of | ||
those logs. |
30 changes: 30 additions & 0 deletions
30
hadoop-tools/hadoop-archive-logs/src/site/resources/css/site.css
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
/* | ||
* Licensed to the Apache Software Foundation (ASF) under one or more | ||
* contributor license agreements. See the NOTICE file distributed with | ||
* this work for additional information regarding copyright ownership. | ||
* The ASF licenses this file to You under the Apache License, Version 2.0 | ||
* (the "License"); you may not use this file except in compliance with | ||
* the License. You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software | ||
* distributed under the License is distributed on an "AS IS" BASIS, | ||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
#banner { | ||
height: 93px; | ||
background: none; | ||
} | ||
|
||
#bannerLeft img { | ||
margin-left: 30px; | ||
margin-top: 10px; | ||
} | ||
|
||
#bannerRight img { | ||
margin: 17px; | ||
} | ||
|