Skip to content

Latest commit

 

History

History
34 lines (24 loc) · 1.95 KB

reference.md

File metadata and controls

34 lines (24 loc) · 1.95 KB
mapped_pages navigation_title
Reference

{{esh-full}} reference

This part of the documentation explains the core functionality of elasticsearch-hadoop starting with the configuration options and architecture and gradually explaining the various major features. At a higher level the reference is broken down into architecture and configuration section which are general, Map/Reduce and the libraries built on top of it, upcoming computation libraries (like Apache Spark) and finally mapping, metrics and troubleshooting.

We recommend going through the entire documentation even superficially when trying out elasticsearch-hadoop for the first time, however those in a rush, can jump directly to the desired sections:

Architecture : Overview of the elasticsearch-hadoop architecture and how it maps on top of Hadoop.

Configuration : Explore the various configuration switches in elasticsearch-hadoop.

Map/Reduce integration : Describes how to use elasticsearch-hadoop in vanilla Map/Reduce environments - typically useful for those interested in data loading and saving to/from {{es}} without little, if any, ETL (extract-transform-load).

Apache Hive integration : Hive users should refer to this section.

Apache Spark support : Describes how to use Apache Spark with {{es}} through elasticsearch-hadoop.

Mapping and types : A deep-dive into the strategies employed by elasticsearch-hadoop for doing type conversion and mapping to and from {{es}}.

Hadoop Metrics : Elasticsearch Hadoop metrics.

Troubleshooting : Tips on troubleshooting and getting help.