Skip to content
Max Heiber edited this page Aug 10, 2017 · 6 revisions

Welcome to the amphora-search wiki!

Amphora Search is a library for working with ElasticSearch from a Clay app. For everything except rare admin tasks, you can use amphora-search+amphora instead of talking to ES directly.

See https://github.com/clay/amphora-search#integration for how to configure amphoraSearch

Here's what Amphora Search does:

  • Querying: amphora-search exposes the /_search endpoint on amphora. To query ES, send a POST to /_search. The body should be JSON in the ElasticSearch Query DSL format.
  • ES mappings are defined in yml files in the mappings directory you have configured. Amphora-search converts these to JSON and uses them to update the indexes in ES. The format is as described in the ES Mapping Docs. A mapping is like a schema for the documents in an index.
  • Indexing documents (updating data): this is done by executing handlers. Every JS file in the directory with your handlers gets run whenever a component is saved. Handlers have a save function and a when function.
    • If when returns a truthy value, then save gets run, else it is skipped. It receives an array of ops which are descriptions of what is getting sent to redis: the key, the value, and the type, which is PUT if something is being updated in Redis.
    • save also receives an array of ops. This is a deep copy of what is actually being used to talk to redis, so it's OK to mutate this array and the objects inside it.
    • The role of save is to update an ES index (update the data in ES). The return value of save is ignored. It's important to filter and transform data in the save function: to make sure you're only processing the component instances you care about for the index, and that you are extracting only the fields we care about for the index.
  • To execute code (for example, to delete documents from an index), add an unpublish hook to a handler. unpublish is called with an object containing two keys: uri and url.
  • amphora-search contains a large number of helpers for writing save and when functions, including functions to help with filtering and with transforming Redis operations to ES operations and removing references to components: https://github.com/clay/amphora-search/blob/master/lib/services/elastic-helpers.test.js.
  • Amphora-search namespaces indexes based on the prefix provided when amphora-search is instantiated (see app.js). For example, if you name an index "recent_published_articles," but configure amphora-search to use the prefix "new_feature", then the actual index in ES will be "new_feature_recent_published_articles" even though in search results (hitting <clay>/_search) will show the index without the prefix (recent-published-articles). One use case for prefixes is to try different mappings on different branches while sharing an ES instance.
  • When migrating data, it is recommended to use ES aliases. For example, the recent_published_articles alias can point to the recent_published_articles_v1 index as you build recent_published_articles_v2. You can point the alias to recent_published_articles_v2 when the new index is ready and data has been migrated. Note that it's not typical to do "migrations" for ES indices, it makes more sense to query against aliases, reindex the data in a new index with an updated mapping, then re-point the alias to the new index.
    • Quick reference:
      • list ES indexes: curl http://<HOST>:9200/_cat/indices?v
      • list aliases: curl '<host>:9200/_aliases?pretty'
      • create alias: curl -X POST -H 'Content-Type: application/json' -d' { "actions" : [ { "add" : { "index" : "test1", "alias" : "alias1" } } ] }
      • delete an ES index (careful now): curl -X DELETE http://<HOST>:9200/<index>
Clone this wiki locally