Skip to content

Listens to a queue of documents from a stream, batches them, and sends them down stream

Notifications You must be signed in to change notification settings

turtlepa/discovery-queue-manager

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Discovery Queue Manager

Listens to a queue of documents from a stream, batches them, and sends them down-stream.

Streaming contract

Expects data like this from stream IndexDocumentQueue (using this avro schema):

<avro encoded> {"uri":"doc1","type":"test"}
<avro encoded> {"uri":"doc2","type":"test"}
<avro encoded> {"uri":"doc2","type":"test"}
<avro encoded> {"uri":"doc3","type":"test"}
<avro encoded> {"uri":"doc3","type":"test"}
<avro encoded> {"uri":"doc3","type":"test"}

Unique document URIs will be passed along down-stream to stream IndexDocument and looks like this (using this avro schema):

<avro encoded> {"uri":"doc1","type":"test"}
<avro encoded> {"uri":"doc2","type":"test"}
<avro encoded> {"uri":"doc3","type":"test"}

Installation

npm install
npm install -g node-lambda

Setup

node-lambda setup

Copy event.sample.json data into event.json. It's encoded with avro schema in base64, but will eventually resolve to something like this:

{"uri":"doc1","type":"test"}
{"uri":"doc2","type":"test"}
{"uri":"doc2","type":"test"}
{"uri":"doc3","type":"test"}
{"uri":"doc3","type":"test"}
{"uri":"doc3","type":"test"}

Fill in credentials in .env file to write to stream. At least these:

KINESIS_STREAM_NAME_OUT=IndexDocument
GROUP_BY_FIELD=uri
AWS_ACCESS_KEY_ID=xxx
AWS_SECRET_ACCESS_KEY=xxx
AWS_ROLE_ARN=xxx

Run test

node-lambda run

Should produce data like this to the output stream (using this avro schema):

<avro encoded> {"uri":"doc1","type":"test"}
<avro encoded> {"uri":"doc2","type":"test"}
<avro encoded> {"uri":"doc3","type":"test"}

Deploy

Update deploy.env with at least these:

KINESIS_STREAM_NAME_OUT=IndexDocument
GROUP_BY_FIELD=uri

Then run:

node-lambda deploy --functionName manageDocumentQueue --environment production --configFile deploy.env

Will deploy to a Lambda called manageDocumentQueue-production. Add a Kinesis stream trigger to execute function if not already added.

About

Listens to a queue of documents from a stream, batches them, and sends them down stream

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • JavaScript 100.0%