This folder contains example projects using the Amazon Textract Response Parser for JavaScript/TypeScript from various different build environments, to help you get started.
⚠️ Note: While all of the example projects reference local API response JSON files, some also make Amazon Textract API calls by default - so running them may incur (typically very small) charges. See Amazon Textract Pricing for details.
The projects use the local build of the library for pre-publication testing, so you'll need to run npm run build
in the parent src-js
folder before they'll work.
To instead switch to published TRP.js versions (if you're using an example as a skeleton for your own project):
- For NodeJS projects, Replace the package.json relative path in
"amazon-textract-response-parser": "file:../.."
with a normal version spec like"amazon-textract-response-parser": "^0.4.3"
, and re-runnpm install
- For browser IIFE projects, edit the
<script>
tag in the HTML to point to your chosen CDN or downloadedtrp.min.js
location
For the example projects that demonstrate actual integration with Amazon Textract, we create a TextractClient with empty configuration. This assumes that your AWS IAM credentials and default region are pre-configured for access through e.g. environment variables.
If you're new to setting up AWS credentials for CLI and SDK access in general, refer to the credentials guidance in the AWS SDK for JavaScript (v3) Developer Guide and/or the AWS CLI user guide.
The 'synchronous' request/response APIs used in these examples generally only support images or single-page documents. Multi-page documents will need to use Asynchronous Textract APIs instead. Since Asynchronous APIs like StartDocumentAnalysis return a job ID rather than an immediate result, applications will need to wait and GetDocumentAnalysis to retrieve the result once it's ready. You'll also need to upload the source document to Amazon S3 rather than passing it directly in the API request.
Furthermore, Amazon Textract applies quota limits on these APIs.
As a result, applications processing multi-page documents will generally need to orchestrate uploading the source file to S3; starting the analysis job; and resuming the processing flow once notified via Amazon SNS that the analysis is ready (which is much more quota-efficient than polling the GetDocumentAnalysis
API)... Particularly spiky workflows (where many documents are submitted at once) may also want to implement queuing to manage inbound request rates.
A full end-to-end solution for this involves deploying cloud infrastructure like AWS Lambda functions and Amazon SNS topics, so is outside the scope of these TRP samples. Instead, refer to:
- Amazon Textract IDP CDK Constructs for composable, deployable solution components written in AWS CDK.
- Amazon Textract Textractor which mainly provides Python bindings, but also a handy CLI for processing a batch of documents for a quick PoC.
- Other code samples listed in the Amazon Textract Developer Guide.