Tabulate

🚀 Extract custom information from unstructured documents with Generative AI

🔥 Overview

Tabulate is a CDK stack solution with the following features:

extract well-defined entities (e.g., name), numeric scores (e.g., sentiment) and free-form content (e.g., summary)
describe the list of attributes to be extracted from your docs without costly data annotation or model training
use Python API or the web app UI to analyze PDF, Office or image docs

Click here to see a 1-minute demo recording.

Refer to the demo notebook for the implementation and usage examples.

Note: do not use the name "Tabulate" when presenting the solution in external customer engagements.

Example API call

docs = ['doc1', 'doc2']

features = [
    {"name": "delay", "description": "delay of the shipment in days"},
    {"name": "shipment_id", "description": "unique shipment identifier"},
    {"name": "summary", "description": "one-sentence summary of the text"},
]

run_tabulate_api(
    documents=docs,
    features=features,
)
# [{'delay': 2, 'shipment_id': '123890', 'summary': 'summary1'},
# {'delay': 3, 'shipment_id': '678623', 'summary': 'summary2'}]

Example Web UI

🔧 Deploy the App

Prerequisites

Make sure you have installed the following tools, languages as well as access to the target AWS account:

AWS CLI
AWS Account and User: we suggest configuring an AWS account with a profile $ aws configure --profile [profile-name]
Node.js
IDE for your programming language
AWS CDK Toolkit
Python

Clone the Repo

Clone the repo to a location of your choice:

git clone [email protected]:genaiic-reusable-assets/demo-artifacts/tabulate.git

Activate Environment

Navigate to the project folder and execute the following commands to create a virtualenv on MacOS and Linux and install dependencies:

python3 -m venv .venv
source .venv/bin/activate
pip install poetry
poetry install

Configure the Stack

Open and modify the config.yml file to specify your project name and modules you would like to deploy (e.g., whether to deploy a web app)

stack_name: tabulate   # Name of your demo, will be used as stack name and prefix for resources

...

streamlit:
  deploy_streamlit: True

CDK Bootstrap & Deploy

Bootstrap CDK in your account, ideally using the profile name you have used in the aws configure step. You can easily configure multiple accounts and bootstrap and deploy the framework to different accounts.

cdk bootstrap --profile [PROFILE_NAME]

Make sure the docker daemon is running in case you deploy the streamlit frontend. (On mac you can just open docker desktop)

You can deploy the framework stack.

cdk deploy --profile [PROFILE_NAME]

💻 Use the App

Option 1: Run API with Python

Follow steps in this notebook to run a job via an API call. You will need to:

provide input document text(s)
provide a list of features to be extracted

Option 2: Run web app

Add Cognito Users

Open the Cognito Console, choose the created user pool, and click create user
Provide the user name and a temporary password or email address for auto-generated password
- Users will be able to log into the frontend using Cognito credentials

Access the Frontend

The URL to access the frontend appears as output at the end of the CDK deployment under "CloudfrontDistributionName"

or

Open the AWS console, and go to CloudFront
Copy the Domain name of the created distribution

👥 Team

Core team:


Nikita Kozodoi	Nuno Castro
Owner & Maintainer	Science Manager

Contributors:


Romain Besombes	Zainab Afolabi	Ivan Sosnovik	Huong Vu

Acknowledgements:

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
assets		assets
demo		demo
diagram		diagram
infra		infra
screenshots		screenshots
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
cdk.json		cdk.json
config.yml		config.yml
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tabulate

Contents

🔥 Overview

🔧 Deploy the App

Prerequisites

Clone the Repo

Activate Environment

Configure the Stack

CDK Bootstrap & Deploy

💻 Use the App

Option 1: Run API with Python

Option 2: Run web app

Add Cognito Users

Access the Frontend

👥 Team

About

Releases

Packages

Languages

License

epequeno/genaiic-insurance-claims

Folders and files

Latest commit

History

Repository files navigation

Tabulate

Contents

🔥 Overview

🔧 Deploy the App

Prerequisites

Clone the Repo

Activate Environment

Configure the Stack

CDK Bootstrap & Deploy

💻 Use the App

Option 1: Run API with Python

Option 2: Run web app

Add Cognito Users

Access the Frontend

👥 Team

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages