Loading SOOP XBT NRT data into Parquet

This is a repo that holds code for reading bufr formatted CSV files and loading them into a reasonably optimally partitioned Parquet file.

The Dockerfile contains the required code and is used for a Lambda function, which is triggered from a SNS notification from a S3 creation event.

Unfortunately, the Lambda deployment was manual, so there's no infra code.

There are also some notebooks for exploration of the parquet creation and consumption.

Tests exist, but the important one doesn't pass because moto can't mock S3 the way aiobotocore expects... I think.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.gitignore		.gitignore
Dockerfile		Dockerfile
IOSS01_AMMC_20200901134700_D5LR9.csv		IOSS01_AMMC_20200901134700_D5LR9.csv
LICENSE.txt		LICENSE.txt
Makefile		Makefile
README.md		README.md
app.py		app.py
app_copy.py		app_copy.py
csv_to_elasticsearch.ipynb		csv_to_elasticsearch.ipynb
csv_to_parquet.ipynb		csv_to_parquet.ipynb
docker-compose.yaml		docker-compose.yaml
load_parquet.ipynb		load_parquet.ipynb
load_parquet_dask.ipynb		load_parquet_dask.ipynb
load_parquet_dask_trajectory.ipynb		load_parquet_dask_trajectory.ipynb
requirements.txt		requirements.txt
s3-event.json		s3-event.json
sns-event.json		sns-event.json
test_handler.py		test_handler.py

Provide feedback