Marker Docker

This is a CUDA-enabled Docker wrapper for converting image-based documents to markdown, complete with a cross-platform Python client. I originally enhanced it for my homelab to convert thousands of scanned documents and images into markdown format.

What's New(ish)?

CUDA Support: Tuned specifically for NVIDIA GPUs
Image Handling: Added support for images (JPG, PNG, TIFF) by automatically converting them to PDFs
Python Client: A handy client script that can process thousands of files—tested on Mac, Windows, and Linux
Debug Mode: Optional debugging that saves all requests and responses for troubleshooting

⚠️ Important Security Note

This service has no authentication and is meant for use in a secured homelab environment. Do not expose it to the internet without adding proper security measures!

Quick Start

Using Docker Compose (recommended):

# Create debug logs directory (if using debug mode)
mkdir debug_logs

# Build and run with compose
docker compose up --build

Or build and run manually with CUDA:

docker build . -t markerwrapper:cuda
docker run --gpus all \
  -e MARKER_ROOT_PATH=/cornvert \
  -e MARKER_HOST=0.0.0.0 \
  -e MARKER_PORT=8001 \
  -p 8001:8001 markerwrapper:cuda

Use the client (from any OS):

pip install requests

python marker_client.py -o /path/to/output -u http://your.server:port/cornvert/ /path/to/scan/files

Debug Mode

To enable debug mode, which saves all requests and responses:

Create a debug directory:

mkdir debug_logs

Use the provided compose_sample.yaml file, which already includes debug mode settings

OR

If running manually, add the debug flag:

docker run --gpus all \
  -e MARKER_ROOT_PATH=/cornvert \
  -e MARKER_HOST=0.0.0.0 \
  -e MARKER_PORT=8001 \
  -v ./debug_logs:/usr/src/app/marker/debug_logs \
  -p 8001:8001 markerwrapper:cuda \
  /usr/src/app/venv/bin/python /usr/src/app/marker/marker_server.py \
  --port 8001 --host 0.0.0.0 --root-path /cornvert --debug

Debug logs will be saved in timestamped folders under ./debug_logs/.

Known Issues

How I'm overwriting marker_server.py in the Dockerfile is lame—I need a better process.
Planning to add Microsoft's MarkItDown for Office documents (.doc, .xls, etc.), but that project seems unstable; currently, I can't get it to convert anything.
External LLM support for advanced image analysis isn't implemented yet, even though Marker itself does provide some support.

Background

I created this for my homelab to convert thousands of scanned documents from various sources into markdown. The client has been battle-tested across Windows, Mac, and Linux. Feel free to adapt it for your own bulk-conversion projects!

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.dockerignore		.dockerignore
Dockerfile		Dockerfile
README.md		README.md
compose_sample.yaml		compose_sample.yaml
marker_client.py		marker_client.py
marker_server.py		marker_server.py
test.pdf		test.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Marker Docker

What's New(ish)?

⚠️ Important Security Note

Quick Start

Debug Mode

Known Issues

Background

References

About

Releases

Packages

Languages

cowmix/marker_docker

Folders and files

Latest commit

History

Repository files navigation

Marker Docker

What's New(ish)?

⚠️ Important Security Note

Quick Start

Debug Mode

Known Issues

Background

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages