Skip to content

Convert files into markdown to help RAG or LLM understand, based on markitdown and MinerU, which could provide high quality pdf parser.

License

Notifications You must be signed in to change notification settings

KylinMountain/markify

Repository files navigation

Markify

Convert files into markdown to help RAG or LLM understand, based on markitdown and MinerU, which could provide high quality pdf parser. In current, it support simple pdf model(use pdfminer, it is fast) and advanced pdf model (use mineru with models to parse pdf, it is slow).

API

FastAPI自带API文档 http://127.0.0.1:20926/docs

上传文件,创建任务

请求

curl -X 'POST' \
  'http://127.0.0.1:20926/api/jobs' \
  -H 'accept: application/json' \
  -H 'Content-Type: multipart/form-data' \
  -F '[email protected];type=application/pdf' \
  -F 'pdf_mode=advanced'

响应

{
  "job_id": "29bbad6b-c167-41f0-8a29-99551c499263"
}

查询任务状态

请求

curl -X 'GET' \
  'http://127.0.0.1:20926/api/jobs/29bbad6b-c167-41f0-8a29-99551c499263' \
  -H 'accept: application/json'

响应

{
  "job_id": "29bbad6b-c167-41f0-8a29-99551c499263",
  "status": "completed",
  "filename": "CoA.pdf",
  "params": {
    "pdf_mode": "advanced"
  },
  "error": null
}

下载markdown文件

请求

curl -X 'GET' \
  'http://127.0.0.1:20926/api/jobs/29bbad6b-c167-41f0-8a29-99551c499263/result' \
  -H 'accept: application/json'

响应 文件

TODO

  • 优化Mineru中输出的图像地址为本机地址
  • 添加云端解析模式
  • 添加简单的web页面

About

Convert files into markdown to help RAG or LLM understand, based on markitdown and MinerU, which could provide high quality pdf parser.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages