Skip to content

Releases: NoEdgeAI/pdfdeal

V0.2.4

13 Aug 12:11
9e29ee2
Compare
Choose a tag to compare

English | 🌐简体中文

✨ Feature changes

  • New MD Document Auto Split tool
  • New MD document image upload tool
  • New built-in upload tool: AliCloud OSS
  • CLI tool will keep the source name of the file (instead of UUID name).

🔧 BUG Fixes

  • Fixed an issue where an error message was not displayed when a status request failed

V0.2.3

31 Jul 09:13
Compare
Choose a tag to compare

English | 🌐简体中文

🔧 BUG Fixes

  • Fixed issue that did not work in Jupyter Notebook
  • Fixed rate limiter not working in pdfdeal function.

V0.2.2

16 Jul 07:31
14edfc4
Compare
Choose a tag to compare

English | 🌐简体中文

✨ Feature Changes

  • CLI command line program doc2x supports automatic decompression of downloaded zip files

🔧 BUG Fixes

  • The CLI command line program doc2x does not save the key locally in some cases.
  • The Replace image links in Markdown files with links to local files function saves images in the wrong format (saves jpg images as png).

V0.2.1

15 Jul 12:50
b2c0028
Compare
Choose a tag to compare

English | 🌐简体中文

✨ Feature Changes

  • Updated to adapt new doc2x rate limiting rules from requests per minute RPM -> simultaneous task requests.

🔧 BUG Fixes

  • CLI command line program doc2x does not save error logs, only prints them in the terminal.

V0.2.0

14 Jul 04:48
d7616cd
Compare
Choose a tag to compare

Caution

This version has major interface updates (impact range: all)

  • Function return parameters have changed, please check update details to see how to migrate

English | 🌐简体中文

✨ Feature Changes

  • Added CLI command line program doc2x, for quickly using doc2x to batch process PDF or image files, please refer to here for usage
  • Adaptation of CLI commands to graphrag has been added, please refer to here for usage
  • Updated Doc2X document translation functions to use see here
  • Enhanced exception handling
  • Function return parameters have changed, will return more detailed content
  • Decoupled various parts of the processing process

🔧 BUG Fixes

  • [Doc2X] When using personal API, if the input file has multiple corrupted files, it may cause an infinite loop
  • [FileTool] The get_files function cannot accept pdf output format

🚀 Others

  • Documentation updated to a separate repository pdfdeal-docs
  • Updated unit tests

V0.1.6

07 Jul 07:10
6695724
Compare
Choose a tag to compare

✅ No interface changes

✨ New Features

  • Add a new function, get_files, to quickly generate all the files in a folder and keep the file structure consistent before and after processing. See example

🐛 Bug Fixes

  • Doc2X API does not return an obvious error when uploading files over 100MB (API limit).

✅ 没有更改的接口

✨ 新特性

  • 添加一个新的函数get_files,使用其快速生成文件夹中的所有文件,并保持处理前后文件结构一致,查看示范程序

🐛 Bug 修复

  • Doc2X API上传文件超过100MB(API限制)时不会返回明显报错

V0.1.5

05 Jul 13:19
faa63e8
Compare
Choose a tag to compare

✅ No interface changes

🐛 Bug Fixes

  • Fixed an issue that prompted immobility in extreme cases:
    When customizing output folders in some cases: os.rename error - system cannot move files to other disks

🚀 Other

  • Modular pdf file/OCR recognition engine

V0.1.4

04 Jul 12:22
Compare
Choose a tag to compare

✅ No interface changes

🚀 Other

  • Updating docstring to follow Google Styleguide

V0.1.3

03 Jul 15:57
57fe3bd
Compare
Choose a tag to compare

✅ No interface changes

✨ New Features

  • New feature: replace all remote images in Markdown files with local ones.
  • Refactored pdfdeal function, now supports batch input of files.

🐛 Bug Fixes

  • Reformatting the output of native OCR file processing functions.
  • pdfdeal can't output md files under some circumstances.
  • Remove Doc2x used in version 0.0.x.

🚀 Other

  • Documentation will be refactored for the next release

V0.1.2

27 Jun 11:36
Compare
Choose a tag to compare

✅ No interface changes for seamless upgrades

✨ New Features

  • Refactored RPM limiter to enhance batch file processing stability.
  • New unit tests for handling large number of files, all unit tests will be automatically completed by GitHub Actions.
  • Backward compatible with python 3.8.

🐛 Bug Fixes

  • Improve the stability of batch file processing
  • Discard unnecessary parameters