Skip to content

Latest commit

 

History

History
72 lines (58 loc) · 2.7 KB

README.md

File metadata and controls

72 lines (58 loc) · 2.7 KB

GOT-OCR-Inference

Research on accelerating the GOT-OCR project deployment, supporting multiple languages on cpu

Links 1:

Releases:

Instructions to install from cli:

  1. Download and extract the Base SDK package.
  2. Download the Update package and extract it to replace/overwrite the files in the base package.
  3. Double-click 启动.bat to start the application. Run :
pip install llama-cpp-python

How used the repo :

The tensors files are images embeddings in pytorch format. if you want : Run inference

python main.py

Convert in gguf yourself (any quantizations types) : Update model path, and other args :

python convert_hf_to_gguf.py

Code Usage: The following code snippet is a basic demonstration for testing if the model embedding works properly:

<|im_start|>system
You should follow the instructions carefully and explain your answers in detail.<|im_end|>
<|im_start|>user
<img></img>
OCR: <|im_end|><|im_start|>assistant

Notes: GOT-OCR2.0 deployment acceleration research was conducted using llama-cpp-python. The source code and documentation for llama-cpp-python and llama were referenced to implement possible inference solutions. No official documentation exists for embedding custom vectors, and this implementation is based on the available knowledge.

Model Quantization:

The quantized version of the model is provided here, but it is not guaranteed to be completely correct since it’s based on the official model's quantization. Some layers of the GOT model may have been included in the quantization by mistake.

Quantized model weights: Download here : Release-exe (GOT weights) (Code: 3zop) If you want to perform the quantization yourself, refer to the modified convert_hf_to_gguf.py script. Make sure to update the config.json file as follows:

"architectures": [
  "GOTQwenForCausalLM"
]

change to :

"architectures": [
  "Qwen2ForCausalLM"
]

This change is necessary to avoid errors when the quantization script attempts to locate the model architecture type.

Mention

This fork is based on original repos from GOT-OCR-Inference. Kudos to him for gguf implementation