Support Chinese version of StructEqTable

UniModal4Reasoning · Jul 26, 2024 · 668f4ab · 668f4ab
1 parent 4ed4b38
commit 668f4ab
Show file tree

Hide file tree

Showing 2 changed files with 9 additions and 8 deletions.
diff --git a/README.md b/README.md
@@ -19,7 +19,7 @@ Table is an effective way to represent structured data in scientific publication
 ## TODO
 
 - [x] Release inference code and checkpoints of StructEqTable.
-- [ ] Support Chinese version of StructEqTable.
+- [x] Support Chinese version of StructEqTable.
 - [ ] Improve the inference speed of StructEqTable.
 
 
@@ -34,13 +34,13 @@ pip install "git+https://github.com/UniModal4Reasoning/StructEqTable-Deploy.git"
 
 ```
 
-## Demo
-- run the demo.py
+## Quick Demo
+- run the demo/demo.py
 ```shell script
 cd demo
-python demo.py \
---image_path demo/demo.png \
---ckpt_path ${CKPT_PATH}
+
+python demo.py \ --image_path ./demo.png \
+  --ckpt_path ${CKPT_PATH}
 ```
 
 - Visualization Results
@@ -52,9 +52,10 @@ python demo.py \
 
 
 ## Acknowledgements
-- [UniMERNet](https://github.com/opendatalab/UniMERNet). A Universal Network for Real-World Mathematical Expression Recognition.
 - [DocGenome](https://github.com/UniModal4Reasoning/DocGenome). An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Models.
 - [ChartVLM](https://github.com/UniModal4Reasoning/ChartVLM). A Versatile Benchmark and Foundation Model for Complicated Chart Reasoning.
+- [Pix2Struct](https://github.com/google-research/pix2struct). Screenshot Parsing as Pretraining for Visual Language Understanding.
+- [UniMERNet](https://github.com/opendatalab/UniMERNet). A Universal Network for Real-World Mathematical Expression Recognition.
 - [Donut](https://huggingface.co/naver-clova-ix/donut-base). The UniMERNet's Transformer Encoder-Decoder are referenced from Donut.
 - [Nougat](https://github.com/facebookresearch/nougat). The tokenizer uses Nougat.
 

diff --git a/demo/demo.py b/demo/demo.py
@@ -9,7 +9,7 @@
 def parse_config():
     parser = argparse.ArgumentParser(description='arg parser')
     parser.add_argument('--image_path', type=str, default='demo.png', help='data path for table image')
-    parser.add_argument('--ckpt_path', type=str, default='', help='ckpt path for table model')
+    parser.add_argument('--ckpt_path', type=str, default='U4R/StructTable-base', help='ckpt path for table model, which can be downloaded from huggingface')
     parser.add_argument('--cpu', action='store_true', default=False, help='using cpu for inference')
     parser.add_argument('--html', action='store_true', default=False, help='output html format table code')
     args = parser.parse_args()