Skip to content

Commit

Permalink
Support Chinese version of StructEqTable
Browse files Browse the repository at this point in the history
  • Loading branch information
sky-fly97 committed Jul 26, 2024
1 parent 4ed4b38 commit 668f4ab
Show file tree
Hide file tree
Showing 2 changed files with 9 additions and 8 deletions.
15 changes: 8 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ Table is an effective way to represent structured data in scientific publication
## TODO

- [x] Release inference code and checkpoints of StructEqTable.
- [ ] Support Chinese version of StructEqTable.
- [x] Support Chinese version of StructEqTable.
- [ ] Improve the inference speed of StructEqTable.


Expand All @@ -34,13 +34,13 @@ pip install "git+https://github.com/UniModal4Reasoning/StructEqTable-Deploy.git"

```

## Demo
- run the demo.py
## Quick Demo
- run the demo/demo.py
```shell script
cd demo
python demo.py \
--image_path demo/demo.png \
--ckpt_path ${CKPT_PATH}

python demo.py \ --image_path ./demo.png \
--ckpt_path ${CKPT_PATH}
```

- Visualization Results
Expand All @@ -52,9 +52,10 @@ python demo.py \


## Acknowledgements
- [UniMERNet](https://github.com/opendatalab/UniMERNet). A Universal Network for Real-World Mathematical Expression Recognition.
- [DocGenome](https://github.com/UniModal4Reasoning/DocGenome). An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Models.
- [ChartVLM](https://github.com/UniModal4Reasoning/ChartVLM). A Versatile Benchmark and Foundation Model for Complicated Chart Reasoning.
- [Pix2Struct](https://github.com/google-research/pix2struct). Screenshot Parsing as Pretraining for Visual Language Understanding.
- [UniMERNet](https://github.com/opendatalab/UniMERNet). A Universal Network for Real-World Mathematical Expression Recognition.
- [Donut](https://huggingface.co/naver-clova-ix/donut-base). The UniMERNet's Transformer Encoder-Decoder are referenced from Donut.
- [Nougat](https://github.com/facebookresearch/nougat). The tokenizer uses Nougat.

Expand Down
2 changes: 1 addition & 1 deletion demo/demo.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
def parse_config():
parser = argparse.ArgumentParser(description='arg parser')
parser.add_argument('--image_path', type=str, default='demo.png', help='data path for table image')
parser.add_argument('--ckpt_path', type=str, default='', help='ckpt path for table model')
parser.add_argument('--ckpt_path', type=str, default='U4R/StructTable-base', help='ckpt path for table model, which can be downloaded from huggingface')
parser.add_argument('--cpu', action='store_true', default=False, help='using cpu for inference')
parser.add_argument('--html', action='store_true', default=False, help='output html format table code')
args = parser.parse_args()
Expand Down

0 comments on commit 668f4ab

Please sign in to comment.