# RapidLatexOCR
**Repository Path**: PolarisF/RapidLatexOCR
## Basic Information
- **Project Name**: RapidLatexOCR
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: MIT
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2023-11-12
- **Last Updated**: 2023-11-12
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
### Introduction
- `rapid_latex_ocr` is a tool to convert formula images to latex format.
- **The reasoning code in the repo is modified from [LaTeX-OCR](https://github.com/lukas-blecher/LaTeX-OCR), the model has all been converted to ONNX format, and the reasoning code has been simplified, Inference is faster and easier to deploy.**
- The repo only has codes based on `ONNXRuntime` or `OpenVINO` inference in onnx format, and does not contain training model codes. If you want to train your own model, please move to [LaTeX-OCR](https://github.com/lukas-blecher/LaTeX-OCR).
- If it helps you, please give a little star ⭐ or sponsor a cup of coffee (click the link in Sponsor at the top of the page)
- Welcome all friends to actively contribute to make this tool better.
- ☆ [Model Conversion Notes](https://github.com/RapidAI/RapidLatexOCR/wiki/Model-Conversion-Notes)
### [Demo](https://swhl-rapidlatexocrdemo.hf.space)
### TODO
- [ ] Rewrite LaTeX-OCR GUI version based on `rapid_latex_ocr`
- [x] Add demo in the hugging face
- [ ] Integrate other better models
- [ ] Add support for OpenVINO
### Installation
1. pip install `rapid_latext_ocr` library. Because packaging the model into the whl package exceeds the pypi limit (100M), the model needs to be downloaded separately.
```bash
pip install rapid_latex_ocr
```
2. Download the model ([Google Drive](https://drive.google.com/drive/folders/1e8BgLk1cPQDSZjgoLgloFYMAQWLTaroQ?usp=sharing) | [Baidu NetDisk](https://pan.baidu.com/s/1rnYmmKp2HhOkYVFehUiMNg?pwd=dh72)), when initializing, just specify the model path, see the next part for details.
|model name|size|
|---:|:---:|
|`image_resizer.onnx`|37.1M|
|`encoder.onnx`|84.8M|
|`decoder.onnx`|48.5M|
### Usage
- Used by python script:
```python
from rapid_latex_ocr import LatexOCR
image_resizer_path = 'models/image_resizer.onnx'
encoder_path = 'models/encoder.onnx'
decoder_path = 'models/decoder.onnx'
tokenizer_json = 'models/tokenizer.json'
model = LatexOCR(image_resizer_path=image_resizer_path,
encoder_path=encoder_path,
decoder_path=decoder_path,
tokenizer_json=tokenizer_json)
img_path = "tests/test_files/6.png"
with open(img_path, "rb") as f:
data = f. read()
result, elapse = model(data)
print(result)
# {\frac{x^{2}}{a^{2}}}-{\frac{y^{2}}{b^{2}}}=1
print(elapse)
# 0.4131628000000003
```
- Used by command line.
```bash
$ rapid_latex_ocr -h
usage: rapid_latex_ocr [-h] [-img_resizer IMAGE_RESIZER_PATH]
[-encdoer ENCODER_PATH] [-decoder DECODER_PATH]
[-tokenizer TOKENIZER_JSON]
img_path
positional arguments:
img_path Only img path of the formula.
optional arguments:
-h, --help show this help message and exit
-img_resizer IMAGE_RESIZER_PATH, --image_resizer_path IMAGE_RESIZER_PATH
-encdoer ENCODER_PATH, --encoder_path ENCODER_PATH
-decoder DECODER_PATH, --decoder_path DECODER_PATH
-tokenizer TOKENIZER_JSON, --tokenizer_json TOKENIZER_JSON
$ rapid_latex_ocr tests/test_files/6.png \
-img_resizer models/image_resizer.onnx \
-encoder models/encoder.onnx \
-dedocer models/decoder.onnx \
-tokenizer models/tokenizer.json
# ('{\\frac{x^{2}}{a^{2}}}-{\\frac{y^{2}}{b^{2}}}=1', 0.47902780000000034)
```
### Changlog
- 2023-09-13 v0.0.4 update:
- Merge [pr #5](https://github.com/RapidAI/RapidLatexOCR/pull/5)
- Optim code
- 2023-07-15 v0.0.1 update:
- First release
### Code Contributors
### Contributing
- Pull requests are welcome. For major changes, please open an issue first
to discuss what you would like to change.
- Please make sure to update tests as appropriate.
### [Sponsor](https://swhl.github.io/RapidVideOCR/docs/sponsor/)
If you want to sponsor the project, you can directly click the **Buy me a coffee** image, please write a note (e.g. your github account name) to facilitate adding to the sponsorship list below.
### License
This project is released under the [MIT license](./LICENSE).