# erpnext_ocr_1 **Repository Path**: webvip/erpnext_ocr_1 ## Basic Information - **Project Name**: erpnext_ocr_1 - **Description**: No description available - **Primary Language**: Unknown - **License**: MIT - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 2 - **Created**: 2025-07-03 - **Last Updated**: 2025-07-03 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README [![License: MIT][uri_license_image]][uri_license] [![Managed with Taiga.io](https://img.shields.io/badge/managed%20with-TAIGA.io-709f14.svg)](https://tree.taiga.io/project/monogrammbot-monogrammerpnext_ocr/ "Managed with Taiga.io") [![Build Status](https://travis-ci.org/Monogramm/erpnext_ocr.svg)](https://travis-ci.org/Monogramm/erpnext_ocr) [![Codacy Badge](https://api.codacy.com/project/badge/Grade/e154ec72926346d4ba4951c25d906d33)](https://www.codacy.com/gh/Monogramm/erpnext_ocr?utm_source=github.com&utm_medium=referral&utm_content=Monogramm/erpnext_ocr&utm_campaign=Badge_Grade) [![Coverage Status](https://coveralls.io/repos/github/Monogramm/erpnext_ocr/badge.svg?branch=master)](https://coveralls.io/github/Monogramm/erpnext_ocr?branch=master) ## ERPNext OCR > :alembic: **Experimental** Frappe OCR application with [tesseract](https://github.com/tesseract-ocr/tesseract). This project is a fork of [ERPNext-OCR](https://github.com/jvfiel/ERPNext-OCR) by [John Vincent Fiel](https://github.com/jvfiel). Its aim is to fix and cleanup the original source code and add some new features. Check out more on [ERPNext Discuss](https://discuss.erpnext.com/t/erpnext-ocr-app/33834/7). ## :chart_with_upwards_trend: Changes See [CHANGELOG](./CHANGELOG.md) ## :bookmark: Roadmap See [Taiga.io](https://tree.taiga.io/project/monogrammbot-monogrammerpnext_ocr/ "Taiga.io monogrammbot-monogrammerpnext_ocr") ## :construction: Install ### Pre-requisites: tesseract-python and imagemagick Install tesseract-ocr, plus imagemagick and ghostscript (to work with pdf files) using this command on Debian: ```sh sudo apt-get install tesseract-ocr imagemagick libmagickwand-dev ghostscript ``` ### Install Frappe application ```sh bench get-app --branch develop erpnext_ocr https://github.com/Monogramm/erpnext_ocr bench install-app erpnext_ocr ``` When installing Frappe app, the following python requirements will be installed: - python binding for tesseract, [tesserocr](https://pypi.org/project/tesserocr/) - image processing library in python, [pillow](https://pypi.org/project/Pillow/) - HTTP library in python, [requests](https://pypi.org/project/requests/) - python binding for imagemagick, [wand](https://pypi.org/project/Wand/) ## :rocket: Usage **File Being Read**: ![File Being Read](./erpnext_ocr/tests/test_data/Picture_010.png) **Sample Screenshot**: ![Sample Screenshot](./erpnext_ocr/tests/test_data/Picture_010_screenshot.png) ### Tesseract trained data In order to use OCR with different languages, you need to install the appropriate trained data files. Check tesseract Wiki for details: ### Development If you wish to develop or just test locally this application, you can use `docker-compose up -d` at the root of the this repository. You can then access your ERPNext OCR dev env at `http://localhost:8080`. ### Known issues - `wand.exceptions.PolicyError: not authorized '/opt/sample.pdf' @ error/constitute.c/ReadImage/412` - This can happen due to security configuration in imagemagick preventing it to read PDF files. - Reference: - - - `wand.exceptions.WandRuntimeError: MagickReadImage returns false, but did raise ImageMagick exception. This can occurs when a delegate is missing, or returns EXIT_SUCCESS without generating a raster.` - This might happen if you're missing a dependency to convert PDF, most of the time `ghostscript` - References: - - `OSError: encoder error -2 when writing image file` - This might happen when trying to open a TIFF image, but the real error is "_hidden_" and only displayed in console. - If the original error in console is `Fax3SetupState: Bits/sample must be 1 for Group 3/4 encoding/decoding.` that usually happens when TIFF image compression is not valid / recognized. ## :white_check_mark: Run tests ```sh bench run-tests --app erpnext_ocr ``` ## :bust_in_silhouette: Authors **Monogramm** - Website: - Github: [@Monogramm](https://github.com/Monogramm) **John Vincent Fiel** - Github: [@jvfiel](https://github.com/jvfiel) ## :handshake: Contributing Contributions, issues and feature requests are welcome!
Feel free to check [issues page](https://github.com/Monogramm/erpnext_ocr/issues). [Check the contributing guide](./CONTRIBUTING.md).
## :thumbsup: Show your support Give a :star: if this project helped you! ## :page_facing_up: License Copyright © 2019 [Monogramm](https://github.com/Monogramm).
This project is [MIT](uri_license) licensed. * * * _This README was generated with :heart: by [readme-md-generator](https://github.com/kefranabg/readme-md-generator)_ [uri_license]: https://opensource.org/licenses/MIT [uri_license_image]: https://img.shields.io/badge/license-MIT-blue