Text Extraction from Images using FastAPI and Tesseract

This project provides a simple API for extracting text from images using FastAPI and Tesseract OCR.

Related Repositories

Features

Upload an image and extract text using Tesseract OCR.
FastAPI backend for handling image uploads and text extraction.
Easy to set up and run locally.

Prerequisites

Before running the project, ensure you have the following installed:

Python 3.8 or higher
Tesseract OCR
FastAPI
Uvicorn
Pillow
Pytesseract

Installation

Clone the repository:

git clone https://github.com/HugoNicolau/text-extraction-py.git
cd text-extraction-py

Set up a virtual environment:

python -m venv venv
source venv/bin/activate  
# On Windows: venv\Scripts\activate

Install dependencies:

pip install -r requirements.txt

Install Tesseract OCR:

On Ubuntu/Debian:
```
sudo apt-get install tesseract-ocr
```
On macOS (using Homebrew):
```
brew install tesseract
```
On Windows, download the installer from Tesseract's GitHub page.

Running the Project

Start the FastAPI server:
```
uvicorn main:app --reload
```
The server will start at http://localhost:8000.

Test the API:

You can test the API using curl or a tool like Postman.

Example using curl:

curl -X POST -F "file=@/path/to/your/image.png" http://localhost:8000/extract-text/

Example response:

{
  "text": "This is the extracted text from the image."
}

API Endpoints

POST /extract-text/

Extract text from an uploaded image.

Request:

File: The image file to process (supported formats: PNG, JPEG, etc.).

Response:

text: The extracted text.

Project Structure

text-extraction-py/
├── venv/                  # Virtual environment
├── main.py                # FastAPI application
├── requirements.txt       # Python dependencies
├── README.md              # Project documentation
├── .gitignore             # Files to ignore in Git
└── image/                 # Folder for test images
    └── example-image.png

Usage

Upload an Image:

Use the /extract-text/ endpoint to upload an image and extract text.
View Extracted Text:

The API will return the extracted text in JSON format.

Example

Here’s an example of how to use the API with curl:

curl -X POST -F "file=@/path/to/your/image.png" http://localhost:8000/extract-text/

Response:

{
  "text": "This is the extracted text from the image."
}

Contributing

Contributions are welcome! If you'd like to contribute, please follow these steps:

Fork the repository.
Create a new branch (git checkout -b feature/YourFeature).
Commit your changes (git commit -m 'Add some feature').
Push to the branch (git push origin feature/YourFeature).
Open a pull request.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Acknowledgments

FastAPI for the text extraction service.
Tesseract OCR for text extraction.
Pillow for image processing.

Contact

For questions or feedback, please reach out to me on LinkedIn or via email.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Text Extraction from Images using FastAPI and Tesseract

Related Repositories

Features

Prerequisites

Installation

Running the Project

API Endpoints

POST /extract-text/

Project Structure

Usage

Example

Contributing

License

Acknowledgments

Contact

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
image		image
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
main.py		main.py
render.yaml		render.yaml
requirements.txt		requirements.txt
runtime.txt		runtime.txt

License

HugoNicolau/text-extraction-py

Folders and files

Latest commit

History

Repository files navigation

Text Extraction from Images using FastAPI and Tesseract

Related Repositories

Features

Prerequisites

Installation

Running the Project

API Endpoints

POST /extract-text/

Project Structure

Usage

Example

Contributing

License

Acknowledgments

Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages