Skip to content

chrizztus/tesseract-ios

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 

Repository files navigation

Tesseract for iOS

About

Tesseract-ios is an Objective-C wrapper for Tesseract OCR.

This project couldn't exist without the Ângelo Suzuki's blog post. A lot of code came from his article.

Requirements

  • iOS SDK 6.0, iOS 5.0+ (there is no support for armv6)
  • Tesseract and Leptonica libraries from the tesseract-ios-lib repo.

Installation

  • Download tesseract-ios-lib and put it somewhere in your project.
  • Put the Classes content (from this repo) somewhere in your project.
  • Go to your project settings, and ensure that C++ Standard Library => Compiler Default.

Usage

Here is the default workflow to extract text from an image:

  • Instantiate Tesseract with data path and language
  • Set the delegate (only needed for recognizing with progress update)
  • Set variables (character set, …)
  • Set the image to analyze
  • Start recognition
  • Get recognized text
  • Clear

Code Sample

#import "Tesseract.h"

Tesseract* tesseract = [[Tesseract alloc] initWithDataPath:@"tessdata" language:@"eng"];
[tesseract setVariableValue:@"0123456789" forKey:@"tessedit_char_whitelist"];
[tesseract setImage:[UIImage imageNamed:@"image_sample.jpg"]];
[tesseract recognize];

NSLog(@"%@", [tesseract recognizedText]);
[tesseract clear];

Code sample with progress update

#import "Tesseract.h"

- (void) viewDidLoad {
  Tesseract* tesseract = [[Tesseract alloc] initWithDataPath:@"tessdata" language:@"eng"];
  [tesseract setTessDelegate:self];
  [tesseract setVariableValue:@"0123456789" forKey:@"tessedit_char_whitelist"];
  [tesseract setImage:[UIImage imageNamed:@"image_sample.jpg"]];
  [tesseract recognizeWithProgressUpdate];

  NSLog(@"%@", [tesseract recognizedText]);
  [tesseract clear];
}

/*
 * this is delegate is getting called everytime the progress changes
 */
- (void)progressUpdate:(NSUInteger)progress {
  NSLog(@"progress: %d%%",progress);
}

Method reference

-initWithDataPath:language:

- (id)initWithDataPath:(NSString *)dataPath language:(NSString *)language

Initialize a new Tesseract instance.

  • dataPath: a relative path from the application bundle to the .traineddata files. You can find these files from the tesseract downloads section.
  • language: language used for recognition. Ex: eng. Tesseract will search for a eng.traineddata file in the dataPath directory.

Returns nil if instanciation failed.

-setVariableValue:forKey:

- (void)setVariableValue:(NSString *)value forKey:(NSString *)key

Set Tesseract variable key to value. See http://www.sk-spell.sk.cx/tesseract-ocr-en-variables for a complete (but not up-to-date) list.

For instance, use tessedit_char_whitelist to restrict characters to a specific set.

-setImage:

- (void)setImage:(UIImage *)image

Set the image to recognize.

-setLanguage:

- (BOOL)setLanguage:(NSString *)language

Override the language defined with -initWithDataPath:language:.

-recognize

- (BOOL)recognize

Start text recognition. You might want to launch this process in background with NSObject's -performSelectorInBackground:withObject:.

-recognizeWithProgressUpdate

- (BOOL)recognizeWithProgressUpdate

Start text recognition with progress update. Implement - (void)progressUpdate:(NSUInteger)progress delegate in order to show a progress bar or to log the progress.

-recognizedText

- (NSString *)recognizedText

Get the text extracted from the image.

-clear

- (void) clear

Clears Tesseract object after text has been recognized from image. Preventing memory leaks.

-progressUpdate

- (void)progressUpdate:(NSUInteger)progress

Function that will be called on progress update during recognizing. progress range is [0...100]

About

Tesseract OCR for iOS

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Objective-C 100.0%