Tesseract and Leptonica iOS Business Card Scan

I am trying to scan a business card using tesseract OCR, all I do is send the image without any attempts using the code that I use.

 Tesseract* tesseract = [[Tesseract alloc] initWithLanguage:@"eng+ita"];
tesseract.delegate = self;
[tesseract setVariableValue:@"0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ@.-()" forKey:@"tessedit_char_whitelist"];

[tesseract setImage:[UIImage imageNamed:@"card.jpg"]]; //image to check
[tesseract recognize];

 NSLog(@"Here is the text %@", [tesseract recognizedText]);

Map image

This is the result.

As you can see, the accuracy is not 100%, which is not what bothers me, I believe that I can fix it with simple simple processing. However, if you notice that it mixes two text blocks at the bottom that share the address and possibly other information on other cards.

How can I use Leptonica (or something else, possibly OpenCV) to somehow group the text? Is it possible to send individual pieces of text to an image individually for tesseract verification? I got stuck in this problem for a while when any possible solutions are welcome!

+3
source share
4 answers

I would recommend using the Runtime Alignment Algorithm (RLSA) algorithm. This algorithm is used in many document processing systems, although not every system presents it as part of its API.

1982 . , .

: http://www.sciencedirect.com/science/article/pii/S0262885609002005

, .

, . , :

  • ,
    • ,
  • ,
    • , .

, , :

  • ( ),
  • .

, OpenCV, . , , , .

+1

, OCR - , OCR / .

, :

, , , 1-5% . OCR , , . , , -.

-, , . , , , " ", " " . , , , .

, , , . , ( ), , , "N" . , .

, , , , "A" Associate, .

OCR, , " " " " ".

, , .

0

HOCRText, xml.

char *boxtext = _tesseract->GetHOCRText(0);

xml, . , , tesseract.

_tesseract->SetRectangle(100, 100, 200, 200);

, . , tesseract .

0

Github iOS, :

https://github.com/danauclair/CardScan

? ( : https://github.com/danauclair/CardScan/blob/master/Classes/CardParser.m)

//  A class used to parse a bunch of OCR text into the fields of an ABRecordRef which can be added to the 
//  iPhone address book contacts. This class copies and opens a small SQLite databse with a table of ~5500
//  common American first names which it uses to help decipher which text on the business card is the name.
//
//  The class tokenizes the text by splitting it up by newlines and also by a simple " . " regex pattern.
//  This is because many business cards put multiple "tokens" of information on a single line separated by 
//  spaces and some kind of character such as |, -, /, or a dot.
//
//  Once the OCR text is fully tokenized it tries to identify the name (via SQLite table), job title (uses 
//  a set of common job title words), email, website, phone, address (all using regex patterns). The company
//  or organization name is assumed to be the first token/line of the text unless that is the name.
//
//  This is obviously a far from perfect parsing scheme for business card text, but it seems to work decently
//  on a number of cards that were tested. I'm sure a lot of improvements can be made here.
0

All Articles