CN109800750A

CN109800750A - A kind of character area mask method of the text based on morphological image for rule composing

Info

Publication number: CN109800750A
Application number: CN201910072288.6A
Authority: CN
Inventors: 段强; 李锐; 于治楼; 安程治
Original assignee: Jinan Inspur Hi Tech Investment and Development Co Ltd
Current assignee: Jinan Inspur Hi Tech Investment and Development Co Ltd
Priority date: 2019-01-25
Filing date: 2019-01-25
Publication date: 2019-05-24

Abstract

The present invention provides a kind of character area mask method of text based on morphological image for rule composing, belong to OCR image identification technical field, the present invention is by the binaryzation of image, burn into expands, opens the comprehensive use of the operations such as operation, closed operation and connected domain analysis in morphological image, extracts text region and is labeled.The text box mark of specification typesetting is simplified into flexibility, with lesser consumption and cost obtains and the essentially identical effect of complicated approach.

Description

A kind of character area mark of the text based on morphological image for rule composing Method

Technical field

The present invention relates to OCR image recognition technology more particularly to a kind of texts that rule composing is used for based on morphological image This character area mask method.

Background technique

The text box label technology of mainstream is the supervised learning side using deep learning and Fast RCNN series at present Method, but need to consume manpower in practical operation and training data is labeled.It is suitable for the String localization and text of complex scene This collimation mark note.In simple scene, if business card OCR is identified, the simple and text such as invoice OCR identification compared in the scene of specification, Using the methods of deep learning excessively complexity and it is easy to produce unstable mark.

In the tide of Artificial Intelligence Development, it will repeat and the work working machine of the machinery substitution mankind be a certainty Trend.In current existing artificial intelligence application, Text region is highly developed, and is not limited only to the knowledge of type fount It not, further include the identification of handwritten form.But if it is desired to accomplishing more intelligent, automation, before Text region, also to there is one Walk the operation of String localization and text box mark.Want to identify single text, need accurately to mark text and divides Cut

Traditional method is mostly based on Fast RCNN and its deriving method, these method belongs to supervised learning, into It needs manually to be labeled text box before row training, needs a large amount of manpowers, and have certain consumption to hardware resource.In addition, It is excessively complicated using study and neural network and will not be than simple image processing method in the text identification scene of comparison rule Generate better effect.

The deficiency of traditional method:

1) training caused by supervised learning is complicated, and data set building is spent human and material resources, and preferable hardware is needed to be propped up It holds.

2) after complicated step and model training, the promotion of matter can not be obtained on text box mark.

Therefore, a unsupervised method based on image procossing is widely used scene.

Summary of the invention

In order to solve the above technical problems, the invention proposes a kind of texts for being used for rule composing based on morphological image Character area mask method, using to technology mainly have the binaryzation of image, behaviour is opened in burn into expansion in morphological image Work, closed operation and connected domain analysis.

The technical scheme is that

A kind of character area mask method of the text based on morphological image for rule composing, can be used for typesetting specification Text identification scene in text box mark.By corroding (Erosion) in morphological image, expansion (Dilate), opening (Open)/and (Close) operation is closed, the target area of the image of the binaryzation containing target text is become into single connected domain, it is fixed Text box can be marked out and behind position to the boundary of the connected domain.

Generally speaking, input is the text identification scene of a typesetting specification, the image arrived such as the photo of invoice or scanning. Output is the text box marked.

If because visual angle difference leads to the invoice region photographed not and is the rectangle of specification to overlook visual angle, first using singly answering Property transformation (Homography) by geographic norms to be marked become standard rectangular.

After the region standardized, binarization operation is carried out to image, the region other than text is schemed by 3 channel rgb It is white (1) by text conversion as being converted to the black (0) in two bit image of single channel.

Histogram analytic approach may be used herein, obtain a suitable threshold value and come out text and background separation.? After the binary image gone out to text and background separation, operated using morphological image by large stretch of character area and the equal structure of noise Connected domain is built up, noise removal is regarded as lesser connected domain, biggish connected domain is considered that text retains.It obtains respectively The boundary up and down of text connected domain can draw the rectangular text frame an of standard.Since the height of line of text has centainly Range, therefore a threshold value is manually set and falls noise filtering, the connected domain less or greater than threshold value regards as noise removal, Connected domain within the set range is considered that character area is retained.This usually upper range is set as all effective text boxes 0.5-1.5 times of height median.

Specifically used technology such as homography conversion, Histogram analysis, morphological image operation etc. are mature and logical Technology, details are not described herein.

Specific step is as follows.

The first step obtains input picture, it is ensured that it only includes target area and is rectangle；If not rectangle, by singly answering Property transformation standardize be standard quadrilateral；

Second step sets the threshold value of image binaryzation, distinguishes background and word segment by histogram analytic approach, and A threshold value is obtained, by the way that threshold value is reasonably arranged, reduces noise as far as possible；Background and word segment are finally used 0 respectively, 1 is expressed as a bianry image；

Third step is operated using morphological image word segment connecting into an entire connected domain；

4th step obtains a rectangle connected domain after third step, extracts the seat on this connected domain four sides up and down Target maximum value, which can mark out text box region, to be come.

Further, in the third step, closed operation is selected, structural element selects rectangle, for the text area that will do not gone together It branches away, the height of structural element is no more than this number of pixels of two styles of writing.

Further, at the same time, in order to which the connection of the text of same a line is got up, the wide of structural element is not less than two words Lateral separation cuts extra pixel when finally calculating text box position if setting is larger.

The beneficial effects of the invention are as follows

Flexible utilization prior art evades its disadvantage, play its strong point, proposes unsupervised based on morphological image Text box mask method has following advantage:

1) method is simple and clear, has stronger versatility for the Text Feature Extraction scene of specification typesetting；

2) unsupervised learning does not need manpower and material resources and is trained collection mark and time-consuming training step；

3) calculation amount is smaller, does not need powerful hardware supported.

Detailed description of the invention

Fig. 1 is workflow schematic diagram of the invention.

Specific embodiment

In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present invention In attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is A part of the embodiment of the present invention, instead of all the embodiments, based on the embodiments of the present invention, those of ordinary skill in the art Every other embodiment obtained without making creative work, shall fall within the protection scope of the present invention.

A kind of character area mask method of text based on morphological image for rule composing of the invention, can be used for Text box mark in the text identification scene of typesetting specification.By corroding (Erosion), expansion in morphological image (Dilate), (Open)/close (Close) operation is opened, the target area of the image of the binaryzation containing target text is become single A connected domain can mark out text box to come after navigating to the boundary of the connected domain.

The specific steps of which are as follows:

The first step obtains input picture, takes pictures or scan, it is ensured that it only includes target area and is rectangle.If It is not rectangle, can be standardized by homography conversion is standard quadrilateral

Second step sets the threshold value of image binaryzation, can distinguish background and word segment by histogram analytic approach, And a suitable threshold value is obtained, by the way that threshold value is reasonably arranged, reduce noise as far as possible.Finally by background and word segment A bianry image is expressed as with 0,1 respectively.

Third step is operated using morphological image word segment connecting into an entire connected domain.Here selection is closed (close) it operating, structural element (structural element) selects rectangle, in order to which the text that do not go together is distinguished, The high of structural element cannot be greater than this number of pixels of two styles of writing.It can be set to 1 or 2.Meanwhile in order to by same a line Text connection is got up, and the wide of structural element cannot be less than the lateral separation of two words, and what be can be set is larger, but will be most Extra pixel is cut when calculating text box position afterwards.

4th step should be able to theoretically obtain a rectangle connected domain after third step, and it is left up and down to extract this connected domain The maximum value of the coordinate on right four sides, which can mark out text box region, to be come.

The present invention is flexibly used the prior art, is maximized favourable factors and minimized unfavourable ones using general at present Open Framework and language, and specification is arranged The text box mark of version simplifies flexibility, and with lesser consumption and cost obtains and the essentially identical effect of complicated approach

The foregoing is merely presently preferred embodiments of the present invention, is only used to illustrate the technical scheme of the present invention, and is not intended to limit Determine protection scope of the present invention.Any modification, equivalent substitution, improvement and etc. done all within the spirits and principles of the present invention, It is included within the scope of protection of the present invention.

Claims

1. a kind of character area mask method of the text based on morphological image for rule composing, which is characterized in that

By burn into expansion, opening and closing operations in morphological image, by the target area of the image of the binaryzation containing target text Become single connected domain, text box can be marked out after navigating to the boundary of the connected domain.

2. the method according to claim 1, wherein

If become first using homography because it is the rectangle vertical view visual angle of specification that visual angle difference, which leads to the invoice region photographed not, Geographic norms to be marked of changing commanders become standard rectangular.

3. according to the method described in claim 2, it is characterized in that,

After the region standardized, binarization operation is carried out to image, the region other than text is turned by 3 channel rgb images Text conversion is white by the black being changed in two bit image of single channel.

4. according to the method described in claim 3, it is characterized in that,

Using histogram analytic approach, obtains a threshold value and come out text and background separation.

5. according to the method described in claim 4, it is characterized in that,

After obtaining the binary image of text and background separation out, by large stretch of character area and made an uproar using morphological image operation Sound is built into connected domain, since the height of line of text has a certain range, a threshold value is manually set and falls noise filtering, Connected domain less or greater than threshold value regards as noise removal, and connected domain within the set range is considered that character area carries out Retain；This range is set as 0.5-1.5 times of all effective TextField._height medians.

6. according to the method described in claim 5, it is characterized in that,

The boundary up and down for obtaining text connected domain respectively can draw the rectangular text frame an of standard.

7. according to the method described in claim 5, it is characterized in that,

Concrete operation step is as follows.

The first step obtains input picture, it is ensured that it only includes target area and is rectangle；If not rectangle is become by homography Its specification of changing commanders is standard quadrilateral；

Second step sets the threshold value of image binaryzation, distinguishes background and word segment by histogram analytic approach, and obtain One threshold value reduces noise by the way that threshold value is arranged as far as possible；Finally 0,1 is used to be expressed as one respectively background and word segment A bianry image；

4th step obtains a rectangle connected domain after third step, extracts this connected domain coordinate on four sides up and down Maximum value, which can mark out text box region, to be come.

8. the method according to the description of claim 7 is characterized in that

In the third step, closed operation is selected, structural element selects rectangle, in order to distinguish the text that do not go together, structural elements The height of element is no more than two style of writing this number of pixels.

9. according to the method described in claim 8, it is characterized in that,

Meanwhile in order to which the connection of the text of same a line is got up, the wide lateral separation for being not less than two words of structural element, if set It sets larger, cuts extra pixel when finally calculating text box position.