CN104346609B - The method and device of character on a kind of identification printed matter - Google Patents
The method and device of character on a kind of identification printed matter Download PDFInfo
- Publication number
- CN104346609B CN104346609B CN201310331468.4A CN201310331468A CN104346609B CN 104346609 B CN104346609 B CN 104346609B CN 201310331468 A CN201310331468 A CN 201310331468A CN 104346609 B CN104346609 B CN 104346609B
- Authority
- CN
- China
- Prior art keywords
- image
- gray value
- character
- pixel
- duplicating
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/22—Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Character Input (AREA)
Abstract
This application involves a kind of method and device for identifying character on printed matter.This method can include:The printed matter is shot to obtain the image to be identified;Described image is replicated to obtain at least two width duplicating images, and different image procossings is carried out respectively to every width duplicating image to obtain at least two width layered images;Obtained layered image is subjected to figure layer merging, with image after being handled;The image of each character is extracted from image after the processing;And the image of each character to extracting carries out character recognition.Using the technical solution of the application, when carrying out image procossing to the printed matter such as certificate, more effective, the more accurate identification to character on the printed matter can be realized.
Description
Technical field
This application involves image identification technical field, more particularly to a kind of method and device for identifying character on printed matter.
Background technology
In conventional OCR(Optical Character Recognition, optical character identification)In identification, for one
Some exterior smoothers, reflect the identification of word on stronger printed matter, such as on the printed matter handled by surface coating
The identification of word or for example all kinds of certificate photos or various cards(Especially cross the certificate of modeling processing(Driver's license, driving license etc.))
The identification of upper word, it is often relatively low or in addition reflective so as in the presence of the feelings for identifying mistake because of surface coating there are discrimination
Condition, causes the essence of this problem to be effectively be filtered during identifying, causes the font in OCR identifications source to be deposited
Fuzzy or contrast is excessive the problem of, simultaneously as various printed matters can also known often there are a variety of different fonts
The character brought on not can not match or the problem of matching error.
At present, in the identification technology development of OCR, the demand towards license is more and more, and the hair of existing OCR technique
Exhibition direction is all intended to identification and search for complete image information, and from the point of view of current license identification, existing is several
In OCR identifying schemes, the identification for identity card, passport etc., although having more mature high discrimination engine at this stage with calculating
Method, but in the identification for similar driving license, employee's card etc., since these certificates can all carry out certificate when final issue
Cross modeling processing, as well as each regional similar certificate printing be not as identity card equally possess unified printing standard and
Font, so as to result in existing license identification, often exist for the license for needing to identify is caused due to over-exposed
Image obscure and for deformation font recognition efficiency it is low the problem of, for essence, in existing recognition methods simultaneously
The needs in terms of the two are not considered completely.
The content of the invention
The main purpose of the application is to provide a kind of method and device for identifying character on printed matter, to solve existing skill
Image processing problem and character recognition problem during character on printed matter is identified existing for art, wherein:
According to the one side of the application, there is provided a kind of method for identifying character on printed matter, it is characterised in that bag
Include:The printed matter is shot to obtain the image to be identified;Described image is replicated and is answered with obtaining at least two width
It is imaged, and different image procossings is carried out respectively to every width duplicating image to obtain at least two width layered images;By what is obtained
Layered image carries out figure layer merging, with image after being handled;The image of each character is extracted from image after the processing;With
And the image of each character to extracting carries out character recognition.
According to an embodiment of the present application, in the method, the printed matter is shot to obtain the image to be identified,
Including:When being shot setting is exposed by predetermined condition.
According to an embodiment of the present application, in the method, different images is carried out respectively to each width duplicating image
Handle to obtain at least two width layered images, including:Noise processing is removed to the width in the duplicating image to obtain
First layer image;And contrast enhancement processing is carried out to another width in the duplicating image to obtain the second hierarchical diagram
Picture.
According to an embodiment of the present application, in the method, noise processing is removed to the width in the duplicating image
To obtain first layer image, including:Identify the noise in the duplicating image;By the gray value of each noise and phase around it
The gray value phase adduction of eight adjacent pixels is averaged the denoising gray value as each noise;And by the copy pattern
The gray value of each noise replaces with the denoising gray value of the noise to obtain first layer image as in.
According to an embodiment of the present application, in the method, identify that the noise in the duplicating image includes:By the duplication
The gray value of each pixel and the gray value phase adduction of its two neighbor pixel in left and right are averaged as each in image
The calculating gray value of pixel;Judge that whether the gray value of each pixel calculates the absolute value of the difference of gray value pre- with it
Determine in threshold range;And gray value and the absolute value for the difference for calculating gray value are known beyond the pixel of predetermined threshold range
Wei not noise.
According to an embodiment of the present application, in the method, contrast enhancing is carried out to another width in the duplicating image
Handle to obtain the second layered image, including:The duplicating image is divided at least two subregions;And to each sub-district
Domain carries out gray scale adjustment respectively, to obtain the second layered image.
According to an embodiment of the present application, in the method, the layered image is merged, to scheme after being handled
Picture, including:Intermediate value is taken to the gray value of corresponding pixel in the layered image, is obtained in the gray value of each pixel
Value;And the gray value of each pixel is replaced with to the gray value intermediate value of the pixel, with image after being handled.
According to an embodiment of the present application, in the method, the image of each character in image after the processing, bag are extracted
Include:Determine the position of the text image after the processing in image;And Character segmentation is carried out to the text image, extract
The image of each character in the text image.
According to an embodiment of the present application, in the method, the position of the text image after the processing in image, bag are obtained
Include:Edge texture in every row pixel is identified by edge detection;Do histogram to the Edge texture of every row pixel, and according to
Analysis to the histogram determines the recognition threshold of edge primitive;Often gone according to the recognition threshold of edge primitive statistics
The number of edge primitive, and record the often starting position of row edge primitive and end position;Identify after the processing in image
Non-blank-white row;Judge whether current non-blank-white row meets preset condition, if it is satisfied, then carrying out the detection of next non-blank-white row;
And when being consecutively detected the non-blank-white row more than predetermined number and meeting the preset condition, according to each non-blank-white row edge
The starting position of primitive and end position determine the position of text image.
According to an embodiment of the present application, in the method, the image of each character to extracting carries out character recognition, bag
Include:Character recognition is carried out to the image of each character using BP neural network.
A kind of another aspect of the application, there is provided device for identifying character on printed matter, it is characterised in that including:Adopt
Collect module, for being shot to the printed matter to obtain the image to be identified;Hierarchical processing module, for described image
Replicated to obtain at least two width duplicating images, and different image procossings is carried out respectively to every width duplicating image with obtain to
Few two width layered images;Figure layer merging module, for obtained layered image to be carried out figure layer merging, to scheme after being handled
Picture;Extraction module, for extracting the image of each character from image after the processing;And identification module, for extraction
The image of each character gone out carries out character recognition.
Compared with prior art, according to the technical solution of the application, by being shot to printed matter and to be identified
Image carries out layered image processing, and is merged by figure layer and carry out effect compensation, can lift picture quality, improve the standard of identification
True rate.
Brief description of the drawings
Attached drawing described herein is used for providing further understanding of the present application, forms the part of the application, this Shen
Schematic description and description please is used to explain the application, does not form the improper restriction to the application.In the accompanying drawings:
Fig. 1 is the flow chart of the method for character on a kind of identification printed matter of the embodiment of the present application;
Fig. 2 is the flow chart for the step S1 that noise processing is removed in the step S102 in Fig. 1 of the embodiment of the present application;
Fig. 3 is the flow chart of the step S201 in Fig. 2 of the embodiment of the present application;
Fig. 4 is the flow chart of the step S2 of contrast enhancement processing in the step S102 in Fig. 1 of the embodiment of the present application;
Fig. 5 is the flow chart of the step S103 in Fig. 1 of the embodiment of the present application;
Fig. 6 is the flow chart of the step S104 in Fig. 1 of the embodiment of the present application;
Fig. 7 is the flow chart of the step S601 in Fig. 6 of the embodiment of the present application;And
Fig. 8 is the structure diagram of the device of character on a kind of identification printed matter of the embodiment of the present application.
Embodiment
The main idea of the present application lies in that by being shot to the printed matter with word, the copying image that will be obtained
Carry out different image procossings respectively at least two images and obtain layered image, and figure layer merging is carried out to each layered image,
Image after being handled, then Text Feature Extraction and Text region are carried out to the image after the processing.
To make the purpose, technical scheme and advantage of the application clearer, below in conjunction with the application specific embodiment and
Technical scheme is clearly and completely described in corresponding attached drawing.Obviously, described embodiment is only the application one
Section Example, instead of all the embodiments.Based on the embodiment in the application, those of ordinary skill in the art are not doing
Go out all other embodiments obtained under the premise of creative work, shall fall in the protection scope of this application.
According to an embodiment of the present application, there is provided a kind of method for identifying character on printed matter.
This application can be applied to which the character being printed on the printed matter of character is identified, for example, can be used for certificate
Identification, especially mould processed certificate to crossing and be identified.
With reference to figure 1, Fig. 1 is the method flow diagram of character on a kind of identification printed matter of the embodiment of the present application:Such as Fig. 1 institutes
Show, in step S101, the printed matter is shot to obtain the image to be identified.
When being shot, since image capture device is uneven, when shooting, may be influenced be subject to various aspects, example
Such as time for exposure, exposure compensating, it is bad to cause to shoot the image effect come, while can also influence subsequently to image
Processing.Therefore, when being shot, setting can be exposed by predetermined condition before shooting, obtains the more preferable picture of effect.
By to same type image in identical environment(Such as the condition such as light intensity)Under that relevant parameter setting is exposed when being shot is different
And the situation for producing different-effect is counted, and set the predetermined condition.
In step s 102, described image is replicated to obtain at least two width duplicating images, and to every width copy pattern
As carrying out different image procossings respectively to obtain at least two width layered images.That is, by the copying image of shooting into more
Part, image procossing is carried out respectively to obtained each width duplicating image, and the image procossing carried out to every piece image is
Different, this is equivalent to carry out layered shaping to original image, so as to obtain the layered image by different disposal.
The different image procossing can include:Remove noise processing, contrast enhancement processing.It can also include other
Image procossing, for example, path coloring treatment, pattern cut processing, texture recognition pretreatment etc., by these image procossings
Afterwards, several layered images will be obtained.
Step S102 may further include step:S1 is removed noise to the width in the duplicating image and handles
To first layer image;And step S2 carries out contrast enhancement processing to another width in the duplicating image and obtains second point
Tomographic image.
Fig. 2 is the particular flow sheet for the step S1 for being removed noise processing, as shown in Fig. 2, step S1 can include:
Step S201, identifies the noise in the duplicating image.As shown in figure 3, step S201 may further include son
Step S301-S303.
In sub-step S301, by the gray value of each pixel in the duplicating image and its two adjacent pixel in left and right
The gray value phase adduction of point is averaged the calculating gray value as each pixel.
In sub-step S302, judge whether the gray value of each pixel calculates the absolute value of the difference of gray value with it
In predetermined threshold range.
In sub-step S303, the absolute value of gray value and the difference for calculating gray value is exceeded to the picture of predetermined threshold range
Vegetarian refreshments is identified as noise.Wherein predetermined threshold range can be configured according to specific condition, or can also be according in the past
The empirical value accumulated in noise identification and processing procedure is configured.
Step S202, after the noise in identifying the duplicating image, by the gray value of each noise and phase around it
The gray value phase adduction of eight adjacent pixels is averaged the denoising gray value as each noise.Since pixel is with vertical
Horizontal both direction is evenly arranged, and therefore, each pixel can have eight adjacent pixels, therefore, by each noise
The gray value of gray value eight pixels adjacent thereto is summed in the denoising gray value as the noise of averaging.
Step S203, the denoising gray scale that the gray value of each noise in the duplicating image is replaced with to the noise are worth to
First layer image.After obtaining the denoising gray value of each noise, the gray value of each noise in the duplicating image is replaced
It is changed to the denoising gray value of the noise, and other pixels(It is not noise)Gray value it is constant, obtain through remove noise at
The first layer image of reason.
Digital picture, due to illumination or the object reason such as in itself, often occurs that target area contrasts in gatherer process
Low situation is spent, contrast enhancement processing can be carried out to image.
Fig. 4 is the flow chart for the step S2 that contrast processing is carried out to the duplicating image, as shown in figure 4, step S2 can
With including:
Step S401, at least two subregions are divided into by the duplicating image.
The basic thought of contrast enhancement processing is carried out, is that image is divided into two sections or multistage by gray scale interval, respectively
Greyscale transformation is carried out, so as to strengthen the contrast of image.
It is possible, firstly, to number and the division of division subregion are determined by the analysis of the grey level histogram to duplicating image
Subregion boundary threshold.Grey level histogram is the pixel frequency of occurrences of different grey-scale in statistical picture, therefore basis
Grey level histogram can obtain the distribution situation of the duplicating image gray value, and according to the distribution of the duplicating image gray value
Situation determines to divide an image into more sub-regions, and determines the boundary threshold in division region to determine two neighboring region
Waypoint, and the duplicating image is divided at least two subregions by waypoint., can be according to figure in the division of subregion
How many wave crest of the grey level histogram of picture or trough determine the number of division subregion, and are used as subzone boundaries threshold using paddy
Value.In terms of the setting of boundary threshold, it can be determined according to being trained to image engine, i.e. to be identified to largely similar
The image of image is trained with definite suitable boundary threshold, and determining for waypoint can be according to the boundary threshold of selection
Calculated, or given threshold can also determine waypoint on the histogram.
Every sub-regions are carried out gray scale adjustment by step S402 respectively, to obtain the second layered image.
Gray scale adjustment is carried out respectively to every sub-regions, specifically, be exactly as needed, will be per each in sub-regions
Pixel carries out the conversion of gray value according to pre-defined rule, opposite to suppress those to protrude the gray scale interval where interesting target
Uninterested gray space, can use linear transformation, i.e., the conversion of gray value is carried out using predetermined linear transformation for mula, and
Obtain the second layered image.
In step s 103, obtained layered image is subjected to figure layer merging, with image after being handled.
Fig. 5 is the particular flow sheet of step S103, as shown in figure 5, step S103 can include:
Step S501, takes intermediate value to the gray value of corresponding pixel in each layered image, obtains each pixel
Gray value intermediate value.
Specifically, the every width layered image obtained in above-mentioned step S102 is both for each identical duplicating image
Carry out the image after different images processing respectively, therefore the still still original pixel of the pixel in each width layered image,
Expression or identical graphical information, simply after different image procossings, the gray scale of each pixel there may be
Change, therefore, intermediate value is taken to the gray value of corresponding pixel in every width layered image, can be that each pixel determines one
A suitable new gray value.
The gray value of each pixel, is replaced with the gray value intermediate value of the pixel, schemed after being handled by step S502
Picture.
Specifically, can be each by what is obtained in the original image or another width duplicating image of the image that shooting obtains
New gray value of the gray value intermediate value of pixel as the pixel, the pixel is adjusted to by the gray value of each pixel
Gray value intermediate value, image after being handled, this completes the figure layer merging of layered image, the image after being handled.
Alternatively, after completing image and merging, it is contemplated that the needs of picture quality, can also be by image after the obtained processing
The pixel to conform to a predetermined condition carries out gray scale coloring again, so as on the image more intentinonally be labeled image, example
Such as, black pixel will be leveled off to(Gray value exceedes the point of certain value)The gray value of pixel add 2, to lift partially black pixel
Color depth.
To image after obtained processing, the contrast with original image can also be carried out, by each picture of image after the processing
The gray value of vegetarian refreshments and the gray value of the corresponding pixel points of original image subtract each other to obtain the gray value differences of each pixel, and judge institute
Whether the absolute value for stating gray value differences exceedes predetermined threshold, if the gray value differences of the point exceed predetermined threshold, also needs to this
The gray value of point carries out the adjustment of gray value.
In step S104, the image of each character is extracted from image after the processing.
With reference to figure 6, Fig. 6 is the particular flow sheet of step S104.Each character is extracted, texture point can be first passed through
Analysis determines the position of text image in image after the processing, then carries out Character segmentation to text image to extract this each word
Symbol.
As shown in fig. 6, step S104 can include step S601 and step S602.
In step s 601, the position of the text image after the processing in image is obtained.Refer to shown in Fig. 7, Fig. 7 is
The particular flow sheet of step S601, specifically, may comprise steps of:
Step S701, the Edge texture in every row pixel is identified by edge detection.The Edge texture, refers to image
Region jumpy occurs for middle gray scale, can be by setting a predetermined threshold value excursion to be identified, i.e. identify
Grey scale change exceeds the region of the predetermined threshold value excursion.
Step S702, histogram is done to the Edge texture of every row pixel, and determines side according to the analysis to the histogram
The recognition threshold of edge primitive.The edge primitive can be pixel of the gray value in predetermined threshold range.The edge base
The recognition threshold of member, can be the dynamic threshold being calculated using adaptive thresholding algorithm.
Step S703, edge primitive number in often being gone according to the recognition threshold of edge primitive statistics, and record and often go
The starting position of edge primitive and end position.
Step S704, identifies the non-blank-white row in image after the processing.Can be according to the gray scale of image after the processing
Histogram, gray value is very poor(The difference of gray value maxima and minima)Row less than predetermined threshold is identified as blank line, its
It is remaining to be identified as non-blank-white row.For example, it is less than amplitude between maximum gradation value and minimum gradation value in histogram by gray value is very poor
(It is very poor)5% row be identified as blank line.The blank line that will identify that in follow-up processing is as blank background, after not doing
It is continuous to handle, only using non-blank-white row as processing target in subsequent treatment.Wherein, predetermined threshold can be according to a variety of sample graphs
The variable that piece obtains after being trained, for example, for the license picture being directed to after being currently known training, can be by predetermined threshold
5% of amplitude between maximum gradation value and minimum gradation value is set in grey level histogram, should for other types of picture recognition
Variable can be configured according to the result being trained to other types image.
Step S705, judges whether current non-blank-white row meets preset condition, if it is satisfied, then carrying out next non-blank-white row
Detection.Wherein, a large amount of character samples, can be sent into BP neural network and be trained study by the preset condition, according to
Obtained result determines after BP neural network training, for example, judge whether the number of the edge primitive in often going reaches predetermined
Number.
Step S706, when being consecutively detected the non-blank-white row more than predetermined number and meeting the preset condition, according to every
The starting position of the edge primitive of one non-blank-white row and end position determine the position of text image.
For the step S701-S706 of the position of the text image in image after the above-mentioned definite processing, execution sequence
Above-mentioned one kind is not limited to, other orders can also be used to perform, for example, the non-NULL after the processing in image can be identified first
Bai Hang, then the non-blank-white row to identifying carry out identification, judgement etc. the step of other.
In step S602, Character segmentation is carried out to the text image, extracts each word in the text image
The image of symbol.
Segmentation is carried out to the text image to utilize sciagraphy to extract the text into every trade cutting and character segmentation
The image of each character in this image.Row cutting, exactly comes out the character cutting of a line a line, forms single file character text figure
Picture.Can be along capable direction floor projection, by identifying the blank between literal line and row into every trade cutting.Character segmentation, is exactly
After the single file character text image for having carried out row cutting and having obtained, by single character picture from each single file character text image
In cut out, obtain the single character picture of each character.
In step S105, the image of each character to extracting carries out Text region.
Text region can be carried out to the character using BP neural network, the image of each character is sent into BP nerve nets
Into the identification of line character in network system.
Wherein, the image array to character sample that the advance training to character sample in BP neural network can pass through
The method being trained, i.e. be first normalized to the image of character sample, obtain the image moment of each character sample
Battle array, then BP neural network is carried out to the image array of each character sample(Error back propagation)Training study.
The image of each character is sent into BP neural network when carrying out the identification of each character picture and carries out word
The identification of symbol.
Present invention also provides a kind of device for identifying character on printed matter, Fig. 8 is the identification according to the embodiment of the present application
The structure diagram of the device 800 of character on printed matter, as shown in the figure the device 800 can include:Acquisition module 810, at layering
Manage module 820, figure layer merging module 830, extraction module 840, and identification module 850.
Acquisition module 810 can be used for shooting the printed matter to obtain the image to be identified.
Hierarchical processing module 820 can be used for replicating described image to obtain at least two width duplicating images, and right
Every width duplicating image carries out different image procossings to obtain at least two width layered images respectively.
Figure layer merging module 830 can be used for obtained layered image carrying out figure layer merging, with image after being handled.
Extraction module 840 can be used for from image after the processing image for extracting each character.
The image that identification module 850 can be used for each character to extracting carries out character recognition.
According to one embodiment of the application, the acquisition module 810 can be further used for when being shot by pre-
Fixed condition is exposed setting.
According to one embodiment of the application, the hierarchical processing module 820 can include denoising module and contrast
Degree enhancing module.
Denoising module can be used for being removed the width in the duplicating image noise processing to obtain first
Layered image.
Contrast-enhancement module can be used for carrying out another width in the duplicating image contrast enhancement processing to obtain
To the second layered image.
According to one embodiment of the application, the denoising module can include:Noise identification module, denoising gray scale
It is worth acquisition module, and noise removes module.
Noise identification module can be used for identifying the noise in the duplicating image.
Denoising gray value acquisition module can be used for the gray value of each noise and eight pixels adjacent around it
Gray value phase adduction be averaged denoising gray value as each noise.
Noise removes module and can be used for the gray value of each noise in the duplicating image replacing with going for the noise
Gray value make an uproar to obtain first layer image.
According to one embodiment of the application, the noise identification module can include:Calculating sub module, judges submodule
Block, and identification submodule.
Calculating sub module can be used for the gray value of each pixel in the duplicating image is adjacent with its left and right two
The gray value phase adduction of pixel is averaged the calculating gray value as each pixel.
Judging submodule can be used for judging that the gray value of each pixel calculates the absolute value of the difference of gray value with it
Whether in predetermined threshold range.
Identification submodule can be used for gray value with calculating the absolute difference of gray value beyond predetermined threshold range
Pixel is identified as noise.
According to one embodiment of the application, the contrast-enhancement module can include picture portion module and gray scale tune
Mould preparation block.
Picture portion module can be used for the duplicating image being divided at least two subregions.
Gray scale adjustment module can be used for carrying out gray scale adjustment respectively to every sub-regions, to obtain the second layered image.
According to one embodiment of the application, the merging module 830 can include:Value module and gray value replace mould
Block.
Value module can be used for taking intermediate value to the gray value of corresponding pixel in the layered image, obtain each picture
The gray value intermediate value of vegetarian refreshments.
Gray value replacement module can be used for the gray value intermediate value that the gray value of each pixel is replaced with to the pixel,
Image after being handled.
According to one embodiment of the application, the extraction module 840 can include:
Position acquisition module, can be used for obtaining the position of the text image after the processing in image;
Character segmentation module, can be used for carrying out Character segmentation to the text image, extracts in the text image
Each character image.
According to one embodiment of the application, the position acquisition module may further include:Edge detection module, threshold
It is worth acquisition module, statistic record module, non-blank-white row identification module, condition judgment module, and position determination module.
Edge detection module can be used for identifying the Edge texture in every row pixel by edge detection.Wherein, it is described
Edge texture can be the region that acute variation occurs for gray value.
Threshold value acquisition module can be used for doing the Edge texture of every row pixel histogram, and according to the histogram point
Analysis determines the recognition threshold of edge primitive.
Statistic record module can be used for the number that often row top edge primitive is counted according to the recognition threshold of the edge primitive
Amount, and record the often starting position of row edge primitive and end position.
Non-blank-white row identification module can be used for identifying the non-blank-white row in image after the processing.
Condition judgment module can be used for judging whether current non-blank-white row meets preset condition, if it is satisfied, then carrying out
The detection of next non-blank-white row.
Position determination module can be used for meeting the default bar when being consecutively detected the non-blank-white row more than predetermined number
During part, the position of text image is determined according to the starting position of the edge primitive of each non-blank-white row and end position.
According to one embodiment of the application, the identification module 850 can be further used for, and utilize BP neural network pair
The image of each character carries out character recognition.
Since the function that the device of the present embodiment is realized essentially corresponds to earlier figures 1 to the embodiment of the method shown in Fig. 7,
Therefore not detailed part in the description of the present embodiment, the related description in previous embodiment is may refer to, this will not be repeated here.
In a typical configuration, computing device includes one or more processors (CPU), input/output interface, net
Network interface and memory.
Memory may include computer-readable medium in volatile memory, random access memory (RAM) and/or
The forms such as Nonvolatile memory, such as read-only storage (ROM) or flash memory (flash RAM).Memory is computer-readable medium
Example.
Computer-readable medium includes permanent and non-permanent, removable and non-removable media can be by any method
Or technology come realize information store.Information can be computer-readable instruction, data structure, the module of program or other data.
The example of the storage medium of computer includes, but are not limited to phase transition internal memory (PRAM), static RAM (SRAM), moves
State random access memory (DRAM), other kinds of random access memory (RAM), read-only storage (ROM), electric erasable
Programmable read only memory (EEPROM), fast flash memory bank or other memory techniques, read-only optical disc read-only storage (CD-ROM),
Digital versatile disc (DVD) or other optical storages, magnetic cassette tape, the storage of tape magnetic rigid disk or other magnetic storage apparatus
Or any other non-transmission medium, the information that can be accessed by a computing device available for storage.Define, calculate according to herein
Machine computer-readable recording medium does not include non-temporary computer readable media (transitory media), such as the data-signal and carrier wave of modulation.
It should also be noted that, term " comprising ", "comprising" or its any other variant are intended to nonexcludability
Comprising so that process, method, commodity or equipment including a series of elements not only include those key elements, but also wrapping
Include other elements that are not explicitly listed, or further include for this process, method, commodity or equipment it is intrinsic will
Element.In the absence of more restrictions, the key element limited by sentence "including a ...", it is not excluded that wanted including described
Also there are other identical element in the process of element, method, commodity or equipment.
It will be understood by those skilled in the art that embodiments herein can be provided as method, system or computer program product.
Therefore, the application can be using the embodiment in terms of complete hardware embodiment, complete software embodiment or combination software and hardware
Form.Deposited moreover, the application can use to can use in one or more computers for wherein including computer usable program code
Storage media(Including but not limited to magnetic disk storage, CD-ROM, optical memory etc.)The shape of the computer program product of upper implementation
Formula.
The foregoing is merely embodiments herein, is not limited to the application.For those skilled in the art
For, the application can have various modifications and variations.All any modifications made within spirit herein and principle, be equal
Replace, improve etc., it should be included within the scope of claims hereof.
Claims (18)
- A kind of 1. method for identifying character on printed matter, it is characterised in that including:The printed matter is shot to obtain the image to be identified;Described image is replicated to obtain at least two width duplicating images, and different figures is carried out respectively to every width duplicating image Picture is handled to obtain at least two width layered images;Obtained layered image is subjected to figure layer merging, with image after being handled;The image of each character is extracted from image after the processing;AndThe image of each character to extracting carries out character recognition;Wherein, it is described to carry out different image procossings respectively to every width duplicating image to obtain at least two width layered images, including:The grey level histogram of a width in the duplicating image is analyzed to obtain the distribution feelings of the duplicating image gray value Condition, and the duplicating image is divided into by more sub-regions according to the distribution situation of the duplicating image gray value, to every height Region carries out gray scale adjustment respectively, to obtain the second layered image.
- 2. according to the method described in claim 1, it is characterized in that, the printed matter is shot to obtain the figure to be identified Picture, including:When being shot setting is exposed by predetermined condition.
- 3. according to the method described in claim 1, it is characterized in that, different image procossings is carried out respectively to every width duplicating image To obtain at least two width layered images, including:Noise processing is removed to another width in the duplicating image to obtain first layer image.
- 4. according to the method described in claim 3, it is characterized in that, noise is removed to another width in the duplicating image Handle to obtain first layer image, including:Identify the noise in the duplicating image;The gray value phase adduction of the gray value of each noise eight pixels adjacent with around it is averaged as each The denoising gray value of noise;AndThe gray value of each noise in the duplicating image is replaced with into the denoising gray value of the noise to obtain first layer figure Picture.
- 5. according to the method described in claim 4, it is characterized in that, identify that the noise in the duplicating image includes:The gray value phase adduction of the gray value of each pixel in the duplicating image and its two neighbor pixel in left and right is taken Calculating gray value of the average value as each pixel;Judge that the gray value of each pixel calculates the absolute value of the difference of gray value whether in predetermined threshold range with it;With AndGray value and the absolute value for the difference for calculating gray value are identified as noise beyond the pixel of predetermined threshold range.
- 6. according to the method described in claim 1, it is characterized in that, the layered image is merged, after obtaining processing Image, including:Intermediate value is taken to the gray value of corresponding pixel in the layered image, obtains the gray value intermediate value of each pixel;With AndThe gray value of each pixel is replaced with to the gray value intermediate value of the pixel, with image after being handled.
- 7. according to the method described in claim 1, it is characterized in that, extract the figure of each character in image after the processing Picture, including:Obtain the position of the text image after the processing in image;AndCharacter segmentation is carried out to the text image, extracts the image of each character in the text image.
- 8. the method according to the description of claim 7 is characterized in that obtain the position of the text image after the processing in image Put, including:Edge texture in every row pixel is identified by edge detection;Histogram is done to the Edge texture of every row pixel, and the identification threshold of edge primitive is determined according to the analysis to the histogram Value;According to the number of the often capable edge primitive of the recognition threshold of edge primitive statistics, and record opening for every row edge primitive Beginning position and end position;Identify the non-blank-white row in image after the processing;Judge whether current non-blank-white row meets preset condition, if it is satisfied, then carrying out the detection of next non-blank-white row;AndWhen being consecutively detected the non-blank-white row more than predetermined number and meeting the preset condition, according to the side of each non-blank-white row The starting position of edge primitive and end position determine the position of text image.
- 9. according to the method described in claim 1, it is characterized in that, the image of each character to extracting is known into line character Not, including:Character recognition is carried out to the image of each character using BP neural network.
- A kind of 10. device for identifying character on printed matter, it is characterised in that including:Acquisition module, for being shot to the printed matter to obtain the image to be identified;Hierarchical processing module, for being replicated to described image to obtain at least two width duplicating images, and to every width copy pattern As carrying out different image procossings respectively to obtain at least two width layered images;Figure layer merging module, for obtained layered image to be carried out figure layer merging, with image after being handled;Extraction module, for extracting the image of each character from image after the processing;AndIdentification module, the image for each character to extracting carry out character recognition;Wherein, it is described to carry out different image procossings respectively to every width duplicating image to obtain at least two width layered images, including:The grey level histogram of a width in the duplicating image is analyzed to obtain the distribution feelings of the duplicating image gray value Condition, and the duplicating image is divided into by more sub-regions according to the distribution situation of the duplicating image gray value, to every height Region carries out gray scale adjustment respectively, to obtain the second layered image.
- 11. device according to claim 10, it is characterised in that the acquisition module, is further used for being shot When by predetermined condition be exposed setting.
- 12. device according to claim 10, it is characterised in that the hierarchical block includes:Denoising module, is handled to obtain first layer figure for being removed noise to another width in the duplicating image Picture.
- 13. device according to claim 12, it is characterised in that the denoising module, including:Noise identification module, for identifying the noise in the duplicating image;Denoising gray value acquisition module, for by the gray value of the gray value of each noise and eight pixels adjacent around it Phase adduction is averaged the denoising gray value as each noise;AndNoise removes module, for the gray value of each noise in the duplicating image to be replaced with to the denoising gray value of the noise To obtain first layer image.
- 14. device according to claim 13, it is characterised in that the noise identification module includes:Calculating sub module, for by the gray value of each pixel in the duplicating image and its two neighbor pixel in left and right Gray value phase adduction is averaged the calculating gray value as each pixel;Whether judging submodule, the gray value for judging each pixel calculate the absolute value of the difference of gray value pre- with it Determine in threshold range;AndSubmodule is identified, for the absolute value of gray value and the difference for calculating gray value to be exceeded to the pixel of predetermined threshold range It is identified as noise.
- 15. device according to claim 10, it is characterised in that the merging module includes:Value module, for taking intermediate value to the gray value of corresponding pixel in the layered image, obtains each pixel Gray value intermediate value;AndGray value replacement module, for the gray value of each pixel to be replaced with to the gray value intermediate value of the pixel, obtains everywhere Image after reason.
- 16. device according to claim 10, it is characterised in that the extraction module includes:Position acquisition module, for obtaining the position of the text image after the processing in image;AndCharacter segmentation module, for carrying out Character segmentation to the text image, extracts each word in the text image The image of symbol.
- 17. device according to claim 16, it is characterised in that the position acquisition module includes:Edge detection module, for identifying the Edge texture in every row pixel by edge detection;Threshold value acquisition module, for doing histogram to the Edge texture of every row pixel, and determines according to the histogram analysis The recognition threshold of edge primitive;Statistic record module, for counting the quantity of often row top edge primitive according to the recognition threshold of the edge primitive, and remembers Record the often starting position of row edge primitive and end position;Non-blank-white row identification module, for identifying the non-blank-white row after the processing in image;Condition judgment module, for judging whether current non-blank-white row meets preset condition, if it is satisfied, then carrying out next non-NULL The detection of white row;AndPosition determination module, for when being consecutively detected the non-blank-white row more than predetermined number and meeting the preset condition, root The position of text image is determined according to the starting position and end position of the edge primitive of each non-blank-white row.
- 18. device according to claim 10, it is characterised in that the identification module is further used for, and utilizes BP nerves Network carries out character recognition to the image of each character.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310331468.4A CN104346609B (en) | 2013-08-01 | 2013-08-01 | The method and device of character on a kind of identification printed matter |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310331468.4A CN104346609B (en) | 2013-08-01 | 2013-08-01 | The method and device of character on a kind of identification printed matter |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104346609A CN104346609A (en) | 2015-02-11 |
CN104346609B true CN104346609B (en) | 2018-05-04 |
Family
ID=52502183
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310331468.4A Active CN104346609B (en) | 2013-08-01 | 2013-08-01 | The method and device of character on a kind of identification printed matter |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104346609B (en) |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104978578B (en) * | 2015-04-21 | 2018-07-27 | 深圳市点通数据有限公司 | Mobile phone photograph text image method for evaluating quality |
CN105160304A (en) * | 2015-08-10 | 2015-12-16 | 中山大学 | Method and device for sign text identification based on machine vision |
CN105787480B (en) * | 2016-02-26 | 2020-01-03 | 广东小天才科技有限公司 | Method and device for shooting test questions |
CN107145734B (en) * | 2017-05-04 | 2020-08-28 | 深圳市联新移动医疗科技有限公司 | Automatic medical data acquisition and entry method and system |
CN107545460B (en) * | 2017-07-25 | 2020-12-18 | 广州智选网络科技有限公司 | Digital color page promotion management and analysis method, storage device and mobile terminal |
CN110135288B (en) * | 2019-04-28 | 2023-04-18 | 佛山科学技术学院 | Method and device for quickly checking electronic certificate |
CN110929738A (en) * | 2019-11-19 | 2020-03-27 | 上海眼控科技股份有限公司 | Certificate card edge detection method, device, equipment and readable storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2006031163A (en) * | 2004-07-13 | 2006-02-02 | Ricoh Co Ltd | Character recognition result processor, character recognition result processing method, character recognition result processing program and recording medium with the same program stored |
CN1756311A (en) * | 2004-09-29 | 2006-04-05 | 乐金电子(惠州)有限公司 | Image switching method and its apparatus |
CN102289792A (en) * | 2011-05-03 | 2011-12-21 | 北京云加速信息技术有限公司 | Method and system for enhancing low-illumination video image |
CN102663382A (en) * | 2012-04-25 | 2012-09-12 | 重庆邮电大学 | Video image character recognition method based on submesh characteristic adaptive weighting |
-
2013
- 2013-08-01 CN CN201310331468.4A patent/CN104346609B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2006031163A (en) * | 2004-07-13 | 2006-02-02 | Ricoh Co Ltd | Character recognition result processor, character recognition result processing method, character recognition result processing program and recording medium with the same program stored |
CN1756311A (en) * | 2004-09-29 | 2006-04-05 | 乐金电子(惠州)有限公司 | Image switching method and its apparatus |
CN102289792A (en) * | 2011-05-03 | 2011-12-21 | 北京云加速信息技术有限公司 | Method and system for enhancing low-illumination video image |
CN102663382A (en) * | 2012-04-25 | 2012-09-12 | 重庆邮电大学 | Video image character recognition method based on submesh characteristic adaptive weighting |
Non-Patent Citations (5)
Title |
---|
"Photoshop照片模糊变清晰大全";sou6;《http://www.31ian.com/edu/2012/04-03/24407.html?&from=androidqq》;20120403;第1-6页 * |
"PS几种处理模糊照片变清晰的方法";小照;《http://www.31ian.com/edu/2012/07-21/32980.html》;20120721;第1-4页 * |
"利用PS把不清晰的照片改清晰";益彩足球;《http://jingyan.baidu.com/artic1e/f3ad7d0fdc433a09c3345b0b.html》;20120508;第1-6页 * |
"同源视频检索与商标货号识别";张惠;《中国优秀硕士学位论文全文数据库 信息科技辑》;20110915;第31-44、54页 * |
"快速提高照片清晰度";zhoulpwen;《http://jingyan.baidu.com/artic1e/fec4bce20ae348f2608d8b64.html》;20110406;第1-4页 * |
Also Published As
Publication number | Publication date |
---|---|
CN104346609A (en) | 2015-02-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104346609B (en) | The method and device of character on a kind of identification printed matter | |
CN110766736B (en) | Defect detection method, defect detection device, electronic equipment and storage medium | |
Afzal et al. | Document image binarization using lstm: A sequence learning approach | |
US9251614B1 (en) | Background removal for document images | |
US20170220836A1 (en) | Fingerprint classification system and method using regular expression machines | |
US11836969B2 (en) | Preprocessing images for OCR using character pixel height estimation and cycle generative adversarial networks for better character recognition | |
CN108197644A (en) | A kind of image-recognizing method and device | |
CN106651774B (en) | License plate super-resolution model reconstruction method and device | |
CN109165538A (en) | Bar code detection method and device based on deep neural network | |
CN111192241B (en) | Quality evaluation method and device for face image and computer storage medium | |
Ntogas et al. | A binarization algorithm for historical manuscripts | |
CN113191358B (en) | Metal part surface text detection method and system | |
CN106599891A (en) | Remote sensing image region-of-interest rapid extraction method based on scale phase spectrum saliency | |
Ding et al. | Smoothing identification for digital image forensics | |
Siddiqui et al. | Block-based feature-level multi-focus image fusion | |
CN112488137A (en) | Sample acquisition method and device, electronic equipment and machine-readable storage medium | |
CN111144425A (en) | Method and device for detecting screen shot picture, electronic equipment and storage medium | |
CN113920434A (en) | Image reproduction detection method, device and medium based on target | |
Shobha Rani et al. | Restoration of deteriorated text sections in ancient document images using atri-level semi-adaptive thresholding technique | |
CN117373136A (en) | Face counterfeiting detection method based on frequency mask and attention consistency | |
Jin et al. | A color image segmentation method based on improved K-means clustering algorithm | |
CN110276260B (en) | Commodity detection method based on depth camera | |
CN109643451B (en) | Line detection method | |
Wang | An algorithm for ATM recognition of spliced money based on image features | |
Roe et al. | Thresholding color images of historical documents with preservation of the visual quality of graphical elements |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20191203 Address after: P.O. Box 31119, grand exhibition hall, hibiscus street, 802 West Bay Road, Grand Cayman, Cayman Islands Patentee after: Innovative advanced technology Co., Ltd Address before: A four-storey 847 mailbox in Grand Cayman Capital Building, British Cayman Islands Patentee before: Alibaba Group Holding Co., Ltd. |
|
TR01 | Transfer of patent right |