CN109598271A - A kind of character segmentation method and device - Google Patents

A kind of character segmentation method and device Download PDF

Info

Publication number
CN109598271A
CN109598271A CN201811504027.9A CN201811504027A CN109598271A CN 109598271 A CN109598271 A CN 109598271A CN 201811504027 A CN201811504027 A CN 201811504027A CN 109598271 A CN109598271 A CN 109598271A
Authority
CN
China
Prior art keywords
character
image
binary picture
array
segmentation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811504027.9A
Other languages
Chinese (zh)
Other versions
CN109598271B (en
Inventor
罗熹之
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing QIYI Century Science and Technology Co Ltd
Original Assignee
Beijing QIYI Century Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing QIYI Century Science and Technology Co Ltd filed Critical Beijing QIYI Century Science and Technology Co Ltd
Priority to CN201811504027.9A priority Critical patent/CN109598271B/en
Publication of CN109598271A publication Critical patent/CN109598271A/en
Application granted granted Critical
Publication of CN109598271B publication Critical patent/CN109598271B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/136Segmentation; Edge detection involving thresholding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/30Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components

Abstract

This application discloses a kind of character segmentation methods, comprising: obtains character zone image, and generates the corresponding grayscale image of the character zone image, the character zone image is the corresponding image of character zone in printing of cards;The grayscale image is handled to obtain binary picture based on canny edge detection algorithm;By the binary picture along the direction projection vertical with default segmentation direction, statistics array is obtained;Linear transformation is carried out to the statistics array, so that the element normalization in the statistics array;Statistics array after normalization is matched with preset more Character mother plate arrays, Character segmentation is carried out according to matching result and obtains monocase rectangle string.This method is handled the grayscale image of character zone using canny edge detection algorithm, can be overcome the interference of background color lump, and matched using Elastic forming board, be improved Character segmentation precision.Disclosed herein as well is a kind of Character segmentation devices.

Description

A kind of character segmentation method and device
Technical field
This application involves field of image processing more particularly to a kind of character segmentation method and devices.
Background technique
With the development of science and technology, printing bank card is increasingly becoming a kind of more extensive bank card types of application.? In some cases, by carrying out automatic identification to the character on bank card, can to avoid the dynamic input bank card of user hand card number, On the one hand identifying code etc. can cause to input information error to avoid artificial fault, improve the accuracy of input information, another party Face simplifies user's operation, provides user experience.
In order to identify the character on bank card, it is necessary first to be split to the character on bank card.Currently, industry provides A kind of method that character is split based on Morphological Gradient.For printing bank card, background is usually some various The color lump of various kinds when obtaining character edge feature using Morphological Gradient, is easy to be interfered by these color lumps, so as to cause template There is large error in timing, and then leads to segmentation inaccuracy.
Based on this, it is urgent to provide a kind of character segmentation methods for printing bank card, to solve to be based on Morphological Gradient The technical problem of segmentation inaccuracy caused by separating character.
Summary of the invention
In view of this, this method is using canny edge detection algorithm to word this application provides a kind of character segmentation method The grayscale image in symbol region is handled, and can be overcome the interference of background color lump, and matched using Elastic forming board, be improved Character segmentation precision.Accordingly, present invention also provides a kind of Character segmentation devices.
The application first aspect provides a kind of character segmentation method, which comprises
Character zone image is obtained, and generates the corresponding grayscale image of the character zone image, the character zone image For the corresponding image of character zone in printing of cards;
The grayscale image is handled to obtain binary picture based on canny edge detection algorithm;
By the binary picture along the direction projection vertical with default segmentation direction, statistics array is obtained;
Linear transformation is carried out to the statistics array, so that the element normalization in the statistics array;
Statistics array after normalization is matched with preset more Character mother plate arrays, according to matching result to described Character zone image is split to obtain monocase rectangle string, and the monocase rectangle string includes each of described character zone The corresponding rectangular area of character.
Optionally, more Character mother plate arrays include the corresponding template array of at least one character combination, the multiword Accord with the length of template array according to character packet count in the character combination, the number of character, character width and group spacing with The proportionality coefficient of character width determines.
Optionally, the printing of cards includes printing bank card;The character zone includes bank's card number field, identifying code area Any one or more in domain or effective term area.
Optionally, before being handled the grayscale image described based on canny edge detection algorithm, the method is also Include:
Gauss noise reduction is carried out to the grayscale image.
Optionally, it is described based on canny edge detection algorithm the grayscale image is handled to obtain binary picture include:
For the grayscale image, single order local derviation finite difference formulations gradient magnitude is utilized;
Maximum inhibition processing is carried out according to the gradient magnitude, obtains binary picture.
Optionally, before by the binary picture edge direction projection vertical with default segmentation direction, the method is also Include:
Closed operation processing is carried out to the binary picture according to dual threashold value-based algorithm, so that the image side in the binary picture Edge closure.
The application second aspect provides a kind of Character segmentation device, and described device includes:
Generation module for obtaining character zone image, and generates the corresponding grayscale image of the character zone image, described Character zone image is the corresponding image of character zone in printing of cards;
Edge detection module, for being handled to obtain binaryzation to the grayscale image based on canny edge detection algorithm Figure;
Projection module, for the binary picture along the direction projection vertical with default segmentation direction, to be obtained statistical number Group;
Conversion module, for carrying out linear transformation to the statistics array, so that the element normalizing in the statistics array Change;
Divide module, for the statistics array after normalization to be matched with preset more Character mother plate arrays, according to Matching result is split the character picture to obtain monocase rectangle string, and the monocase rectangle string includes the character area The corresponding rectangular area of each character in domain.
Optionally, more Character mother plate arrays include the corresponding template array of at least one character combination, the multiword Accord with the length of template array according to character packet count in the character combination, the number of character, character width and group spacing with The proportionality coefficient of character width determines.
Optionally, the printing of cards includes printing bank card;The character zone includes bank's card number field, identifying code area Any one or more in domain or effective term area.
Optionally, described device further include:
Noise reduction module, for it is described the grayscale image is handled based on canny edge detection algorithm before, to institute It states grayscale image and carries out Gauss noise reduction.
Optionally, the edge detection module is specifically used for:
For the grayscale image, single order local derviation finite difference formulations gradient magnitude is utilized;
Maximum inhibition processing is carried out according to the gradient magnitude, obtains binary picture
Optionally, described device further include:
Computing module, for by the binary picture before the direction projection vertical with default segmentation direction, according to Dual threashold value-based algorithm carries out closed operation processing to the binary picture, so that the image border in the binary picture is closed.
As can be seen from the above technical solutions, the embodiment of the present application has the advantage that
The embodiment of the present application provides a kind of character segmentation method, first acquisition character zone image, character zone image The corresponding image of character zone in printing of cards is specifically referred to, the corresponding grayscale image of the character zone image is then generated, then, The grayscale image is handled to obtain binary picture based on canny edge detection algorithm, by the binary picture along with it is default Divide the vertical direction projection in direction, statistics array is obtained, then linear transformation is carried out to the statistics array, so that the statistics In array element normalization, the statistics array after normalization is matched with preset more Character mother plate arrays, according to Character zone image is split with result to obtain monocase rectangle string, the monocase rectangle string includes the character zone In the corresponding rectangular area of each character.This method is carried out using grayscale image of the canny edge detection algorithm to character zone Processing can overcome the interference of background color lump, and be matched using Elastic forming board, improve Character segmentation precision.
Detailed description of the invention
In order to illustrate the technical solutions in the embodiments of the present application or in the prior art more clearly, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments of application without any creative labor, may be used also for those of ordinary skill in the art To obtain other drawings based on these drawings.
Fig. 1 is a kind of scene framework figure of character segmentation method in the embodiment of the present application;
Fig. 2 is a kind of flow chart of character segmentation method in the embodiment of the present application;
Fig. 3 A is to obtain the result signal that edge feature carries out template matching using Morphological Gradient in the embodiment of the present application Figure;
Fig. 3 B is to be shown in the embodiment of the present application using the result that canny edge detection obtains edge feature progress template matching It is intended to;
Fig. 4 is the structural schematic diagram of Character segmentation device in the embodiment of the present application.
Specific embodiment
In order to make those skilled in the art more fully understand application scheme, below in conjunction in the embodiment of the present application Attached drawing, the technical scheme in the embodiment of the application is clearly and completely described, it is clear that described embodiment is only this Apply for a part of the embodiment, instead of all the embodiments.Based on the embodiment in the application, those of ordinary skill in the art exist Every other embodiment obtained under the premise of creative work is not made, shall fall in the protection scope of this application.
The description and claims of this application and term " first ", " second ", " third ", " in above-mentioned attached drawing The (if present)s such as four " are to be used to distinguish similar objects, without being used to describe a particular order or precedence order.It should manage The data that solution uses in this way are interchangeable under appropriate circumstances, so that embodiments herein described herein for example can be to remove Sequence other than those of illustrating or describe herein is implemented.In addition, term " includes " and " having " and theirs is any Deformation, it is intended that cover it is non-exclusive include, for example, containing the process, method of a series of steps or units, system, production Product or equipment those of are not necessarily limited to be clearly listed step or unit, but may include be not clearly listed or for this A little process, methods, the other step or units of product or equipment inherently.
It is interfered for using Morphological Gradient to obtain character edge feature in the prior art by background color lump, leads to template There is large error in timing, and then leads to the technical problem of segmentation inaccuracy, this application provides a kind of character segmentation method, In this method, firstly, obtaining character zone image, character zone image specifically refers to the corresponding figure of character zone in printing of cards Then picture generates the corresponding grayscale image of the character zone image, then, based on canny edge detection algorithm to the gray scale Figure is handled to obtain binary picture, by the binary picture along the direction projection vertical with default segmentation direction, is counted Array, then linear transformation is carried out to the statistics array, so that the element normalization in the statistics array, after normalization Statistics array is matched with preset more Character mother plate arrays, is split to obtain to character zone image according to matching result Monocase rectangle string, the monocase rectangle string include the corresponding rectangular area of each character in the character zone.
This method is handled the grayscale image of character zone using canny edge detection algorithm, can overcome background The interference of color lump, and matched using Elastic forming board, which includes different patterns, different classes of character combination The template of formation, by comparing different patterns, the corresponding matching result of different classes of template, according to matching degree higher Character zone image is split with result, Character segmentation precision can be improved.
It is appreciated that this method can be applied to data processing equipment arbitrarily with image-capable.At the data Reason equipment can be terminal or server, wherein terminal can be it is existing, researching and developing or in the future research and development, can Any of interaction is realized by any type of wiredly and/or wirelessly connection (for example, Wi-Fi, LAN, honeycomb, coaxial cable etc.) User equipment, including but not limited to: existing, researching and developing or research and development in the future smart phones, tablet computer, on knee People's computer, desktop personal computer etc., the equipment that server is to provide the service of calculating.In the present embodiment, data processing Equipment can be independent, be also possible to the cluster formed by multiple data processing equipments.
In specific implementation, character segmentation method provided by the present application is stored in data processing in the form of application program and set Standby, data processing equipment is by executing the application program, to realize character segmentation method provided by the present application.Wherein, using journey Sequence can be independent application program, be also possible to be integrated in functional module, plug-in unit, small routine etc. in other applications Deng.
In order to facilitate statement, hereinafter with server as an example, being situated between to character segmentation method provided by the present application It continues.Below in conjunction with concrete scene, character segmentation method provided by the embodiments of the present application is introduced.
The scene framework figure of character segmentation method shown in Figure 1 includes server 10 and terminal in the application scenarios 20, terminal 20 scans printing of cards, generates the image of printing of cards, the image of the printing of cards is then uploaded to server 10, services Device 10 can obtain character zone image from the image of printing of cards, and convert grayscale image, then, base for character zone image The grayscale image is handled in canny edge detection algorithm to obtain binary picture, then, by the binary picture along with it is pre- If dividing the vertical direction projection in direction, statistics array is obtained;Linear transformation is carried out to the statistics array, so that the statistics Element normalization in array;Statistics array after normalization is matched with preset more Character mother plate arrays, according to Character zone image is split with result to obtain monocase rectangle string, wherein monocase rectangle string includes the character area The corresponding rectangular area of each character in domain, so realizes the Character segmentation of character zone, and segmentation with higher Precision.
In order to enable the technical solution of the application it is clearer, it can be readily appreciated that below in conjunction with attached drawing, the application is implemented The character segmentation method that example provides is introduced.
The flow chart of character segmentation method shown in Figure 2, this method are applied to server, comprising:
S201: character zone image is obtained, and generates the corresponding grayscale image of the character zone image.
Wherein, the character zone image is the corresponding image of character zone in printing of cards.Printing of cards refers to mode of printing The card of production, in some possible implementations, printing of cards can be the bank card of mode of printing production, i.e. printing bank Card is also possible to the certificates such as the identity card of mode of printing production.In order to facilitate understanding, hereinafter with bank card as an example, right Character zone image is introduced.For printing bank card, character zone include bank card number region, identifying code region or The effective term area of person, the embodiment of the present application can divide the one or more character zone execution characters of any of the above, to realize Character recognition.
Due in printing of cards, such as printing bank card, other general field colors of the color of character zone are different, therefore, Server can by image converting gradation figure perhaps binary map and based in grayscale image or binary map color lump difference determine print The character zone of bank's card graphic is brushed, and then obtains character zone image.After obtaining character zone image, server obtains word The pixel value for according with each pixel difference channel in area image, is handled character zone image based on the pixel value, raw At corresponding grayscale image.
Server generates grayscale image can be there are many implementation.For example, server can be based on floating-point arithmetic, integer side Method, displacement method, mean value method and any one for only taking in the methods of green determine the gray value of each pixel, utilize The gray value replaces the pixel value of RGB triple channel in pixel, to obtain grayscale image.
S202: the grayscale image is handled to obtain binary picture based on canny edge detection algorithm.
Image edge information is concentrated mainly on high band, therefore edge detection is substantially exactly High frequency filter.Wherein, to letter Number carry out differential high fdrequency component can be enhanced, the signal of digital picture belongs to discrete signal, therefore, to its differential be calculate it is poor Point or gradient.Canny edge detection algorithm is to seek what gradient was realized.
Carrying out processing to the grayscale image based on canny edge detection algorithm can specifically include following steps:
The first step utilizes single order local derviation finite difference formulations gradient magnitude.
Second step carries out non-maxima suppression processing according to the gradient magnitude, obtains binary picture.
For the first step, the gradient of approximate image gray value can be generally divided using first difference, in specific implementation, First-order partial derivative matrix, gradient magnitude matrix and the ladder of grayscale image in the x and y direction can be calculated by convolution operator Spend direction matrix.
For second step, the element value in gradient magnitude matrix is bigger, then shows that the gradient value of the point in grayscale image is bigger, Inhibited by the point to non-maximum, specially lookup pixel local maximum, by ash corresponding to non-maximum point Angle value is set to 0, in this way, the point of most non-edge can be weeded out.The gray value as corresponding to non-maximum point is 0, Therefore, non-maxima suppression treated image is binary picture.
It should be noted that often there is noise in image, and noise also focuses on high band, it is easily recognizable as Pseudo-edge, therefore, before carrying out edge detection based on canny edge detection algorithm, server is first with filter to grayscale image Gauss noise reduction is carried out, in this way, the noise filtering in grayscale image can be reduced the influence that noise calculates gradient, reduces noise It is identified as the probability of pseudo-edge.Wherein, when carrying out high speed noise reduction, server selects suitable radius according to demand, avoids Excessive radius makes weak edge be difficult to detect.
After non-maxima suppression, the edge of available image, in order to enable edge closure, server can also be right Binary picture carries out closed operation processing.In specific implementation, server can be detected and be connected edge by dual threashold value-based algorithm, be made Obtain image border closure.A high threshold and a Low threshold is arranged in dual threashold value-based algorithm, can reduce image by high threshold In pseudo-edge, the edge closure in image can be made by Low threshold.Specifically, server is in high threshold by boundary chain It is connected into profile, when reaching the breakpoint of profile, which can find the point for meeting Low threshold, then root in 8 neighborhood points of breakpoint Point collects new edge accordingly, until whole image edge closure.
S203: by the binary picture along the direction projection vertical with default segmentation direction, statistics array is obtained.
Default segmentation direction refers to preset image segmentation direction.Specifically, word can be based on by presetting segmentation direction It accords with the direction that character is shown in area image and determines, be specifically as follows direction identical with the direction that character is shown.For example, The direction that character is shown is horizontal direction, then presetting segmentation direction is horizontal direction, and the direction that character is shown is vertical direction, then Default segmentation direction can be vertical direction.
After carrying out canny edge detection, server throws binary picture along the direction vertical with default segmentation direction Shadow obtains statistics array.Specifically, binary picture is two dimensional image, for the ease of successive character segmentation, can by its with it is pre- If the mode of the vertical direction projection in segmentation direction is converted into one-dimensional statistics array.
Projection process is illustrated below with reference to specific example.For example, the size of binary picture be 428*27, i.e., this two Value figure has 428 pixels in the horizontal direction, and vertical direction has 27 pixels, and the gray value of each pixel is 0 or 1, this two The display direction of character is horizontal direction in value figure, in this way, default segmentation direction is horizontal direction, by its edge and default segmentation The vertical direction in direction, that is, vertical direction projection, i.e., add up the gray value of the pixel of vertical direction, in this way, can be formed The statistics array that length is 428, is denoted as [0,427].The corresponding element value of element in the array is that respective column pixel is being hung down Histogram to gray scale accumulated value, as included pixel and 12 gray scales that 15 gray values are 1 in a column pixel in binary picture The pixel that value is 0, gray value accumulated result are 15, and the element value for therefore, in array corresponding to the element of the column is 15.
S204: carrying out linear transformation to the statistics array, so that the element normalization in the statistics array.
Server to statistics array carry out linear transformation, be specifically as follows, to statistics array in each element element value into Row summation obtains element value summation, is then directed to each element, the element respectively divided by element value summation, as each element Value realizes the element normalization in statistics array in this way, the sum of the element value of each element is one in statistics array.
S205: the statistics array after normalization is matched with preset more Character mother plate arrays, according to matching result The character zone image is split to obtain monocase rectangle string.
Wherein, the monocase rectangle string includes the corresponding rectangular area of each character in the character zone.By word Symbol area image is divided into the rectangle string that multiple rectangular areas are formed, wherein each corresponding character in rectangular area, in this way, Realize Character segmentation.
In the present embodiment, server will count array and match with preset more Character mother plate arrays, according to matching As a result division position is marked, is then based on the division position of label, character zone image is split, it is available Monocase rectangle string.Wherein, more Character mother plate arrays include the corresponding template array of at least one character combination, more Character mother plates The length of array is wide according to character packet count, the number of character, character width and group spacing and character in the character combination The proportionality coefficient of degree determines.Since the character combination in character zone has diversity, preset various characters combination Corresponding template array is matched, and matched accuracy can be improved, and then improves the accuracy of Character segmentation.
Below with reference to printing bank card, template matching is described in detail.
For printing bank card, character zone can be there are many format.Specifically, character zone may include bank Card card number field, and the bank card of different issued by banks can be 16, be also possible to 19.16 bank card numbers one As be divided into 4 groups, every group of 4 characters have certain intervals between every group, corresponding template array can be denoted as group=[1,1, 1,1,0,1,1,1,1,0,1,1,1,1,0,1,1,1,1], it is based on this, the corresponding more Character mother plate arrays of 16 bank card numbers Length len=ch_width*16+3*coef*ch_width, wherein ch_width indicates character width, between coef expression group Away from the proportionality coefficient with character width.19 are divided into 2 groups for bank card number, and one group is 6 characters, and one group is 13 characters, Corresponding template array can be denoted as group=[1,1,1,1,1,1,0,1,1,1,11,1,1,1,1,1,1,1,1], be based on this, 19 be the length len=ch_width*19+1*coef*ch_width of the corresponding more Character mother plate arrays of bank card number.
Server can will statistics array successively more Character mother plate arrays corresponding with 16 bank card numbers and 19 The corresponding more Character mother plate arrays of bank card number are matched.Wherein, array and any more Character mother plate groupings will counted When being matched, it can gradually be matched according to the grouping situation of more Character mother plate arrays.It is corresponding for bank card number with 16 More Character mother plate arrays for, can first will corresponding with first group of character in the more Character mother plate arrays array of statistics array it is first Element is matched, and when result matching, then continues to match the corresponding array element of second group of character;If result is not Match, then can directly match statistics array with another more Character mother plate arrays.
Fig. 3 A and Fig. 3 B respectively illustrate the template matching results signal after obtaining edge feature using Morphological Gradient Figure, and using the template matching results schematic diagram after canny edge detection acquisition edge feature.As shown in Figure 3A, due to It is interfered by background color lump, the edge feature inaccuracy of acquisition, leads to error occur when carrying out the matching of template array, so lead Cause the accuracy of Character segmentation not high, and in figure 3b, the dry of background color lump can be overcome using canny edge detection algorithm It disturbs, obtains accurate edge feature, in this way, accuracy is higher in the matching of template array, have when to Character segmentation Higher accuracy.
From the foregoing, it will be observed that the embodiment of the present application provides a kind of character segmentation method, character zone image, character are obtained first Area image specifically refers to the corresponding image of character zone in printing of cards, then generates the corresponding gray scale of the character zone image Figure, then, is handled to obtain binary picture, by the binary picture based on canny edge detection algorithm to the grayscale image Along the direction projection vertical with default segmentation direction, statistics array is obtained, then linear transformation is carried out to the statistics array, so that Element normalization in the statistics array, by the statistics array and preset more Character mother plate array progress after normalization Match, character zone image is split according to matching result to obtain monocase rectangle string.This method uses canny edge detection Algorithm handles the grayscale image of character zone, can overcome the interference of background color lump, and carry out using Elastic forming board Matching, improves Character segmentation precision.
The above are a kind of specific implementations of character segmentation method provided by the embodiments of the present application, are based on this, the application Embodiment additionally provides corresponding Character segmentation device.Next, by being mentioned from the angle of function modoularization to the embodiment of the present application The Character segmentation device of confession is introduced.
The structural schematic diagram of Character segmentation device shown in Figure 4, the device include:
Generation module 410 for obtaining character zone image, and generates the corresponding grayscale image of the character zone image, The character zone image is the corresponding image of character zone in printing of cards;
Edge detection module 420, for being handled to obtain two-value to the grayscale image based on canny edge detection algorithm Change figure;
Projection module 430, for along the direction projection vertical with default segmentation direction, being counted the binary picture Array;
Conversion module 440, for carrying out linear transformation to the statistics array, so that the element in the statistics array is returned One changes;
Divide module 450, for the statistics array after normalization to be matched with preset more Character mother plate arrays, root The character zone image is split according to matching result to obtain monocase rectangle string, the monocase rectangle string includes described The corresponding rectangular area of each character in character zone.
Optionally, more Character mother plate arrays include the corresponding template array of at least one character combination, the multiword Accord with the length of template array according to character packet count in the character combination, the number of character, character width and group spacing with The proportionality coefficient of character width determines.
Optionally, the printing of cards includes printing bank card;The character zone includes bank's card number field, identifying code area Any one or more in domain or effective term area.
Optionally, described device further include:
Noise reduction module, for it is described the grayscale image is handled based on canny edge detection algorithm before, to institute It states grayscale image and carries out Gauss noise reduction.
Optionally, it is described based on canny edge detection algorithm the grayscale image is handled to obtain binary picture include:
For the grayscale image, single order local derviation finite difference formulations gradient magnitude is utilized;
Maximum inhibition processing is carried out according to the gradient magnitude, obtains binary picture.
Optionally, described device further include:
Computing module, for by the binary picture before the direction projection vertical with default segmentation direction, to institute It states binary picture and carries out closed operation processing, so that the image border in the binary picture is closed.
From the foregoing, it will be observed that the embodiment of the present application provides a kind of Character segmentation device, which obtains character zone figure first Then picture generates the corresponding grayscale image of the character zone image, then, based on canny edge detection algorithm to the gray scale Figure is handled to obtain binary picture, by the binary picture along the direction projection vertical with default segmentation direction, is counted Array, then linear transformation is carried out to the statistics array, so that the element normalization in the statistics array, after normalization Statistics array is matched with preset more Character mother plate arrays, is split to obtain to character zone image according to matching result Monocase rectangle string.The device is handled the grayscale image of character zone using canny edge detection algorithm, can be overcome The interference of background color lump, and matched using Elastic forming board, improve Character segmentation precision.
It is apparent to those skilled in the art that for convenience and simplicity of description, the system of foregoing description, The specific work process of device and unit, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
In several embodiments provided herein, it should be understood that disclosed device and method can pass through it Its mode is realized.For example, the apparatus embodiments described above are merely exemplary, for example, the division of the unit, only Only a kind of logical function partition, there may be another division manner in actual implementation, such as multiple units or components can be tied Another system is closed or is desirably integrated into, or some features can be ignored or not executed.Another point, it is shown or discussed Mutual coupling, direct-coupling or communication connection can be through some interfaces, the INDIRECT COUPLING or logical of device or unit Letter connection can be electrical property, mechanical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme 's.
It should be appreciated that in this application, " at least one (item) " refers to one or more, and " multiple " refer to two or two More than a."and/or" indicates may exist three kinds of relationships, for example, " A and/or B " for describing the incidence relation of affiliated partner It can indicate: only exist A, only exist B and exist simultaneously tri- kinds of situations of A and B, wherein A, B can be odd number or plural number.Word Symbol "/" typicallys represent the relationship that forward-backward correlation object is a kind of "or"." at least one of following (a) " or its similar expression, refers to Any combination in these, any combination including individual event (a) or complex item (a).At least one of for example, in a, b or c (a) can indicate: a, b, c, " a and b ", " a and c ", " b and c ", or " a and b and c ", and wherein a, b, c can be individually, can also To be multiple.
The above, above embodiments are only to illustrate the technical solution of the application, rather than its limitations;Although referring to before Embodiment is stated the application is described in detail, those skilled in the art should understand that: it still can be to preceding Technical solution documented by each embodiment is stated to modify or equivalent replacement of some of the technical features;And these It modifies or replaces, the spirit and scope of each embodiment technical solution of the application that it does not separate the essence of the corresponding technical solution.

Claims (10)

1. a kind of character segmentation method, which is characterized in that the described method includes:
Character zone image is obtained, and generates the corresponding grayscale image of the character zone image, the character zone image is print It swipes the card the corresponding image of middle character zone;
The grayscale image is handled to obtain binary picture based on canny edge detection algorithm;
By the binary picture along the direction projection vertical with default segmentation direction, statistics array is obtained;
Linear transformation is carried out to the statistics array, so that the element normalization in the statistics array;
Statistics array after normalization is matched with preset more Character mother plate arrays, according to matching result to the character Area image is split to obtain monocase rectangle string, and the monocase rectangle string includes each character in the character zone Corresponding rectangular area.
2. the method according to claim 1, wherein more Character mother plate arrays include at least one character group Close corresponding template array, the length of more Character mother plate arrays is according to character packet count in the character combination, character The proportionality coefficient of number, character width and group spacing and character width determines.
3. the method according to claim 1, wherein the printing of cards includes printing bank card;The character area Domain includes any one or more in bank's card number field, identifying code region or effective term area.
4. according to claim 1 to method described in 3 any one, which is characterized in that described to be based on canny edge detection algorithm The grayscale image is handled to obtain binary picture include:
For the grayscale image, single order local derviation finite difference formulations gradient magnitude is utilized;
Maximum inhibition processing is carried out according to the gradient magnitude, obtains binary picture.
5. according to claim 1 to method described in 3 any one, which is characterized in that by the binary picture along with it is default Before dividing the vertical direction projection in direction, the method also includes:
Closed operation processing is carried out to the binary picture according to dual threashold value-based algorithm, so that the image border in the binary picture is closed It closes.
6. a kind of Character segmentation device, which is characterized in that described device includes:
Generation module for obtaining character zone image, and generates the corresponding grayscale image of the character zone image, the character Area image is the corresponding image of character zone in printing of cards;
Edge detection module, for being handled to obtain binary picture to the grayscale image based on canny edge detection algorithm;
Projection module, for the binary picture along the direction projection vertical with default segmentation direction, to be obtained statistics array;
Conversion module, for carrying out linear transformation to the statistics array, so that the element normalization in the statistics array;
Divide module, for matching the statistics array after normalization with preset more Character mother plate arrays, according to matching As a result the character picture is split to obtain monocase rectangle string, the monocase rectangle string includes in the character zone The corresponding rectangular area of each character.
7. the apparatus according to claim 1, which is characterized in that more Character mother plate arrays include at least one character group Close corresponding template array, the length of more Character mother plate arrays is according to character packet count in the character combination, character The proportionality coefficient of number, character width and group spacing and character width determines.
8. the apparatus according to claim 1, which is characterized in that the printing of cards includes printing bank card;The character area Domain includes any one or more in bank's card number field, identifying code region or effective term area.
9. according to device described in claim 6 to 8 any one, which is characterized in that the edge detection module is specifically used for:
For the grayscale image, single order local derviation finite difference formulations gradient magnitude is utilized;
Maximum inhibition processing is carried out according to the gradient magnitude, obtains binary picture.
10. according to device described in claim 6 to 8 any one, which is characterized in that described device further include:
Computing module, for by the binary picture before the direction projection vertical with default segmentation direction, according to dual threashold Value-based algorithm carries out closed operation processing to the binary picture, so that the image border in the binary picture is closed.
CN201811504027.9A 2018-12-10 2018-12-10 Character segmentation method and device Active CN109598271B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811504027.9A CN109598271B (en) 2018-12-10 2018-12-10 Character segmentation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811504027.9A CN109598271B (en) 2018-12-10 2018-12-10 Character segmentation method and device

Publications (2)

Publication Number Publication Date
CN109598271A true CN109598271A (en) 2019-04-09
CN109598271B CN109598271B (en) 2021-02-09

Family

ID=65962208

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811504027.9A Active CN109598271B (en) 2018-12-10 2018-12-10 Character segmentation method and device

Country Status (1)

Country Link
CN (1) CN109598271B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110705989A (en) * 2019-09-17 2020-01-17 阿里巴巴集团控股有限公司 Identity authentication method, method for realizing login-free authorization component and respective devices
CN111027546A (en) * 2019-12-05 2020-04-17 北京嘉楠捷思信息技术有限公司 Character segmentation method and device and computer readable storage medium
CN111046862A (en) * 2019-12-05 2020-04-21 北京嘉楠捷思信息技术有限公司 Character segmentation method and device and computer readable storage medium
CN111060527A (en) * 2019-12-30 2020-04-24 歌尔股份有限公司 Character defect detection method and device
CN111866254A (en) * 2020-07-31 2020-10-30 广东佳米科技有限公司 Speed dialing system for intelligently identifying word stroke and implementation method thereof
EP4071665A4 (en) * 2019-12-05 2023-11-22 Canaan Bright Sight Co., Ltd. Character segmentation method and apparatus, and computer-readable storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060120602A1 (en) * 2004-12-03 2006-06-08 Bei Tang Character segmentation method and apparatus
CN101567041A (en) * 2009-05-25 2009-10-28 公安部交通管理科学研究所 Method for recognizing characters of number plate images of motor vehicles based on trimetric projection
CN102043959A (en) * 2010-12-28 2011-05-04 青岛海信网络科技股份有限公司 License plate character segmentation method
CN104408454A (en) * 2014-06-30 2015-03-11 电子科技大学 License plate character segmentation method based on elastic template matching algorithm
CN105095860A (en) * 2015-06-30 2015-11-25 小米科技有限责任公司 Method and device for character segmentation
CN106650553A (en) * 2015-10-30 2017-05-10 比亚迪股份有限公司 License plate recognition method and system
CN108615034A (en) * 2017-12-14 2018-10-02 燕山大学 A kind of licence plate recognition method that template matches are combined with neural network algorithm

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060120602A1 (en) * 2004-12-03 2006-06-08 Bei Tang Character segmentation method and apparatus
CN101567041A (en) * 2009-05-25 2009-10-28 公安部交通管理科学研究所 Method for recognizing characters of number plate images of motor vehicles based on trimetric projection
CN102043959A (en) * 2010-12-28 2011-05-04 青岛海信网络科技股份有限公司 License plate character segmentation method
CN104408454A (en) * 2014-06-30 2015-03-11 电子科技大学 License plate character segmentation method based on elastic template matching algorithm
CN105095860A (en) * 2015-06-30 2015-11-25 小米科技有限责任公司 Method and device for character segmentation
CN106650553A (en) * 2015-10-30 2017-05-10 比亚迪股份有限公司 License plate recognition method and system
CN108615034A (en) * 2017-12-14 2018-10-02 燕山大学 A kind of licence plate recognition method that template matches are combined with neural network algorithm

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110705989A (en) * 2019-09-17 2020-01-17 阿里巴巴集团控股有限公司 Identity authentication method, method for realizing login-free authorization component and respective devices
CN111027546A (en) * 2019-12-05 2020-04-17 北京嘉楠捷思信息技术有限公司 Character segmentation method and device and computer readable storage medium
CN111046862A (en) * 2019-12-05 2020-04-21 北京嘉楠捷思信息技术有限公司 Character segmentation method and device and computer readable storage medium
CN111046862B (en) * 2019-12-05 2023-10-27 嘉楠明芯(北京)科技有限公司 Character segmentation method, device and computer readable storage medium
EP4071665A4 (en) * 2019-12-05 2023-11-22 Canaan Bright Sight Co., Ltd. Character segmentation method and apparatus, and computer-readable storage medium
CN111027546B (en) * 2019-12-05 2024-03-26 嘉楠明芯(北京)科技有限公司 Character segmentation method, device and computer readable storage medium
CN111060527A (en) * 2019-12-30 2020-04-24 歌尔股份有限公司 Character defect detection method and device
CN111866254A (en) * 2020-07-31 2020-10-30 广东佳米科技有限公司 Speed dialing system for intelligently identifying word stroke and implementation method thereof

Also Published As

Publication number Publication date
CN109598271B (en) 2021-02-09

Similar Documents

Publication Publication Date Title
CN109598271A (en) A kind of character segmentation method and device
CN109636784B (en) Image saliency target detection method based on maximum neighborhood and super-pixel segmentation
CN109348731B (en) Image matching method and device
CN109409377B (en) Method and device for detecting characters in image
CN107038416B (en) Pedestrian detection method based on binary image improved HOG characteristics
CN105118048A (en) Method and device for identifying copying certificate image
CN109886937B (en) Insulator defect detection method based on super-pixel segmentation image recognition
EP2605186B1 (en) Method and apparatus for recognizing a character based on a photographed image
CN103824373B (en) A kind of bill images amount of money sorting technique and system
CN107392142B (en) Method and device for identifying true and false face
CN104268519B (en) Image recognition terminal and its recognition methods based on pattern match
CN107766932B (en) Image processing method and device based on neural network
CN109903302A (en) A kind of altering detecting method for stitching image
CN110443159A (en) Digit recognition method, device, electronic equipment and storage medium
CN105869175A (en) Image segmentation method and system
Anis et al. Digital electric meter reading recognition based on horizontal and vertical binary pattern
CN104239883A (en) Textural feature extraction method and device
CN104966109A (en) Medical laboratory report image classification method and apparatus
CN108710881B (en) Neural network model, candidate target area generation method and model training method
CN110569848A (en) feature extraction method and system for power equipment nameplate
CN105375992B (en) Based on gradient operator and the morphologic frequency spectrum cavity-pocket detection method of mathematics
Khamdamov et al. A character segmentation algorithm for vehicle license plates
CN107145883A (en) Method for text detection and equipment
CN108769521A (en) A kind of photographic method, mobile terminal and computer readable storage medium
US10176399B1 (en) Method and apparatus for optical character recognition of dot text in an image

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant