CN110263616A - A kind of character recognition method, device, electronic equipment and storage medium - Google Patents
A kind of character recognition method, device, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN110263616A CN110263616A CN201910356344.9A CN201910356344A CN110263616A CN 110263616 A CN110263616 A CN 110263616A CN 201910356344 A CN201910356344 A CN 201910356344A CN 110263616 A CN110263616 A CN 110263616A
- Authority
- CN
- China
- Prior art keywords
- picture
- identified
- text
- template
- reference point
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/22—Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
- G06V10/235—Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition based on user input or interaction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/40—Document-oriented image-based pattern recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
Abstract
This application discloses a kind of character recognition method, device, electronic equipment and storage mediums, and choose one has the picture of same characteristic features as template picture, and according to selection rule with picture to be identified, select multiple reference points in template picture upper ledge.According to the position of the identification region in template picture, picture to be identified is cut into multiple picture blocks, and carry out Text region;In the text information situation identical with the text segment information in identification region in verification picture block, realizes the structuring Word Input of picture to be identified, improve the accuracy rate and efficiency of Text region.As it can be seen that method, apparatus provided by the invention, electronic equipment and storage medium, may be implemented the identification to certain types of image text, recognition efficiency is higher.
Description
Technical field
This application involves image identification technical field more particularly to a kind of character recognition method, device, electronic equipment and deposit
Storage media.
Background technique
Company personnel can exchange visiting cards when carrying out social, and company can also generate a large amount of bills in day-to-day operations, and name
Piece and bill are paper-based form, accumulate excessive paper business card and bill, store occupied space, and not only easy to be lost, but also not
Convenient for the lookup of target business card or target bill.It, can be by the name on paper business card and bill, public affairs for the ease of checking and saving
The text informations such as department, identification card number and phone are identified and stored in the terminal, in order to obtain in time when needed
To relevant information.
In the prior art, the identification method of sample comparison is generallyd use to identify the text information on business card or bill, is had
The method of body includes: to prepare a sample in advance, and the textbox of designated position on sample is elected, and forms recognition template;
Paper business card or bill are scanned into picture again, picture to be identified is compared with recognition template, on picture to be identified
When some information sets corresponding characters matching with frame bit selecting, which is the Text region result of picture to be identified.
But when carrying out Text region by the identification method that sample compares, it need to be set using the frame bit selecting on recognition template
It is matched with the whole region of picture to be identified, due to needing to carry out a large amount of data processing in matching process, so can lead
Cause the time spent when identification text longer.
Summary of the invention
This application provides a kind of character recognition method, device, electronic equipment and storage mediums, to solve existing identification
The low problem of method recognition efficiency.
In a first aspect, this application provides a kind of character recognition methods, comprising the following steps:
Select template picture corresponding with the feature of picture to be identified;
Several reference points are selected in the template picture upper ledge;
According to the position of the identification region in the template picture, the picture to be identified is cut, is obtained multiple
Picture block, the identification region are the region with picture to be identified with same characteristic features in template picture;
Identify the text information in the picture block;
Compare the text segment information in text information identification region corresponding with reference point;
Under the text information and the identical situation of the corresponding text of text segment information, extract on the picture to be identified
Text.
Further, the reference point is located at the common and constant position text in the template picture and picture to be identified
At field;And the reference point be located at the template picture edge and four corners;And the reference point is located at the mould
Occurs primary text section position on plate picture;And the quantity of the reference point is more than or equal to 4;And the same ginseng
According to putting corresponding text in same a line and adjacent.
Further, according to the position of the identification region in the template picture, the picture to be identified is cut,
Obtain the process of multiple picture blocks, comprising:
Two-dimensional coordinate system is established in the template picture, determines the coordinate of each identification region;The identification region packet
Include the text segment information that template picture and picture to be identified have jointly;
According to the coordinate position of each identification region, picture to be identified is cut, obtains multiple picture blocks;Its
In, it include a kind of corresponding text segment information of identification region in each picture block.
Further, described under text information and the identical situation of the corresponding text of text segment information, extract it is described to
Identify the process of the text on picture, comprising:
Under text information and the identical situation of the corresponding text of text segment information, determine that current image block is that can recognize figure
Tile;
According to the position of the recognizable picture block, corresponding position to be identified is determined, extract the figure to be identified
Text on the position to be identified of piece.
Further, further includes:
Processing is positively twisted to picture to be identified according to the reference point, makes the shape of the picture to be identified and template picture
Shape, size are identical.
Further, further includes:
Image enhancement processing is carried out to the picture to be identified, with adjust the light of the picture to be identified, contrast and
Exposure.
Second aspect, the embodiment of the invention provides a kind of character recognition devices, comprising:
Template picture chooses module, for selecting template picture corresponding with the feature of picture to be identified;
Reference point frame modeling block, for selecting several reference points in the template picture upper ledge;
Cutting module carries out the picture to be identified for the position according to the identification region in the template picture
Cutting, obtains multiple picture blocks, and the identification region is the region with picture to be identified with same characteristic features in template picture;
Text region module, for identification text information in the picture block;
Information contrast module, for comparing the letter of the text section in text information identification region corresponding with reference point
Breath;
Word Input module, for mentioning under the text information and the identical situation of the corresponding text of text segment information
Take the text on the picture to be identified.
Further, the feature of the reference point frame modeling block includes:
The reference point is located at the common and constant position text section in the template picture and picture to be identified;
And the reference point be located at the template picture edge and four corners;
And the reference point is located in the template picture and primary text section position occurs;
And the quantity of the reference point is more than or equal to 4;
And the corresponding text of the same reference point is in same a line and adjacent.
Further, the cutting module, comprising:
Establishment of coordinate system unit determines each identification region for establishing two-dimensional coordinate system in the template picture
Coordinate;The identification region includes template picture and the text segment information that picture to be identified has jointly;
Cutter unit cuts picture to be identified, obtains for the coordinate position according to each identification region
Multiple picture blocks;It wherein, include a kind of corresponding text segment information of identification region in each picture block.
Further, the Word Input module, comprising:
It can recognize picture block determination unit, under text information and the identical situation of the corresponding text of text segment information, really
Settled preceding picture block is that can recognize picture block;
Word Input unit determines corresponding position to be identified for the position according to the recognizable picture block,
Extract the text on the position to be identified of the picture to be identified.
Further, further includes:
Positive twist module makes the picture to be identified for processing to be positively twisted to picture to be identified according to the reference point
It is identical as the shape of template picture, size.
Further, further includes:
Image processing module, for carrying out image enhancement processing to the picture to be identified, to adjust the figure to be identified
Light, contrast and the exposure of piece.
The third aspect, the embodiment of the invention provides a kind of electronic equipment, comprising:
Memory, for storing program instruction;
Processor, for calling and executing the program instruction in the memory, to realize text described in first aspect
Recognition methods.
Fourth aspect is stored with calculating the embodiment of the invention provides a kind of storage medium in the readable storage medium storing program for executing
Machine program, when at least one processor of character recognition device executes the computer program, character recognition device executes the
Character recognition method described in one side.
From the above technical scheme, a kind of character recognition method provided in an embodiment of the present invention, device, electronic equipment and
Storage medium chooses a picture with picture to be identified with same characteristic features as template picture, and according to selection rule,
Template picture upper ledge selects multiple reference points.According to the position of the identification region in template picture, picture to be identified is cut into more
A picture block, and carry out Text region;Text information in verification picture block is identical as the text segment information in identification region
In the case where, it realizes the structuring Word Input of picture to be identified, improves the accuracy rate and efficiency of Text region.As it can be seen that this hair
The identification to certain types of image text may be implemented in method, apparatus, electronic equipment and the storage medium of bright offer, identification
It is more efficient.
Detailed description of the invention
In order to illustrate more clearly of the technical solution of the application, letter will be made to attached drawing needed in the embodiment below
Singly introduce, it should be apparent that, for those of ordinary skills, without any creative labor,
It is also possible to obtain other drawings based on these drawings.
Fig. 1 is the flow chart of character recognition method provided in an embodiment of the present invention;
Fig. 2 is the setting position view of reference point provided in an embodiment of the present invention;
Fig. 3 is the method flow diagram provided in an embodiment of the present invention cut to picture to be identified;
Fig. 4 is the schematic diagram of the template picture provided in an embodiment of the present invention for establishing coordinate system;
Fig. 5 is the method flow diagram provided in an embodiment of the present invention for extracting the text on picture to be identified;
Fig. 6 be positive twist provided in an embodiment of the present invention before template picture and picture to be identified contrast schematic diagram;
Fig. 7 be positive twist provided in an embodiment of the present invention after template picture and picture to be identified contrast schematic diagram;
Fig. 8 is the structural block diagram of character recognition device provided in an embodiment of the present invention;
Fig. 9 is the hardware structural diagram of electronic equipment provided in an embodiment of the present invention.
Specific embodiment
Fig. 1 is the flow chart of character recognition method provided in an embodiment of the present invention.
A kind of character recognition method provided in an embodiment of the present invention, specific one kind has the text of fixed format for identification
Grade, such as bill, card card, identity card etc., it realizes and various bills, card card, the field name of identity card and field value is corresponded to and extracted
Identification.The executing subject of this method can be installation OCR (Optical Character Recognition, optical character identification)
The test equipment of software, such as desktop computer, laptop.Referring to Fig. 1, method includes the following steps:
S1, selection template picture corresponding with the feature of picture to be identified;
Picture to be identified is a kind of corresponding picture of document with fixed format to be identified, such as bill, card card, body
The corresponding pictures such as part card.Feature on picture to be identified includes field name, such as name, identification card number.
According to the feature shown on picture to be identified, a template picture is chosen, template photo wants clear, smooth and complete
Site preparation shows the identical feature having with picture to be identified, for example, if picture to be identified is a kind of corresponding picture of bill,
So template picture is the picture for including the field name shown on bill;If picture to be identified is the corresponding figure of identity card
Piece, then template picture is the picture for including the field name shown on identity card.That is, picture to be identified and Prototype drawing
Piece will be corresponding to carry out efficient, accurate Text region.
S2, several reference points are selected in template picture upper ledge;
For the ease of the matching of picture to be identified and template picture, to carry out efficient Text region, in the present embodiment,
Frame is needed to select several reference points in template picture.Reference point is for providing a datum mark, by picture to be identified and Prototype drawing
On piece is corresponding.
As shown in Fig. 2, when template picture upper ledge selects reference point, being needed according to following rules to improve recognition effect
It is chosen:
Reference point is located at the common and constant position text section in template picture and picture to be identified;And referring to point
In the edge of template picture and four corners;And reference point is located at that occur primary text section institute in the template picture in place
It sets;And the quantity of reference point is more than or equal to 4;And the corresponding text of same reference point is in same a line and adjacent.
Opposite position due to template picture and picture to be identified feature having the same, between the corresponding text section of this feature
Set it is identical and constant, therefore, when selecting reference point, for guarantee recognition accuracy, need in template picture and picture to be identified
Position with same text section, and the position of this article field will not change selection position of the corresponding position as reference point
It sets.For example, " name " is the constant text section in position, and corresponding " Zhang San " is then the text that position can change on a bill
Field, this is because " name " is the text section that there is originally on bill, and " Zhang San " is that subsequent user's addition is got on.
The text section added afterwards can not determine unified position, and therefore, the indefinite text section position in position should not be used as reference point
Position.
A kind of document with fixed format, the corresponding text section in surrounding position can be different, and middle position is corresponding
Text section perhaps can be identical, for example, the answering card that student examination uses, the corresponding text section in surrounding position and four corners
Difference, but intermediate corresponding text section is identical, is the content of " A, B, C, D " four options.Therefore, for each reference point pair
The text section answered distinguishes, improve recognition accuracy, reference point need to be arranged in template picture edge and four corners.
In addition, the content of " A, B, C, D " four options on answering card can repeat, if selecting reference point here,
Then can not the determination content to be identified which part belonged to, it may appear that " A, B, C, D " of Part IV four on picture to be identified
The option situation corresponding with " A, B, C, D " four options of first part in template picture, causes recognition result mistake occur, because
This, the phenomenon that avoid the occurrence of wrong identification, reference point should not be selected in the text section position repeated, and be selected in as far as possible
Only there is primary text section position.
The corresponding text of same reference point need to be adjacent in same a line, and as far as possible, to ensure in the subsequent Text region that carries out
Accuracy.And the quantity of reference point is more as far as possible, at least marks 4, but in order to guarantee recognition effect, should mark 8 above by reference to
Point.The reference point locations chosen in the present embodiment can be as shown in the black dot in Fig. 2, and the reference point in template picture is more
More disperse, recognition effect is better, can guarantee the accuracy and efficiency of Text region.
It should be noted that Fig. 2 is only the position for being exemplarily illustrated several reference points, not to the limit of reference point locations
Fixed, for other kinds of document, the position of reference point can also be other.
S3, according to the position of the identification region in template picture, picture to be identified is cut, multiple pictures are obtained
Block, identification region are the region with picture to be identified with same characteristic features in template picture.
Due on picture to be identified there are multiple text sections to be identified, for the accuracy and efficiency for improving identification,
In the present embodiment, picture to be identified is cut into multiple portions, with the text section in corresponding identification each section.
The standard cut to picture to be identified is the position according to each identification region in template picture, cog region
Domain is the region with picture to be identified with same text section in template picture, for example, where " name ", " identification card number "
Region.
Specifically, as shown in figure 3, in the present embodiment, according to the position of the identification region in template picture, to figure to be identified
Piece is cut, and the process of multiple picture blocks is obtained, comprising:
S31, two-dimensional coordinate system is established in template picture, determine the coordinate of each identification region;Identification region includes mould
The text segment information that plate picture and picture to be identified have jointly.
As shown in figure 4, the two-dimensional coordinate system established in template picture, coordinate origin may be provided at the lower-left of template picture
Angle, X-direction are the bottom edge direction of template picture, and Y direction is the side direction of template.
According to position of each identification region in template picture, coordinate of the identification region in two-dimensional coordinate system is determined.
Since identification region includes text segment information, corresponding is a region rather than a point, therefore, in order to accurately determine
The coordinate of the coordinate of identification region, identification region can be indicated by the coordinate of the center of identification region.
S32, according to the coordinate position of each identification region, picture to be identified is cut, multiple picture blocks are obtained;Its
In, it include a kind of corresponding text segment information of identification region in each picture block.
It, will be wait know using the coordinate of each identification region as reference after determining the coordinate position of each identification region
Other picture is cut into multiple picture blocks, and the form of picture block can be identical with identification region, and needs to include text section in picture block
Information, this article field information need to be identical as the text segment information in the correspondence identification region in template picture.
Text information on S4, identification picture block;
Successively each picture block is identified, to get the text information in picture block, which be can be used
OCR (Optical Character Recognition, optical character identification) technology, since the technology has been comparative maturity
Technology, accordingly, with respect to more specifically identification process, details are not described herein again.
Text segment information on S5, comparison text information identification region corresponding with reference point;
To guarantee that the identification content on current picture to be identified is consistent with the corresponding content in template picture, the present embodiment
In, it also needs to verify the text information in the picture block recognized, i.e., by text information cog region corresponding with reference point
Text segment information on domain compares, and avoids the occurrence of the text on the text information and template picture identified on picture to be identified
The situation that field information does not correspond to, influences recognition accuracy.
S6, under text information and the identical situation of the corresponding text of text segment information, extract the text on picture to be identified
Word.
After text information and text segment information are compared, if the corresponding word content of the two is not identical, explanation
There is mistake in current identification process, i.e., picture to be identified is not corresponding with template picture, then this identification process will be terminated, more
It changes template picture or carries out the Text region for carrying out a new round after other are adjusted again.
If text information word content corresponding with text segment information is identical, illustrate that current identification process is normal
The corresponding literal value of text section is identified, that is, identifies the text on picture to be identified.
Specifically, as shown in figure 5, in the present embodiment, in text information and the identical feelings of the corresponding text of text segment information
Under condition, the process of the text on picture to be identified is extracted, comprising:
S61, under text information and the identical situation of the corresponding text of text segment information, determine current image block be can know
Other picture block;
S62, basis can recognize the position of picture block, determine corresponding position to be identified, extract picture to be identified
Text on position to be identified.
When text information word content corresponding with text segment information is identical, using current image block as recognizable picture
Block can continue the picture block for carrying out Text region to picture to be identified, and can recognize that picture block determines position to be identified according to this
It sets, position to be identified is position corresponding with picture block.For example, if can recognize that the text information in picture block is " name ",
The corresponding text information in position so to be identified is " Zhang San ";If can recognize that the text information in picture block is " identity account
Number ", then the corresponding text information in position to be identified is " 110XXXXXXXXXXXXXXX ".
Then, the text information on position to be identified is identified using OCR technique again, by picture pair to be identified
The multiple picture blocks answered are completed after identifying and being verified, and are extracted to the text information on corresponding position to be identified,
Realize that the data of the structuring of picture to be identified are extracted immediately.
On the basis of character recognition method provided by the above embodiment, character recognition method provided in this embodiment,
After reading picture to be identified, further includes: processing is positively twisted to picture to be identified according to reference point, makes picture and mould to be identified
Shape, the size of plate picture are identical;
Due to when reading picture to be identified, phenomena such as picture to be identified will appear rotation, deformation, therefore, identification to
Before identifying the text on picture, picture to be identified need to be positively twisted on the basis of reference point, i.e., by picture to be identified and mould
Plate picture positive twist is to consistent, so that the shape of picture to be identified, size and placement direction are identical as template picture.
For example, as shown in (a) and (b) in Fig. 6, if parallelogram is presented in picture to be identified at the time of reading, by text
The positive direction of shelves has rotated certain angle, and template picture is rectangle, places by the positive direction of document.Therefore, know for guarantee
Other efficiency, by the parallelogram positive twist rectangularity of picture to be identified, and size is identical as template picture, separately
Outside, the angle for also needing to adjust picture to be identified places it also by the positive direction of document, in the comparison diagram such as Fig. 7 after positive twist
(a) and shown in (b), that is to say, that picture to be identified and template picture can accomplish to be completely coincident.
Since the picture to be identified of reading will appear than darker, unintelligible or excessively bright phenomenon, the knowledge of text is easily influenced
Not, therefore, in the present embodiment, before processing is positively twisted to picture to be identified according to reference point, the character recognition method is also
It include: that image enhancement processing is carried out to picture to be identified, to adjust light, contrast and the exposure of picture to be identified.
Picture enhancing processing is carried out to picture to be identified, so that the light of picture to be identified, contrast and exposure etc. reach
To optimum state, convenient for identifying the text on picture to be identified.It, can will be bright for example, if the brightness of picture to be identified is excessively bright
Degree is turned down, if excessively dark, brightness is turned up.
As it can be seen that character recognition method provided in this embodiment, when carrying out Text region to picture to be identified, in advance to reading
Processing and image enhancement processing is positively twisted in the picture to be identified taken, so that picture to be identified is identical as template picture, and image
It is more clear.When being identified to text, if picture to be identified and template picture are equal on size, shape and placement direction
It is identical, and the light of picture to be identified, contrast and exposure etc. reach optimum state, can more accurately recognize wait know
Each text on other picture improves recognition efficiency.
From the above technical scheme, a kind of character recognition method provided in an embodiment of the present invention, choose one with wait know
Other picture has the picture of same characteristic features as template picture, and according to selection rule, selects multiple references in template picture upper ledge
Point.According to the position of the identification region in template picture, picture to be identified is cut into multiple picture blocks, row text of going forward side by side is known
Not;In the text information situation identical with the text segment information in identification region in verification picture block, figure to be identified is realized
The structuring Word Input of piece improves the accuracy rate and efficiency of Text region.As it can be seen that method provided by the invention may be implemented pair
The identification of certain types of image text, recognition efficiency are higher.
Referring to Fig. 8, the embodiment of the invention provides a kind of character recognition devices, for executing Text region shown in FIG. 1
Method, the device include:
Template picture chooses module 10, for selecting template picture corresponding with the feature of picture to be identified;Reference point
Frame modeling block 20, for selecting several reference points in the template picture upper ledge;Cutting module 30, for according to the template picture
On identification region position, the picture to be identified is cut, obtains multiple picture blocks, the identification region is template
The region with picture to be identified with same characteristic features on picture;Text region module 40, for identification in the picture block
Text information;Information contrast module 50, for comparing the text section in text information identification region corresponding with reference point
Information;Word Input module 60, for extracting under the text information and the identical situation of the corresponding text of text segment information
Text on the picture to be identified.
Further, the feature of the reference point frame modeling block 20 include: the reference point be located at the template picture and
At the constant text section in common and position in picture to be identified;And the reference point is located at the edge and four of the template picture
A corner;And the reference point is located in the template picture and primary text section position occurs;And the reference point
Quantity be more than or equal to 4;And the corresponding text of the same reference point is in same a line and adjacent.
Further, the cutting module 30, comprising: establishment of coordinate system unit, for being established in the template picture
Two-dimensional coordinate system determines the coordinate of each identification region;The identification region includes that template picture has jointly with picture to be identified
Some text segment informations;Cutter unit cuts picture to be identified for the coordinate position according to each identification region
It cuts, obtains multiple picture blocks;It wherein, include a kind of corresponding text segment information of identification region in each picture block.
Further, the Word Input module 60, comprising: can recognize picture block determination unit, in text information and text
In the identical situation of the corresponding text of field information, determine that current image block is that can recognize picture block;Word Input unit, is used for
According to the position of the recognizable picture block, determine corresponding position to be identified, extract the picture to be identified wait know
Text on other position.
Further, further includes: positive twist module, for processing to be positively twisted to picture to be identified according to the reference point,
Keep the picture to be identified identical as the shape of template picture, size.
Further, further includes: image processing module, for carrying out image enhancement processing to the picture to be identified, with
Adjust light, contrast and the exposure of the picture to be identified.
Fig. 9 is the hardware structural diagram of electronic equipment provided in an embodiment of the present invention.As shown in figure 9, the present invention is implemented
Example provides a kind of electronic equipment, comprising:
Memory 601, for storing program instruction;
Processor 602, for calling and executing the program instruction in the memory 601, to realize previous embodiment institute
The character recognition method stated.
In the present embodiment, processor 602 can be connected with memory 601 by bus or other modes.Processor can be
General processor, such as central processing unit, digital signal processor, specific integrated circuit, or be configured to implement the present invention
One or more integrated circuits of embodiment.Memory may include volatile memory, such as random access memory;Storage
Device also may include nonvolatile memory, such as read-only memory, flash memory, hard disk or solid state hard disk.
The embodiment of the invention provides a kind of storage medium, it is stored with computer program in the readable storage medium storing program for executing, when
When at least one processor of character recognition device executes the computer program, character recognition device executes previous embodiment institute
The character recognition method stated.
The readable storage medium storing program for executing can for magnetic disk, CD, read-only memory (English: read-only memory,
Referred to as: ROM) or random access memory (English: random access memory, referred to as: RAM) etc..
It is required that those skilled in the art can be understood that the technology in the embodiment of the present invention can add by software
The mode of general hardware platform realize.Based on this understanding, the technical solution in the embodiment of the present invention substantially or
Say that the part that contributes to existing technology can be embodied in the form of software products, which can deposit
Storage is in storage medium, such as ROM/RAM, magnetic disk, CD, including some instructions are used so that computer equipment (can be with
It is personal computer, server or the network equipment etc.) execute certain part institutes of each embodiment of the present invention or embodiment
The method stated.
Same and similar part may refer to each other between each embodiment in this specification.Especially for Text region
For Installation practice, since it is substantially similar to the method embodiment, so being described relatively simple, related place is referring to method
Explanation in embodiment.
Invention described above embodiment is not intended to limit the scope of the present invention..
Claims (14)
1. a kind of character recognition method, which comprises the following steps:
Select template picture corresponding with the feature of picture to be identified;
Several reference points are selected in the template picture upper ledge;
According to the position of the identification region in the template picture, the picture to be identified is cut, multiple pictures are obtained
Block, the identification region are the region with picture to be identified with same characteristic features in template picture;
Identify the text information in the picture block;
Compare the text segment information in text information identification region corresponding with reference point;
Under the text information and the identical situation of the corresponding text of text segment information, the text on the picture to be identified is extracted
Word.
2. the method according to claim 1, wherein
The reference point is located at the common and constant position text section in the template picture and picture to be identified;
And the reference point be located at the template picture edge and four corners;
And the reference point is located in the template picture and primary text section position occurs;
And the quantity of the reference point is more than or equal to 4;
And the corresponding text of the same reference point is in same a line and adjacent.
3. the method according to claim 1, wherein according to the position of the identification region in the template picture,
The picture to be identified is cut, the process of multiple picture blocks is obtained, comprising:
Two-dimensional coordinate system is established in the template picture, determines the coordinate of each identification region;The identification region includes mould
The text segment information that plate picture and picture to be identified have jointly;
According to the coordinate position of each identification region, picture to be identified is cut, obtains multiple picture blocks;Wherein,
It include a kind of corresponding text segment information of identification region in each picture block.
4. the method according to claim 1, wherein described in text information and the corresponding text of text segment information
In identical situation, the process of the text on the picture to be identified is extracted, comprising:
Under text information and the identical situation of the corresponding text of text segment information, determine that current image block is that can recognize picture
Block;
According to the position of the recognizable picture block, corresponding position to be identified is determined, extract the picture to be identified
Text on position to be identified.
5. the method according to claim 1, wherein further include:
Processing is positively twisted to picture to be identified according to the reference point, make the picture to be identified and template picture shape,
Size is identical.
6. the method according to claim 1, wherein further include:
Image enhancement processing is carried out to the picture to be identified, to adjust light, contrast and the exposure of the picture to be identified
Degree.
7. a kind of character recognition device characterized by comprising
Template picture chooses module, for selecting template picture corresponding with the feature of picture to be identified;
Reference point frame modeling block, for selecting several reference points in the template picture upper ledge;
Cutting module cuts the picture to be identified for the position according to the identification region in the template picture,
Multiple picture blocks are obtained, the identification region is the region with picture to be identified with same characteristic features in template picture;
Text region module, for identification text information in the picture block;
Information contrast module, for comparing the text segment information in text information identification region corresponding with reference point;
Word Input module, for extracting institute under the text information and the identical situation of the corresponding text of text segment information
State the text on picture to be identified.
8. device according to claim 7, which is characterized in that the feature of the reference point frame modeling block includes:
The reference point is located at the common and constant position text section in the template picture and picture to be identified;
And the reference point be located at the template picture edge and four corners;
And the reference point is located in the template picture and primary text section position occurs;
And the quantity of the reference point is more than or equal to 4;
And the corresponding text of the same reference point is in same a line and adjacent.
9. device according to claim 7, which is characterized in that the cutting module, comprising:
Establishment of coordinate system unit determines the coordinate of each identification region for establishing two-dimensional coordinate system in the template picture;
The identification region includes template picture and the text segment information that picture to be identified has jointly;
Cutter unit cuts picture to be identified for the coordinate position according to each identification region, obtains multiple
Picture block;It wherein, include a kind of corresponding text segment information of identification region in each picture block.
10. device according to claim 7, which is characterized in that the Word Input module, comprising:
It can recognize picture block determination unit, under text information and the identical situation of the corresponding text of text segment information, determination is worked as
Preceding picture block is that can recognize picture block;
Word Input unit determines corresponding position to be identified for the position according to the recognizable picture block, extracts
Text on the position to be identified of the picture to be identified.
11. device according to claim 7, which is characterized in that further include:
Positive twist module makes the picture to be identified and mould for processing to be positively twisted to picture to be identified according to the reference point
Shape, the size of plate picture are identical.
12. device according to claim 7, which is characterized in that further include:
Image processing module, for carrying out image enhancement processing to the picture to be identified, to adjust the picture to be identified
Bright, contrast and exposure.
13. a kind of electronic equipment characterized by comprising
Memory, for storing program instruction;
Processor, for calling and executing the program instruction in the memory, to realize described in any one of claim 1~6
Character recognition method.
14. a kind of storage medium, which is characterized in that be stored with computer program in the readable storage medium storing program for executing, work as Text region
When at least one processor of device executes the computer program, character recognition device perform claim requires any one of 1~6 institute
The character recognition method stated.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910356344.9A CN110263616A (en) | 2019-04-29 | 2019-04-29 | A kind of character recognition method, device, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910356344.9A CN110263616A (en) | 2019-04-29 | 2019-04-29 | A kind of character recognition method, device, electronic equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110263616A true CN110263616A (en) | 2019-09-20 |
Family
ID=67914084
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910356344.9A Pending CN110263616A (en) | 2019-04-29 | 2019-04-29 | A kind of character recognition method, device, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110263616A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111178365A (en) * | 2019-12-31 | 2020-05-19 | 五八有限公司 | Picture character recognition method and device, electronic equipment and storage medium |
CN111444792A (en) * | 2020-03-13 | 2020-07-24 | 安诚迈科(北京)信息技术有限公司 | Bill recognition method, electronic device, storage medium and device |
CN111931784A (en) * | 2020-09-17 | 2020-11-13 | 深圳壹账通智能科技有限公司 | Bill recognition method, system, computer device and computer-readable storage medium |
CN112200185A (en) * | 2020-10-10 | 2021-01-08 | 航天科工智慧产业发展有限公司 | Method and device for reversely positioning picture by characters and computer storage medium |
CN112580499A (en) * | 2020-12-17 | 2021-03-30 | 上海眼控科技股份有限公司 | Text recognition method, device, equipment and storage medium |
WO2021184578A1 (en) * | 2020-03-17 | 2021-09-23 | 平安科技(深圳)有限公司 | Ocr-based target field recognition method and apparatus, electronic device, and storage medium |
CN114187604A (en) * | 2022-02-14 | 2022-03-15 | 山东信通电子股份有限公司 | Integrity verification method, equipment and medium for WebP picture |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106845469A (en) * | 2017-01-24 | 2017-06-13 | 深圳怡化电脑股份有限公司 | A kind of Paper Currency Identification and device |
CN108009546A (en) * | 2016-10-28 | 2018-05-08 | 北京京东尚科信息技术有限公司 | information identifying method and device |
CN108229463A (en) * | 2018-02-07 | 2018-06-29 | 众安信息技术服务有限公司 | Character recognition method based on image |
CN108874283A (en) * | 2018-05-29 | 2018-11-23 | 努比亚技术有限公司 | Image identification method, mobile terminal and computer readable storage medium |
CN109145904A (en) * | 2018-08-24 | 2019-01-04 | 讯飞智元信息科技有限公司 | A kind of character identifying method and device |
CN109658584A (en) * | 2018-12-14 | 2019-04-19 | 泰康保险集团股份有限公司 | A kind of bill bank slip recognition method and device |
-
2019
- 2019-04-29 CN CN201910356344.9A patent/CN110263616A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108009546A (en) * | 2016-10-28 | 2018-05-08 | 北京京东尚科信息技术有限公司 | information identifying method and device |
CN106845469A (en) * | 2017-01-24 | 2017-06-13 | 深圳怡化电脑股份有限公司 | A kind of Paper Currency Identification and device |
CN108229463A (en) * | 2018-02-07 | 2018-06-29 | 众安信息技术服务有限公司 | Character recognition method based on image |
CN108874283A (en) * | 2018-05-29 | 2018-11-23 | 努比亚技术有限公司 | Image identification method, mobile terminal and computer readable storage medium |
CN109145904A (en) * | 2018-08-24 | 2019-01-04 | 讯飞智元信息科技有限公司 | A kind of character identifying method and device |
CN109658584A (en) * | 2018-12-14 | 2019-04-19 | 泰康保险集团股份有限公司 | A kind of bill bank slip recognition method and device |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111178365A (en) * | 2019-12-31 | 2020-05-19 | 五八有限公司 | Picture character recognition method and device, electronic equipment and storage medium |
CN111444792A (en) * | 2020-03-13 | 2020-07-24 | 安诚迈科(北京)信息技术有限公司 | Bill recognition method, electronic device, storage medium and device |
CN111444792B (en) * | 2020-03-13 | 2023-05-09 | 安诚迈科(北京)信息技术有限公司 | Bill identification method, electronic equipment, storage medium and device |
WO2021184578A1 (en) * | 2020-03-17 | 2021-09-23 | 平安科技(深圳)有限公司 | Ocr-based target field recognition method and apparatus, electronic device, and storage medium |
CN111931784A (en) * | 2020-09-17 | 2020-11-13 | 深圳壹账通智能科技有限公司 | Bill recognition method, system, computer device and computer-readable storage medium |
WO2022057471A1 (en) * | 2020-09-17 | 2022-03-24 | 深圳壹账通智能科技有限公司 | Bill identification method, system, computer device, and computer-readable storage medium |
CN112200185A (en) * | 2020-10-10 | 2021-01-08 | 航天科工智慧产业发展有限公司 | Method and device for reversely positioning picture by characters and computer storage medium |
CN112580499A (en) * | 2020-12-17 | 2021-03-30 | 上海眼控科技股份有限公司 | Text recognition method, device, equipment and storage medium |
CN114187604A (en) * | 2022-02-14 | 2022-03-15 | 山东信通电子股份有限公司 | Integrity verification method, equipment and medium for WebP picture |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110263616A (en) | A kind of character recognition method, device, electronic equipment and storage medium | |
CN109657665A (en) | A kind of invoice batch automatic recognition system based on deep learning | |
CN110008944A (en) | OCR recognition methods and device, storage medium based on template matching | |
CN107944452A (en) | A kind of circular stamp character recognition method | |
WO2021017272A1 (en) | Pathology image annotation method and device, computer apparatus, and storage medium | |
CN110298353B (en) | Character recognition method and system | |
CN112528998B (en) | Certificate image processing method and device, electronic equipment and readable storage medium | |
CN111259891B (en) | Method, device, equipment and medium for identifying identity card in natural scene | |
CN108255555A (en) | A kind of system language switching method and terminal device | |
CN108154132A (en) | A kind of identity card text extraction method, system and equipment and storage medium | |
CN113111880B (en) | Certificate image correction method, device, electronic equipment and storage medium | |
CN108734849B (en) | Automatic invoice true-checking method and system | |
CN108648189B (en) | Image blur detection method and device, computing equipment and readable storage medium | |
CN108304562B (en) | Question searching method and device and intelligent terminal | |
CN112232336A (en) | Certificate identification method, device, equipment and storage medium | |
CN110135288B (en) | Method and device for quickly checking electronic certificate | |
CN108197624A (en) | The recognition methods of certificate image rectification and device, computer storage media | |
US11176363B2 (en) | System and method of training a classifier for determining the category of a document | |
RU2672395C1 (en) | Method for training a classifier designed for determining the category of a document | |
CN106557733A (en) | Information processor and information processing method | |
CN111178365A (en) | Picture character recognition method and device, electronic equipment and storage medium | |
CN111462388A (en) | Bill inspection method and device, terminal equipment and storage medium | |
CN113033562A (en) | Image processing method, device, equipment and storage medium | |
CN115410191A (en) | Text image recognition method, device, equipment and storage medium | |
CN115083024A (en) | Signature identification method, device, medium and equipment based on region division |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |