CN110263616A - A kind of character recognition method, device, electronic equipment and storage medium - Google Patents

A kind of character recognition method, device, electronic equipment and storage medium Download PDF

Info

Publication number
CN110263616A
CN110263616A CN201910356344.9A CN201910356344A CN110263616A CN 110263616 A CN110263616 A CN 110263616A CN 201910356344 A CN201910356344 A CN 201910356344A CN 110263616 A CN110263616 A CN 110263616A
Authority
CN
China
Prior art keywords
picture
identified
text
template
reference point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910356344.9A
Other languages
Chinese (zh)
Inventor
张学军
史忠伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuba Co Ltd
Original Assignee
Wuba Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuba Co Ltd filed Critical Wuba Co Ltd
Priority to CN201910356344.9A priority Critical patent/CN110263616A/en
Publication of CN110263616A publication Critical patent/CN110263616A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • G06V10/235Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition based on user input or interaction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Abstract

This application discloses a kind of character recognition method, device, electronic equipment and storage mediums, and choose one has the picture of same characteristic features as template picture, and according to selection rule with picture to be identified, select multiple reference points in template picture upper ledge.According to the position of the identification region in template picture, picture to be identified is cut into multiple picture blocks, and carry out Text region;In the text information situation identical with the text segment information in identification region in verification picture block, realizes the structuring Word Input of picture to be identified, improve the accuracy rate and efficiency of Text region.As it can be seen that method, apparatus provided by the invention, electronic equipment and storage medium, may be implemented the identification to certain types of image text, recognition efficiency is higher.

Description

A kind of character recognition method, device, electronic equipment and storage medium
Technical field
This application involves image identification technical field more particularly to a kind of character recognition method, device, electronic equipment and deposit Storage media.
Background technique
Company personnel can exchange visiting cards when carrying out social, and company can also generate a large amount of bills in day-to-day operations, and name Piece and bill are paper-based form, accumulate excessive paper business card and bill, store occupied space, and not only easy to be lost, but also not Convenient for the lookup of target business card or target bill.It, can be by the name on paper business card and bill, public affairs for the ease of checking and saving The text informations such as department, identification card number and phone are identified and stored in the terminal, in order to obtain in time when needed To relevant information.
In the prior art, the identification method of sample comparison is generallyd use to identify the text information on business card or bill, is had The method of body includes: to prepare a sample in advance, and the textbox of designated position on sample is elected, and forms recognition template; Paper business card or bill are scanned into picture again, picture to be identified is compared with recognition template, on picture to be identified When some information sets corresponding characters matching with frame bit selecting, which is the Text region result of picture to be identified.
But when carrying out Text region by the identification method that sample compares, it need to be set using the frame bit selecting on recognition template It is matched with the whole region of picture to be identified, due to needing to carry out a large amount of data processing in matching process, so can lead Cause the time spent when identification text longer.
Summary of the invention
This application provides a kind of character recognition method, device, electronic equipment and storage mediums, to solve existing identification The low problem of method recognition efficiency.
In a first aspect, this application provides a kind of character recognition methods, comprising the following steps:
Select template picture corresponding with the feature of picture to be identified;
Several reference points are selected in the template picture upper ledge;
According to the position of the identification region in the template picture, the picture to be identified is cut, is obtained multiple Picture block, the identification region are the region with picture to be identified with same characteristic features in template picture;
Identify the text information in the picture block;
Compare the text segment information in text information identification region corresponding with reference point;
Under the text information and the identical situation of the corresponding text of text segment information, extract on the picture to be identified Text.
Further, the reference point is located at the common and constant position text in the template picture and picture to be identified At field;And the reference point be located at the template picture edge and four corners;And the reference point is located at the mould Occurs primary text section position on plate picture;And the quantity of the reference point is more than or equal to 4;And the same ginseng According to putting corresponding text in same a line and adjacent.
Further, according to the position of the identification region in the template picture, the picture to be identified is cut, Obtain the process of multiple picture blocks, comprising:
Two-dimensional coordinate system is established in the template picture, determines the coordinate of each identification region;The identification region packet Include the text segment information that template picture and picture to be identified have jointly;
According to the coordinate position of each identification region, picture to be identified is cut, obtains multiple picture blocks;Its In, it include a kind of corresponding text segment information of identification region in each picture block.
Further, described under text information and the identical situation of the corresponding text of text segment information, extract it is described to Identify the process of the text on picture, comprising:
Under text information and the identical situation of the corresponding text of text segment information, determine that current image block is that can recognize figure Tile;
According to the position of the recognizable picture block, corresponding position to be identified is determined, extract the figure to be identified Text on the position to be identified of piece.
Further, further includes:
Processing is positively twisted to picture to be identified according to the reference point, makes the shape of the picture to be identified and template picture Shape, size are identical.
Further, further includes:
Image enhancement processing is carried out to the picture to be identified, with adjust the light of the picture to be identified, contrast and Exposure.
Second aspect, the embodiment of the invention provides a kind of character recognition devices, comprising:
Template picture chooses module, for selecting template picture corresponding with the feature of picture to be identified;
Reference point frame modeling block, for selecting several reference points in the template picture upper ledge;
Cutting module carries out the picture to be identified for the position according to the identification region in the template picture Cutting, obtains multiple picture blocks, and the identification region is the region with picture to be identified with same characteristic features in template picture;
Text region module, for identification text information in the picture block;
Information contrast module, for comparing the letter of the text section in text information identification region corresponding with reference point Breath;
Word Input module, for mentioning under the text information and the identical situation of the corresponding text of text segment information Take the text on the picture to be identified.
Further, the feature of the reference point frame modeling block includes:
The reference point is located at the common and constant position text section in the template picture and picture to be identified;
And the reference point be located at the template picture edge and four corners;
And the reference point is located in the template picture and primary text section position occurs;
And the quantity of the reference point is more than or equal to 4;
And the corresponding text of the same reference point is in same a line and adjacent.
Further, the cutting module, comprising:
Establishment of coordinate system unit determines each identification region for establishing two-dimensional coordinate system in the template picture Coordinate;The identification region includes template picture and the text segment information that picture to be identified has jointly;
Cutter unit cuts picture to be identified, obtains for the coordinate position according to each identification region Multiple picture blocks;It wherein, include a kind of corresponding text segment information of identification region in each picture block.
Further, the Word Input module, comprising:
It can recognize picture block determination unit, under text information and the identical situation of the corresponding text of text segment information, really Settled preceding picture block is that can recognize picture block;
Word Input unit determines corresponding position to be identified for the position according to the recognizable picture block, Extract the text on the position to be identified of the picture to be identified.
Further, further includes:
Positive twist module makes the picture to be identified for processing to be positively twisted to picture to be identified according to the reference point It is identical as the shape of template picture, size.
Further, further includes:
Image processing module, for carrying out image enhancement processing to the picture to be identified, to adjust the figure to be identified Light, contrast and the exposure of piece.
The third aspect, the embodiment of the invention provides a kind of electronic equipment, comprising:
Memory, for storing program instruction;
Processor, for calling and executing the program instruction in the memory, to realize text described in first aspect Recognition methods.
Fourth aspect is stored with calculating the embodiment of the invention provides a kind of storage medium in the readable storage medium storing program for executing Machine program, when at least one processor of character recognition device executes the computer program, character recognition device executes the Character recognition method described in one side.
From the above technical scheme, a kind of character recognition method provided in an embodiment of the present invention, device, electronic equipment and Storage medium chooses a picture with picture to be identified with same characteristic features as template picture, and according to selection rule, Template picture upper ledge selects multiple reference points.According to the position of the identification region in template picture, picture to be identified is cut into more A picture block, and carry out Text region;Text information in verification picture block is identical as the text segment information in identification region In the case where, it realizes the structuring Word Input of picture to be identified, improves the accuracy rate and efficiency of Text region.As it can be seen that this hair The identification to certain types of image text may be implemented in method, apparatus, electronic equipment and the storage medium of bright offer, identification It is more efficient.
Detailed description of the invention
In order to illustrate more clearly of the technical solution of the application, letter will be made to attached drawing needed in the embodiment below Singly introduce, it should be apparent that, for those of ordinary skills, without any creative labor, It is also possible to obtain other drawings based on these drawings.
Fig. 1 is the flow chart of character recognition method provided in an embodiment of the present invention;
Fig. 2 is the setting position view of reference point provided in an embodiment of the present invention;
Fig. 3 is the method flow diagram provided in an embodiment of the present invention cut to picture to be identified;
Fig. 4 is the schematic diagram of the template picture provided in an embodiment of the present invention for establishing coordinate system;
Fig. 5 is the method flow diagram provided in an embodiment of the present invention for extracting the text on picture to be identified;
Fig. 6 be positive twist provided in an embodiment of the present invention before template picture and picture to be identified contrast schematic diagram;
Fig. 7 be positive twist provided in an embodiment of the present invention after template picture and picture to be identified contrast schematic diagram;
Fig. 8 is the structural block diagram of character recognition device provided in an embodiment of the present invention;
Fig. 9 is the hardware structural diagram of electronic equipment provided in an embodiment of the present invention.
Specific embodiment
Fig. 1 is the flow chart of character recognition method provided in an embodiment of the present invention.
A kind of character recognition method provided in an embodiment of the present invention, specific one kind has the text of fixed format for identification Grade, such as bill, card card, identity card etc., it realizes and various bills, card card, the field name of identity card and field value is corresponded to and extracted Identification.The executing subject of this method can be installation OCR (Optical Character Recognition, optical character identification) The test equipment of software, such as desktop computer, laptop.Referring to Fig. 1, method includes the following steps:
S1, selection template picture corresponding with the feature of picture to be identified;
Picture to be identified is a kind of corresponding picture of document with fixed format to be identified, such as bill, card card, body The corresponding pictures such as part card.Feature on picture to be identified includes field name, such as name, identification card number.
According to the feature shown on picture to be identified, a template picture is chosen, template photo wants clear, smooth and complete Site preparation shows the identical feature having with picture to be identified, for example, if picture to be identified is a kind of corresponding picture of bill, So template picture is the picture for including the field name shown on bill;If picture to be identified is the corresponding figure of identity card Piece, then template picture is the picture for including the field name shown on identity card.That is, picture to be identified and Prototype drawing Piece will be corresponding to carry out efficient, accurate Text region.
S2, several reference points are selected in template picture upper ledge;
For the ease of the matching of picture to be identified and template picture, to carry out efficient Text region, in the present embodiment, Frame is needed to select several reference points in template picture.Reference point is for providing a datum mark, by picture to be identified and Prototype drawing On piece is corresponding.
As shown in Fig. 2, when template picture upper ledge selects reference point, being needed according to following rules to improve recognition effect It is chosen:
Reference point is located at the common and constant position text section in template picture and picture to be identified;And referring to point In the edge of template picture and four corners;And reference point is located at that occur primary text section institute in the template picture in place It sets;And the quantity of reference point is more than or equal to 4;And the corresponding text of same reference point is in same a line and adjacent.
Opposite position due to template picture and picture to be identified feature having the same, between the corresponding text section of this feature Set it is identical and constant, therefore, when selecting reference point, for guarantee recognition accuracy, need in template picture and picture to be identified Position with same text section, and the position of this article field will not change selection position of the corresponding position as reference point It sets.For example, " name " is the constant text section in position, and corresponding " Zhang San " is then the text that position can change on a bill Field, this is because " name " is the text section that there is originally on bill, and " Zhang San " is that subsequent user's addition is got on. The text section added afterwards can not determine unified position, and therefore, the indefinite text section position in position should not be used as reference point Position.
A kind of document with fixed format, the corresponding text section in surrounding position can be different, and middle position is corresponding Text section perhaps can be identical, for example, the answering card that student examination uses, the corresponding text section in surrounding position and four corners Difference, but intermediate corresponding text section is identical, is the content of " A, B, C, D " four options.Therefore, for each reference point pair The text section answered distinguishes, improve recognition accuracy, reference point need to be arranged in template picture edge and four corners.
In addition, the content of " A, B, C, D " four options on answering card can repeat, if selecting reference point here, Then can not the determination content to be identified which part belonged to, it may appear that " A, B, C, D " of Part IV four on picture to be identified The option situation corresponding with " A, B, C, D " four options of first part in template picture, causes recognition result mistake occur, because This, the phenomenon that avoid the occurrence of wrong identification, reference point should not be selected in the text section position repeated, and be selected in as far as possible Only there is primary text section position.
The corresponding text of same reference point need to be adjacent in same a line, and as far as possible, to ensure in the subsequent Text region that carries out Accuracy.And the quantity of reference point is more as far as possible, at least marks 4, but in order to guarantee recognition effect, should mark 8 above by reference to Point.The reference point locations chosen in the present embodiment can be as shown in the black dot in Fig. 2, and the reference point in template picture is more More disperse, recognition effect is better, can guarantee the accuracy and efficiency of Text region.
It should be noted that Fig. 2 is only the position for being exemplarily illustrated several reference points, not to the limit of reference point locations Fixed, for other kinds of document, the position of reference point can also be other.
S3, according to the position of the identification region in template picture, picture to be identified is cut, multiple pictures are obtained Block, identification region are the region with picture to be identified with same characteristic features in template picture.
Due on picture to be identified there are multiple text sections to be identified, for the accuracy and efficiency for improving identification, In the present embodiment, picture to be identified is cut into multiple portions, with the text section in corresponding identification each section.
The standard cut to picture to be identified is the position according to each identification region in template picture, cog region Domain is the region with picture to be identified with same text section in template picture, for example, where " name ", " identification card number " Region.
Specifically, as shown in figure 3, in the present embodiment, according to the position of the identification region in template picture, to figure to be identified Piece is cut, and the process of multiple picture blocks is obtained, comprising:
S31, two-dimensional coordinate system is established in template picture, determine the coordinate of each identification region;Identification region includes mould The text segment information that plate picture and picture to be identified have jointly.
As shown in figure 4, the two-dimensional coordinate system established in template picture, coordinate origin may be provided at the lower-left of template picture Angle, X-direction are the bottom edge direction of template picture, and Y direction is the side direction of template.
According to position of each identification region in template picture, coordinate of the identification region in two-dimensional coordinate system is determined. Since identification region includes text segment information, corresponding is a region rather than a point, therefore, in order to accurately determine The coordinate of the coordinate of identification region, identification region can be indicated by the coordinate of the center of identification region.
S32, according to the coordinate position of each identification region, picture to be identified is cut, multiple picture blocks are obtained;Its In, it include a kind of corresponding text segment information of identification region in each picture block.
It, will be wait know using the coordinate of each identification region as reference after determining the coordinate position of each identification region Other picture is cut into multiple picture blocks, and the form of picture block can be identical with identification region, and needs to include text section in picture block Information, this article field information need to be identical as the text segment information in the correspondence identification region in template picture.
Text information on S4, identification picture block;
Successively each picture block is identified, to get the text information in picture block, which be can be used OCR (Optical Character Recognition, optical character identification) technology, since the technology has been comparative maturity Technology, accordingly, with respect to more specifically identification process, details are not described herein again.
Text segment information on S5, comparison text information identification region corresponding with reference point;
To guarantee that the identification content on current picture to be identified is consistent with the corresponding content in template picture, the present embodiment In, it also needs to verify the text information in the picture block recognized, i.e., by text information cog region corresponding with reference point Text segment information on domain compares, and avoids the occurrence of the text on the text information and template picture identified on picture to be identified The situation that field information does not correspond to, influences recognition accuracy.
S6, under text information and the identical situation of the corresponding text of text segment information, extract the text on picture to be identified Word.
After text information and text segment information are compared, if the corresponding word content of the two is not identical, explanation There is mistake in current identification process, i.e., picture to be identified is not corresponding with template picture, then this identification process will be terminated, more It changes template picture or carries out the Text region for carrying out a new round after other are adjusted again.
If text information word content corresponding with text segment information is identical, illustrate that current identification process is normal The corresponding literal value of text section is identified, that is, identifies the text on picture to be identified.
Specifically, as shown in figure 5, in the present embodiment, in text information and the identical feelings of the corresponding text of text segment information Under condition, the process of the text on picture to be identified is extracted, comprising:
S61, under text information and the identical situation of the corresponding text of text segment information, determine current image block be can know Other picture block;
S62, basis can recognize the position of picture block, determine corresponding position to be identified, extract picture to be identified Text on position to be identified.
When text information word content corresponding with text segment information is identical, using current image block as recognizable picture Block can continue the picture block for carrying out Text region to picture to be identified, and can recognize that picture block determines position to be identified according to this It sets, position to be identified is position corresponding with picture block.For example, if can recognize that the text information in picture block is " name ", The corresponding text information in position so to be identified is " Zhang San ";If can recognize that the text information in picture block is " identity account Number ", then the corresponding text information in position to be identified is " 110XXXXXXXXXXXXXXX ".
Then, the text information on position to be identified is identified using OCR technique again, by picture pair to be identified The multiple picture blocks answered are completed after identifying and being verified, and are extracted to the text information on corresponding position to be identified, Realize that the data of the structuring of picture to be identified are extracted immediately.
On the basis of character recognition method provided by the above embodiment, character recognition method provided in this embodiment, After reading picture to be identified, further includes: processing is positively twisted to picture to be identified according to reference point, makes picture and mould to be identified Shape, the size of plate picture are identical;
Due to when reading picture to be identified, phenomena such as picture to be identified will appear rotation, deformation, therefore, identification to Before identifying the text on picture, picture to be identified need to be positively twisted on the basis of reference point, i.e., by picture to be identified and mould Plate picture positive twist is to consistent, so that the shape of picture to be identified, size and placement direction are identical as template picture.
For example, as shown in (a) and (b) in Fig. 6, if parallelogram is presented in picture to be identified at the time of reading, by text The positive direction of shelves has rotated certain angle, and template picture is rectangle, places by the positive direction of document.Therefore, know for guarantee Other efficiency, by the parallelogram positive twist rectangularity of picture to be identified, and size is identical as template picture, separately Outside, the angle for also needing to adjust picture to be identified places it also by the positive direction of document, in the comparison diagram such as Fig. 7 after positive twist (a) and shown in (b), that is to say, that picture to be identified and template picture can accomplish to be completely coincident.
Since the picture to be identified of reading will appear than darker, unintelligible or excessively bright phenomenon, the knowledge of text is easily influenced Not, therefore, in the present embodiment, before processing is positively twisted to picture to be identified according to reference point, the character recognition method is also It include: that image enhancement processing is carried out to picture to be identified, to adjust light, contrast and the exposure of picture to be identified.
Picture enhancing processing is carried out to picture to be identified, so that the light of picture to be identified, contrast and exposure etc. reach To optimum state, convenient for identifying the text on picture to be identified.It, can will be bright for example, if the brightness of picture to be identified is excessively bright Degree is turned down, if excessively dark, brightness is turned up.
As it can be seen that character recognition method provided in this embodiment, when carrying out Text region to picture to be identified, in advance to reading Processing and image enhancement processing is positively twisted in the picture to be identified taken, so that picture to be identified is identical as template picture, and image It is more clear.When being identified to text, if picture to be identified and template picture are equal on size, shape and placement direction It is identical, and the light of picture to be identified, contrast and exposure etc. reach optimum state, can more accurately recognize wait know Each text on other picture improves recognition efficiency.
From the above technical scheme, a kind of character recognition method provided in an embodiment of the present invention, choose one with wait know Other picture has the picture of same characteristic features as template picture, and according to selection rule, selects multiple references in template picture upper ledge Point.According to the position of the identification region in template picture, picture to be identified is cut into multiple picture blocks, row text of going forward side by side is known Not;In the text information situation identical with the text segment information in identification region in verification picture block, figure to be identified is realized The structuring Word Input of piece improves the accuracy rate and efficiency of Text region.As it can be seen that method provided by the invention may be implemented pair The identification of certain types of image text, recognition efficiency are higher.
Referring to Fig. 8, the embodiment of the invention provides a kind of character recognition devices, for executing Text region shown in FIG. 1 Method, the device include:
Template picture chooses module 10, for selecting template picture corresponding with the feature of picture to be identified;Reference point Frame modeling block 20, for selecting several reference points in the template picture upper ledge;Cutting module 30, for according to the template picture On identification region position, the picture to be identified is cut, obtains multiple picture blocks, the identification region is template The region with picture to be identified with same characteristic features on picture;Text region module 40, for identification in the picture block Text information;Information contrast module 50, for comparing the text section in text information identification region corresponding with reference point Information;Word Input module 60, for extracting under the text information and the identical situation of the corresponding text of text segment information Text on the picture to be identified.
Further, the feature of the reference point frame modeling block 20 include: the reference point be located at the template picture and At the constant text section in common and position in picture to be identified;And the reference point is located at the edge and four of the template picture A corner;And the reference point is located in the template picture and primary text section position occurs;And the reference point Quantity be more than or equal to 4;And the corresponding text of the same reference point is in same a line and adjacent.
Further, the cutting module 30, comprising: establishment of coordinate system unit, for being established in the template picture Two-dimensional coordinate system determines the coordinate of each identification region;The identification region includes that template picture has jointly with picture to be identified Some text segment informations;Cutter unit cuts picture to be identified for the coordinate position according to each identification region It cuts, obtains multiple picture blocks;It wherein, include a kind of corresponding text segment information of identification region in each picture block.
Further, the Word Input module 60, comprising: can recognize picture block determination unit, in text information and text In the identical situation of the corresponding text of field information, determine that current image block is that can recognize picture block;Word Input unit, is used for According to the position of the recognizable picture block, determine corresponding position to be identified, extract the picture to be identified wait know Text on other position.
Further, further includes: positive twist module, for processing to be positively twisted to picture to be identified according to the reference point, Keep the picture to be identified identical as the shape of template picture, size.
Further, further includes: image processing module, for carrying out image enhancement processing to the picture to be identified, with Adjust light, contrast and the exposure of the picture to be identified.
Fig. 9 is the hardware structural diagram of electronic equipment provided in an embodiment of the present invention.As shown in figure 9, the present invention is implemented Example provides a kind of electronic equipment, comprising:
Memory 601, for storing program instruction;
Processor 602, for calling and executing the program instruction in the memory 601, to realize previous embodiment institute The character recognition method stated.
In the present embodiment, processor 602 can be connected with memory 601 by bus or other modes.Processor can be General processor, such as central processing unit, digital signal processor, specific integrated circuit, or be configured to implement the present invention One or more integrated circuits of embodiment.Memory may include volatile memory, such as random access memory;Storage Device also may include nonvolatile memory, such as read-only memory, flash memory, hard disk or solid state hard disk.
The embodiment of the invention provides a kind of storage medium, it is stored with computer program in the readable storage medium storing program for executing, when When at least one processor of character recognition device executes the computer program, character recognition device executes previous embodiment institute The character recognition method stated.
The readable storage medium storing program for executing can for magnetic disk, CD, read-only memory (English: read-only memory, Referred to as: ROM) or random access memory (English: random access memory, referred to as: RAM) etc..
It is required that those skilled in the art can be understood that the technology in the embodiment of the present invention can add by software The mode of general hardware platform realize.Based on this understanding, the technical solution in the embodiment of the present invention substantially or Say that the part that contributes to existing technology can be embodied in the form of software products, which can deposit Storage is in storage medium, such as ROM/RAM, magnetic disk, CD, including some instructions are used so that computer equipment (can be with It is personal computer, server or the network equipment etc.) execute certain part institutes of each embodiment of the present invention or embodiment The method stated.
Same and similar part may refer to each other between each embodiment in this specification.Especially for Text region For Installation practice, since it is substantially similar to the method embodiment, so being described relatively simple, related place is referring to method Explanation in embodiment.
Invention described above embodiment is not intended to limit the scope of the present invention..

Claims (14)

1. a kind of character recognition method, which comprises the following steps:
Select template picture corresponding with the feature of picture to be identified;
Several reference points are selected in the template picture upper ledge;
According to the position of the identification region in the template picture, the picture to be identified is cut, multiple pictures are obtained Block, the identification region are the region with picture to be identified with same characteristic features in template picture;
Identify the text information in the picture block;
Compare the text segment information in text information identification region corresponding with reference point;
Under the text information and the identical situation of the corresponding text of text segment information, the text on the picture to be identified is extracted Word.
2. the method according to claim 1, wherein
The reference point is located at the common and constant position text section in the template picture and picture to be identified;
And the reference point be located at the template picture edge and four corners;
And the reference point is located in the template picture and primary text section position occurs;
And the quantity of the reference point is more than or equal to 4;
And the corresponding text of the same reference point is in same a line and adjacent.
3. the method according to claim 1, wherein according to the position of the identification region in the template picture, The picture to be identified is cut, the process of multiple picture blocks is obtained, comprising:
Two-dimensional coordinate system is established in the template picture, determines the coordinate of each identification region;The identification region includes mould The text segment information that plate picture and picture to be identified have jointly;
According to the coordinate position of each identification region, picture to be identified is cut, obtains multiple picture blocks;Wherein, It include a kind of corresponding text segment information of identification region in each picture block.
4. the method according to claim 1, wherein described in text information and the corresponding text of text segment information In identical situation, the process of the text on the picture to be identified is extracted, comprising:
Under text information and the identical situation of the corresponding text of text segment information, determine that current image block is that can recognize picture Block;
According to the position of the recognizable picture block, corresponding position to be identified is determined, extract the picture to be identified Text on position to be identified.
5. the method according to claim 1, wherein further include:
Processing is positively twisted to picture to be identified according to the reference point, make the picture to be identified and template picture shape, Size is identical.
6. the method according to claim 1, wherein further include:
Image enhancement processing is carried out to the picture to be identified, to adjust light, contrast and the exposure of the picture to be identified Degree.
7. a kind of character recognition device characterized by comprising
Template picture chooses module, for selecting template picture corresponding with the feature of picture to be identified;
Reference point frame modeling block, for selecting several reference points in the template picture upper ledge;
Cutting module cuts the picture to be identified for the position according to the identification region in the template picture, Multiple picture blocks are obtained, the identification region is the region with picture to be identified with same characteristic features in template picture;
Text region module, for identification text information in the picture block;
Information contrast module, for comparing the text segment information in text information identification region corresponding with reference point;
Word Input module, for extracting institute under the text information and the identical situation of the corresponding text of text segment information State the text on picture to be identified.
8. device according to claim 7, which is characterized in that the feature of the reference point frame modeling block includes:
The reference point is located at the common and constant position text section in the template picture and picture to be identified;
And the reference point be located at the template picture edge and four corners;
And the reference point is located in the template picture and primary text section position occurs;
And the quantity of the reference point is more than or equal to 4;
And the corresponding text of the same reference point is in same a line and adjacent.
9. device according to claim 7, which is characterized in that the cutting module, comprising:
Establishment of coordinate system unit determines the coordinate of each identification region for establishing two-dimensional coordinate system in the template picture; The identification region includes template picture and the text segment information that picture to be identified has jointly;
Cutter unit cuts picture to be identified for the coordinate position according to each identification region, obtains multiple Picture block;It wherein, include a kind of corresponding text segment information of identification region in each picture block.
10. device according to claim 7, which is characterized in that the Word Input module, comprising:
It can recognize picture block determination unit, under text information and the identical situation of the corresponding text of text segment information, determination is worked as Preceding picture block is that can recognize picture block;
Word Input unit determines corresponding position to be identified for the position according to the recognizable picture block, extracts Text on the position to be identified of the picture to be identified.
11. device according to claim 7, which is characterized in that further include:
Positive twist module makes the picture to be identified and mould for processing to be positively twisted to picture to be identified according to the reference point Shape, the size of plate picture are identical.
12. device according to claim 7, which is characterized in that further include:
Image processing module, for carrying out image enhancement processing to the picture to be identified, to adjust the picture to be identified Bright, contrast and exposure.
13. a kind of electronic equipment characterized by comprising
Memory, for storing program instruction;
Processor, for calling and executing the program instruction in the memory, to realize described in any one of claim 1~6 Character recognition method.
14. a kind of storage medium, which is characterized in that be stored with computer program in the readable storage medium storing program for executing, work as Text region When at least one processor of device executes the computer program, character recognition device perform claim requires any one of 1~6 institute The character recognition method stated.
CN201910356344.9A 2019-04-29 2019-04-29 A kind of character recognition method, device, electronic equipment and storage medium Pending CN110263616A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910356344.9A CN110263616A (en) 2019-04-29 2019-04-29 A kind of character recognition method, device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910356344.9A CN110263616A (en) 2019-04-29 2019-04-29 A kind of character recognition method, device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN110263616A true CN110263616A (en) 2019-09-20

Family

ID=67914084

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910356344.9A Pending CN110263616A (en) 2019-04-29 2019-04-29 A kind of character recognition method, device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110263616A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111178365A (en) * 2019-12-31 2020-05-19 五八有限公司 Picture character recognition method and device, electronic equipment and storage medium
CN111444792A (en) * 2020-03-13 2020-07-24 安诚迈科(北京)信息技术有限公司 Bill recognition method, electronic device, storage medium and device
CN111931784A (en) * 2020-09-17 2020-11-13 深圳壹账通智能科技有限公司 Bill recognition method, system, computer device and computer-readable storage medium
CN112200185A (en) * 2020-10-10 2021-01-08 航天科工智慧产业发展有限公司 Method and device for reversely positioning picture by characters and computer storage medium
CN112580499A (en) * 2020-12-17 2021-03-30 上海眼控科技股份有限公司 Text recognition method, device, equipment and storage medium
WO2021184578A1 (en) * 2020-03-17 2021-09-23 平安科技(深圳)有限公司 Ocr-based target field recognition method and apparatus, electronic device, and storage medium
CN114187604A (en) * 2022-02-14 2022-03-15 山东信通电子股份有限公司 Integrity verification method, equipment and medium for WebP picture

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106845469A (en) * 2017-01-24 2017-06-13 深圳怡化电脑股份有限公司 A kind of Paper Currency Identification and device
CN108009546A (en) * 2016-10-28 2018-05-08 北京京东尚科信息技术有限公司 information identifying method and device
CN108229463A (en) * 2018-02-07 2018-06-29 众安信息技术服务有限公司 Character recognition method based on image
CN108874283A (en) * 2018-05-29 2018-11-23 努比亚技术有限公司 Image identification method, mobile terminal and computer readable storage medium
CN109145904A (en) * 2018-08-24 2019-01-04 讯飞智元信息科技有限公司 A kind of character identifying method and device
CN109658584A (en) * 2018-12-14 2019-04-19 泰康保险集团股份有限公司 A kind of bill bank slip recognition method and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108009546A (en) * 2016-10-28 2018-05-08 北京京东尚科信息技术有限公司 information identifying method and device
CN106845469A (en) * 2017-01-24 2017-06-13 深圳怡化电脑股份有限公司 A kind of Paper Currency Identification and device
CN108229463A (en) * 2018-02-07 2018-06-29 众安信息技术服务有限公司 Character recognition method based on image
CN108874283A (en) * 2018-05-29 2018-11-23 努比亚技术有限公司 Image identification method, mobile terminal and computer readable storage medium
CN109145904A (en) * 2018-08-24 2019-01-04 讯飞智元信息科技有限公司 A kind of character identifying method and device
CN109658584A (en) * 2018-12-14 2019-04-19 泰康保险集团股份有限公司 A kind of bill bank slip recognition method and device

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111178365A (en) * 2019-12-31 2020-05-19 五八有限公司 Picture character recognition method and device, electronic equipment and storage medium
CN111444792A (en) * 2020-03-13 2020-07-24 安诚迈科(北京)信息技术有限公司 Bill recognition method, electronic device, storage medium and device
CN111444792B (en) * 2020-03-13 2023-05-09 安诚迈科(北京)信息技术有限公司 Bill identification method, electronic equipment, storage medium and device
WO2021184578A1 (en) * 2020-03-17 2021-09-23 平安科技(深圳)有限公司 Ocr-based target field recognition method and apparatus, electronic device, and storage medium
CN111931784A (en) * 2020-09-17 2020-11-13 深圳壹账通智能科技有限公司 Bill recognition method, system, computer device and computer-readable storage medium
WO2022057471A1 (en) * 2020-09-17 2022-03-24 深圳壹账通智能科技有限公司 Bill identification method, system, computer device, and computer-readable storage medium
CN112200185A (en) * 2020-10-10 2021-01-08 航天科工智慧产业发展有限公司 Method and device for reversely positioning picture by characters and computer storage medium
CN112580499A (en) * 2020-12-17 2021-03-30 上海眼控科技股份有限公司 Text recognition method, device, equipment and storage medium
CN114187604A (en) * 2022-02-14 2022-03-15 山东信通电子股份有限公司 Integrity verification method, equipment and medium for WebP picture

Similar Documents

Publication Publication Date Title
CN110263616A (en) A kind of character recognition method, device, electronic equipment and storage medium
CN109657665A (en) A kind of invoice batch automatic recognition system based on deep learning
CN110008944A (en) OCR recognition methods and device, storage medium based on template matching
CN107944452A (en) A kind of circular stamp character recognition method
WO2021017272A1 (en) Pathology image annotation method and device, computer apparatus, and storage medium
CN110298353B (en) Character recognition method and system
CN112528998B (en) Certificate image processing method and device, electronic equipment and readable storage medium
CN111259891B (en) Method, device, equipment and medium for identifying identity card in natural scene
CN108255555A (en) A kind of system language switching method and terminal device
CN108154132A (en) A kind of identity card text extraction method, system and equipment and storage medium
CN113111880B (en) Certificate image correction method, device, electronic equipment and storage medium
CN108734849B (en) Automatic invoice true-checking method and system
CN108648189B (en) Image blur detection method and device, computing equipment and readable storage medium
CN108304562B (en) Question searching method and device and intelligent terminal
CN112232336A (en) Certificate identification method, device, equipment and storage medium
CN110135288B (en) Method and device for quickly checking electronic certificate
CN108197624A (en) The recognition methods of certificate image rectification and device, computer storage media
US11176363B2 (en) System and method of training a classifier for determining the category of a document
RU2672395C1 (en) Method for training a classifier designed for determining the category of a document
CN106557733A (en) Information processor and information processing method
CN111178365A (en) Picture character recognition method and device, electronic equipment and storage medium
CN111462388A (en) Bill inspection method and device, terminal equipment and storage medium
CN113033562A (en) Image processing method, device, equipment and storage medium
CN115410191A (en) Text image recognition method, device, equipment and storage medium
CN115083024A (en) Signature identification method, device, medium and equipment based on region division

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination