CN110147785A - Image-recognizing method, relevant apparatus and equipment - Google Patents
Image-recognizing method, relevant apparatus and equipment Download PDFInfo
- Publication number
- CN110147785A CN110147785A CN201810274802.XA CN201810274802A CN110147785A CN 110147785 A CN110147785 A CN 110147785A CN 201810274802 A CN201810274802 A CN 201810274802A CN 110147785 A CN110147785 A CN 110147785A
- Authority
- CN
- China
- Prior art keywords
- style
- information
- writing
- pixel
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/20—Image enhancement or restoration by the use of local operators
- G06T5/30—Erosion or dilatation, e.g. thinning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/22—Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/28—Quantising the image, e.g. histogram thresholding for discrimination between background and foreground patterns
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/28—Character recognition specially adapted to the type of the alphabet, e.g. Latin alphabet
- G06V30/293—Character recognition specially adapted to the type of the alphabet, e.g. Latin alphabet of characters other than Kanji, Hiragana or Katakana
Abstract
The invention discloses a kind of image-recognizing methods, comprising: carries out binary conversion treatment to image, obtains binary map;Described image includes multiple characters;Skeletal extraction is carried out to the binary map, extracts the framework information of the multiple character;Style of writing information is extracted from the framework information;The style of writing information includes the location information between style of writing feature point number and adjacent style of writing characteristic point;By style of writing information described in the chronicle recognition engine analysis based on deep learning network, the multiple character and intercharacter positional relationship information are identified.The invention also discloses a kind of pattern recognition device and equipment, artificial design features are not necessarily to, and do not need to do character separation, the prior art is solved for there are the character of adhesion, separation algorithms cannot be handled well, leads to the technical problem that recognition accuracy is low.
Description
Technical field
It is by method, relevant apparatus and equipment the present invention relates to computer field more particularly to image.
Background technique
Optical character identification (Optical Character Recognition, OCR) refers to that electronic equipment (such as scans
Instrument or digital camera) check the character printed on paper, its shape is determined by the mode for detecting dark, bright, then uses character recognition
Shape is translated into the process of computword by method.Wherein, misclassification rate or recognition accuracy are to measure OCR performance quality
One important indicator.
Currently, the application field of OCR mathematical character identification is very extensive, alternative keyboard completes high speed in many occasions for it
Words input is humane.Such as the identification typing of block letter manuscript is carried out with OCR, this be many office sectors commonly using method it
One;The block letter that automatic segmentation can be also carried out to the complicated space of a whole page of the mixings such as figure, image and text identifies;Also by right
Mail Automated Sorting System is realized in the identification of handwriting digital;And realize handwritten form list data automatic input, it can be extensive
The input of the list datas such as declaration form, application form applied to every profession and trades such as government, the tax, insurance, quotient, medical treatment, finance, factories and miness
With processing, etc..
In the prior art, the character in image is identified, when especially being identified to mathematical formulae, often first
Binary conversion treatment is carried out to image, then carries out character separation, cutting extracts single mathematical character, and extracts mathematical character
Feature, then according to the positional relationship of intercharacter using stochastic context can not grammar rule carry out mathematic(al) representation derivation it is raw
At mathematical formulae.Then the above-mentioned prior art causes to identify for there are the character of adhesion, separation algorithms cannot be handled well
Accuracy rate is low.
Summary of the invention
The technical problem to be solved by the embodiment of the invention is that providing a kind of image-recognizing method, a kind of image recognition
Device, a kind of image recognition apparatus and a kind of computer readable storage medium solve the prior art for there are the words of adhesion
Symbol, separation algorithms cannot be handled well, lead to the technical problem that recognition accuracy is low.
In order to solve the above-mentioned technical problem, the one aspect of the embodiment of the present invention discloses a kind of image recognition side
Method, comprising:
Binary conversion treatment is carried out to image, obtains binary map;Described image includes multiple characters;
Skeletal extraction is carried out to the binary map, extracts the framework information of the multiple character;
Style of writing information is extracted from the framework information;The style of writing information includes style of writing feature point number and adjacent pen
Touch the location information between characteristic point;
By style of writing information described in the chronicle recognition engine analysis based on deep learning network, the multiple character is identified
And intercharacter positional relationship information.
It is described that skeletal extraction is carried out to the binary map in conjunction with a kind of above-mentioned image-recognizing method, comprising:
Corrosion treatment is iterated to the binary map, the not new pixel of the binary map after relatively last corrosion
Point is corroded;Wherein each iteration corrosion includes the pixel successively traversed in the binary map, to the picture for meeting specified requirements
Vegetarian refreshments is corroded.
In conjunction with a kind of above-mentioned image-recognizing method, the pixel for meeting specified requirements includes meeting following either condition
Target pixel points:
The number for the pixel that two-value is 1 in 8 adjacent pixels around target pixel points is more than or equal to first threshold, is less than
Equal to second threshold;The first threshold is less than the second threshold;
Check that 8 adjacent pixel, the binary sequence of two neighboring pixel are around target pixel points in a clockwise direction
01 number is equal to third threshold value;
It is 0 there are the two-value of at least one pixel in 4 relatively nearest neighbor pixels of distance;The distance packet
It includes at a distance from the center to the center of the target pixel points of the pixel adjacent with target pixel points.
It is described that the style of writing information is passed through into the timing based on deep learning network in conjunction with a kind of above-mentioned image-recognizing method
It identifies engine, identifies the multiple character and intercharacter positional relationship information, comprising:
The style of writing information is subjected to spy by convolutional neural networks (Convolutional Neural Network, CNN)
Sign is extracted;
The feature of extraction is input in shot and long term memory network (Long Short-Term Memory, LSTM) and carries out word
Symbol identification, identifies the multiple character and intercharacter positional relationship information.
In conjunction with a kind of above-mentioned image-recognizing method, the shot and long term memory network LSTM is two-way LSTM.
It is described to include: to image progress binary conversion treatment in conjunction with a kind of above-mentioned image-recognizing method
Using maximum stable extremal region (Maximally Stable Extremal Regions, MSER) algorithm to figure
As carrying out binary conversion treatment.
In conjunction with a kind of above-mentioned image-recognizing method, the multiple character includes mathematic(al) representation;
It is described identify the multiple character and intercharacter positional relationship information after, further includes: according to what is identified
The multiple character exports La Taihe (LaTex) expression formula.
In conjunction with a kind of above-mentioned image-recognizing method, the style of writing information of extracting from the framework information includes:
It is traversed for the framework information according to connected domain, extracts style of writing characteristic point;Wherein in the feelings of stroke bifurcated
Under condition, the lesser style of writing characteristic point of deflection of advantage distillation and upper style of writing characteristic point.
The embodiment of the present invention discloses a kind of pattern recognition device on the other hand, comprising:
Processing unit obtains binary map for carrying out binary conversion treatment to image;The image includes multiple characters;
Extraction unit extracts the framework information of multiple character for carrying out skeletal extraction to the binary map;
Information unit is extracted, for extracting style of writing information from the framework information;The style of writing information includes style of writing characteristic point
Location information between number and adjacent style of writing characteristic point;
Recognition unit is used to identify by style of writing information described in the chronicle recognition engine analysis based on deep learning network
The multiple character and intercharacter positional relationship information.
In conjunction with a kind of above-mentioned pattern recognition device, the extraction unit is specifically used for being iterated corrosion to the binary map
It handles, the not new pixel of the binary map after relatively last corrosion is corroded;Wherein the corrosion of each iteration include according to
Pixel in the secondary traversal binary map, corrodes the pixel for meeting specified requirements.
In conjunction with a kind of above-mentioned pattern recognition device, the pixel for meeting specified requirements includes meeting following either condition
Target pixel points:
The number for the pixel that two-value is 1 in 8 adjacent pixels around target pixel points is more than or equal to first threshold, is less than
Equal to second threshold;The first threshold is less than the second threshold;
Check that 8 adjacent pixel, the binary sequence of two neighboring pixel are around target pixel points in a clockwise direction
01 number is equal to third threshold value;
It is 0 there are the two-value of at least one pixel in nearest neighbor pixel;The distance includes and target
Distance of the center of the adjacent pixel of pixel to the center of the target pixel points.
In conjunction with a kind of above-mentioned pattern recognition device, the recognition unit includes:
Feature extraction unit, for the style of writing information to be carried out feature extraction by convolutional neural networks CNN;
Character recognition unit carries out character recognition for the feature of extraction to be input in shot and long term memory network LSTM,
Identify multiple character and intercharacter positional relationship information.
In conjunction with a kind of above-mentioned pattern recognition device, the multiple character includes mathematic(al) representation;
The multiple character that the recognition unit output identifies includes: according to the multiple character output identified
LaTex expression formula.
In conjunction with a kind of above-mentioned pattern recognition device, the extraction information unit is specifically used for, and presses for the framework information
It is traversed according to connected domain, extracts style of writing characteristic point;Wherein in the case where stroke bifurcated, advantage distillation and upper style of writing feature
The lesser style of writing characteristic point of deflection of point.
The embodiment of the present invention discloses a kind of image recognition apparatus, including processor and memory on the other hand, described
Processor and memory are connected with each other, wherein for storing application code, the processor is configured the memory
For calling said program code, a kind of above-mentioned image-recognizing method is executed.
The embodiment of the present invention discloses a kind of computer readable storage medium, the computer storage medium on the other hand
It is stored with computer program, the computer program includes program instruction, and described program instruction makes institute when being executed by a processor
It states processor and executes a kind of such as above-mentioned image-recognizing method.
Implement the embodiment of the present invention, by carrying out skeletal extraction to binary map, extracts the framework information of multiple characters, so
Style of writing information is extracted from framework information afterwards, style of writing information is passed through into the chronicle recognition engine based on deep learning network, identification
Multiple characters and intercharacter positional relationship information are not necessarily to artificial design features, and do not need to do character separation, solve existing
There is technology for there are the character of adhesion, separation algorithms cannot be handled well, leads to the technical problem that recognition accuracy is low;It is special
Other embodiment of the present invention carries out the identification of numerical character by the deep learning identification model based on timing, will be mentioned by CNN
The feature taken inputs i.e. exportable LaTex expression formula in two-way LSTM network, does not need to be split the character of image, also not
Need to analyze the spatial relation of intercharacter, what these information were all obtained by the study of deep learning identification model, that is, it realizes
It identifies end to end, therefore the embodiment of the present invention is adapted to Various Complex scene, recognition accuracy is greatly improved.
Detailed description of the invention
In order to illustrate the embodiment of the present invention or technical solution in the prior art, embodiment or the prior art will be retouched below
Attached drawing needed in stating is briefly described.
Fig. 1 is the flow diagram of pattern recognition method provided in an embodiment of the present invention;
Fig. 2 is the schematic diagram of input picture provided in an embodiment of the present invention;
Fig. 3 is the schematic diagram of binary map provided in an embodiment of the present invention;
Fig. 4 is the schematic diagram that image framework provided in an embodiment of the present invention extracts;
Fig. 5 is the structural schematic diagram of pixel provided in an embodiment of the present invention;
Fig. 6 is the structural schematic diagram of the pixel of another embodiment provided by the invention;
Fig. 7 is the exemplary construction schematic diagram of the pixel of another embodiment provided by the invention;
Fig. 8 is the schematic diagram that the image framework of another embodiment provided by the invention extracts;
Fig. 9 a is the schematic diagram of style of writing information provided in an embodiment of the present invention;
Fig. 9 b is the schematic diagram of the style of writing information of another embodiment provided by the invention;
Figure 10 is the schematic illustration of chronicle recognition engine provided in an embodiment of the present invention;
Figure 11 is the structural schematic diagram of LSTM network provided in an embodiment of the present invention;
Figure 12 is the schematic illustration of the chronicle recognition engine of another embodiment provided by the invention;
Figure 13 is the structural schematic diagram of two-way LSTM network provided in an embodiment of the present invention;
Figure 14 is the structural schematic diagram of pattern recognition device provided in an embodiment of the present invention;
Figure 15 is the structural schematic diagram of recognition unit provided in an embodiment of the present invention;
Figure 16 is the structural schematic diagram of image recognition apparatus provided in an embodiment of the present invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention is described.
It is also understood that mesh of the term used in this description of the invention merely for the sake of description specific embodiment
And be not intended to limit the present invention.
It will be further appreciated that the term "and/or" used in description of the invention and the appended claims is
Refer to any combination and all possible combinations of one or more of associated item listed, and including these combinations.
In the specific implementation, terminal or equipment described in the embodiment of the present invention include but is not limited to such as desktop computer,
The portable mobile termianls such as laptop computer, tablet computer, intelligent terminal such as smart phone, smartwatch, intelligent glasses.
A kind of image-recognizing method, pattern recognition device and the image that embodiment provides in order to better understand the present invention are known
Other equipment is below first described the image recognition scene of the embodiment of the present invention.The image recognition of the embodiment of the present invention be
After pattern recognition device or image recognition apparatus get the image for needing to identify, which includes multiple characters, is e.g. counted
Formula is learned, the process of the character in image is identified and exported to the image.The character of output carries out letter convenient for related personnel
It ceases typing or carries out letter sorting or the relevant information etc. matching convenient for subsequent searches convenient for mail system.
A kind of image-recognizing method provided in an embodiment of the present invention, pattern recognition device and image are known with reference to the accompanying drawing
Other equipment is described in detail.The flow diagram of pattern recognition method provided in an embodiment of the present invention as shown in Figure 1, can be with
The following steps are included:
Step S100: binary conversion treatment is carried out to image, obtains binary map;
Specifically, the image in the embodiment of the present invention may include multiple characters;To the binaryzation (Image of image
It Binarization) is exactly to set 0 or 255 for the gray value of the pixel on image, to obtain binary map, that is, will
Whole image shows the process of apparent black and white effect.The pixel that gray value after two-value can be 0 by the embodiment of the present invention
Two-value be expressed as 0, the two-value of pixel that gray value is 255 is expressed as 1.
The present invention in one embodiment, Binarization methods can be maximum using the best Affinely invariant region of performance
Stable extremal region (Maximally Stable Extremal Regions, MSER) algorithm extracts connected region and mistake
Too small, excessive and abnormal length-width ratio region is filtered, binary map is exported.With specific reference to Fig. 2 shows the embodiment of the present invention mention
The schematic diagram of the input picture of confession, the image in Fig. 2 includes multiple characters, and multiple character constitutes a mathematic(al) representation;
After carrying out binary conversion treatment to the image by step S100, binary map provided in an embodiment of the present invention as shown in Figure 3 is obtained
Schematic diagram, output the image that apparent black and white effect is presented.
Step S102: skeletal extraction is carried out to the binary map, extracts the framework information of multiple character;
Specifically, the schematic diagram that image framework provided in an embodiment of the present invention as shown in Figure 4 extracts, image framework extract
Exactly extract the center pixel profile of target on the image, that is to say, that be to be subject to target's center, refine to target.Bone
Frame extraction algorithm can be divided into iteration and non-iterative two major classes, in iterative algorithm, and be divided into parallel iteration and sequential iteration two
Kind, etc..
The present invention in one embodiment, corrosion treatment can be iterated to the binary map, until relatively upper one
The not new pixel of binary map after secondary corrosion is corroded;Wherein each iteration corrosion includes successively traversing in the binary map
Pixel corrodes the pixel for meeting specified requirements.
It should be noted that the corrosion in the embodiment of the present invention can refer to the certain portions for removing image in morphology
Point, it can specifically refer to and delete the certain pixels of object bounds, then corroding to binary map can refer to two-value in binary map
For 1 pixel point deletion, that is to say, that so that the pixel that the two-value is 1 becomes the pixel that two-value is 0.
Specifically, the specified requirements, such as this hair can be arranged in the embodiment of the present invention according to the Skeleton demand of itself
The pixel for meeting specified requirements in bright may include the target pixel points for meeting following either condition:
The number for the pixel that two-value is 1 in 8 adjacent pixels around condition a, target pixel points is more than or equal to the first threshold
Value is less than or equal to second threshold;The first threshold is less than the second threshold;Specifically, following equation 1 can be referred to:
First threshold≤B (P1)≤second threshold formula 1
Wherein it is possible to which the structural schematic diagram of the pixel provided in an embodiment of the present invention with reference to shown in Fig. 5, P1 want for us
The target pixel points for judging whether to corrode (or leaving out), around P1 8 adjacent pixels labeled as P2, P3, P4, P5, P6, P7, P8,
P9;The embodiment of the present invention is by taking the two-value of pixel is 0 or 1 as an example, then B (P1) refers to central pixel point P1 (i.e. target picture
Vegetarian refreshments) around two-value is 1 in 8 adjacent pixels pixel number, that is, B (P1)=P2+P3+P4+P5+P6+P7+
P8+P9.The first threshold can be 2 in one of the embodiments, and second threshold can be 6.
Condition b, 8 adjacent pixel, the two-value of two neighboring pixel around target pixel points are checked in a clockwise direction
The number that sequence is 01 is equal to third threshold value;Specifically, following equation 2 can be referred to:
A (P1)=third threshold formula 2
Wherein it is possible to the structural schematic diagram of the pixel of another embodiment provided by the invention with reference to shown in Fig. 6, up time
Needle direction i.e. from P3 to P4 to P5 to P6, and so on to return to the direction of P3 from P2;A (P1) is to check in a clockwise direction
8 adjacent pixel around target pixel points, the number that the binary sequence of two neighboring pixel is 01.
The third threshold value can be 1 in one of the embodiments, then the present invention shown in Fig. 7 mentions by taking Fig. 7 as an example
The exemplary construction schematic diagram of the pixel of another embodiment supplied, can be seen that the two of two neighboring pixel from the example in left side
The number that value sequence is 01 is 2, is sequence 01 from P2 to P3, and is sequence 01 from P6 to P7, then not meeting condition b;And
The number that can be seen that the binary sequence of two neighboring pixel is 01 from the example on right side is 1, is only sequence from P9 to P2
01, then meeting condition b, then corrode the P1 point.
It is 0 there are the two-value of at least one pixel in condition c, 4 relatively nearest neighbor pixels of distance;It should be away from
From include the pixel adjacent with target pixel points center to the center of the target pixel points at a distance from.Specifically, Ke Yican
Examine following equation 3:
P2*P4*P6*P8=0 formula 3
Wherein it is possible to which the structural schematic diagram of the pixel provided in an embodiment of the present invention with reference to shown in above-mentioned Fig. 5, is with P1
Target pixel points, distance P1 relatively nearest neighbor pixel be P2, P4, P6 and P8 respectively, that is to say, that P2, P4, P6 and
The distance at the center of P1 is arrived at the center of P8 respectively, will less than P3, P5, P7 and P9 center arrive respectively P1 center distance;
Particularly ideally, the distance that the center of P1 is arrived at the center of P2, P4, P6 and P8 respectively is equal, and all for apart from most
Close neighbor pixel, i.e. condition of embodiment of the present invention c can also be in nearest neighbor pixel, there are at least one
The two-value of pixel is 0.Citing then meeting condition c, then corrodes the P1 point if the two-value of P2 is 0.If P2, P4, P6 and
The two-value of P8 is not 0, then does not meet condition c.
It is possible to further judge P2*P4*P6=0 or P4*P6*P8=0 is when current iteration is odd-times iteration
No establishment, works as establishment, then meets condition c, corrodes the P1 point;When current iteration is even-times iteration, P2*P4*P8 is judged
Whether=0 or P2*P6*P8=0 is true, works as establishment, then meets condition c, corrodes the P1 point.
By taking the binary map shown in Fig. 3 as an example, skeletal extraction is carried out by step S102, extracts the skeleton of multiple character
Information, obtained effect picture, the signal that can be extracted with reference to the image framework of another embodiment provided by the invention shown in Fig. 8
Figure realizes the Skeleton of character picture by the expansion of successive ignition, corrosion, so that the target in image becomes increasingly
Carefully.
Step S104: style of writing information is extracted from the framework information;
Specifically, the embodiment of the present invention extracts style of writing information by style of writing extraction algorithm from the framework information, such as schemes
The schematic diagram of style of writing information provided in an embodiment of the present invention shown in 9a, the style of writing information in the embodiment of the present invention may include pen
Touch the location information between feature point number and adjacent style of writing characteristic point;In Fig. 9 a, each point is style of writing characteristic point, phase
There are exist in positional relationship, such as Fig. 9 a from style of writing characteristic point a to adjacent style of writing characteristic point b between adjacent style of writing characteristic point
Positional relationship can indicate the deflection from style of writing characteristic point a to adjacent style of writing characteristic point b by Vector Message.
In one of the embodiments, from framework information extract style of writing information may include for the framework information according to
Connected domain is traversed, and style of writing characteristic point is extracted;Wherein in the case where stroke bifurcated, advantage distillation and upper style of writing characteristic point
The lesser style of writing characteristic point of deflection.Connected domain in the embodiment of the present invention can refer to the connected region of style of writing characteristic point;
Stroke bifurcated in the embodiment of the present invention can refer to that prolonging some direction since some style of writing characteristic point carries out style of writing characteristic point
Traversal when, when next connected style of writing characteristic point there are it is multiple when, then there is stroke bifurcated;Side in the embodiment of the present invention
Refer to existing deflection between style of writing characteristic point that current style of writing characteristic point is connected with upper one to angle, specifically can be from time
The angle in the direction in the direction and current style of writing characteristic point of traversal for the upper one connected style of writing characteristic point gone through.Specifically, as schemed
The schematic diagram of the style of writing information of another embodiment provided by the invention shown in 9b, style of writing information is the x's in Fig. 9 a in Fig. 9 b
The amplification of style of writing information shows figure, since style of writing characteristic point c, traverses next style of writing characteristic point d according to connected domain, works as style of writing
Characteristic point e comes into existence bifurcated, and bifurcated has style of writing characteristic point f, style of writing characteristic point g and style of writing characteristic point h, then first traversal side
The style of writing characteristic point f for being 0 degree to angle, the style of writing characteristic point g that secondly traversal direction angle is 90 degree, last traversal direction angle are 270
The style of writing characteristic point h of degree.
Step S106: by style of writing information described in the chronicle recognition engine analysis based on deep learning network, this is identified
Multiple characters and intercharacter positional relationship information.
Wherein, the chronicle recognition engine of the embodiment of the present invention can be using based on shot and long term memory network (Long Short-
Term Memory, LSTM) deep learning network.Specifically, after the style of writing information that input step S104 is obtained, network can be with
Feature is extracted by convolutional neural networks (Convolutional Neural Network, CNN), then the feature of extraction is inputted
It completes the identification of multiple character and intercharacter positional relationship information into LSTM network, and may finally export and identify
Multiple character.
The schematic illustration of chronicle recognition engine provided in an embodiment of the present invention as shown in Figure 10 can be referred to, input
Style of writing information includes the location information between style of writing feature point number and adjacent style of writing characteristic point, is extracted by CNN network 10
Then feature, the convolutional layer that the port number by 3*3 twice is 64 carry out the processing of pond layer, using the port number of 3*3 twice
For 128 convolutional layer, the processing of pond layer is then carried out, then the convolutional layer that the port number of 3*3 is 256 twice carries out pond layer
Then processing, the convolutional layer that finally port number of 3*3 is 512 twice carry out the feature that layer processing output in pond is extracted.The present invention
Embodiment is not limited to be not limited to the convolution of 3*3 in Figure 10, can also be 5*5 etc., the feature of extraction can be divided into multiple timing
Then the style of writing information of unit sequentially inputs LSTM network to complete the knowledge of multiple character and intercharacter positional relationship information
Not, multiple character that final output identifies.The structure of specific LSTM network can be with reference to the present invention as shown in Figure 11
The structural schematic diagram for the LSTM network that embodiment provides, by taking the image in Fig. 2 as an example, then 11 can be extracted from CNN network
The style of writing information of each timing unit is chronologically passed through well-designed referred to as " door " by the style of writing information of a timing unit
Structure removes or increases information into cell state, finally the i.e. exportable multiple character identified.
Implement the embodiment of the present invention, by carrying out skeletal extraction to binary map, extracts the framework information of multiple characters, so
Style of writing information is extracted from framework information afterwards, style of writing information is passed through into the chronicle recognition engine based on deep learning network, identification
Multiple characters and intercharacter positional relationship information are not necessarily to artificial design features, and do not need to do character separation, solve existing
There is technology for there are the character of adhesion, separation algorithms cannot be handled well, leads to the technical problem that recognition accuracy is low.
Still further, the principle signal of the chronicle recognition engine of another embodiment provided by the invention as shown in Figure 12
Scheme, the LSTM in the step S106 of the embodiment of the present invention can be two-way LSTM, specifically can be with reference to the present invention shown in Figure 13
The structural schematic diagram for the two-way LSTM network that embodiment provides can be with from CNN network then equally by taking the image in Fig. 2 as an example
The style of writing information of each timing unit is chronologically passed through well-designed be referred to as by the style of writing information for extracting 11 timing units
It removes for the structure of " door " or increases information into cell state, finally the i.e. exportable multiple character identified.
Multiple characters in the embodiment of the present invention may include mathematic(al) representation in one of the embodiments, then defeated
The multiple character identified out may include: according to the multiple character output LaTex expression formula identified.The present invention is implemented
Example carries out the identification of numerical character by the deep learning identification model based on timing, the feature extracted by CNN is inputted double
It is exportable LaTex expression formula into LSTM network, does not need to be split the character of image, does not also need analysis intercharacter
Spatial relation, these information all by deep learning identification model study obtain, that is, realize and identify end to end, because
This embodiment of the present invention is adapted to Various Complex scene, and recognition accuracy is greatly improved.
For the ease of better implementing the above scheme of the embodiment of the present invention, the present invention is also corresponding to be provided a kind of image and knows
Other device is described in detail with reference to the accompanying drawing:
The structural schematic diagram of pattern recognition device provided in an embodiment of the present invention as shown in Figure 14, pattern recognition device 14
It may include: processing unit 140, extraction unit 142, extract information unit 144 and recognition unit 146, wherein
Processing unit 140 is used to carry out binary conversion treatment to image, obtains binary map;The image includes multiple characters;
Extraction unit 142 is used to carry out skeletal extraction to the binary map, extracts the framework information of multiple character;
Information unit 144 is extracted to be used to extract style of writing information from the framework information;The style of writing information includes style of writing feature
Location information between point number and adjacent style of writing characteristic point;
Recognition unit 146 is used to know by style of writing information described in the chronicle recognition engine analysis based on deep learning network
Not Chu the multiple character and intercharacter positional relationship information, export the multiple character identified.
Wherein, extraction unit 142 is specifically used for being iterated corrosion treatment to the binary map, until relatively last corrosion
The not new pixel of binary map afterwards is corroded;Wherein each iteration corrosion includes the pixel successively traversed in the binary map
Point corrodes the pixel for meeting specified requirements.
The pixel that the embodiment of the present invention meets specified requirements may include the target pixel points for meeting following either condition:
The number for the pixel that two-value is 1 in 8 adjacent pixels around condition a, target pixel points is more than or equal to the first threshold
Value is less than or equal to second threshold;The first threshold is less than the second threshold;
Condition b, 8 adjacent pixel, the two-value of two neighboring pixel around target pixel points are checked in a clockwise direction
The number that sequence is 01 is equal to third threshold value;
It is 0 there are the two-value of at least one pixel in condition c, 4 relatively nearest neighbor pixels of distance;It should be away from
From include the pixel adjacent with target pixel points center to the center of the target pixel points at a distance from.
The present invention in one embodiment, extracting information unit 1404 can be specifically used for pressing for the framework information
It is traversed according to connected domain, extracts style of writing characteristic point;Wherein in the case where stroke bifurcated, advantage distillation and upper style of writing feature
The lesser style of writing characteristic point of deflection of point.
Specifically, extraction unit of the embodiment of the present invention 142 can be extracted from the framework information by style of writing extraction algorithm
Style of writing information out, the schematic diagram of the style of writing information provided in an embodiment of the present invention as shown in Fig. 9 a, the pen in the embodiment of the present invention
Touching information may include the location information between style of writing feature point number and adjacent style of writing characteristic point;In Fig. 9 a, Mei Gedian
As style of writing characteristic point, between adjacent style of writing characteristic point there are in positional relationship, such as Fig. 9 a from style of writing characteristic point a to adjacent
Style of writing characteristic point b there are positional relationship, can be indicated by Vector Message from style of writing characteristic point a to adjacent style of writing feature
The deflection of point b.
It may include being directed to be somebody's turn to do that extraction unit 142 extracts style of writing information from framework information in one of the embodiments,
Framework information is traversed according to connected domain, extracts style of writing characteristic point;Wherein in the case where stroke bifurcated, advantage distillation with it is upper
The lesser style of writing characteristic point of the deflection of one style of writing characteristic point.Specifically, another implementation provided by the invention as shown in Fig. 9 b
The schematic diagram of the style of writing information of example, style of writing information is that the amplification of the style of writing information of the x in Fig. 9 a shows figure in Fig. 9 b, from style of writing spy
Sign point c starts, and according to the connected domain next style of writing characteristic point d of traversal, when style of writing characteristic point e comes into existence bifurcated, bifurcated has pen
Characteristic point f, style of writing characteristic point g and style of writing characteristic point h are touched, then the style of writing characteristic point f that first traversal deflection is 0 degree, secondly
The style of writing characteristic point g that traversal direction angle is 90 degree, the style of writing characteristic point h that last traversal direction angle is 270 degree.
The present invention in one embodiment, the structure of recognition unit provided in an embodiment of the present invention as shown in Figure 15
Schematic diagram, recognition unit 146 may include feature extraction unit 1460 and character recognition unit 1462, wherein
Feature extraction unit 1460 is used to the style of writing information carrying out feature extraction by convolutional neural networks CNN;
Character recognition unit 1462 is used to for the feature of extraction to be input to progress character knowledge in shot and long term memory network LSTM
Not, multiple character and intercharacter positional relationship information are identified.
The present invention in one embodiment, shot and long term memory network LSTM can be two-way LSTM.
The present invention in one embodiment, multiple character may include mathematic(al) representation;
The chronicle recognition engine of the embodiment of the present invention can be using based on shot and long term memory network (Long Short-Term
Memory, LSTM) deep learning network.Specifically, after extracting the obtained style of writing information of information unit 144, network can be by
Convolutional neural networks (Convolutional Neural Network, CNN) are input to extract feature, then by the feature of extraction
The identification of multiple character and intercharacter positional relationship information, multiple word that final output identifies are completed in LSTM network
Symbol.The schematic illustration of chronicle recognition engine provided in an embodiment of the present invention shown in Figure 10
The schematic illustration of chronicle recognition engine provided in an embodiment of the present invention as shown in Figure 10 can be referred to, input
Style of writing information includes the location information between style of writing feature point number and adjacent style of writing characteristic point, is extracted by CNN network 10
Feature, the embodiment of the present invention are not limited to be not limited to the convolution of 3*3 in Figure 10, can also be 5*5 etc., feature extraction unit 1460
The feature of extraction can be divided into the style of writing information of multiple timing units, sequentially input LSTM network then to complete multiple character
And the identification of intercharacter positional relationship information, multiple character that final output identifies.The structure of specific LSTM network
The structural schematic diagram that LSTM network provided in an embodiment of the present invention as shown in Figure 11 can be referred to, by taking the image in Fig. 2 as an example,
The style of writing information of 11 timing units can so be extracted from CNN network, character recognition unit 1462 is by each timing unit
Style of writing information chronologically remove by the well-designed structure for being referred to as " door " or increase information into cell state,
It is finally the exportable multiple character identified.
Implement the embodiment of the present invention, by carrying out skeletal extraction to binary map, extracts the framework information of multiple characters, so
Style of writing information is extracted from framework information afterwards, style of writing information is passed through into the chronicle recognition engine based on deep learning network, identification
Multiple characters and intercharacter positional relationship information are not necessarily to artificial design features, and do not need to do character separation, solve existing
There is technology for there are the character of adhesion, separation algorithms cannot be handled well, leads to the technical problem that recognition accuracy is low.
Still further, the principle signal of the chronicle recognition engine of another embodiment provided by the invention as shown in Figure 12
Figure, the LSTM of the embodiment of the present invention can be two-way LSTM, specifically can be with reference to provided in an embodiment of the present invention shown in Figure 13
The structural schematic diagram of two-way LSTM network, then equally by taking the image in Fig. 2 as an example, when can extract 11 from CNN network
The style of writing information of sequence unit, character recognition unit 1462 chronologically pass through the style of writing information of each timing unit well-designed
The referred to as structure of " door " removes or increases information into cell state, finally the i.e. exportable multiple word identified
Symbol.
Multiple characters in the embodiment of the present invention may include mathematic(al) representation in one of the embodiments, then knowing
It may include: according to the multiple character output LaTex expression identified that other unit 146, which exports the multiple character identified,
Formula.The embodiment of the present invention carries out the identification of numerical character by the deep learning identification model based on timing, will be extracted by CNN
Feature input in two-way LSTM network i.e. exportable LaTex expression formula, do not need to be split the character of image, be not required to yet
The spatial relation of intercharacter is analyzed, what these information were all obtained by the study of deep learning identification model, that is, realize end
To the identification at end, therefore the embodiment of the present invention is adapted to Various Complex scene, and recognition accuracy is greatly improved.
For the ease of better implementing the above scheme of the embodiment of the present invention, the present invention is also corresponding to be provided a kind of image and knows
Other equipment is described in detail with reference to the accompanying drawing:
The structural schematic diagram of image recognition apparatus provided in an embodiment of the present invention as shown in Figure 16, image recognition apparatus 16
It may include processor 161, input unit 162, recognition unit 163, memory 164 and communication unit 165, it is processor 161, defeated
Entering unit 162, recognition unit 163, memory 164 and communication unit 165 can be connected with each other by bus 166.Memory 164
It can be high speed RAM memory, be also possible to non-volatile memory (non-volatile memory), for example, at least one
A magnetic disk storage, memory 704 include the flash in the embodiment of the present invention.Memory 164 optionally can also be at least one
A storage system for being located remotely from aforementioned processor 161.Memory 164 may include operation for storing application code
System, network communication module, Subscriber Interface Module SIM and image recognition program, communication unit 165 are used to carry out with external unit
Information exchange;Processor 161 is configured for calling the program code, executes following steps:
Binary conversion treatment is carried out to the image of input, obtains binary map;The image includes multiple characters;
Skeletal extraction is carried out to the binary map, extracts the framework information of multiple character;
Style of writing information is extracted from the framework information;The style of writing information includes that style of writing feature point number and adjacent style of writing are special
Location information between sign point;
The style of writing information is passed through into the chronicle recognition engine based on deep learning network, identifies multiple character and character
Between position relation information, export the multiple character identified.
In one of them embodiment, processor 161 carries out skeletal extraction to the binary map, may include:
Corrosion treatment is iterated to the binary map, the not new pixel of the binary map after relatively last corrosion
It is corroded;Wherein each iteration corrosion includes the pixel successively traversed in the binary map, to the pixel for meeting specified requirements
Corroded.
In one of them embodiment, the pixel for meeting specified requirements includes the target picture for meeting following either condition
Vegetarian refreshments:
The number for the pixel that two-value is 1 in 8 adjacent pixels around target pixel points is more than or equal to first threshold, is less than
Equal to second threshold;The first threshold is less than the second threshold;
Check that 8 adjacent pixel, the binary sequence of two neighboring pixel are around target pixel points in a clockwise direction
01 number is equal to third threshold value;
It is 0 there are the two-value of at least one pixel in 4 relatively nearest neighbor pixels of distance;The distance includes
With target pixel points at a distance from the center of adjacent pixel to the center of the target pixel points.
In one of them embodiment, which is passed through the timing based on deep learning network by processor 161
It identifies engine, identifies multiple character and intercharacter positional relationship information, may include:
The style of writing information is subjected to feature extraction by convolutional neural networks CNN;
The feature of extraction is input in shot and long term memory network LSTM and carries out character recognition, identify multiple character and
Intercharacter positional relationship information.
In one of them embodiment, shot and long term memory network LSTM is two-way LSTM.
In one of them embodiment, multiple characters may include mathematic(al) representation;
Processor 161 exports the multiple character identified, may include: according to the multiple character output identified
LaTex expression formula.
In one of them embodiment, processor 161 extracts style of writing information from the framework information and may include:
It is traversed for the framework information according to connected domain, extracts style of writing characteristic point;Wherein the stroke bifurcated the case where
Under, the lesser style of writing characteristic point of deflection of advantage distillation and upper style of writing characteristic point.
Implement the embodiment of the present invention, by carrying out skeletal extraction to binary map, extracts the framework information of multiple characters, so
Style of writing information is extracted from framework information afterwards, style of writing information is passed through into the chronicle recognition engine based on deep learning network, identification
Multiple characters and intercharacter positional relationship information are not necessarily to artificial design features, and do not need to do character separation, solve existing
There is technology for there are the character of adhesion, separation algorithms cannot be handled well, leads to the technical problem that recognition accuracy is low;It is special
Other embodiment of the present invention carries out the identification of numerical character by the deep learning identification model based on timing, will be mentioned by CNN
The feature taken inputs i.e. exportable LaTex expression formula in two-way LSTM network, does not need to be split the character of image, also not
Need to analyze the spatial relation of intercharacter, what these information were all obtained by the study of deep learning identification model, that is, it realizes
It identifies end to end, therefore the embodiment of the present invention is adapted to Various Complex scene, recognition accuracy is greatly improved.
Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, being can be with
Relevant hardware is instructed to complete by computer program, the program can be stored in a computer-readable storage medium
In, the program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein, the storage medium can be magnetic
Dish, CD, read-only memory (Read-Only Memory, ROM) or random access memory (Random Access
Memory, RAM) etc..
The above disclosure is only the preferred embodiments of the present invention, cannot limit the right model of the present invention with this certainly
It encloses, therefore equivalent changes made in accordance with the claims of the present invention, is still within the scope of the present invention.
Claims (15)
1. a kind of image-recognizing method characterized by comprising
Binary conversion treatment is carried out to image, obtains binary map;Described image includes multiple characters;
Skeletal extraction is carried out to the binary map, extracts the framework information of the multiple character;
Style of writing information is extracted from the framework information;The style of writing information includes that style of writing feature point number and adjacent style of writing are special
Location information between sign point;
By style of writing information described in the chronicle recognition engine analysis based on deep learning network, identify the multiple character and
Intercharacter positional relationship information.
2. the method as described in claim 1, which is characterized in that described to carry out skeletal extraction to the binary map, comprising:
Corrosion treatment is iterated to the binary map, the not new pixel quilt of the binary map after relatively last corrosion
Corrosion;Wherein each iteration corrosion includes the pixel successively traversed in the binary map, to the pixel for meeting specified requirements
Corroded.
3. method according to claim 2, which is characterized in that the pixel for meeting specified requirements includes meeting following appoint
The target pixel points of one condition:
The number for the pixel that two-value is 1 in 8 adjacent pixels around target pixel points is more than or equal to first threshold, is less than or equal to
Second threshold;The first threshold is less than the second threshold;
Check that 8 adjacent pixel around target pixel points, the binary sequence of two neighboring pixel are 01 in a clockwise direction
Number is equal to third threshold value;
It is 0 there are the two-value of at least one pixel in 4 relatively nearest neighbor pixels of distance;The distance include with
Distance of the center of the adjacent pixel of target pixel points to the center of the target pixel points.
4. the method as described in claim 1, which is characterized in that described to pass through the style of writing information based on deep learning network
Chronicle recognition engine, identify the multiple character and intercharacter positional relationship information, comprising:
The style of writing information is subjected to feature extraction by convolutional neural networks CNN;
The feature of extraction is input in shot and long term memory network LSTM and carries out character recognition, identifies the multiple character and word
Position relation information between symbol.
5. the method as described in claim 1, which is characterized in that described to include: to image progress binary conversion treatment
Binary conversion treatment is carried out to image using maximum stable extremal region MSER algorithm.
6. method as claimed in claim 4, which is characterized in that the multiple character includes mathematic(al) representation;
It is described identify the multiple character and intercharacter positional relationship information after, further includes: according to identifying
Multiple characters export LaTex expression formula.
7. the method as described in claim 1, which is characterized in that it is described from the framework information extract style of writing information include:
It is traversed for the framework information according to connected domain, extracts style of writing characteristic point;Wherein in the case where stroke bifurcated,
The lesser style of writing characteristic point of deflection of advantage distillation and upper style of writing characteristic point.
8. a kind of pattern recognition device characterized by comprising
Processing unit obtains binary map for carrying out binary conversion treatment to image;The image includes multiple characters;
Extraction unit extracts the framework information of multiple character for carrying out skeletal extraction to the binary map;
Information unit is extracted, for extracting style of writing information from the framework information;The style of writing information includes style of writing feature point number
And the location information between adjacent style of writing characteristic point;
Recognition unit is used to identify described by style of writing information described in the chronicle recognition engine analysis based on deep learning network
Multiple characters and intercharacter positional relationship information.
9. device as claimed in claim 8, which is characterized in that the extraction unit is specifically used for changing to the binary map
For corrosion treatment, the not new pixel of binary map after relatively last corrosion is corroded;Wherein each iteration corrosion
Including successively traversing the pixel in the binary map, the pixel for meeting specified requirements is corroded.
10. device as claimed in claim 9, which is characterized in that the pixel for meeting specified requirements includes below meeting
The target pixel points of either condition:
The number for the pixel that two-value is 1 in 8 adjacent pixels around target pixel points is more than or equal to first threshold, is less than or equal to
Second threshold;The first threshold is less than the second threshold;
Check that 8 adjacent pixel around target pixel points, the binary sequence of two neighboring pixel are 01 in a clockwise direction
Number is equal to third threshold value;
It is 0 there are the two-value of at least one pixel in 4 relatively nearest neighbor pixels of distance;The distance include with
Distance of the center of the adjacent pixel of target pixel points to the center of the target pixel points.
11. device as claimed in claim 8, which is characterized in that the recognition unit includes:
Feature extraction unit, for the style of writing information to be carried out feature extraction by convolutional neural networks CNN;
Character recognition unit carries out character recognition for the feature of extraction to be input in shot and long term memory network LSTM, identification
Multiple character and intercharacter positional relationship information.
12. device as claimed in claim 11, which is characterized in that the multiple character includes mathematic(al) representation;
The recognition unit is also used to according to the multiple character output LaTex expression formula identified.
13. device as claimed in claim 8, which is characterized in that the extraction information unit is specifically used for, for the skeleton
Information is traversed according to connected domain, extracts style of writing characteristic point;Wherein in the case where stroke bifurcated, advantage distillation with upper one
Touch the lesser style of writing characteristic point of deflection of characteristic point.
14. a kind of image recognition apparatus, which is characterized in that including processor and memory, the processor and memory are mutual
Connection, wherein the memory is configured for calling described program generation for storing application code, the processor
Code executes the method according to claim 1 to 7.
15. a kind of computer readable storage medium, which is characterized in that the computer storage medium is stored with computer program,
The computer program includes program instruction, and described program instruction makes the processor execute such as right when being executed by a processor
It is required that the described in any item methods of 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810274802.XA CN110147785B (en) | 2018-03-29 | 2018-03-29 | Image recognition method, related device and equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810274802.XA CN110147785B (en) | 2018-03-29 | 2018-03-29 | Image recognition method, related device and equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110147785A true CN110147785A (en) | 2019-08-20 |
CN110147785B CN110147785B (en) | 2023-01-10 |
Family
ID=67588309
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810274802.XA Active CN110147785B (en) | 2018-03-29 | 2018-03-29 | Image recognition method, related device and equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110147785B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111104945A (en) * | 2019-12-17 | 2020-05-05 | 上海博泰悦臻电子设备制造有限公司 | Object identification method and related product |
CN111428593A (en) * | 2020-03-12 | 2020-07-17 | 北京三快在线科技有限公司 | Character recognition method and device, electronic equipment and storage medium |
CN112800987A (en) * | 2021-02-02 | 2021-05-14 | 中国联合网络通信集团有限公司 | Chinese character processing method and device |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1996347A (en) * | 2006-09-14 | 2007-07-11 | 浙江大学 | Visualized reproduction method based on handwriting image |
US20140079297A1 (en) * | 2012-09-17 | 2014-03-20 | Saied Tadayon | Application of Z-Webs and Z-factors to Analytics, Search Engine, Learning, Recognition, Natural Language, and Other Utilities |
CN104408455A (en) * | 2014-11-27 | 2015-03-11 | 上海理工大学 | Adherent character partition method |
CN105512692A (en) * | 2015-11-30 | 2016-04-20 | 华南理工大学 | BLSTM-based online handwritten mathematical expression symbol recognition method |
CN105654127A (en) * | 2015-12-30 | 2016-06-08 | 成都数联铭品科技有限公司 | End-to-end-based picture character sequence continuous recognition method |
CN106407971A (en) * | 2016-09-14 | 2017-02-15 | 北京小米移动软件有限公司 | Text recognition method and device |
CN107273897A (en) * | 2017-07-04 | 2017-10-20 | 华中科技大学 | A kind of character recognition method based on deep learning |
CN107403180A (en) * | 2017-06-30 | 2017-11-28 | 广州广电物业管理有限公司 | A kind of numeric type equipment detection recognition method and system |
-
2018
- 2018-03-29 CN CN201810274802.XA patent/CN110147785B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1996347A (en) * | 2006-09-14 | 2007-07-11 | 浙江大学 | Visualized reproduction method based on handwriting image |
US20140079297A1 (en) * | 2012-09-17 | 2014-03-20 | Saied Tadayon | Application of Z-Webs and Z-factors to Analytics, Search Engine, Learning, Recognition, Natural Language, and Other Utilities |
CN104408455A (en) * | 2014-11-27 | 2015-03-11 | 上海理工大学 | Adherent character partition method |
CN105512692A (en) * | 2015-11-30 | 2016-04-20 | 华南理工大学 | BLSTM-based online handwritten mathematical expression symbol recognition method |
CN105654127A (en) * | 2015-12-30 | 2016-06-08 | 成都数联铭品科技有限公司 | End-to-end-based picture character sequence continuous recognition method |
CN106407971A (en) * | 2016-09-14 | 2017-02-15 | 北京小米移动软件有限公司 | Text recognition method and device |
CN107403180A (en) * | 2017-06-30 | 2017-11-28 | 广州广电物业管理有限公司 | A kind of numeric type equipment detection recognition method and system |
CN107273897A (en) * | 2017-07-04 | 2017-10-20 | 华中科技大学 | A kind of character recognition method based on deep learning |
Non-Patent Citations (6)
Title |
---|
ADNAN UL-HASAN: "Generic Text Recognition using Long Short-Term Memory Networks", 《RESERCHGATE》 * |
JUN LIU 等: "Skeleton-Based Human Action Recognition with Global Context-Aware Attention LSTM Networks", 《IEEE TRANSACTIONS ON IMAGE PROCESSING》 * |
RONALDO MESSINA 等: "Segmentation-free Handwritten Chinese Text Recognition with LSTM-RNN", 《2015 13TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR)》 * |
SUN YAN 等: "A Character Recognition Algorithm for Unhealthy-text Embedded in Web Images", 《PROCEEDINGS OF 14TH YOUTH CONFERENCE ON COMMUNICATION》 * |
张九龙 等: "一种书法字骨架提取优化方法", 《西安理工大学学报》 * |
陈睿 等: "基于二维RNN的CAPTCHA识别", 《小型微型计算机系统》 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111104945A (en) * | 2019-12-17 | 2020-05-05 | 上海博泰悦臻电子设备制造有限公司 | Object identification method and related product |
CN111428593A (en) * | 2020-03-12 | 2020-07-17 | 北京三快在线科技有限公司 | Character recognition method and device, electronic equipment and storage medium |
CN112800987A (en) * | 2021-02-02 | 2021-05-14 | 中国联合网络通信集团有限公司 | Chinese character processing method and device |
CN112800987B (en) * | 2021-02-02 | 2023-07-21 | 中国联合网络通信集团有限公司 | Chinese character processing method and device |
Also Published As
Publication number | Publication date |
---|---|
CN110147785B (en) | 2023-01-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110738207B (en) | Character detection method for fusing character area edge information in character image | |
CN111985464B (en) | Court judgment document-oriented multi-scale learning text recognition method and system | |
US20200065601A1 (en) | Method and system for transforming handwritten text to digital ink | |
CN106980856B (en) | Formula identification method and system and symbolic reasoning calculation method and system | |
WO2022142611A1 (en) | Character recognition method and apparatus, storage medium and computer device | |
WO2020164278A1 (en) | Image processing method and device, electronic equipment and readable storage medium | |
JP2014132453A (en) | Word detection for optical character recognition constant to local scaling, rotation and display position of character in document | |
CN110852311A (en) | Three-dimensional human hand key point positioning method and device | |
CN110147785A (en) | Image-recognizing method, relevant apparatus and equipment | |
CN109685065A (en) | Printed page analysis method, the system of paper automatic content classification | |
JP2019102061A5 (en) | ||
CN111242109A (en) | Method and device for manually fetching words | |
CN110414622B (en) | Classifier training method and device based on semi-supervised learning | |
CN109190615B (en) | Shape-near word recognition determination method, device, computer device and storage medium | |
CN112597940B (en) | Certificate image recognition method and device and storage medium | |
CN111401360B (en) | Method and system for optimizing license plate detection model, license plate detection method and system | |
CN111291712B (en) | Forest fire recognition method and device based on interpolation CN and capsule network | |
CN104598289A (en) | Recognition method and electronic device | |
CN108345943B (en) | Machine learning identification method based on embedded coding and contrast learning | |
Jia et al. | Grayscale-projection based optimal character segmentation for camera-captured faint text recognition | |
Bose et al. | Light Weight Structure Texture Feature Analysis for Character Recognition Using Progressive Stochastic Learning Algorithm | |
CN112200216A (en) | Chinese character recognition method, device, computer equipment and storage medium | |
CN113065480B (en) | Handwriting style identification method and device, electronic device and storage medium | |
CN113343983B (en) | License plate number recognition method and electronic equipment | |
Guruprasad | Handwritten Devanagari word recognition using robust invariant feature transforms |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |