CN107273897A - A kind of character recognition method based on deep learning - Google Patents
A kind of character recognition method based on deep learning Download PDFInfo
- Publication number
- CN107273897A CN107273897A CN201710538785.1A CN201710538785A CN107273897A CN 107273897 A CN107273897 A CN 107273897A CN 201710538785 A CN201710538785 A CN 201710538785A CN 107273897 A CN107273897 A CN 107273897A
- Authority
- CN
- China
- Prior art keywords
- input
- layer
- characteristic pattern
- feature vector
- pixel
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/62—Text, e.g. of license plates, overlay texts or captions on TV images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/28—Character recognition specially adapted to the type of the alphabet, e.g. Latin alphabet
- G06V30/287—Character recognition specially adapted to the type of the alphabet, e.g. Latin alphabet of Kanji, Hiragana or Katakana characters
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Biophysics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Evolutionary Biology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Character Discrimination (AREA)
Abstract
The invention discloses a kind of character recognition method based on deep learning, this method includes structure and training stage of the structure stage of spatial alternation layer with deep layer convolutional neural networks.Spatial alternation layer includes three parts, and positioning network receives characteristic pattern as input, by a series of hidden layers, then exports the parameter of spatial alternation, the parameter will be used on characteristic pattern;The parameter that mesh generator is generated using Part I produces sampling grid;Characteristic pattern and sampling grid as input, are sampled, finally obtain output characteristic figure result by sampler on mesh point to characteristic pattern.Spatial alternation layer differentiable, can carry out spatial manipulation, so as to make e-learning to the consistency to spatial warping, it is to avoid need the process of manually generated a large amount of deformation samples in traditional convolution network training to view data in a network by it.In addition, volume and neutral net by building deeper, have more preferable recognition effect for the various Chinese character of classification.
Description
Technical field
The invention belongs to the field of character recognition in pattern-recognition, more particularly, to a kind of text based on deep learning
Word recognition methods.
Background technology
As continuing to develop for modern science and technology is widely available with internet, we will be touched with all kinds of daily
The magnanimity information resource that form is presented, particularly among our life study and works usually, is often difficult to need with avoiding
Substantial amounts of text information is handled, and is entered into computer.Therefore, how rapidly and accurately these words to be believed
Breath is become for a urgent problem among being entered into each class of electronic devices such as computer.Optical character identification is (referred to as
OCR) refer to by automatically extracting out the word in picture by machinery equipment, and convert thereof into the word that machine can be edited
A kind of technology.
In general, traditional Chinese characters recognition method mainly includes data prediction, feature extraction and Classification and Identification three
Point.
(1) pre-process.The effect of pretreatment is to strengthen useful image information, removes noise, so as to be conducive to feature to carry
Take.The process is performed by means such as binaryzation, smoothing denoising and normalization.Wherein, binaryzation is to realize gray scale text diagram
Conversion as arriving binaryzation text image;Denoising is that the isolated point (stain) in image is removed by after binaryzation;Normalization is rule
The size of model essay word, position shape are to reduce the deformation between identical characters.
(2) feature extraction.Feature extraction is divided into 2 major classes:Architectural feature is extracted to be extracted with statistical nature.Structure-based spy
Levy extraction to refer to, extract the character pixels information on character outline or skeleton, such as stroke feature, profile, surrounding features, part
Font change can be effectively adapted to Deng, this method, it is strong to distinguish similar character ability, but there are various interference in image text, such as
Inclination, distortion, fracture, adhesion, 5 points etc., such method antijamming capability is weaker.That is extracted after mathematic(al) manipulation is carried out to sample
Feature, is referred to as statistical nature.The method mainly used have wavelet transformation, Fourier transformation, frequency-domain transform, square, discrete cosine
Conversion etc..The feature extracted is typically supplied statistical sorter and used.In general, the identification opposed configuration feature of statistical nature
Subdivision ability is weaker, distinguishes the indifferent of similar character.
(3) Classification and Identification.Sample is obtained to feature extraction during Classification and Identification, is identified by the classifying rules of foundation.
Grader is the key problem of Classification and Identification, and the effect of grader is to speed up matching speed, improves recognition efficiency, reaches identification effect
Really.
But the recognition methods of tradition Chinese character font has deficiency, because the complexity of chinese character, feature extracting method
Changeable character outline can not be handled, Feature Points Extraction needs human expert to define important characteristic point position, moreover, right
Unified standard can not be provided in the importance of those characteristic points, so as to cause Text region accuracy rate relatively low.
The content of the invention
For the disadvantages described above or Improvement requirement of prior art, depth is based on object of the present invention is to provide one kind
The character recognition method of habit, thus solves the relatively low technical problem of recognition accuracy of the current character recognition method to word.
To achieve the above object, according to one aspect of the present invention, there is provided a kind of Text region based on deep learning
Method, including:Spatial alternation layer building stage and deep layer convolutional neural networks are built and the training stage;
The spatial alternation layer building stage includes:
The characteristic pattern that network receives input is positioned, by serial hidden layer, spatial transformation parameter is exported, wherein, the parameter is
The parameter that transforming function transformation function is acted on characteristic pattern;
The spatial transformation parameter that mesh generator is exported using positioning network produces sampling grid;
Sampler using the characteristic pattern and sampling grid of input as input, to the characteristic pattern of input on sampling network lattice point
Sampled, finally obtain output characteristic figure result;
The deep layer convolutional neural networks are built to be included with the training stage:
The structure of deep layer convolutional neural networks is built, the spatial alternation layer of structure is arranged on deep layer convolutional neural networks
Most starting position obtains target deep layer convolutional neural networks;
Target deep layer convolutional neural networks are trained using stochastic gradient descent method, and then obtain character recognition mould
Type, the character recognition model is used to carry out Text region to the character image to be identified of input.
Preferably, the positioning network includes two convolutional layers, and the convolution nuclear volume of each convolutional layer is M, and size is N, step
A length of s, is respectively provided with a maximum pond layer after each convolutional layer, and the pond layer size is L, and step-length is t, in each pond
One ReLU layers are respectively provided with after layer, a full articulamentum is set after second ReLU layers, one is set after full articulamentum
ReLU layers, then last layer is also full articulamentum, and for exporting spatial transformation parameter, dimension is d.
Preferably, the spatial transformation parameter that the mesh generator is exported using positioning network produces sampling grid, wraps
Include:
ByThe output pixel after each pixel transform in input feature vector figure is obtained, by institute
Sampling grid in output characteristic figure is constituted by output pixel, wherein,Represent in input feature vector figure in ith pixel
Source coordinate,Represent the sampling grid in output characteristic figure in the coordinates of targets of ith pixel, AθRepresent affine to become
Matrix is changed, for the spatial transformation parameter of positioning network output, GiRepresent the set of pixels in sampling grid.
Preferably, the sampler using the characteristic pattern and sampling grid of input as input, it is right on sampling network lattice point
The characteristic pattern of input is sampled, and finally obtains output characteristic figure result, including:
ByObtain output characteristic
The corresponding coordinate points of each pixel in figurePixel value, wherein, ΦxAnd ΦyFor sampling kernel k () parameter,It is
The pixel value of the coordinate points (n, m) of c passages ith pixel in input feature vector figure of input feature vector figure, Vi cIt is output characteristic figure
C passages in output characteristic figure coordinate pointsOutput pixel value, W represents the width of input feature vector figure, and H represents input
The height of characteristic pattern, C represents the port number of input feature vector figure.
Preferably, the sampler using the characteristic pattern and sampling grid of input as input, it is right on sampling network lattice point
The characteristic pattern of input is sampled, and finally obtains output characteristic figure result, including:
ByObtain
The corresponding coordinate points of each pixel in output characteristic figurePixel value, wherein,Expression is rounded downwards, δ () expressions gram
Luo Neike functions,It is the pixel of the coordinate points (n, m) of c passages ith pixel in input feature vector figure of input feature vector figure
Value, Vi cBe output characteristic figure c passages in output characteristic figure coordinate pointsOutput pixel value, W represents input feature vector
The width of figure, H represents the height of input feature vector figure, and C represents the port number of input feature vector figure.
Preferably, the sampler using the characteristic pattern and sampling grid of input as input, it is right on sampling network lattice point
The characteristic pattern of input is sampled, and finally obtains output characteristic figure result, including:
By
The corresponding coordinate points of each pixel into output characteristic figurePixel value, wherein,It is the c passages of input feature vector figure
The pixel value of coordinate points (n, m), V in input feature vector figurei cBe output characteristic figure c passages in output characteristic figure i-th of picture
The coordinate points of elementOutput pixel value, W represents the width of input feature vector figure, and H represents the height of input feature vector figure, and C represents defeated
Enter the port number of characteristic pattern.
In general, by the contemplated above technical scheme of the present invention compared with prior art, it can obtain down and show
Beneficial effect:Character recognition method proposed by the present invention based on deep learning by spatial alternation layer by being incorporated into convolutional Neural net
In network, various spatial alternations can be actively carried out to input character image in a network, and without carrying out volume to optimization process
Outer training supervision or modification.As a result show, utilization space transform layer can make model learning to translation, scaling, rotation
And more general spatial warping consistency, can preferably it recognize in the presence of the character more substantially deformed.
Brief description of the drawings
Fig. 1 is a kind of schematic flow sheet of the character recognition method based on deep learning disclosed in the embodiment of the present invention;
Fig. 2 is a kind of structural representation of spatial alternation layer disclosed in the embodiment of the present invention.
Embodiment
In order to make the purpose , technical scheme and advantage of the present invention be clearer, it is right below in conjunction with drawings and Examples
The present invention is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and
It is not used in the restriction present invention.As long as in addition, technical characteristic involved in each embodiment of invention described below
Not constituting conflict each other can just be mutually combined.
A kind of character recognition method based on deep learning disclosed by the invention, devises a kind of spatial alternation volume of deep layer
Product neutral net, actively can carry out various spatial alternations, so as to reach the enhanced purpose of data, together to the character picture of input
The ability of Shi Tisheng cyberspace consistency, has relatively higher recognition accuracy for chinese character.
It is a kind of flow signal of character recognition method based on deep learning disclosed in the embodiment of the present invention as shown in Figure 1
Figure, includes the structure for building stage and deep layer convolutional neural networks in two stages, i.e. spatial alternation layer in the method shown in Fig. 1
Build and the training stage, the two stages are specifically described below.
(A) the spatial alternation layer building stage includes:
The characteristic pattern that network receives input is positioned, by serial hidden layer, spatial transformation parameter is exported, wherein, the parameter is
The parameter that transforming function transformation function is acted on characteristic pattern;
Wherein, network is positioned by characteristic patternAs input, a width of W, a height of H, port number is C, is output as
θ, θ are transforming function transformation function TθThe parameter acted on characteristic pattern:θ=floc(U).θ form can be with various, after parametrization
Alternative types, such as affine transformation, then θ be exactly one 6 dimension output.
Position network function floc() can be any form, such as fully-connected network or convolutional neural networks, but finally
Layer, which will be returned, comprising one is used to generate transformation parameter θ.
In the present invention, the positioning network includes two convolutional layers, and the convolution nuclear volume of each convolutional layer is M, and size is
N, step-length is s, and a maximum pond layer is respectively provided with after each convolutional layer, and the pond layer size is L, and step-length is t, each
An activation primitive (Rectified Linear Units, ReLU) layer is respectively provided with after the layer of pond, is set after second ReLU layers
A full articulamentum is put, one ReLU layers are set after full articulamentum, then last layer is also full articulamentum, for exporting sky
Between transformation parameter, dimension is d.Preferably, M values are that 20, N values are that 5, s values are that 1, L values are that 2, t values are 2, Quan Lian
The output classification for connecing layer is preferably 20.
The spatial transformation parameter that mesh generator is exported using positioning network produces sampling grid;
Wherein, in order to carry out various modifications processing to input feature vector figure, each output pixel is entered by a sampling kernel
Row calculating is obtained, and the kernel is centered on an ad-hoc location in input feature vector figure.Here input pixel is referred to commonly
Characteristic pattern in a pixel, must be not necessarily original image.Generally, output pixel is in set of pixels
Sampling grid G={ GiIn be defined, so as to produce output characteristic figureWherein H' and W' represent sampling network
The height and width of lattice, C represent port number.
Assuming that TθIt is one 2 dimension affine transformation Aθ, then conversion is as shown in formula (1) pixel-by-pixel.
WhereinBe in output characteristic figure sampling grid ith pixel coordinates of targets,It is definition
Source coordinate in the input feature vector figure of ith sample point, AθIt is affine transformation matrix.After we are using height and the width normalization
Coordinate so that have in the space boundary of outputHave in the space boundary of input(being also similar for y-coordinate).Source/object transformation and sampling operation and the standard texture in graphics
Map and coordinate is equivalent.
Can be to conversion TθClassification do more limitations, such as, work as transformation matrixWhen, can be with
By adjusting s, txAnd tyTo realize the operations such as cutting, translation and scaling.In fact, conversion can include any parametrization
Form, but have a condition is exactly it relative to parameter can differentiable, this point is very crucial, it can allow by gradient from
Sampled point Tθ(Gi) positioning network is propagated backward to, and then obtain parameter θ.
Sampler using the characteristic pattern and sampling grid of input as input, to the characteristic pattern of input on sampling network lattice point
Sampled, finally obtain output characteristic figure result;
Wherein, in order to carry out spatial alternation on input feature vector figure, sampler needs to use sampling point set in characteristic pattern U
Tθ(G) sampled, so that the output characteristic figure V after being sampled.Tθ(G) each inCoordinate definition is defeated
Enter the locus of characteristic pattern, sampled in input feature vector figure using sampling kernel, obtain output characteristic figure specific pixel
Shown in value, such as formula (2).
Wherein ΦxAnd ΦyFor universal sample kernel k () parameter, the interpolation operation that they define image is (such as linear
Interpolation),It is the value of c passages coordinate points (n, m) in input feature vector figure of input feature vector figure, Vi cIt is that c passages are special in output
Levy the coordinate of ith pixel in figureOutput pixel value.Note, be to each passage of input feature vector figure
Exactly the same sampling processing, thus each passage be changed in an identical manner (so doing can keep logical
Space Consistency between road).
In theory, as long as can be rightAndSub- gradient is defined, any type of sampling kernel can be made by bringing
With.Such as, formula (2) can be converted into formula (3) using integer samples kernel.
WhereinX is rounded up to immediate integer by expression.δ () represents Kronecker function.The sampling
Kernel is equal to distanceThe value of nearest pixel copies to outgoing positionOr two-wire can also be used
Property sampling kernel, such as shown in formula (4).
In order to be able to realize the backpropagation of loss in this sampling mechanism, we can define the ladder relative to U and G
Degree, for bilinearity sampling (4), partial derivative such as formula (5), shown in (6).
Calculation with (6) it is similar.
Network is positioned, mesh generator forms spatial alternation layer with sampler three, is illustrated in figure 2 spatial alternation layer
Structural representation, this is a completely self-contained module, can be disposed in any amount in convolutional neural networks appoint
What position, and then obtain spatial alternation network.The module calculating speed is fast, will not have an impact to training speed, time overhead is small.
(B) deep layer convolutional neural networks build and included with the training stage:
The structure of deep layer convolutional neural networks is built, the spatial alternation layer of structure is arranged on deep layer convolutional neural networks
Most starting position obtains target deep layer convolutional neural networks;
Wherein, deep layer convolutional neural networks are built, including definition constitutes the number of plies of network, convolution window size and nodes
Deng.As an alternative embodiment, in embodiments of the present invention, the network for finally building completion contains the Internet of parameter
There are 14 layers (there are 19 layers if including also calculating input layer, pond layer and softmax output), network includes 4
Inception modules are used to increase network node quantity, while network can be controlled effectively, in training, computation complexity will not
There is explosive increase.Each inception module is 1 × 1,3 × 3 and 5 × 5 convolutional layer and one 3 × 3 by size
Maximum pond layer composition.
Target deep layer convolutional neural networks are trained using stochastic gradient descent method, and then obtain character recognition mould
Type, the character recognition model is used to carry out Text region to the character image to be identified of input.
Wherein, because spatial alternation layer is completely self-contained, any position of network can be placed in any quantity.
Select the beginning by network is placed in the spatial alternation of middle design on last stage in the present invention, that is, the data of network are defeated
After entering layer.
Spatial alternation layer is combined with deep layer convolutional neural networks, so that spatial alternation network is obtained, can be to feature
Figure carries out the spatial alternation of active, and enhancing network is constant to translation, scaling, rotation and more general spatial warping
Property.
Network model is trained using stochastic gradient descent method and obtains objective network, parameter setting is as follows:batch
Size is that 256, base learning rate are 0.01, is not provided with weight decay, while learning rate are per 50k
Secondary 10 times of reduction.Random initializtion is carried out to the weight of network, but does not include the last recurrence layer of positioning network, this layer it is initial
Change will consider to return identical transformation.Objective network is used to carry out Text region to the character image of input.
As it will be easily appreciated by one skilled in the art that the foregoing is merely illustrative of the preferred embodiments of the present invention, it is not used to
The limitation present invention, any modifications, equivalent substitutions and improvements made within the spirit and principles of the invention etc., it all should include
Within protection scope of the present invention.
Claims (6)
1. a kind of character recognition method based on deep learning, it is characterised in that including:Spatial alternation layer building stage and depth
Layer convolutional neural networks are built and the training stage;
The spatial alternation layer building stage includes:
The characteristic pattern that network receives input is positioned, by serial hidden layer, spatial transformation parameter is exported, wherein, the parameter is conversion
The parameter that function is acted on characteristic pattern;
The spatial transformation parameter that mesh generator is exported using positioning network produces sampling grid;
Sampler as input, carries out the characteristic pattern and sampling grid of input on sampling network lattice point to the characteristic pattern of input
Sampling, finally obtains output characteristic figure result;
The deep layer convolutional neural networks are built to be included with the training stage:
The structure of deep layer convolutional neural networks is built, the spatial alternation layer of structure is arranged on most opening for deep layer convolutional neural networks
Beginning, position obtained target deep layer convolutional neural networks;
Target deep layer convolutional neural networks are trained using stochastic gradient descent method, and then obtain character recognition model, institute
Stating character recognition model is used to carry out Text region to the character image to be identified of input.
2. according to the method described in claim 1, it is characterised in that the positioning network includes two convolutional layers, each convolutional layer
Convolution nuclear volume be M, size is N, and step-length is s, and a maximum pond layer, the pond layer are respectively provided with after each convolutional layer
Size is L, and step-length is t, and one ReLU layers are respectively provided with after each pond layer, and a full connection is set after second ReLU layers
Layer, sets one ReLU layers, then last layer is also full articulamentum, for exporting spatial transformation parameter after full articulamentum,
Dimension is d.
3. according to the method described in claim 1, it is characterised in that the space that the mesh generator is exported using positioning network
Transformation parameter produces sampling grid, including:
ByThe output pixel after each pixel transform in input feature vector figure is obtained, by all defeated
The sampling grid gone out in pixel composition output characteristic figure, wherein,Represent in input feature vector figure in the source of ith pixel
Coordinate,Represent that sampling grid is in the coordinates of targets of ith pixel, A in output characteristic figureθRepresent affine transformation matrix,
For the spatial transformation parameter of positioning network output, TθFor transforming function transformation function, GiRepresent the set of pixels in sampling grid.
4. method according to claim 3, it is characterised in that the sampler is by the characteristic pattern and sampling grid of input
As input, the characteristic pattern of input is sampled on sampling network lattice point, output characteristic figure result is finally obtained, including:
ByObtain in output characteristic figure
The corresponding coordinate points of each pixelPixel value, wherein, ΦxAnd ΦyFor sampling kernel k () parameter,It is input
The pixel value of c passages coordinate points (n, m) in input feature vector figure of characteristic pattern, Vi cBe output characteristic figure c passages it is special in output
Levy the coordinate points of ith pixel in figureOutput pixel value, W represents the width of input feature vector figure, and H represents input feature vector
The height of figure, C represents the port number of input feature vector figure.
5. method according to claim 3, it is characterised in that the sampler is by the characteristic pattern and sampling grid of input
As input, the characteristic pattern of input is sampled on sampling network lattice point, output characteristic figure result is finally obtained, including:
ByExported
The corresponding coordinate points of each pixel in characteristic patternPixel value, wherein,Expression is rounded downwards, and δ () is represented in Crow
Gram function,It is the pixel value of c passages coordinate points (n, m) in input feature vector figure of input feature vector figure, Vi cIt is output characteristic
The coordinate points of c passages ith pixel in output characteristic figure of figureOutput pixel value, W represents input feature vector figure
Width, H represents the height of input feature vector figure, and C represents the port number of input feature vector figure.
6. method according to claim 3, it is characterised in that the sampler is by the characteristic pattern and sampling grid of input
As input, the characteristic pattern of input is sampled on sampling network lattice point, output characteristic figure result is finally obtained, including:
ByObtain defeated
Go out the corresponding coordinate points of each pixel in characteristic patternPixel value, wherein,It is the c passages of input feature vector figure defeated
Enter the pixel value of coordinate points (n, m) in characteristic pattern, Vi cBe output characteristic figure c passages in output characteristic figure ith pixel
Coordinate pointsOutput pixel value, W represents the width of input feature vector figure, and H represents the height of input feature vector figure, and C represents that input is special
Levy the port number of figure.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710538785.1A CN107273897A (en) | 2017-07-04 | 2017-07-04 | A kind of character recognition method based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710538785.1A CN107273897A (en) | 2017-07-04 | 2017-07-04 | A kind of character recognition method based on deep learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107273897A true CN107273897A (en) | 2017-10-20 |
Family
ID=60071378
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710538785.1A Pending CN107273897A (en) | 2017-07-04 | 2017-07-04 | A kind of character recognition method based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107273897A (en) |
Cited By (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107704859A (en) * | 2017-11-01 | 2018-02-16 | 哈尔滨工业大学深圳研究生院 | A kind of character recognition method based on deep learning training framework |
CN108229474A (en) * | 2017-12-29 | 2018-06-29 | 北京旷视科技有限公司 | Licence plate recognition method, device and electronic equipment |
CN108509881A (en) * | 2018-03-22 | 2018-09-07 | 五邑大学 | A kind of the Off-line Handwritten Chinese text recognition method of no cutting |
CN108681735A (en) * | 2018-03-28 | 2018-10-19 | 中科博宏(北京)科技有限公司 | Optical character recognition method based on convolutional neural networks deep learning model |
CN108932494A (en) * | 2018-06-29 | 2018-12-04 | 北京字节跳动网络技术有限公司 | Number identification method, system, equipment and computer readable storage medium |
CN109801234A (en) * | 2018-12-28 | 2019-05-24 | 南京美乐威电子科技有限公司 | Geometric image correction method and device |
CN109886077A (en) * | 2018-12-28 | 2019-06-14 | 北京旷视科技有限公司 | Image-recognizing method, device, computer equipment and storage medium |
CN110147785A (en) * | 2018-03-29 | 2019-08-20 | 腾讯科技(深圳)有限公司 | Image-recognizing method, relevant apparatus and equipment |
CN110619325A (en) * | 2018-06-20 | 2019-12-27 | 北京搜狗科技发展有限公司 | Text recognition method and device |
CN110738188A (en) * | 2019-10-24 | 2020-01-31 | 程少轩 | Ancient character recognition system based on presorting |
CN110766020A (en) * | 2019-10-30 | 2020-02-07 | 哈尔滨工业大学 | System and method for detecting and identifying multi-language natural scene text |
CN111429580A (en) * | 2020-02-17 | 2020-07-17 | 浙江工业大学 | Space omnibearing simulation system and method based on virtual reality technology |
CN111783761A (en) * | 2020-06-30 | 2020-10-16 | 苏州科达科技股份有限公司 | Certificate text detection method and device and electronic equipment |
CN112801088A (en) * | 2020-12-31 | 2021-05-14 | 科大讯飞股份有限公司 | Method and related device for correcting distorted text line image |
TWI734085B (en) * | 2019-03-13 | 2021-07-21 | 中華電信股份有限公司 | Dialogue system using intention detection ensemble learning and method thereof |
CN113657364A (en) * | 2021-08-13 | 2021-11-16 | 北京百度网讯科技有限公司 | Method, device, equipment and storage medium for recognizing character mark |
WO2022047662A1 (en) * | 2020-09-02 | 2022-03-10 | Intel Corporation | Method and system of neural network object recognition for warpable jerseys with multiple attributes |
US11423634B2 (en) | 2018-08-03 | 2022-08-23 | Huawei Cloud Computing Technologies Co., Ltd. | Object detection model training method, apparatus, and device |
CN112801088B (en) * | 2020-12-31 | 2024-05-31 | 科大讯飞股份有限公司 | Method and related device for correcting distorted text line image |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105184312A (en) * | 2015-08-24 | 2015-12-23 | 中国科学院自动化研究所 | Character detection method and device based on deep learning |
CN105205448A (en) * | 2015-08-11 | 2015-12-30 | 中国科学院自动化研究所 | Character recognition model training method based on deep learning and recognition method thereof |
CN105335754A (en) * | 2015-10-29 | 2016-02-17 | 小米科技有限责任公司 | Character recognition method and device |
CN105809164A (en) * | 2016-03-11 | 2016-07-27 | 北京旷视科技有限公司 | Character identification method and device |
-
2017
- 2017-07-04 CN CN201710538785.1A patent/CN107273897A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105205448A (en) * | 2015-08-11 | 2015-12-30 | 中国科学院自动化研究所 | Character recognition model training method based on deep learning and recognition method thereof |
CN105184312A (en) * | 2015-08-24 | 2015-12-23 | 中国科学院自动化研究所 | Character detection method and device based on deep learning |
CN105335754A (en) * | 2015-10-29 | 2016-02-17 | 小米科技有限责任公司 | Character recognition method and device |
CN105809164A (en) * | 2016-03-11 | 2016-07-27 | 北京旷视科技有限公司 | Character identification method and device |
Non-Patent Citations (3)
Title |
---|
CHRISTIAN SZEGEDY 等: "Going Deeper with Convolutions", 《2015IEEE》 * |
MAX JADERBERG 等: "Spatial Transformer Networks", 《ARXIV》 * |
樊重俊 等: "《大数据分析与应用》", 31 January 2016, 立信会计出版社 * |
Cited By (25)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107704859A (en) * | 2017-11-01 | 2018-02-16 | 哈尔滨工业大学深圳研究生院 | A kind of character recognition method based on deep learning training framework |
CN108229474A (en) * | 2017-12-29 | 2018-06-29 | 北京旷视科技有限公司 | Licence plate recognition method, device and electronic equipment |
CN108229474B (en) * | 2017-12-29 | 2019-10-01 | 北京旷视科技有限公司 | Licence plate recognition method, device and electronic equipment |
CN108509881A (en) * | 2018-03-22 | 2018-09-07 | 五邑大学 | A kind of the Off-line Handwritten Chinese text recognition method of no cutting |
CN108681735A (en) * | 2018-03-28 | 2018-10-19 | 中科博宏(北京)科技有限公司 | Optical character recognition method based on convolutional neural networks deep learning model |
CN110147785A (en) * | 2018-03-29 | 2019-08-20 | 腾讯科技(深圳)有限公司 | Image-recognizing method, relevant apparatus and equipment |
CN110147785B (en) * | 2018-03-29 | 2023-01-10 | 腾讯科技(深圳)有限公司 | Image recognition method, related device and equipment |
CN110619325B (en) * | 2018-06-20 | 2024-03-08 | 北京搜狗科技发展有限公司 | Text recognition method and device |
CN110619325A (en) * | 2018-06-20 | 2019-12-27 | 北京搜狗科技发展有限公司 | Text recognition method and device |
CN108932494A (en) * | 2018-06-29 | 2018-12-04 | 北京字节跳动网络技术有限公司 | Number identification method, system, equipment and computer readable storage medium |
US11423634B2 (en) | 2018-08-03 | 2022-08-23 | Huawei Cloud Computing Technologies Co., Ltd. | Object detection model training method, apparatus, and device |
US11605211B2 (en) | 2018-08-03 | 2023-03-14 | Huawei Cloud Computing Technologies Co., Ltd. | Object detection model training method and apparatus, and device |
CN109886077A (en) * | 2018-12-28 | 2019-06-14 | 北京旷视科技有限公司 | Image-recognizing method, device, computer equipment and storage medium |
CN109801234B (en) * | 2018-12-28 | 2023-09-22 | 南京美乐威电子科技有限公司 | Image geometry correction method and device |
CN109801234A (en) * | 2018-12-28 | 2019-05-24 | 南京美乐威电子科技有限公司 | Geometric image correction method and device |
TWI734085B (en) * | 2019-03-13 | 2021-07-21 | 中華電信股份有限公司 | Dialogue system using intention detection ensemble learning and method thereof |
CN110738188A (en) * | 2019-10-24 | 2020-01-31 | 程少轩 | Ancient character recognition system based on presorting |
CN110766020A (en) * | 2019-10-30 | 2020-02-07 | 哈尔滨工业大学 | System and method for detecting and identifying multi-language natural scene text |
CN111429580A (en) * | 2020-02-17 | 2020-07-17 | 浙江工业大学 | Space omnibearing simulation system and method based on virtual reality technology |
CN111783761A (en) * | 2020-06-30 | 2020-10-16 | 苏州科达科技股份有限公司 | Certificate text detection method and device and electronic equipment |
WO2022047662A1 (en) * | 2020-09-02 | 2022-03-10 | Intel Corporation | Method and system of neural network object recognition for warpable jerseys with multiple attributes |
CN112801088A (en) * | 2020-12-31 | 2021-05-14 | 科大讯飞股份有限公司 | Method and related device for correcting distorted text line image |
CN112801088B (en) * | 2020-12-31 | 2024-05-31 | 科大讯飞股份有限公司 | Method and related device for correcting distorted text line image |
CN113657364A (en) * | 2021-08-13 | 2021-11-16 | 北京百度网讯科技有限公司 | Method, device, equipment and storage medium for recognizing character mark |
CN113657364B (en) * | 2021-08-13 | 2023-07-25 | 北京百度网讯科技有限公司 | Method, device, equipment and storage medium for identifying text mark |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107273897A (en) | A kind of character recognition method based on deep learning | |
CN109948165B (en) | Fine granularity emotion polarity prediction method based on mixed attention network | |
CN111753828B (en) | Natural scene horizontal character detection method based on deep convolutional neural network | |
CN107464210A (en) | A kind of image Style Transfer method based on production confrontation network | |
CN113221639B (en) | Micro-expression recognition method for representative AU (AU) region extraction based on multi-task learning | |
CN104217214B (en) | RGB D personage's Activity recognition methods based on configurable convolutional neural networks | |
CN105469047B (en) | Chinese detection method and system based on unsupervised learning deep learning network | |
CN106778835A (en) | The airport target by using remote sensing image recognition methods of fusion scene information and depth characteristic | |
CN110110599B (en) | Remote sensing image target detection method based on multi-scale feature fusion | |
CN108304357A (en) | A kind of Chinese word library automatic generation method based on font manifold | |
CN106022355B (en) | High spectrum image sky based on 3DCNN composes joint classification method | |
CN110334584B (en) | Gesture recognition method based on regional full convolution network | |
CN111242241A (en) | Method for amplifying etched character recognition network training sample | |
CN112069900A (en) | Bill character recognition method and system based on convolutional neural network | |
CN109255339B (en) | Classification method based on self-adaptive deep forest human gait energy map | |
CN112347970A (en) | Remote sensing image ground object identification method based on graph convolution neural network | |
CN111127360A (en) | Gray level image transfer learning method based on automatic encoder | |
CN113392244A (en) | Three-dimensional model retrieval method and system based on depth measurement learning | |
Manandhar et al. | Magic layouts: Structural prior for component detection in user interface designs | |
CN114299578A (en) | Dynamic human face generation method based on facial emotion analysis | |
CN116933141B (en) | Multispectral laser radar point cloud classification method based on multicore graph learning | |
CN110866552B (en) | Hyperspectral image classification method based on full convolution space propagation network | |
Du et al. | CAPTCHA recognition based on faster R-CNN | |
CN105844299B (en) | A kind of image classification method based on bag of words | |
CN114627312B (en) | Zero sample image classification method, system, equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20171020 |
|
RJ01 | Rejection of invention patent application after publication |