CN112861840A - Complex scene character recognition method and system based on multi-feature fusion convolutional network - Google Patents

Complex scene character recognition method and system based on multi-feature fusion convolutional network Download PDF

Info

Publication number
CN112861840A
CN112861840A CN202110260333.8A CN202110260333A CN112861840A CN 112861840 A CN112861840 A CN 112861840A CN 202110260333 A CN202110260333 A CN 202110260333A CN 112861840 A CN112861840 A CN 112861840A
Authority
CN
China
Prior art keywords
sequence
network
character recognition
lstm
convolutional
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110260333.8A
Other languages
Chinese (zh)
Inventor
孙锬锋
蒋兴浩
许可
舒常思
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiaotong University
Original Assignee
Shanghai Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University filed Critical Shanghai Jiaotong University
Priority to CN202110260333.8A priority Critical patent/CN112861840A/en
Publication of CN112861840A publication Critical patent/CN112861840A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • G06V20/63Scene text, e.g. street names
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • G06V20/625License plates

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Character Discrimination (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a complex scene character recognition method and a complex scene character recognition system based on a multi-feature fusion convolutional network, wherein the complex scene character recognition method comprises the following steps: a characteristic extraction step: constructing a convolutional neural network based on a multi-feature fusion method, and extracting features of image characters to obtain a feature map containing relative position information and time sequence information; a confidence degree estimation step: constructing a bidirectional LSTM network, and inputting all the feature maps into the bidirectional LSTM network to obtain an image character confidence coefficient estimation sequence; a mapping step: and constructing a transcription layer, and mapping the image character confidence coefficient estimation sequence to obtain an output sequence as a character recognition result. The invention solves the problem that the license plate character recognition precision is not high in the existing method under the conditions of fuzzy images, overlarge license plate inclination angle, rain, snow, fog and other weather conditions and complex scenes of overexposure or over-darkness and the like, and improves the universality of the license plate character recognition method in practical application.

Description

Complex scene character recognition method and system based on multi-feature fusion convolutional network
Technical Field
The invention relates to the field of computer vision, in particular to a method and a system for recognizing characters in a complex scene based on a multi-feature fusion convolutional network.
Background
Due to the rapid development of the economic level of our country in recent years, the demand for character recognition is also increasing. The automatic recognition of the characters in the complex scene is realized, the management efficiency can be improved, and the labor cost can be reduced. Therefore, character recognition technology has become a hot spot in recent years. The conventional character recognition techniques can be divided into two-stage character recognition techniques and one-stage character recognition techniques.
The two-stage character recognition technology is that the first stage performs character segmentation, and the second stage recognizes a segmented single character image. The character segmentation method comprises methods such as edge extraction, horizontal and vertical projection, characteristic projection and the like; the character recognition method comprises a template matching method, a hidden Markov model, a support vector machine, an artificial neural network and other methods. Due to the fact that the joint between the two steps is prone to errors and continuous semantic information is damaged, the robustness of the overall recognition is poor. And the method is difficult to realize the parallelization of the calculation, thereby causing high average processing delay.
The method comprises a step of character recognition technology, namely, inputting a complete character sequence image by a recognition system, and obtaining a recognized character sequence result in one step according to a character recognition model of the recognition system. The current common approach is to use a convolutional neural network model. The method keeps complete semantic information of the character sequence, and has better robustness and higher identification accuracy. Meanwhile, the method can realize parallelization calculation to a certain extent so as to improve the processing efficiency.
In the existing character Recognition technology, a paper "a Light CNN for End-to-End Car Light sites Detection and Recognition" published by w.wang, j.yang, m.chen and p.wang in IEEE Access, vol.7 at 11/28/2019 proposes an End-to-End character Recognition network model, performs feature extraction by using a CNN convolutional neural network, and then constructs an RNN network to train the feature network. The method can identify the character sequence without segmentation, but compared with the multi-feature fusion convolution network designed by the invention, the CNN convolution network adopted in the feature extraction can not extract effective features with high quality when aiming at the license plate under a complex scene, and the accuracy rate of the license plate character identification under the complex scene is lower. A patent document of Hu nan Tumo communication technology Limited company, which is published in 2019, 12 and 27, provides an end-to-end-based license plate character recognition model in a real-time license plate recognition method based on deep learning in a complex scene (CN 110619327A). A lightweight MobileNet neural network is used as a feature extraction network and added into a deep learning object detection algorithm SSD, and character categories are obtained by adopting full-link mapping. However, compared with the character recognition method provided by the invention, which adopts the recurrent neural network and the connection time sequence classification to realize the recognition of the character sequence with the indefinite length, the method adopts seven parallel full-connection layers to predict seven characters in the license plate character recognition respectively, so that the license plate character recognition method cannot recognize a new energy license plate with 8 characters, and has lower accuracy in recognizing license plate characters in a complex scene.
The complex scene comprises: and the character recognition accuracy is not high in scenes caused by the weather conditions such as image blurring, overlarge inclination angle of the character sequence, rain, snow, fog and the like, and overexposure and the like.
Disclosure of Invention
Aiming at the defects in the prior art, the invention aims to provide a complex scene character recognition method and system based on a multi-feature fusion convolutional network.
The invention provides a complex scene character recognition method based on a multi-feature fusion convolutional network, which comprises the following steps:
a characteristic extraction step: constructing a convolutional neural network based on a multi-feature fusion method, and extracting features of image characters to obtain a feature map containing relative position information and time sequence information;
a confidence degree estimation step: constructing a bidirectional LSTM network, and inputting all the feature maps into the bidirectional LSTM network to obtain an image character confidence coefficient estimation sequence;
a mapping step: and constructing a transcription layer, and mapping the image character confidence coefficient estimation sequence to obtain an output sequence as a character recognition result.
Preferably, the method further comprises the following steps:
model training: training the convolutional neural network through a sample picture;
and (3) testing the model: and fixing the parameters of the trained convolutional neural network, and testing the accuracy of the convolutional neural network.
Preferably, the feature extraction step includes:
and constructing a convolutional neural network, and adding a multilayer feature fusion structure into a second layer of the convolutional neural network, wherein the multilayer feature fusion structure comprises two branches added on convolutional layers of the convolutional neural network, one branch is connected with one 1 × 1 convolutional layer, and the other branch is connected with one 5 × 5 convolutional layer.
Preferably, the bidirectional LSTM network comprises: forward LSTM and backward LSTM;
and the forward LSTM and the backward LSTM are formed by connecting a plurality of LSTM units in a chain manner, each LSTM unit comprises an input gate and an output gate, the characteristic diagram is correspondingly input into the input gate of the corresponding LSTM unit, and the output value of the output gate is converted by applying an activation function to obtain the image character confidence coefficient estimation sequence.
Preferably, the image character confidence estimation sequence is set to y ═ y (y)1,y2,…,yT) Then the conditional probability of the target sequence pi is
Figure BDA0002969698620000031
T is the number of LSTM units, a shorter sequence is obtained through many-to-one mapping and is used as a final prediction result, different target sequences pi are mapped to obtain the same result, and therefore the probability of the final output result is the sum of conditional probabilities of all obtained target sequences pi
Figure BDA0002969698620000032
Where β is the sequence-to-sequence mapping function and l is the mapping sequence.
The invention provides a complex scene character recognition system based on a multi-feature fusion convolutional network, which comprises the following steps:
a feature extraction module: constructing a convolutional neural network based on a multi-feature fusion method, and extracting features of image characters to obtain a feature map containing relative position information and time sequence information;
a confidence estimation module: constructing a bidirectional LSTM network, and inputting all the feature maps into the bidirectional LSTM network to obtain an image character confidence coefficient estimation sequence;
a mapping module: and constructing a transcription layer, and mapping the image character confidence coefficient estimation sequence to obtain an output sequence as a character recognition result.
Preferably, the method further comprises the following steps:
a model training module: training the convolutional neural network through a sample picture;
a model testing module: and fixing the parameters of the trained convolutional neural network, and testing the accuracy of the convolutional neural network.
Preferably, the feature extraction module includes:
and constructing a convolutional neural network, and adding a multilayer feature fusion structure into a second layer of the convolutional neural network, wherein the multilayer feature fusion structure comprises two branches added on convolutional layers of the convolutional neural network, one branch is connected with one 1 × 1 convolutional layer, and the other branch is connected with one 5 × 5 convolutional layer.
Preferably, the bidirectional LSTM network comprises: forward LSTM and backward LSTM;
and the forward LSTM and the backward LSTM are formed by connecting a plurality of LSTM units in a chain manner, each LSTM unit comprises an input gate and an output gate, the characteristic diagram is correspondingly input into the input gate of the corresponding LSTM unit, and the output value of the output gate is converted by applying an activation function to obtain the image character confidence coefficient estimation sequence.
Preferably, the image character confidence estimation sequence is set to y ═ y (y)1,y2,…,yT) Then the conditional probability of the target sequence pi is
Figure BDA0002969698620000041
T is the number of LSTM units, a shorter sequence is obtained through many-to-one mapping and is used as a final prediction result, different target sequences pi are mapped to obtain the same result, and therefore the probability of the final output result is the sum of conditional probabilities of all obtained target sequences pi
Figure BDA0002969698620000042
Where β is the sequence-to-sequence mapping function and l is the mapping sequence.
Compared with the prior art, the invention has the following beneficial effects:
1. the license plate character recognition method disclosed by the invention is suitable for license plate recognition application in different scenes, can support license plate recognition of different types and character lengths, solves the problem that the license plate character recognition precision is not high in the existing method under the conditions of blurred images, overlarge license plate inclination angle, rain, snow, fog and other weather conditions and complex scenes of overexposure or over-darkness and the like, and improves the universality of the license plate character recognition method in practical application.
2. Compared with the traditional character recognition method, the deep learning network model-based vehicle license plate recognition method based on the deep learning network model does not need a character segmentation step, retains complete semantic information of the vehicle license plate, and has better robustness and higher recognition accuracy.
3. When the method is used for feature extraction, a multi-feature fusion method is adopted, compared with the traditional convolution, the multi-feature fusion method can better learn low-level features and high-level features in the license plate, and effectively prevent the reduction of recognition accuracy caused by feature loss, so that the recognition accuracy of the method in a complex scene is improved.
Drawings
Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments with reference to the following drawings:
fig. 1 is an overall framework diagram of a license plate character recognition method in a complex scene based on deep learning according to an embodiment of the present invention.
Fig. 2 is a schematic diagram of a multi-feature fused feature extraction network according to an embodiment of the present invention.
Detailed Description
The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the invention, but are not intended to limit the invention in any way. It should be noted that it would be obvious to those skilled in the art that various changes and modifications can be made without departing from the spirit of the invention. All falling within the scope of the present invention.
The embodiment takes license plate recognition as an example, and the license plate recognition comprises a common type license plate and special types of license plates such as a new energy source, a police plate and a military plate. Those skilled in the art will appreciate that the present invention is also applicable to other fields of character recognition, such as character recognition on media such as displays, paper, etc.
As shown in fig. 1, the complex scene character recognition method based on the multi-feature fusion convolutional network provided by the present invention includes:
a characteristic extraction step: and constructing a convolutional neural network based on a multi-feature fusion method, and extracting features of image characters to obtain a feature map containing relative position information and time sequence information. The convolutional neural network is mainly constructed by a convolutional layer, a maximum pooling layer and a ReLU.
A confidence degree estimation step: and constructing a bidirectional LSTM network, and inputting all the characteristic graphs into the bidirectional LSTM network to obtain an image character confidence coefficient estimation sequence.
A mapping step: and constructing a transcription layer, and mapping the image character confidence coefficient estimation sequence to obtain an output sequence as a character recognition result.
Model training: and training the convolutional neural network through a sample picture.
And (3) testing the model: and fixing the parameters of the trained convolutional neural network, and testing the accuracy of the convolutional neural network.
The characteristic extraction step comprises: as shown in fig. 2, a convolutional neural network is constructed, and a multi-layer feature fusion structure is added to a second layer of the convolutional neural network, wherein the multi-layer feature fusion structure comprises two branches added to convolutional layers of the convolutional neural network, one branch is connected with one 1 × 1 convolutional layer, and the other branch is connected with one 5 × 5 convolutional layer.
The bidirectional LSTM network includes: forward LSTM and backward LSTM. And the forward LSTM and the backward LSTM are formed by connecting a plurality of LSTM units in a chain manner, each LSTM unit comprises an input gate and an output gate, the characteristic diagram is correspondingly input into the input gate of the corresponding LSTM unit, and the output value of the output gate is converted by applying an activation function to obtain the image character confidence coefficient estimation sequence.
The image character confidence estimation sequence is set as y ═ y (y)1,y2,…,yT) Then the conditional probability of the target sequence pi is
Figure BDA0002969698620000051
T is the number of LSTM units, a shorter sequence is obtained through many-to-one mapping and is used as a final prediction result, different target sequences pi are mapped to obtain the same result, and therefore the probability of the final output result is the sum of conditional probabilities of all obtained target sequences pi
Figure BDA0002969698620000061
Where β is the sequence-to-sequence mapping function and l is the mapping sequence.
Data set
The training and testing data set comprises a real data set collected from a real environment and a synthetic data set generated using a computer.
The license plate images in the real license plate data set come from real shooting and an open source China license plate data set CCPD, and contain 7561 license plate images together. CCPD data sets were designed by the paper Towards End-to-End License Plate Detection and Recognition A Large Dataset and Baseline (Xu Z, Yang W, Meng A, Computer Vision-ECCV 2018.Springer, Cham, 2018.) A License Plate data set of China can be publicly downloaded in htps:// githu. com/detectRecog/CCPD for a total of over 30 million plates. In the invention, 3400 small license plates, 2700 large license plates, 825 new energy license plates and 390 other special license plates are selected from CCPD data set. The real shooting and collecting is to shoot the license plate areas of different vehicles by hands of people holding mobile phones, mainly shoot small license plates and large license plates under the real conditions, and specially collect rare new energy license plates and other special license plates; in the shooting process, complex scenes such as different angles, different backgrounds and different illumination conditions are shot in a certain proportion. 116 small license plates, 89 large license plates, 32 new energy license plates and 9 other special license plates are selected from the reality shooting collection.
The synthetic license plate data set generates 10 thousands of simulated license plates by using an OpenCV-based method, and then carries out style migration on the simulated license plates through a countermeasure generating network to obtain the synthetic license plates which accord with the real style of complex scenes. Wherein 4 pieces of small license plates, 3 pieces of large license plates, 2 pieces of new energy license plates and 1 piece of the rest license plates.
For the normal condition test set, the selected images have no intersection with the training data set, and 1000 license plate images are randomly selected; 400 complex license plate images are selected from the test data set under the complex scene.
Description of the test
In the testing process of the embodiment, the fake plate character recognition model is built and trained by using Keras based on the complex scene of deep learning.
Firstly, a recognition model is pre-trained by using a generated data set, so that the recognition model learns certain priori knowledge to obtain proper initial weight. And then, fine tuning is carried out by using the weight parameters in the real data set multi-model to obtain a better network weight. In the training process, an EarlyStopping function of Keras is adopted to prevent overfitting in the training.
And after training is finished, selecting the weight with the lowest loss on the test set from all the stored intermediate results for use. The test is performed in both the normal and complex test sets. The accuracy of the test is defined as the number of license plates with correct character recognition/the total number of license plates tested.
Test results
According to the invention, on the test set of the normal condition data set, the license plate character recognition accuracy is 92.7%, and on the test set of the complex scene data set, the license plate character recognition accuracy is 87.2%; the commonly used CRNN-CTC license plate recognition network has the license plate character recognition accuracy rate of 91.2% on a test set of a normal condition data set and the license plate character recognition accuracy rate of 80.0% on a test set of a complex scene data set; the method has higher recognition accuracy in the test set of correct conditions and complex scenes, has obvious advantages in the test set of complex license plates, and proves the effectiveness of the method.
Those skilled in the art will appreciate that, in addition to implementing the system and its various devices, modules, units provided by the present invention as pure computer readable program code, the system and its various devices, modules, units provided by the present invention can be fully implemented by logically programming method steps in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Therefore, the system and various devices, modules and units thereof provided by the invention can be regarded as a hardware component, and the devices, modules and units included in the system for realizing various functions can also be regarded as structures in the hardware component; means, modules, units for performing the various functions may also be regarded as structures within both software modules and hardware components for performing the method.
The foregoing description of specific embodiments of the present invention has been presented. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes or modifications may be made by one skilled in the art within the scope of the appended claims without departing from the spirit of the invention. The embodiments and features of the embodiments of the present application may be combined with each other arbitrarily without conflict.

Claims (10)

1. A complex scene character recognition method based on a multi-feature fusion convolutional network is characterized by comprising the following steps:
a characteristic extraction step: constructing a convolutional neural network based on a multi-feature fusion method, and extracting features of image characters to obtain a feature map containing relative position information and time sequence information;
a confidence degree estimation step: constructing a bidirectional LSTM network, and inputting all the feature maps into the bidirectional LSTM network to obtain an image character confidence coefficient estimation sequence;
a mapping step: and constructing a transcription layer, and mapping the image character confidence coefficient estimation sequence to obtain an output sequence as a character recognition result.
2. The complex scene character recognition method based on the multi-feature fusion convolutional network of claim 1, further comprising:
model training: training the convolutional neural network through a sample picture;
and (3) testing the model: and fixing the parameters of the trained convolutional neural network, and testing the accuracy of the convolutional neural network.
3. The complex scene character recognition method based on the multi-feature fusion convolutional network of claim 1, wherein the feature extraction step comprises:
and constructing a convolutional neural network, and adding a multilayer feature fusion structure into a second layer of the convolutional neural network, wherein the multilayer feature fusion structure comprises two branches added on convolutional layers of the convolutional neural network, one branch is connected with one 1 × 1 convolutional layer, and the other branch is connected with one 5 × 5 convolutional layer.
4. The complex scene character recognition method based on the multi-feature fusion convolutional network of claim 1, wherein the bidirectional LSTM network comprises: forward LSTM and backward LSTM;
and the forward LSTM and the backward LSTM are formed by connecting a plurality of LSTM units in a chain manner, each LSTM unit comprises an input gate and an output gate, the characteristic diagram is correspondingly input into the input gate of the corresponding LSTM unit, and the output value of the output gate is converted by applying an activation function to obtain the image character confidence coefficient estimation sequence.
5. The method of claim 1, wherein the image character confidence estimation sequence is set to y ═ y (y)1,y2,…,yT) Then the conditional probability of the target sequence pi is
Figure FDA0002969698610000011
T is the number of LSTM units, a shorter sequence is obtained through many-to-one mapping and is used as a final prediction result, different target sequences pi are mapped to obtain the same result, and therefore the probability of the final output result is the sum of conditional probabilities of all obtained target sequences pi
Figure FDA0002969698610000021
Where β is the sequence-to-sequence mapping function and l is the mapping sequence.
6. A complex scene character recognition system based on a multi-feature fusion convolutional network is characterized by comprising:
a feature extraction module: constructing a convolutional neural network based on a multi-feature fusion method, and extracting features of image characters to obtain a feature map containing relative position information and time sequence information;
a confidence estimation module: constructing a bidirectional LSTM network, and inputting all the feature maps into the bidirectional LSTM network to obtain an image character confidence coefficient estimation sequence;
a mapping module: and constructing a transcription layer, and mapping the image character confidence coefficient estimation sequence to obtain an output sequence as a character recognition result.
7. The complex scene character recognition system based on the multi-feature fusion convolutional network of claim 6, further comprising:
a model training module: training the convolutional neural network through a sample picture;
a model testing module: and fixing the parameters of the trained convolutional neural network, and testing the accuracy of the convolutional neural network.
8. The complex scene character recognition system based on the multi-feature fusion convolutional network of claim 6, wherein the feature extraction module comprises:
and constructing a convolutional neural network, and adding a multilayer feature fusion structure into a second layer of the convolutional neural network, wherein the multilayer feature fusion structure comprises two branches added on convolutional layers of the convolutional neural network, one branch is connected with one 1 × 1 convolutional layer, and the other branch is connected with one 5 × 5 convolutional layer.
9. The complex scene character recognition system based on multi-feature fusion convolutional network of claim 6, wherein the bidirectional LSTM network comprises: forward LSTM and backward LSTM;
and the forward LSTM and the backward LSTM are formed by connecting a plurality of LSTM units in a chain manner, each LSTM unit comprises an input gate and an output gate, the characteristic diagram is correspondingly input into the input gate of the corresponding LSTM unit, and the output value of the output gate is converted by applying an activation function to obtain the image character confidence coefficient estimation sequence.
10. The complex scene character recognition system based on multi-feature fusion convolution network of claim 6 wherein the image character confidence estimation sequence is set to y ═ y (y)1,y2,…,yT) Then the conditional probability of the target sequence pi is
Figure FDA0002969698610000031
T is the number of LSTM units, a shorter sequence is obtained through many-to-one mapping and is used as a final prediction result, different target sequences pi are mapped to obtain the same result, and therefore the probability of the final output result is the sum of conditional probabilities of all obtained target sequences pi
Figure FDA0002969698610000032
Where β is the sequence-to-sequence mapping function and l is the mapping sequence.
CN202110260333.8A 2021-03-10 2021-03-10 Complex scene character recognition method and system based on multi-feature fusion convolutional network Pending CN112861840A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110260333.8A CN112861840A (en) 2021-03-10 2021-03-10 Complex scene character recognition method and system based on multi-feature fusion convolutional network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110260333.8A CN112861840A (en) 2021-03-10 2021-03-10 Complex scene character recognition method and system based on multi-feature fusion convolutional network

Publications (1)

Publication Number Publication Date
CN112861840A true CN112861840A (en) 2021-05-28

Family

ID=75993879

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110260333.8A Pending CN112861840A (en) 2021-03-10 2021-03-10 Complex scene character recognition method and system based on multi-feature fusion convolutional network

Country Status (1)

Country Link
CN (1) CN112861840A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113486885A (en) * 2021-06-17 2021-10-08 杭州鸿泉物联网技术股份有限公司 License plate recognition method and device, electronic equipment and storage medium
CN115457561A (en) * 2022-08-30 2022-12-09 东南大学 Tire embossed character recognition general algorithm based on integrated deep learning

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107203606A (en) * 2017-05-17 2017-09-26 西北工业大学 Text detection and recognition methods under natural scene based on convolutional neural networks
CN110659648A (en) * 2019-09-27 2020-01-07 北京猎户星空科技有限公司 Character recognition method and device
CN111461112A (en) * 2020-03-03 2020-07-28 华南理工大学 License plate character recognition method based on double-cycle transcription network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107203606A (en) * 2017-05-17 2017-09-26 西北工业大学 Text detection and recognition methods under natural scene based on convolutional neural networks
CN110659648A (en) * 2019-09-27 2020-01-07 北京猎户星空科技有限公司 Character recognition method and device
CN111461112A (en) * 2020-03-03 2020-07-28 华南理工大学 License plate character recognition method based on double-cycle transcription network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
商俊蓓: "基于双向长短时记忆递归神经网络的联机手写数字公式字符识别", 《中国优秀博硕士学位论文全文数据库(硕士) 信息科技辑》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113486885A (en) * 2021-06-17 2021-10-08 杭州鸿泉物联网技术股份有限公司 License plate recognition method and device, electronic equipment and storage medium
CN115457561A (en) * 2022-08-30 2022-12-09 东南大学 Tire embossed character recognition general algorithm based on integrated deep learning
CN115457561B (en) * 2022-08-30 2023-09-22 东南大学 Tire embossing character recognition universal method based on integrated deep learning

Similar Documents

Publication Publication Date Title
CN110348445B (en) Instance segmentation method fusing void convolution and edge information
CN107563372B (en) License plate positioning method based on deep learning SSD frame
CN108921875B (en) Real-time traffic flow detection and tracking method based on aerial photography data
CN109241982B (en) Target detection method based on deep and shallow layer convolutional neural network
CN108197326B (en) Vehicle retrieval method and device, electronic equipment and storage medium
CN109508663B (en) Pedestrian re-identification method based on multi-level supervision network
CN111259786A (en) Pedestrian re-identification method based on synchronous enhancement of appearance and motion information of video
CN110969166A (en) Small target identification method and system in inspection scene
CN111767927A (en) Lightweight license plate recognition method and system based on full convolution network
CN113420607A (en) Multi-scale target detection and identification method for unmanned aerial vehicle
Cao et al. A survey on image semantic segmentation methods with convolutional neural network
CN110717863B (en) Single image snow removing method based on generation countermeasure network
CN108491828B (en) Parking space detection system and method based on level pairwise similarity PVAnet
CN113723377A (en) Traffic sign detection method based on LD-SSD network
CN112861840A (en) Complex scene character recognition method and system based on multi-feature fusion convolutional network
CN114049572A (en) Detection method for identifying small target
CN115620393A (en) Fine-grained pedestrian behavior recognition method and system oriented to automatic driving
CN110598540B (en) Method and system for extracting gait contour map in monitoring video
CN111723852A (en) Robust training method for target detection network
CN112785610B (en) Lane line semantic segmentation method integrating low-level features
CN111612803B (en) Vehicle image semantic segmentation method based on image definition
CN111160282B (en) Traffic light detection method based on binary Yolov3 network
CN116994164A (en) Multi-mode aerial image fusion and target detection combined learning method
CN113128461B (en) Pedestrian re-recognition performance improving method based on human body key point mining full-scale features
CN115359487A (en) Rapid railcar number identification method, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210528