CN109685100B - Character recognition method, server and computer readable storage medium - Google Patents

Character recognition method, server and computer readable storage medium Download PDF

Info

Publication number
CN109685100B
CN109685100B CN201811341729.XA CN201811341729A CN109685100B CN 109685100 B CN109685100 B CN 109685100B CN 201811341729 A CN201811341729 A CN 201811341729A CN 109685100 B CN109685100 B CN 109685100B
Authority
CN
China
Prior art keywords
character
character recognition
layers
data
recognition model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811341729.XA
Other languages
Chinese (zh)
Other versions
CN109685100A (en
Inventor
许洋
王健宗
肖京
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201811341729.XA priority Critical patent/CN109685100B/en
Publication of CN109685100A publication Critical patent/CN109685100A/en
Priority to PCT/CN2019/088638 priority patent/WO2020098250A1/en
Application granted granted Critical
Publication of CN109685100B publication Critical patent/CN109685100B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/24Character recognition characterised by the processing or recognition method
    • G06V30/248Character recognition characterised by the processing or recognition method involving plural approaches, e.g. verification by template match; Resolving confusion among similar patterns, e.g. "O" versus "Q"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Multimedia (AREA)
  • Character Discrimination (AREA)

Abstract

The invention relates to artificial intelligence, and discloses a character recognition method, which comprises the following steps: acquiring character data, and performing image synthesis on each acquired character data and a preset background picture to obtain a character image corresponding to each character data; carrying out random disturbance processing on the synthesized character images to obtain character images of different types; inputting the character images of different types into a deep learning network for training to generate a character recognition model; and inputting the character image to be recognized into the character recognition model, and outputting a recognition result of the character image to be recognized. The invention also provides a server and a computer readable storage medium. The character recognition method, the server and the computer readable storage medium provided by the invention realize the OCR function based on the deep learning algorithm, so that the character recognition range can be increased and the character recognition accuracy can be increased.

Description

Character recognition method, server and computer readable storage medium
Technical Field
The present invention relates to the field of character recognition, and in particular, to a character recognition method, a server, and a computer readable storage medium.
Background
In the process of OCR (Optical Character Recognition ) services, certain fields in a particular scene are usually identified according to the needs of a service party, which generally requires that the service party provide real picture data in the scene, and manually annotate the data, and then use the annotated pictures for detection and training of deep learning of recognition models. The recognition accuracy is generally high when the recognition contents of these fields are within a small limited set (such as the sex of an identification card, the type of vehicle of a driver's license, the nature of use, etc.). When the identification content of the field is in a very large limited set, even can be regarded as an infinite set (such as the name of an identity card, the owner of a driving license, etc.), the identification is easily limited by the quantity of the marked data, and the accuracy is also affected to a certain extent.
Disclosure of Invention
In view of the above, the present invention provides a character recognition method that can increase the character recognition range and the character recognition accuracy.
First, to achieve the above object, the present invention provides a server including a memory and a processor, wherein the memory stores a character recognition system that can be executed on the processor, and the character recognition system when executed by the processor implements the steps of:
Acquiring character data, and performing image synthesis on each acquired character data and a preset background picture to obtain a character image corresponding to each character data;
Carrying out random disturbance processing on the synthesized character images to obtain character images of different types;
Inputting the character images of different types into a deep learning network for training to generate a character recognition model; and
Inputting the character image to be recognized into the character recognition model, and outputting a recognition result of the character image to be recognized.
Optionally, the deep learning network is a CRNN model, where the CRNN model includes a VGG16 layer, two long-short-term memory network LSTM layers, and two fully-connected FC layers, where the VGG16 layer is used to extract spatial features of a character image, the two long-short-term memory network LSTM layers are used to extract time sequence features of the character image, and the two fully-connected FC layers are used to classify the extracted spatial features and the time sequence features.
Optionally, when the character recognition system is executed by the processor, the following steps are further implemented:
Testing the recognition accuracy of the character recognition model to the characters; and
And if the recognition accuracy is lower than a preset threshold, adjusting the character recognition model.
Optionally, the step of adjusting the character recognition model includes:
Freezing parameters of the VGG16 layer;
parameters of the two long-short-period memory network LSTM layers and the two fully-connected FC layers are adjusted; and
And training the adjusted character recognition model by adopting the real character image data.
In addition, to achieve the above object, the present invention further provides a character recognition method, applied to a server, the method comprising:
Acquiring character data, and performing image synthesis on each acquired character data and a preset background picture to obtain a character image corresponding to each character data;
Carrying out random disturbance processing on the synthesized character images to obtain character images of different types;
Inputting the character images of different types into a deep learning network for training to generate a character recognition model; and
Inputting the character image to be recognized into the character recognition model, and outputting a recognition result of the character image to be recognized.
Optionally, the deep learning network is a CRNN model, where the CRNN model includes a VGG16 layer, two long-short-term memory network LSTM layers, and two fully-connected FC layers, where the VGG16 layer is used to extract spatial features of a character image, the two long-short-term memory network LSTM layers are used to extract time sequence features of the character image, and the two fully-connected FC layers are used to classify the extracted spatial features and the time sequence features.
Optionally, after the step of inputting the different types of character images into the deep learning network for training to generate the character recognition model, the method further includes:
Testing the recognition accuracy of the character recognition model to the characters; and
And if the recognition accuracy is lower than a preset threshold, adjusting the character recognition model.
Optionally, the step of adjusting the character recognition model includes:
Freezing parameters of the VGG16 layer;
parameters of the two long-short-period memory network LSTM layers and the two fully-connected FC layers are adjusted; and
And training the adjusted character recognition model by adopting the real character image data.
Optionally, the random perturbation process includes: at least one of gaussian blur processing, gaussian noise processing, small amplitude rotation processing of a picture, contrast change processing of a picture, and color change processing of a picture.
Further, to achieve the above object, the present invention also provides a computer-readable storage medium storing a character recognition system executable by at least one processor to cause the at least one processor to perform the steps of the character recognition method as set forth in any one of the above.
Compared with the prior art, the character recognition method, the server and the computer readable storage medium provided by the invention acquire character data, and perform image synthesis on each acquired character data and a preset background picture to acquire a character image corresponding to each character data; carrying out random disturbance processing on the synthesized character images to obtain character images of different types; inputting the character images of different types into a deep learning network for training to generate a character recognition model; and inputting the character image to be recognized into the character recognition model, and outputting a recognition result of the character image to be recognized. Thus, diversified training sample data can be generated according to the needs, so that the problems of small character recognition range and low accuracy rate caused by uneven distribution of real data of training samples in the prior art are solved, the character recognition range is increased, and the character recognition accuracy rate is increased.
Drawings
FIG. 1 is a schematic diagram of an alternative hardware architecture of a server according to the present invention;
FIG. 2 is a schematic diagram of a program module of a first embodiment of the character recognition system of the present invention;
FIG. 3 is a schematic diagram of a program module of a second embodiment of the character recognition system of the present invention;
FIG. 4 is a flowchart illustrating a character recognition method according to a first embodiment of the present invention;
fig. 5 is a schematic flow chart of a character recognition method according to a second embodiment of the present invention.
Reference numerals:
Server device 2
Network system 3
Memory device 11
Processor and method for controlling the same 12
Network interface 13
Character recognition system 100
Acquisition module 101
Processing module 102
Generating module 103
Output module 104
Test module 105
Adjustment module 106
The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It should be noted that the description of "first", "second", etc. in this disclosure is for descriptive purposes only and is not to be construed as indicating or implying a relative importance or implying an indication of the number of technical features being indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In addition, the technical solutions of the embodiments may be combined with each other, but it is necessary to base that the technical solutions can be realized by those skilled in the art, and when the technical solutions are contradictory or cannot be realized, the combination of the technical solutions should be considered to be absent and not within the scope of protection claimed in the present invention.
Referring to fig. 1, an optional hardware architecture of an application server 2 according to the present invention is shown.
In this embodiment, the application server 2 may include, but is not limited to, a memory 11, a processor 12, and a network interface 13, which may be communicatively connected to each other through a system bus. It is noted that fig. 2 only shows the application server 2 with components 11-13, but it is understood that not all shown components are required to be implemented, and that more or fewer components may alternatively be implemented.
The application server 2 may be a rack server, a blade server, a tower server, or a cabinet server, and the application server 2 may be an independent server or a server cluster formed by a plurality of servers.
The memory 11 includes at least one type of readable storage medium including flash memory, hard disk, multimedia card, card memory (e.g., SD or DX memory, etc.), random Access Memory (RAM), static Random Access Memory (SRAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), programmable Read Only Memory (PROM), magnetic memory, magnetic disk, optical disk, etc. In some embodiments, the storage 11 may be an internal storage unit of the application server 2, for example, a hard disk or a memory of the application server 2. In other embodiments, the memory 11 may also be an external storage device of the application server 2, such as a plug-in hard disk, a smart memory card (SMART MEDIA CARD, SMC), a Secure Digital (SD) card, a flash memory card (FLASH CARD) or the like, which are provided on the application server 2. Of course, the memory 11 may also comprise both an internal storage unit of the application server 2 and an external storage device thereof. In this embodiment, the memory 11 is typically used to store an operating system installed on the application server 2 and various types of application software, such as program codes of the character recognition system 100. Further, the memory 11 may be used to temporarily store various types of data that have been output or are to be output.
The processor 12 may be a central processing unit (Central Processing Unit, CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments. The processor 12 is typically used to control the overall operation of the application server 2. In this embodiment, the processor 12 is configured to execute the program code or process data stored in the memory 11, for example, to execute the character recognition system 100.
The network interface 13 may comprise a wireless network interface or a wired network interface, which network interface 13 is typically used for establishing a communication connection between the application server 2 and other electronic devices.
So far, the hardware structure and function of the related device of the present invention have been described in detail. In the following, various embodiments of the present invention will be presented based on the above description.
First, the present invention proposes a character recognition system 100.
So far, the hardware structure and function of the related device of the present invention have been described in detail. In the following, various embodiments of the present invention will be presented based on the above description.
First, the present invention proposes a character recognition system 100.
Referring to FIG. 2, a block diagram of a first embodiment of a character recognition system 100 according to the present invention is shown.
In this embodiment, the character recognition system 100 includes a series of computer program instructions stored on the memory 11 that, when executed by the processor 12, perform the character recognition operations of the various embodiments of the invention. In some embodiments, the character recognition system 100 may be divided into one or more modules based on the particular operations implemented by portions of the computer program instructions. For example, in fig. 2, the character recognition system 100 may be divided into an acquisition module 101, a processing module 102, a generation module 103, and an output module 104. Wherein:
The acquiring module 101 is configured to acquire character data, and perform image synthesis on each acquired character data and a preset background picture to obtain a character image corresponding to each character data.
Specifically, the character data may be english letters, symbols, numerals, kanji, etc., and in this embodiment, the character data includes at least one character. The character data can be obtained by grabbing from a network and then stored in a preset file, and when a user needs to use the character data, the character data is directly obtained from the preset file; the character data can also be character data provided by a business party and stored in a preset file, and when a user needs to use the character data, the character data can be directly obtained from the preset file. Preferably, the preset file is a file in TXT format. The character data may be obtained in any manner by those skilled in the art, and will not be described in detail herein.
The preset background picture is a picture determined by the user according to actual needs, in this embodiment, the preset background picture is preferably a picture captured by taking a keyword as "paper" from the internet, and the picture is at least one picture, and of course, the picture can also be a picture obtained after the user uses the camera to capture various papers. It will be appreciated that in other embodiments of the present invention, the preset background image may be other types of pictures, such as a license plate number picture, an identification card picture, etc.
For example, when the acquired character data has 5 character data and the preset background picture has 4 pieces, it is preferable that each character data is respectively combined with each background image during image combination, so that 4 character images can be combined with each character data, and 20 character images can be combined with 5 character data. Of course, in the image synthesis, it is not necessary to perform image synthesis with each background image for each character data to obtain a character image, and the present embodiment is not limited thereto. In this embodiment, the diversity of character images can be increased by image synthesis of character data and a plurality of background pictures.
In this embodiment, any one of the existing image synthesis techniques may be used to implement image synthesis, for example, when image synthesis is performed, first, the length and width of the pixel space occupied by the character data may be determined according to the length of the character data, the style of the character data, and the word size of the character data, and after the length and width of the pixel space occupied by the character data are determined, a corresponding pixel area is selected from the pixels of the background picture so that the pixel corresponding to the character data can be inserted into the pixel area, and the pixel originally located in the pixel area is replaced. It will be understood that in other embodiments, a pixel replacement method may be adopted, and a pixel stacking method may be adopted directly, that is, each pixel corresponding to the character data is respectively stacked with each pixel corresponding to the pixel region, and the pixel value after stacking is taken as the pixel value of each pixel in the pixel region.
The processing module 102 is configured to perform random perturbation processing on the synthesized character image to obtain different types of character images.
Specifically, the random disturbance processing includes gaussian blur processing, gaussian noise processing, small-amplitude rotation processing of a picture, contrast processing and color change processing of a picture, and the like. The Gaussian blur processing of the picture refers to Gaussian filtering of a certain mean value and variance of the picture; the Gaussian noise processing of the picture means that Gaussian noise points are added to three color channels of the picture, and unlike Gaussian blur, the Gaussian blur is direct superposition of values, and the Gaussian blur is filtering of the picture; the small-amplitude rotation processing of the picture refers to determining a center point to be rotated according to a field frame, and the center of the picture can be directly taken as the center point of rotation, so that adjustment can be performed according to actual service requirements, and then the picture is rotated by an angle according to the center point; the contrast processing of the picture refers to randomly changing S (Saturation) and V (Value brightness) of the picture on an HSV color space; the color change processing of the picture refers to randomly changing H (Hue) of the picture on the HSV color space.
In this embodiment, by adopting at least one disturbance processing method described above for the synthesized image, different types of character images, for example, a rotation pattern character image, a noisy character image, an oblique character image, and the like can be obtained. Through carrying out disturbance processing on the synthesized images, the diversity of character images can be increased, so that the data of the training samples are more abundant, and the character recognition model obtained through training by the training samples can have higher recognition accuracy.
The generating module 103 is configured to input the different types of character images into a deep learning network for training to generate a character recognition model.
Specifically, before inputting different types of character images into the deep learning network, the character images need to be preprocessed to be converted into required feature vectors, and then the required feature vectors are input into the deep learning network for training.
In this embodiment, the deep learning network is preferably a CRNN model, which is a joint model of a convolutional neural network and a cyclic neural network, and which is an end-to-end trainable model, and has the following advantages: 1) The input data may be of any length (image width arbitrary, word length arbitrary); 2) The training set does not need to be calibrated with characters; 3) Both dictionary-carrying and dictionary-free libraries (samples) can be used; 4) Good performance and small model (few parameters).
In a specific embodiment, the CRNN model includes a VGG16 layer, two long-short-term memory network LSTM layers, and two fully-connected FC layers, where the VGG16 layer is composed of 13 convolution layers and 3 fully-linked layers, and is used to extract spatial features of a character image; the LSTM layers are used for extracting time sequence characteristics of the character images so as to obtain a context relation of the text to be trained and identified; the two fully connected FC layers are used for classifying the extracted spatial features and the extracted time sequence features. Compared with the existing CRNN model, the CRNN model in the embodiment is additionally provided with a fully-connected FC layer to accelerate the convergence rate of training.
The output module 104 is configured to input a character image to be recognized into the character recognition model, and output a recognition result of the character image to be recognized.
In this embodiment, when a user needs to recognize a character, the character recognition model can recognize a character corresponding to the character image only by collecting a character image of the character to be recognized and inputting the character image into the character recognition model. In this embodiment, the character recognition model may be stored in a local character recognition terminal or may be stored in a server, and specifically, the selection is performed according to the actual needs of the user, which is not limited in this embodiment.
Through the program modules 101-104, the character recognition system 100 provided by the invention acquires character data, and performs image synthesis on each acquired character data and a preset background picture to acquire a character image corresponding to each character data; carrying out random disturbance processing on the synthesized character images to obtain character images of different types; inputting the character images of different types into a deep learning network for training to generate a character recognition model; and inputting the character image to be recognized into the character recognition model, and outputting a recognition result of the character image to be recognized. Thus, diversified training sample data can be generated according to the needs, so that the problems of small character recognition range and low accuracy rate caused by uneven distribution of real data of training samples in the prior art are solved, the character recognition range is increased, and the character recognition accuracy rate is increased.
Referring to FIG. 3, a block diagram of a character recognition system 100 according to a second embodiment of the present invention is shown. In this embodiment, the character recognition system 100 includes a series of computer program instructions stored on the memory 11 that, when executed by the processor 12, perform the character recognition operations of the various embodiments of the invention. In some embodiments, the character recognition system 100 may be divided into one or more modules based on the particular operations implemented by portions of the computer program instructions. For example, in fig. 3, the character recognition system 100 may be divided into an acquisition module 101, a processing module 102, a generation module 103, an output module 104, a test module 105, and an adjustment module 106. The program modules 101-104 are identical to the first embodiment of the character recognition system 100 of the present invention, with the addition of a test module 105 and an adjustment module 106. Wherein:
The acquiring module 101 is configured to acquire character data, and perform image synthesis on each acquired character data and a preset background picture to obtain a character image corresponding to each character data.
Specifically, the character data may be english letters, symbols, numerals, kanji, etc., and in this embodiment, the character data includes at least one character. The character data can be obtained by grabbing from a network and then stored in a preset file, and when a user needs to use the character data, the character data is directly obtained from the preset file; the character data can also be character data provided by a business party and stored in a preset file, and when a user needs to use the character data, the character data can be directly obtained from the preset file. Preferably, the preset file is a file in TXT format. The character data may be obtained in any manner by those skilled in the art, and will not be described in detail herein.
The preset background picture is a picture determined by the user according to actual needs, in this embodiment, the preset background picture is preferably a picture captured by taking a keyword as "paper" from the internet, and the picture is at least one picture, and of course, the picture can also be a picture obtained after the user uses the camera to capture various papers. It will be appreciated that in other embodiments of the present invention, the preset background image may be other types of pictures, such as a license plate number picture, an identification card picture, etc.
For example, when the acquired character data has 5 character data and the preset background picture has 4 pieces, it is preferable that each character data is respectively combined with each background image during image combination, so that 4 character images can be combined with each character data, and 20 character images can be combined with 5 character data. Of course, in the image synthesis, it is not necessary to perform image synthesis with each background image for each character data to obtain a character image, and the present embodiment is not limited thereto. In this embodiment, the diversity of character images can be increased by image synthesis of character data and a plurality of background pictures.
In this embodiment, any one of the existing image synthesis techniques may be used to implement image synthesis, for example, when image synthesis is performed, first, the length and width of the pixel space occupied by the character data may be determined according to the length of the character data, the style of the character data, and the word size of the character data, and after the length and width of the pixel space occupied by the character data are determined, a corresponding pixel area is selected from the pixels of the background picture so that the pixel corresponding to the character data can be inserted into the pixel area, and the pixel originally located in the pixel area is replaced. It will be understood that in other embodiments, a pixel replacement method may be adopted, and a pixel stacking method may be adopted directly, that is, each pixel corresponding to the character data is respectively stacked with each pixel corresponding to the pixel region, and the pixel value after stacking is taken as the pixel value of each pixel in the pixel region.
The processing module 102 is configured to perform random perturbation processing on the synthesized character image to obtain different types of character images.
Specifically, the random disturbance processing includes gaussian blur processing, gaussian noise processing, small-amplitude rotation processing of a picture, contrast processing and color change processing of a picture, and the like. The Gaussian blur processing of the picture refers to Gaussian filtering of a certain mean value and variance of the picture; the Gaussian noise processing of the picture means that Gaussian noise points are added to three color channels of the picture, and unlike Gaussian blur, the Gaussian blur is direct superposition of values, and the Gaussian blur is filtering of the picture; the small-amplitude rotation processing of the picture refers to determining a center point to be rotated according to a field frame, and the center of the picture can be directly taken as the center point of rotation, so that adjustment can be performed according to actual service requirements, and then the picture is rotated by an angle according to the center point; the contrast processing of the picture refers to randomly changing S (Saturation) and V (Value brightness) of the picture on an HSV color space; the color change processing of the picture refers to randomly changing H (Hue) of the picture on the HSV color space.
In this embodiment, by adopting at least one disturbance processing method described above for the synthesized image, different types of character images, for example, a rotation pattern character image, a noisy character image, an oblique character image, and the like can be obtained. Through carrying out disturbance processing on the synthesized images, the diversity of character images can be increased, so that the data of the training samples are more abundant, and the character recognition model obtained through training by the training samples can have higher recognition accuracy.
The generating module 103 is configured to input the different types of character images into a deep learning network for training to generate a character recognition model.
Specifically, before inputting different types of character images into the deep learning network, the character images need to be preprocessed to be converted into required feature vectors, and then the required feature vectors are input into the deep learning network for training.
In this embodiment, the deep learning network is preferably a CRNN model, which is a joint model of a convolutional neural network and a cyclic neural network, and which is an end-to-end trainable model, and has the following advantages: 1) The input data may be of any length (image width arbitrary, word length arbitrary); 2) The training set does not need to be calibrated with characters; 3) Both dictionary-carrying and dictionary-free libraries (samples) can be used; 4) Good performance and small model (few parameters).
In a specific embodiment, the CRNN model includes a VGG16 layer, two long-short-term memory network LSTM layers, and two fully-connected FC layers, where the VGG16 layer is composed of 13 convolution layers and 3 fully-linked layers, and is used to extract spatial features of a character image; the LSTM layers are used for extracting time sequence characteristics of the character images so as to obtain a context relation of the text to be trained and identified; the two fully connected FC layers are used for classifying the extracted spatial features and the extracted time sequence features. Compared with the existing CRNN model, the CRNN model in the embodiment is additionally provided with a fully-connected FC layer to accelerate the convergence rate of training.
The output module 104 is configured to input a character image to be recognized into the character recognition model, and output a recognition result of the character image to be recognized.
In this embodiment, when a user needs to recognize a character, the character recognition model can recognize a character corresponding to the character image only by collecting a character image of the character to be recognized and inputting the character image into the character recognition model. In this embodiment, the character recognition model may be stored in a local character recognition terminal or may be stored in a server, and specifically selected according to the actual needs of the user.
The test module 105 is configured to test the recognition accuracy of the character recognition model on the character.
Specifically, after the character recognition model is generated, the recognition accuracy of the character recognition model to the real character image data needs to be tested.
In one embodiment, a user inputs character images of a plurality of real characters into the character recognition model, outputs a recognition result corresponding to the real characters, and calculates accuracy of character recognition according to the output recognition result. It is understood that the number of real character data input into the character recognition model should be as large as possible in order to obtain an accurate calculation result of the character recognition rate.
When the accuracy of character recognition is calculated, the recognition result output by the character recognition model can be compared with the pre-stored character data, so that whether the character recognition model is true for the character recognition or not is determined, if the character data is determined to be correct, 1 can be counted and accumulated until all the characters are recognized, and the calculated accumulated value is divided by the number of characters input into the character recognition model to obtain the recognition accuracy of the character recognition model for the true character image data.
The adjustment module 106 is configured to adjust the character recognition model if the recognition accuracy is lower than a preset threshold.
Specifically, after the recognition accuracy of the character recognition model to the character is obtained, comparing the character recognition accuracy with a preset threshold, and if the character recognition accuracy is lower than the preset threshold, adjusting the character recognition model. In this embodiment, the preset threshold is the lowest value of the character recognition accuracy set in advance, for example, the preset threshold is 90%. The preset threshold value can be set according to the actual requirement of the user, and the preset threshold value after the setting can be further modified according to the actual requirement.
It should be noted that, in the present embodiment, when the character recognition model is adjusted, only the character recognition model is fine-tuned, and no great adjustment is required.
Specifically, the step of adjusting the character recognition model includes:
And step A, freezing parameters of the VGG16 layer.
In this embodiment, when the character recognition model is adjusted, the parameters of the VGG16 layer are not changed, that is, the parameters of the VGG16 layer are frozen, so as to prevent the parameters of the VGG16 layer from being adjusted under the stimulus of the training sample data when the character recognition model is adjusted.
And B, adjusting parameters of the LSTM layers and the FC layers.
In this embodiment, when the character recognition model is adjusted, parameters of the two long short-term memory network LSTM layers and the two fully-connected FC layers are adjusted, specifically, parameters of the two long-term memory network LSTM layers and the two fully-connected FC layers are released, and the learning rate is set to be attenuated every several epochs until the attenuation reaches a boundary value.
And C, training the adjusted character recognition model by adopting real character image data.
In this embodiment, parameters of the two long-short-term memory network LSTM layers and the two fully-connected FC layers are adjusted, and meanwhile, a real character image is input into a character recognition model with the adjusted parameters, and further training is performed to obtain an adjusted character recognition model. After the adjusted character recognition model is obtained, testing the recognition accuracy of the model by adopting a testing module 105, and if the test result meets the requirement, completing the training of the character recognition model; if the test result obtained by testing the character recognition model by using the test module 105 still does not meet the requirement, the step a-step C is repeatedly executed until the recognition accuracy of the obtained character recognition model meets the requirement.
Through the program modules 101-106, the character recognition system 100 provided by the invention acquires character data, and performs image synthesis on each acquired character data and a preset background picture to acquire a character image corresponding to each character data; carrying out random disturbance processing on the synthesized character images to obtain character images of different types; inputting the character images of different types into a deep learning network for training to generate a character recognition model; inputting a character image to be recognized into the character recognition model, and outputting a recognition result of the character image to be recognized; testing the recognition accuracy of the character recognition model to the characters; and if the recognition accuracy is lower than a preset threshold, adjusting the character recognition model. Therefore, when the character recognition model does not reach the preset recognition accuracy, the character recognition model is finely adjusted, so that the accuracy of character recognition is improved.
In addition, the invention also provides a character recognition method.
Referring to fig. 4, a flowchart of a character recognition method according to a first embodiment of the present invention is shown. In this embodiment, the execution sequence of the steps in the flowchart shown in fig. 4 may be changed, and some steps may be omitted according to different requirements.
Step S500, acquiring character data, and carrying out image synthesis on the acquired character data and a preset background picture to obtain a character image corresponding to each character data.
Specifically, the character data may be english letters, symbols, numerals, kanji, etc., and in this embodiment, the character data includes at least one character. The character data can be obtained by grabbing from a network and then stored in a preset file, and when a user needs to use the character data, the character data is directly obtained from the preset file; the character data can also be character data provided by a business party and stored in a preset file, and when a user needs to use the character data, the character data can be directly obtained from the preset file. Preferably, the preset file is a file in TXT format. The character data may be obtained in any manner by those skilled in the art, and will not be described in detail herein.
The preset background picture is a picture determined by the user according to actual needs, in this embodiment, the preset background picture is preferably a picture captured by taking a keyword as "paper" from the internet, and the picture is at least one picture, and of course, the picture can also be a picture obtained after the user uses the camera to capture various papers. It will be appreciated that in other embodiments of the present invention, the preset background image may be other types of pictures, such as a license plate number picture, an identification card picture, etc.
For example, when the acquired character data has 5 character data and the preset background picture has 4 pieces, it is preferable that each character data is respectively combined with each background image during image combination, so that 4 character images can be combined with each character data, and 20 character images can be combined with 5 character data. Of course, in the image synthesis, it is not necessary to perform image synthesis with each background image for each character data to obtain a character image, and the present embodiment is not limited thereto. In this embodiment, the diversity of character images can be increased by image synthesis of character data and a plurality of background pictures.
In this embodiment, any one of the existing image synthesis techniques may be used to implement image synthesis, for example, when image synthesis is performed, first, the length and width of the pixel space occupied by the character data may be determined according to the length of the character data, the style of the character data, and the word size of the character data, and after the length and width of the pixel space occupied by the character data are determined, a corresponding pixel area is selected from the pixels of the background picture so that the pixel corresponding to the character data can be inserted into the pixel area, and the pixel originally located in the pixel area is replaced. It will be understood that in other embodiments, a pixel replacement method may be adopted, and a pixel stacking method may be adopted directly, that is, each pixel corresponding to the character data is respectively stacked with each pixel corresponding to the pixel region, and the pixel value after stacking is taken as the pixel value of each pixel in the pixel region.
Step S502, carrying out random disturbance processing on the synthesized character images to obtain different types of character images.
Specifically, the random disturbance processing includes gaussian blur processing, gaussian noise processing, small-amplitude rotation processing of a picture, contrast processing and color change processing of a picture, and the like. The Gaussian blur processing of the picture refers to Gaussian filtering of a certain mean value and variance of the picture; the Gaussian noise processing of the picture means that Gaussian noise points are added to three color channels of the picture, and unlike Gaussian blur, the Gaussian blur is direct superposition of values, and the Gaussian blur is filtering of the picture; the small-amplitude rotation processing of the picture refers to determining a center point to be rotated according to a field frame, and the center of the picture can be directly taken as the center point of rotation, so that adjustment can be performed according to actual service requirements, and then the picture is rotated by an angle according to the center point; the contrast processing of the picture refers to randomly changing S (Saturation) and V (Value brightness) of the picture on an HSV color space; the color change processing of the picture refers to randomly changing H (Hue) of the picture on the HSV color space.
In this embodiment, by adopting at least one disturbance processing method described above for the synthesized image, different types of character images, for example, a rotation pattern character image, a noisy character image, an oblique character image, and the like can be obtained. Through carrying out disturbance processing on the synthesized images, the diversity of character images can be increased, so that the data of the training samples are more abundant, and the character recognition model obtained through training by the training samples can have higher recognition accuracy.
Step S504, inputting the character images of different types into a deep learning network for training to generate a character recognition model.
Specifically, before inputting different types of character images into the deep learning network, the character images need to be preprocessed to be converted into required feature vectors, and then the required feature vectors are input into the deep learning network for training.
In this embodiment, the deep learning network is preferably a CRNN model, which is a joint model of a convolutional neural network and a cyclic neural network, and which is an end-to-end trainable model, and has the following advantages: 1) The input data may be of any length (image width arbitrary, word length arbitrary); 2) The training set does not need to be calibrated with characters; 3) Both dictionary-carrying and dictionary-free libraries (samples) can be used; 4) Good performance and small model (few parameters).
In a specific embodiment, the CRNN model includes a VGG16 layer, two long-short-term memory network LSTM layers, and two fully-connected FC layers, where the VGG16 layer is composed of 13 convolution layers and 3 fully-linked layers, and is used to extract spatial features of a character image; the LSTM layers are used for extracting time sequence characteristics of the character images so as to obtain a context relation of the text to be trained and identified; the two fully connected FC layers are used for classifying the extracted spatial features and the extracted time sequence features. Compared with the existing CRNN model, the CRNN model in the embodiment is additionally provided with a fully-connected FC layer to accelerate the convergence rate of training.
Step S506, inputting the character image to be recognized into the character recognition model, and outputting the recognition result of the character image to be recognized.
In this embodiment, when a user needs to recognize a character, the character recognition model can recognize a character corresponding to the character image only by collecting a character image of the character to be recognized and inputting the character image into the character recognition model. In this embodiment, the character recognition model may be stored in a local character recognition terminal or may be stored in a server, and specifically, the selection is performed according to the actual needs of the user, which is not limited in this embodiment.
Through the steps S500-S506, the character recognition method provided by the invention acquires character data, and performs image synthesis on each acquired character data and a preset background picture to acquire a character image corresponding to each character data; carrying out random disturbance processing on the synthesized character images to obtain character images of different types; inputting the character images of different types into a deep learning network for training to generate a character recognition model; and inputting the character image to be recognized into the character recognition model, and outputting a recognition result of the character image to be recognized. Thus, diversified training sample data can be generated according to the needs, so that the problems of small character recognition range and low accuracy rate caused by uneven distribution of real data of training samples in the prior art are solved, the character recognition range is increased, and the character recognition accuracy rate is increased.
Referring to fig. 5, a flowchart of a character recognition method according to a second embodiment of the present invention is shown. In this embodiment, the execution sequence of the steps in the flowchart shown in fig. 5 may be changed, and some steps may be omitted according to different requirements.
Step S600, acquiring character data, and performing image synthesis on the acquired character data and a preset background picture to obtain a character image corresponding to each character data.
Step S602, performing random disturbance processing on the synthesized character images to obtain different types of character images.
In step S604, the different types of character images are input into a deep learning network for training to generate a character recognition model.
Step S606, inputting the character image to be recognized into the character recognition model, and outputting the recognition result of the character image to be recognized.
Steps S600 to S606 are similar to steps S500 to S506, and are not described in detail in this embodiment.
Step S608, testing the recognition accuracy of the character recognition model to the character.
Specifically, after the character recognition model is generated, the recognition accuracy of the character recognition model to the real character image data needs to be tested.
In one embodiment, a user inputs character images of a plurality of real characters into the character recognition model, outputs a recognition result corresponding to the real characters, and calculates accuracy of character recognition according to the output recognition result. It is understood that the number of real character data input into the character recognition model should be as large as possible in order to obtain an accurate calculation result of the character recognition rate.
When the accuracy of character recognition is calculated, the recognition result output by the character recognition model can be compared with the pre-stored character data, so that whether the character recognition model is true for the character recognition or not is determined, if the character data is determined to be correct, 1 can be counted and accumulated until all the characters are recognized, and the calculated accumulated value is divided by the number of characters input into the character recognition model to obtain the recognition accuracy of the character recognition model for the true character image data.
Step S610, if the recognition accuracy is lower than a preset threshold, adjusting the character recognition model.
Specifically, after the recognition accuracy of the character recognition model to the character is obtained, comparing the character recognition accuracy with a preset threshold, and if the character recognition accuracy is lower than the preset threshold, adjusting the character recognition model. In this embodiment, the preset threshold is the lowest value of the character recognition accuracy set in advance, for example, the preset threshold is 90%. The preset threshold value can be set according to the actual requirement of the user, and the preset threshold value after the setting can be further modified according to the actual requirement.
It should be noted that, in the present embodiment, when the character recognition model is adjusted, only the character recognition model is fine-tuned, and no great adjustment is required.
Specifically, the step of adjusting the character recognition model includes:
And step A, freezing parameters of the VGG16 layer.
In this embodiment, when the character recognition model is adjusted, the parameters of the VGG16 layer are not changed, that is, the parameters of the VGG16 layer are frozen, so as to prevent the parameters of the VGG16 layer from being adjusted under the stimulus of the training sample data when the character recognition model is adjusted.
And B, adjusting parameters of the LSTM layers and the FC layers.
In this embodiment, when the character recognition model is adjusted, parameters of the two long short-term memory network LSTM layers and the two fully-connected FC layers are adjusted, specifically, parameters of the two long-term memory network LSTM layers and the two fully-connected FC layers are released, and the learning rate is set to be attenuated every several epochs until the attenuation reaches a boundary value.
And C, training the adjusted character recognition model by adopting real character image data.
In this embodiment, parameters of the two long-short-term memory network LSTM layers and the two fully-connected FC layers are adjusted, and meanwhile, a real character image is input into a character recognition model with the adjusted parameters, and further training is performed to obtain an adjusted character recognition model. After the adjusted character recognition model is obtained, testing the recognition accuracy of the model by adopting a testing module 105, and if the test result meets the requirement, completing the training of the character recognition model; if the test result obtained by testing the character recognition model by using the test module 105 still does not meet the requirement, the step a-step C is repeatedly executed until the recognition accuracy of the obtained character recognition model meets the requirement.
Through the steps S600-S610, the character recognition method provided by the invention acquires character data, and performs image synthesis on each acquired character data and a preset background picture to acquire a character image corresponding to each character data; carrying out random disturbance processing on the synthesized character images to obtain character images of different types; inputting the character images of different types into a deep learning network for training to generate a character recognition model; inputting a character image to be recognized into the character recognition model, and outputting a recognition result of the character image to be recognized; testing the recognition accuracy of the character recognition model to the characters; and if the recognition accuracy is lower than a preset threshold, adjusting the character recognition model. Therefore, when the character recognition model does not reach the preset recognition accuracy, the character recognition model is finely adjusted, so that the accuracy of character recognition is improved.
The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising several instructions for causing a server (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the method according to the embodiments of the present invention.
The foregoing description is only of the preferred embodiments of the present invention, and is not intended to limit the scope of the invention, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein or in the alternative, which may be employed directly or indirectly in other related arts.

Claims (4)

1. A character recognition method applied to a server, the method comprising:
Acquiring character data, and performing image synthesis on each acquired character data and a preset background picture to obtain a character image corresponding to each character data;
Carrying out random disturbance processing on the synthesized character images to obtain character images of different types;
Converting the character images of different types into spatial features and time sequence features, inputting the spatial features and the time sequence features into a deep learning network for training to generate a character recognition model, wherein the deep learning network is a CRNN model, the CRNN model comprises a VGG16 layer, two long short term memory network LSTM layers and two fully connected FC layers, the VGG16 layer consists of 13 convolution layers and 3 fully connected layers and is used for extracting the spatial features of the character images, the two long term memory network LSTM layers are used for extracting the time sequence features of the character images, and the two fully connected FC layers are used for classifying the extracted spatial features and the time sequence features;
Testing the recognition accuracy of the character recognition model on the character, and if the recognition accuracy is lower than a preset threshold, adjusting the character recognition model, wherein the method comprises the following steps: freezing parameters of the VGG16 layer, adjusting parameters of the two long-short-period memory network LSTM layers and the two fully-connected FC layers, and training the adjusted character recognition model by adopting real character image data;
Inputting the character image to be recognized into the character recognition model, and outputting a recognition result of the character image to be recognized.
2. The character recognition method according to claim 1, wherein the random disturbance process includes: at least one of gaussian blur processing, gaussian noise processing, small amplitude rotation processing of a picture, contrast change processing of a picture, and color change processing of a picture.
3. A server comprising a memory, a processor, the memory having stored thereon a character recognition system operable on the processor, the character recognition system when executed by the processor performing the steps of:
Acquiring character data, and performing image synthesis on each acquired character data and a preset background picture to obtain a character image corresponding to each character data;
Carrying out random disturbance processing on the synthesized character images to obtain character images of different types;
Converting the character images of different types into spatial features and time sequence features, inputting the spatial features and the time sequence features into a deep learning network for training to generate a character recognition model, wherein the deep learning network is a CRNN model, the CRNN model comprises a VGG16 layer, two long short term memory network LSTM layers and two fully connected FC layers, the VGG16 layer consists of 13 convolution layers and 3 fully connected layers and is used for extracting the spatial features of the character images, the two long term memory network LSTM layers are used for extracting the time sequence features of the character images, and the two fully connected FC layers are used for classifying the extracted spatial features and the time sequence features;
Testing the recognition accuracy of the character recognition model on the character, and if the recognition accuracy is lower than a preset threshold, adjusting the character recognition model, wherein the method comprises the following steps: freezing parameters of the VGG16 layer, adjusting parameters of the two long-short-period memory network LSTM layers and the two fully-connected FC layers, and training the adjusted character recognition model by adopting real character image data;
Inputting the character image to be recognized into the character recognition model, and outputting a recognition result of the character image to be recognized.
4. A computer readable storage medium storing a character recognition system executable by at least one processor to cause the at least one processor to perform the steps of the character recognition method of any one of claims 1-2.
CN201811341729.XA 2018-11-12 2018-11-12 Character recognition method, server and computer readable storage medium Active CN109685100B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201811341729.XA CN109685100B (en) 2018-11-12 2018-11-12 Character recognition method, server and computer readable storage medium
PCT/CN2019/088638 WO2020098250A1 (en) 2018-11-12 2019-05-27 Character recognition method, server, and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811341729.XA CN109685100B (en) 2018-11-12 2018-11-12 Character recognition method, server and computer readable storage medium

Publications (2)

Publication Number Publication Date
CN109685100A CN109685100A (en) 2019-04-26
CN109685100B true CN109685100B (en) 2024-05-10

Family

ID=66185317

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811341729.XA Active CN109685100B (en) 2018-11-12 2018-11-12 Character recognition method, server and computer readable storage medium

Country Status (2)

Country Link
CN (1) CN109685100B (en)
WO (1) WO2020098250A1 (en)

Families Citing this family (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109685100B (en) * 2018-11-12 2024-05-10 平安科技(深圳)有限公司 Character recognition method, server and computer readable storage medium
CN110135413B (en) * 2019-05-08 2021-08-17 达闼机器人有限公司 Method for generating character recognition image, electronic equipment and readable storage medium
CN110222693B (en) * 2019-06-03 2022-03-08 第四范式(北京)技术有限公司 Method and device for constructing character recognition model and recognizing characters
CN110348436A (en) * 2019-06-19 2019-10-18 平安普惠企业管理有限公司 Text information in image is carried out to know method for distinguishing and relevant device
CN110458184B (en) * 2019-06-26 2023-06-30 平安科技(深圳)有限公司 Optical character recognition assistance method, device, computer equipment and storage medium
WO2020258491A1 (en) * 2019-06-28 2020-12-30 平安科技(深圳)有限公司 Universal character recognition method, apparatus, computer device, and storage medium
CN110363290B (en) * 2019-07-19 2023-07-25 广东工业大学 Image recognition method, device and equipment based on hybrid neural network model
CN112287932B (en) * 2019-07-23 2024-05-10 上海高德威智能交通系统有限公司 Method, device, equipment and storage medium for determining image quality
CN110765442A (en) * 2019-09-30 2020-02-07 奇安信科技集团股份有限公司 Method and device for identifying verification code in verification picture and electronic equipment
US10990876B1 (en) 2019-10-08 2021-04-27 UiPath, Inc. Detecting user interface elements in robotic process automation using convolutional neural networks
US11157783B2 (en) 2019-12-02 2021-10-26 UiPath, Inc. Training optical character detection and recognition models for robotic process automation
CN113221601A (en) 2020-01-21 2021-08-06 深圳富泰宏精密工业有限公司 Character recognition method, device and computer readable storage medium
CN111414844B (en) * 2020-03-17 2023-08-29 北京航天自动控制研究所 Container number identification method based on convolutional neural network
CN112287934A (en) * 2020-08-12 2021-01-29 北京京东尚科信息技术有限公司 Method and device for recognizing characters and obtaining character image feature extraction model
CN112052852B (en) * 2020-09-09 2023-12-29 国家气象信息中心 Character recognition method of handwriting meteorological archive data based on deep learning
CN112215221A (en) * 2020-09-22 2021-01-12 国交空间信息技术(北京)有限公司 Automatic vehicle frame number identification method
CN112287936A (en) * 2020-09-24 2021-01-29 深圳市智影医疗科技有限公司 Optical character recognition test method and device, readable storage medium and terminal equipment
CN114627459A (en) * 2020-12-14 2022-06-14 菜鸟智能物流控股有限公司 OCR recognition method, recognition device and recognition system
CN112613572B (en) * 2020-12-30 2024-01-23 北京奇艺世纪科技有限公司 Sample data obtaining method and device, electronic equipment and storage medium
CN112906693B (en) * 2021-03-05 2022-06-24 杭州费尔斯通科技有限公司 Method for identifying subscript character and subscript character
CN113012265B (en) * 2021-04-22 2024-04-30 中国平安人寿保险股份有限公司 Method, apparatus, computer device and medium for generating needle-type printed character image
CN113239854B (en) * 2021-05-27 2023-12-19 北京环境特性研究所 Ship identity recognition method and system based on deep learning
CN113971806B (en) * 2021-10-26 2023-05-05 北京百度网讯科技有限公司 Model training and character recognition method, device, equipment and storage medium
CN114155361A (en) * 2021-12-11 2022-03-08 浙江正泰中自控制工程有限公司 Method and system for reading meter by camera direct-reading meter
CN114495106A (en) * 2022-04-18 2022-05-13 电子科技大学 MOCR (metal-oxide-semiconductor resistor) deep learning method applied to DFB (distributed feedback) laser chip
CN114758339B (en) * 2022-06-15 2022-09-20 深圳思谋信息科技有限公司 Method and device for acquiring character recognition model, computer equipment and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001358925A (en) * 2000-06-09 2001-12-26 Minolta Co Ltd Unit and method for image processing and recording medium
WO2013121648A1 (en) * 2012-02-17 2013-08-22 オムロン株式会社 Character-recognition method and character-recognition device and program using said method
CN107273896A (en) * 2017-06-15 2017-10-20 浙江南自智能科技股份有限公司 A kind of car plate detection recognition methods based on image recognition
WO2018090013A1 (en) * 2016-11-14 2018-05-17 Kodak Alaris Inc. System and method of character recognition using fully convolutional neural networks with attention
CN108446621A (en) * 2018-03-14 2018-08-24 平安科技(深圳)有限公司 Bank slip recognition method, server and computer readable storage medium
CN108564103A (en) * 2018-01-09 2018-09-21 众安信息技术服务有限公司 Data processing method and device
CN108573496A (en) * 2018-03-29 2018-09-25 淮阴工学院 Multi-object tracking method based on LSTM networks and depth enhancing study
CN108596180A (en) * 2018-04-09 2018-09-28 深圳市腾讯网络信息技术有限公司 Parameter identification, the training method of parameter identification model and device in image

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4661921B2 (en) * 2008-08-26 2011-03-30 富士ゼロックス株式会社 Document processing apparatus and program
CN107392221B (en) * 2017-06-05 2020-09-22 天方创新(北京)信息技术有限公司 Training method of classification model, and method and device for classifying OCR (optical character recognition) results
CN109685100B (en) * 2018-11-12 2024-05-10 平安科技(深圳)有限公司 Character recognition method, server and computer readable storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001358925A (en) * 2000-06-09 2001-12-26 Minolta Co Ltd Unit and method for image processing and recording medium
WO2013121648A1 (en) * 2012-02-17 2013-08-22 オムロン株式会社 Character-recognition method and character-recognition device and program using said method
WO2018090013A1 (en) * 2016-11-14 2018-05-17 Kodak Alaris Inc. System and method of character recognition using fully convolutional neural networks with attention
CN107273896A (en) * 2017-06-15 2017-10-20 浙江南自智能科技股份有限公司 A kind of car plate detection recognition methods based on image recognition
CN108564103A (en) * 2018-01-09 2018-09-21 众安信息技术服务有限公司 Data processing method and device
CN108446621A (en) * 2018-03-14 2018-08-24 平安科技(深圳)有限公司 Bank slip recognition method, server and computer readable storage medium
CN108573496A (en) * 2018-03-29 2018-09-25 淮阴工学院 Multi-object tracking method based on LSTM networks and depth enhancing study
CN108596180A (en) * 2018-04-09 2018-09-28 深圳市腾讯网络信息技术有限公司 Parameter identification, the training method of parameter identification model and device in image

Also Published As

Publication number Publication date
CN109685100A (en) 2019-04-26
WO2020098250A1 (en) 2020-05-22

Similar Documents

Publication Publication Date Title
CN109685100B (en) Character recognition method, server and computer readable storage medium
CN110135411B (en) Business card recognition method and device
CN111046879B (en) Certificate image classification method, device, computer equipment and readable storage medium
CN109377494B (en) Semantic segmentation method and device for image
CN111507324B (en) Card frame recognition method, device, equipment and computer storage medium
CN111104841A (en) Violent behavior detection method and system
CN110555439A (en) identification recognition method, training method and device of model thereof and electronic system
CN112381092B (en) Tracking method, tracking device and computer readable storage medium
CN113239910B (en) Certificate identification method, device, equipment and storage medium
CN111178290A (en) Signature verification method and device
CN111985465A (en) Text recognition method, device, equipment and storage medium
CN110942067A (en) Text recognition method and device, computer equipment and storage medium
CN108648189A (en) Image fuzzy detection method, apparatus, computing device and readable storage medium storing program for executing
CN112149756A (en) Model training method, image recognition method, device, equipment and storage medium
CN113221897B (en) Image correction method, image text recognition method, identity verification method and device
CN115731422A (en) Training method, classification method and device of multi-label classification model
CN112183542A (en) Text image-based recognition method, device, equipment and medium
CN111553431A (en) Picture definition detection method and device, computer equipment and storage medium
CN114419739A (en) Training method of behavior recognition model, behavior recognition method and equipment
CN113516597B (en) Image correction method, device and server
CN112861836B (en) Text image processing method, text and card image quality evaluation method and device
CN112287932B (en) Method, device, equipment and storage medium for determining image quality
CN113920565A (en) Authenticity identification method, authenticity identification device, electronic device and storage medium
CN112381055A (en) First-person perspective image recognition method and device and computer readable storage medium
JPWO2006008992A1 (en) Web site connection method using portable information communication terminal with camera

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant