CN111105549A - Optical character recognition method, device and computer storage medium - Google Patents

Optical character recognition method, device and computer storage medium Download PDF

Info

Publication number
CN111105549A
CN111105549A CN201911318760.6A CN201911318760A CN111105549A CN 111105549 A CN111105549 A CN 111105549A CN 201911318760 A CN201911318760 A CN 201911318760A CN 111105549 A CN111105549 A CN 111105549A
Authority
CN
China
Prior art keywords
training
neural network
model
detection
character recognition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911318760.6A
Other languages
Chinese (zh)
Inventor
乐识非
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Unicloud Nanjing Digital Technology Co Ltd
Original Assignee
Unicloud Nanjing Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Unicloud Nanjing Digital Technology Co Ltd filed Critical Unicloud Nanjing Digital Technology Co Ltd
Priority to CN201911318760.6A priority Critical patent/CN111105549A/en
Publication of CN111105549A publication Critical patent/CN111105549A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G07CHECKING-DEVICES
    • G07DHANDLING OF COINS OR VALUABLE PAPERS, e.g. TESTING, SORTING BY DENOMINATIONS, COUNTING, DISPENSING, CHANGING OR DEPOSITING
    • G07D7/00Testing specially adapted to determine the identity or genuineness of valuable papers or for segregating those which are unacceptable, e.g. banknotes that are alien to a currency
    • G07D7/20Testing patterns thereon
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G07CHECKING-DEVICES
    • G07DHANDLING OF COINS OR VALUABLE PAPERS, e.g. TESTING, SORTING BY DENOMINATIONS, COUNTING, DISPENSING, CHANGING OR DEPOSITING
    • G07D7/00Testing specially adapted to determine the identity or genuineness of valuable papers or for segregating those which are unacceptable, e.g. banknotes that are alien to a currency
    • G07D7/20Testing patterns thereon
    • G07D7/2008Testing patterns thereon using pre-processing, e.g. de-blurring, averaging, normalisation or rotation

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Character Discrimination (AREA)
  • Image Analysis (AREA)

Abstract

An optical character recognition method and apparatus, wherein the method includes preprocessing an image acquired by an optical device through an SSD network model; detecting the preprocessed image by adopting an EAST neural network detection model obtained by training; and identifying the detection result by adopting a CRNN neural network identification model obtained by training. The scheme of the invention adopts the solutions of SSD network segmentation, EAST network detection and CRNN network identification, and also considers the invoice type, detection speed and identification accuracy in the office scene, thereby providing an effective technical scheme for intelligent optical character recognition OCR land in the office scene.

Description

Optical character recognition method, device and computer storage medium
Technical Field
The invention belongs to the field of character recognition, and particularly relates to an optical character recognition method, an optical character recognition device and a computer readable storage medium.
Background
With the rapid development of artificial intelligence technology, the application field of character recognition has gradually shifted from a simple scene oriented to scientific research to a complex application scene closely related to social activities. Based on the above, the design and use of optical character recognition gradually shift from single functionality to cloud, but the existing common optical character recognition OCR technology can complete detection and recognition in the same invoice type, once the invoice background has high noise or large type difference, the existing OCR technology is not easy to separate the boundaries of various invoices from the background, which is not applicable to the optical character recognition technology facing to the office scene.
The OCR technology used at present is mainly applied under an office scene and in a natural scene, and the primary detection technology represented by a Yolo series for the former occupies the mainstream of the detection of the existing natural scene, but the technology has the defect of low regression rate for characters with different scales; in the common office scene character detection, the prior art can only work for one type of invoices, but the clustering method is used for identifying multiple types of invoices, so that the invoices of different types cannot be distinguished with high accuracy.
Disclosure of Invention
In view of the above-mentioned deficiencies of the prior art, one of the objectives of the present invention is to solve the problem that the prior art cannot distinguish invoices of different types with high accuracy and avoid the defect of low regression rate of characters of different scales.
The embodiment of the invention discloses an optical character recognition method, which comprises the steps of preprocessing an image acquired by optical equipment through an SSD network model; detecting the preprocessed image by adopting an EAST neural network detection model obtained by training; and identifying the detection result by adopting a CRNN neural network identification model obtained by training.
In one possible embodiment, the pre-processing includes data cleansing and dataset preparation of the image; and carrying out image segmentation processing by using the trained SSD network model.
In one possible embodiment, the EAST neural network detection model is obtained by: pre-training to obtain a detection model by changing a data set path, adjusting training parameters according to resources, cleaning pre-training, and starting a training process under a multi-window manager tmux; the pre-trained parameters are stored for retraining to obtain a detection model.
In one possible embodiment, the CRNN neural network recognition pattern is obtained by: placing the images to be identified in the same path, and then cutting the images according to the detection result to obtain data to be identified; changing a data set path, adjusting training parameters, cleaning pre-training, starting a training process under a multi-window manager tmux, and training to obtain a recognition model.
In a possible embodiment, the method further comprises verifying the detection result and the identification result.
In one possible embodiment, the method further comprises optimizing the EAST neural network detection model and the CRNN neural network identification model according to the verification result.
The embodiment of the invention also discloses an optical character recognition device, which comprises a preprocessing module, a data processing module and a data processing module, wherein the preprocessing module is used for preprocessing the image acquired by the optical equipment through the SSD network model; the detection module is used for detecting the preprocessed image by adopting an EAST neural network detection model obtained by training; and the recognition module is used for recognizing the detection result by adopting a CRNN neural network recognition model obtained by training.
In one possible embodiment, the preprocessing module is further configured to: performing data cleaning and data set production on the image; and carrying out image segmentation processing by using the trained SSD network model.
In one possible embodiment, the system further comprises a verification module for optimizing the EAST neural network detection model and the CRNN neural network identification model according to the verification result.
The invention also discloses a computer readable storage medium having a computer program stored thereon, which when executed by a processor implements any of the methods described above.
The invention has the beneficial effects that: the scheme of the invention adopts the solutions of SSD network segmentation, EAST network detection and CRNN network identification, and also considers the invoice type, detection speed and identification accuracy in the office scene, thereby providing an effective technical scheme for intelligent optical character recognition OCR land in the office scene.
Drawings
FIG. 1 is a flow chart of a method according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart of a specific method according to an embodiment of the present invention;
fig. 3 is a schematic diagram of a cloud service environment deployment architecture according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of an apparatus according to an embodiment of the present invention.
Detailed Description
In order to facilitate understanding of those skilled in the art, the present invention will be further described with reference to the following examples and drawings, which are not intended to limit the present invention.
The embodiment of the invention discloses an optical character recognition method, which comprises the following steps:
s101, preprocessing an image acquired by the optical equipment through an SSD network model.
The preprocessing comprises the steps of carrying out data cleaning and data set production on the image; and carrying out image segmentation processing by using the trained SSD network model. The English full name of the SSD is a Single Shot MultiBox Detector, the Single Shot indicates that the SSD algorithm belongs to a one-stage method, and the MultiBox indicates that the SSD is multi-frame prediction.
Specifically, referring to fig. 2, the acquired image is preprocessed, that is, data sample sampling is performed: two sub-processes are usually involved, one is the data processing process and the other is the original sample segmentation process.
The data processing sub-process mainly comprises data acquisition, data cleaning and data set production processes.
The data acquisition process mainly comprises the steps of applying for invoice data to relevant departments, sampling on-site data after sampling permission is obtained, carrying out simple normalization and arrangement on the acquired data, and grading according to the quality of the acquired samples to finish coarse-grained data analysis.
The data cleaning process comprises the step of cleaning the data after coarse-grained cleaning in a fine granularity mode, wherein the data can comprise the following dimensions, and pictures which do not meet the minimum dimension, resolution and occupied proportion are filtered; the cleansing of the data requires the following goals to be achieved: the key words of the invoice are clear, the illumination is uniform, the invoice has no distortion, ink marks, obvious folding marks, smooth surface, no folds and no virtual images.
The data set production requires the support of a standard data set format, the data can be produced into a VOC-like data set format, and the data set can comprise the following four items: annotation carries the scaled data, JPEG contains the images in each jpg format, score contains the data samples for each score, layout contains the sample numbers for training, training-validation and validation.
And secondly, the original sample segmentation sub-process mainly comprises SSD interface design and SSD image segmentation training. Wherein. The SSD interface design enables invoice segmentation success to be derived from the fact that both OCR and labels are sorted by the results of the same SSD box, which ensures the relative ordering of large boxes. The SSD image segmentation training is to perform SSD image segmentation training after the SSD interface design is finished, and distinguish the coarse-grained invoice types by using a model obtained by training.
And S102, detecting the preprocessed image by adopting an EAST neural network detection model obtained by training.
The test model needs to be trained before testing. Obtaining the EAST neural network detection model by the following steps: pre-training to obtain a detection model by changing a data set path, adjusting training parameters according to resources, cleaning pre-training, and starting a training process under a multi-window manager tmux; the pre-trained parameters are stored for retraining to obtain a detection model.
Specifically, referring to fig. 2, the EAST detection model training is mainly divided into pre-training and retraining. The pre-training comprises changing a data set path in an EAST pre-training part, adjusting training parameters on the multi-core V100 according to resources, cleaning pre-training, starting a training process under tmux, and then training to obtain a detection model. Retraining comprises the steps of reserving the checkpoint of the pre-training in the EAST retraining part, then inputting relevant images and corresponding json files, and completing retraining by using the parameters of the restore pre-training.
And S103, identifying the detection result by adopting a CRNN neural network identification model obtained by training.
Obtaining a CRNN neural network recognition model by the following steps: placing the images to be identified in the same path, and then cutting the images according to the detection result to obtain data to be identified; changing a data set path, adjusting training parameters, cleaning pre-training, starting a training process under a multi-window manager tmux, and training to obtain a recognition model. The CRNN is a convolution cyclic neural network structure, and is used for solving the problem of image-based sequence recognition, particularly the problem of scene character recognition.
Specifically, referring to fig. 2, the training of the recognition model is mainly divided into label generation and training. The label generation comprises the steps of putting invoices to be identified under the same folder address, cutting out eight-point coordinates based on QUAD according to a detection result, packaging the subgraph and the corresponding label to form files of label and path, and then changing a training label set of CRNN to avoid automatic escape of predicted characters. The training comprises the steps of changing a data set path in a CRNN pre-training part, adjusting training parameters on the multi-core V100 according to resources, cleaning pre-training, starting a training process under tmux, and then training to obtain a detection model.
After obtaining corresponding training models, namely an EAST neural network detection model and a CRNN neural network identification model, the models can be verified, and finally an analysis report of detection and identification is obtained. Specifically, referring to fig. 2, the verification method includes detection model verification: the method mainly comprises the steps of checking a specific detection result and checking a macroscopic detection index, wherein the specific detection result comprises a frame of a code, a number, a date, time, mileage and money of an invoice; the latter includes the correct rate, regression rate and F1 values for the field level; and (3) identifying the model: the method mainly comprises the steps of checking a specific identification result and checking a macroscopic detection index, wherein the specific identification result comprises specific field values of a code, a number, a date, time, mileage and amount of an invoice; the latter includes the correct rate, regression rate and F1 values for the field level.
The method also comprises an improvement process which is mainly divided into data quality improvement and algorithm improvement. Referring to fig. 2, wherein the data quality improvement may be for sampling of a small number of samples, mainly taking a supplementary invoice resampling strategy to improve data quality; for data samples with specific applications, data quality improvement is mainly performed by adopting image processing modes such as data enhancement and the like. The algorithm improvement can be divided into two levels of API image processing level improvement and core algorithm improvement, a core algorithm of target detection, clustering, character detection and character recognition is selected macroscopically, and the operation of the image is carried out on the API level.
The method may be issued to a cloud server, and referring to fig. 3, the design corresponding to the cloud server includes a basic deployment environment and a cluster deployment environment, where the basic deployment environment may include: 1) deploying a Docker environment: and installing a common docker and configuring the authority, then creating a docker group, adding the current user into the group, and installing nvidia-docker. 2) Docker mirror images are made and uploaded to a warehouse: firstly, registering an account number in a Docker Hub, creating a warehouse after registration is completed, and then building a Docker mirror image locally and uploading the Docker mirror image to the warehouse. 3) Installing a deep learning mirror in the cluster by using Docker: the deep learning mirror image is downloaded firstly, then the deep learning running container is created, and finally the deployment of the basic container can be completed by opening the deep learning running container.
Deploying the clustered environment may include: 1) and (3) deep learning cluster frame component installation: and after the deep learning mirror image installation is completed, the deep learning cluster framework component is installed, and the client and the server of the K8S are respectively deployed on the related servers. 2) Create K8S deployment and service: the deployment consists of 3 server copies controlled by Kubernets deployment initiation _ reference, and if the states of the deployment and pod are checked and are in Running, the K8S is created and the service is successful. 3) Invoke K8S deployment and service: and recognizing and packaging the optical character facing to the office scene to form cloud service, and releasing the cloud service on a public cloud to finish OCR landing facing to the office scene.
By the method, a better office scene character recognition result can be obtained under the condition of considering the invoice type, the detection speed and the recognition accuracy in the office scene; the invention aims to provide wider services for more users by packaging simple character recognition services into cloud services, and constructs an optical character recognition cloud service facing to an office scene.
The embodiment of the present invention further discloses an optical character recognition apparatus 10, as shown in fig. 4, including: a preprocessing module 101, configured to preprocess an image acquired by an optical device through an SSD network model; the detection module 102 is configured to detect the preprocessed image by using an EAST neural network detection model obtained through training; and the recognition module 103 is configured to recognize the detection result by using the trained CRNN neural network recognition model.
In one embodiment, the preprocessing module 101 is further configured to: performing data cleaning and data set production on the image; and carrying out image segmentation processing by using the trained SSD network model.
In one embodiment, the system further comprises a verification module for optimizing the EAST neural network detection model and the CRNN neural network identification model according to the verification result
For the specific implementation of the apparatus 10, reference may be made to the method embodiment, which is not described in detail.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described embodiments are merely illustrative, and for example, a division of a unit is merely a division of a logic function, and an actual implementation may have another division, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed.
Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment of the present invention.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
While the invention has been described in terms of its preferred embodiments, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention.

Claims (10)

1. An optical character recognition method is characterized in that an image acquired by an optical device is preprocessed through an SSD network model; detecting the preprocessed image by adopting an EAST neural network detection model obtained by training; and identifying the detection result by adopting a CRNN neural network identification model obtained by training.
2. The method of claim 1, wherein the pre-processing comprises data cleansing and dataset production of the image; and carrying out image segmentation processing by using the trained SSD network model.
3. The method of claim 1 or 2, wherein the EAST neural network detection model is obtained by: pre-training to obtain a detection model by changing a data set path, adjusting training parameters according to resources, cleaning pre-training, and starting a training process under a multi-window manager tmux; the pre-trained parameters are stored for retraining to obtain a detection model.
4. The method of claim 1 or 2, wherein the CRNN neural network recognition pattern is obtained by: placing the images to be identified in the same path, and then cutting the images according to the detection result to obtain data to be identified; changing a data set path, adjusting training parameters, cleaning pre-training, starting a training process under a multi-window manager tmux, and training to obtain a recognition model.
5. The method of claim 1, further comprising verifying the detection result and the identification result.
6. The method of claim 1 or 5, further comprising optimizing the EAST neural network detection model and the CRNN neural network identification model based on the validation results.
7. An optical character recognition device is characterized by comprising a preprocessing module, a character recognition module and a character recognition module, wherein the preprocessing module is used for preprocessing an image acquired by an optical device through an SSD network model; the detection module is used for detecting the preprocessed image by adopting an EAST neural network detection model obtained by training; and the recognition module is used for recognizing the detection result by adopting a CRNN neural network recognition model obtained by training.
8. The apparatus of claim 7, wherein the pre-processing module is further to: performing data cleaning and data set production on the image; and carrying out image segmentation processing by using the trained SSD network model.
9. The apparatus of claim 7, further comprising a verification module to optimize the EAST neural network detection model and the CRNN neural network identification model based on a verification result.
10. A computer storage medium having stored thereon a computer program which, when executed by a processor, implements the optical character recognition method of any of the preceding claims 1-6.
CN201911318760.6A 2019-12-19 2019-12-19 Optical character recognition method, device and computer storage medium Pending CN111105549A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911318760.6A CN111105549A (en) 2019-12-19 2019-12-19 Optical character recognition method, device and computer storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911318760.6A CN111105549A (en) 2019-12-19 2019-12-19 Optical character recognition method, device and computer storage medium

Publications (1)

Publication Number Publication Date
CN111105549A true CN111105549A (en) 2020-05-05

Family

ID=70422173

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911318760.6A Pending CN111105549A (en) 2019-12-19 2019-12-19 Optical character recognition method, device and computer storage medium

Country Status (1)

Country Link
CN (1) CN111105549A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113254158A (en) * 2021-06-11 2021-08-13 苏州浪潮智能科技有限公司 Deployment method and device of deep learning system
CN113435437A (en) * 2021-06-24 2021-09-24 随锐科技集团股份有限公司 Method and device for identifying state of switch on/off indicator and storage medium
CN115588207A (en) * 2022-10-13 2023-01-10 成都卓视智通科技有限公司 Monitoring video date recognition method based on OCR

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107358608A (en) * 2017-08-23 2017-11-17 西安邮电大学 Bone tissue geometric state parameter auto-testing device and method based on image processing techniques
CN108921166A (en) * 2018-06-22 2018-11-30 深源恒际科技有限公司 Medical bill class text detection recognition method and system based on deep neural network
CN109919317A (en) * 2018-01-11 2019-06-21 华为技术有限公司 A kind of machine learning model training method and device
CN110210542A (en) * 2019-05-24 2019-09-06 厦门美柚信息科技有限公司 Picture character identification model training method, device and character identification system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107358608A (en) * 2017-08-23 2017-11-17 西安邮电大学 Bone tissue geometric state parameter auto-testing device and method based on image processing techniques
CN109919317A (en) * 2018-01-11 2019-06-21 华为技术有限公司 A kind of machine learning model training method and device
CN108921166A (en) * 2018-06-22 2018-11-30 深源恒际科技有限公司 Medical bill class text detection recognition method and system based on deep neural network
CN110210542A (en) * 2019-05-24 2019-09-06 厦门美柚信息科技有限公司 Picture character identification model training method, device and character identification system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
许庆志: "基于深度学习的交通标志识别及实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113254158A (en) * 2021-06-11 2021-08-13 苏州浪潮智能科技有限公司 Deployment method and device of deep learning system
WO2022257303A1 (en) * 2021-06-11 2022-12-15 苏州浪潮智能科技有限公司 Method and apparatus for deploying deep learning system
CN113435437A (en) * 2021-06-24 2021-09-24 随锐科技集团股份有限公司 Method and device for identifying state of switch on/off indicator and storage medium
CN115588207A (en) * 2022-10-13 2023-01-10 成都卓视智通科技有限公司 Monitoring video date recognition method based on OCR

Similar Documents

Publication Publication Date Title
US11631234B2 (en) Automatically detecting user-requested objects in images
CN110348441B (en) Value-added tax invoice identification method and device, computer equipment and storage medium
CN109635110A (en) Data processing method, device, equipment and computer readable storage medium
CN109086756A (en) A kind of text detection analysis method, device and equipment based on deep neural network
CN111652232B (en) Bill identification method and device, electronic equipment and computer readable storage medium
CN111105549A (en) Optical character recognition method, device and computer storage medium
CN109460769A (en) A kind of mobile end system and method based on table character machining and identification
WO2017088537A1 (en) Component classification method and apparatus
CN111767927A (en) Lightweight license plate recognition method and system based on full convolution network
CN109858414A (en) A kind of invoice piecemeal detection method
CN110490238A (en) A kind of image processing method, device and storage medium
CN110348511A (en) A kind of picture reproduction detection method, system and electronic equipment
CN109886147A (en) A kind of more attribute detection methods of vehicle based on the study of single network multiple-task
CN109934255A (en) A kind of Model Fusion method for delivering object Classification and Identification suitable for beverage bottle recycling machine
CN114387499A (en) Island coastal wetland waterfowl identification method, distribution query system and medium
CN111522951A (en) Sensitive data identification and classification technical method based on image identification
CN111144215A (en) Image processing method, image processing device, electronic equipment and storage medium
CN112132776A (en) Visual inspection method and system based on federal learning, storage medium and equipment
US11600088B2 (en) Utilizing machine learning and image filtering techniques to detect and analyze handwritten text
CN108460277A (en) A kind of automation malicious code mutation detection method
CN112766255A (en) Optical character recognition method, device, equipment and storage medium
CN113221947A (en) Industrial quality inspection method and system based on image recognition technology
CN109376868A (en) Information management system
CN109145723A (en) A kind of seal recognition methods, system, terminal installation and storage medium
CN113360737A (en) Page content acquisition method and device, electronic equipment and readable medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200505