CN110674815A - Invoice image distortion correction method based on deep learning key point detection - Google Patents
Invoice image distortion correction method based on deep learning key point detection Download PDFInfo
- Publication number
- CN110674815A CN110674815A CN201910932792.9A CN201910932792A CN110674815A CN 110674815 A CN110674815 A CN 110674815A CN 201910932792 A CN201910932792 A CN 201910932792A CN 110674815 A CN110674815 A CN 110674815A
- Authority
- CN
- China
- Prior art keywords
- data
- invoice
- key point
- training
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/24—Aligning, centring, orientation detection or correction of the image
- G06V10/243—Aligning, centring, orientation detection or correction of the image by compensating for image skew or non-uniform image deformations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/148—Segmentation of character regions
- G06V30/153—Segmentation of character regions using recognition of characters or words
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Multimedia (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Biophysics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biomedical Technology (AREA)
- Evolutionary Biology (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
Abstract
The invention provides an invoice image distortion correction method based on deep learning key point detection, and belongs to the technical field of image processing. The invention solves the problem of correcting the bill image distortion, and the key points of the technical scheme are as follows: firstly, marking and enhancing training data; secondly, setting a network structure and training parameters; then, setting a training key point detection model by using the network structure and the training parameters, and storing the trained model; then, detecting key points of the bills by using the trained model; and finally, aligning the bill by using the detected key points. The method can be fast, accurate and suitable for natural scenes, the accuracy of OCR recognition is improved to a great extent by recognizing the corrected picture, the manpower and material resource investment is reduced for downstream OCR application, and resources are saved.
Description
Technical Field
The invention relates to the technical field of image processing, in particular to an invoice image distortion correction method based on deep learning key point detection.
Background
In recent years, the development of AI technology has been rapidly advanced, and its application fields are also increasingly wider, such as robot, speech recognition, image recognition, computer vision, automatic driving, and the like. In image recognition, deep learning-based OCR recognition is widely adopted in the industry because of its advantages such as high recognition accuracy and high recognition speed. As is well known, OCR technology generally divides into two technical branches of text detection and text recognition, and although end-to-end OCR recognition based on a neural network has been recently introduced, its effect in a specific scenario is not ideal. Therefore, the mainstream OCR recognition technology is divided into two directions of text detection and text recognition. The OCR recognition accuracy is not only limited by the quality of the recognition algorithm, but also the text detection effect plays a decisive role. And the influence of the image quality on the text detection effect is more obvious. Especially in the era of mobile internet, the rise of mobile devices has led to an increasing demand of common users for OCR applications, however, due to the uncontrollable behavior of users, images captured by mobile devices in various scenes are also very different. The influence of image distortion (non-flat shooting) on the positioning and recognition of the image character region is particularly obvious, and if the acquired image can be corrected before image recognition, the accuracy of character recognition can be effectively improved from the source.
Disclosure of Invention
The invention aims to provide an invoice image distortion correction method based on deep learning key point detection, which solves the problem of correcting the invoice image distortion.
The invention solves the technical problem, and adopts the technical scheme that: the invoice image distortion correction method based on the deep learning key point detection comprises the following steps:
step 1, marking and enhancing training data;
step 2, setting a network structure and training parameters;
step 3, setting a training key point detection model by using the network structure and the training parameters, and storing the trained model;
step 4, detecting key points of bills by using the trained model;
and 5, aligning the bill by using the detected key points.
Specifically, in the step 1, data is labeled by using a part of manual work, and then a large amount of training data is generated by using a data enhancement strategy, which specifically comprises the following steps:
step 101, preparing label data;
102, data annotation;
103, enhancing data;
step 104, converting data formats;
and 105, dividing the data set.
Further, in step 101, when preparing annotation data, collecting different types of pictures to be annotated, wherein 1000 pictures in each type define the positions and names of key points of each type of bills, and the number of the key points is more than 4;
the definition of the key points is performed according to the following criteria:
if the invoice picture has a table and the form style is fixed, the corner points in the table are taken as the standard during actual definition, and the selected key points are distributed at each position of the invoice face;
if the invoice picture has no table, defining key points according to the position of a fixed character area of the invoice;
if the actual invoice picture has an irregular problem, which causes that part of the defined key points are invisible, only corresponding visible key points are marked when the actual marking task is executed.
Specifically, in step 102, during data annotation, the invoice pictures with tables are annotated by adopting points task types of VIA tools, and the invoice pictures without tables are annotated by adopting rect task types of VIA tools.
Further, in step 103, during data enhancement, the labeled partial data is adopted, then image enhancement is performed on the labeled data according to the actual business bill image condition, and the imgauge image enhancement library of python is used as the enhancement strategy of the training data, and the following method is adopted:
random affine transformation of the image, wherein the scaling scale range is (0.5,2), the rotation angle range is [ (-15,15), (75,105), (165,195), (255,285) ], the unit is degree, the translation amount is (-200, 200), the unit is pixel;
image random perspective transformation, wherein the scale parameter random value range is (0.025, 0.15);
randomly adding noise to the image;
image contrast stretching;
adding shadow noise to the image;
the marked training data are expanded and enhanced by the fusion of the data enhancement modes.
Specifically, in step 104, during format conversion, format fusion and conversion are performed on the annotation data of different types of tickets, and the annotation data is arranged into a data set, for the annotation data of a specific type of ticket, the corresponding annotation data is not changed, and for the annotation data of other types of tickets, the annotation data of the corresponding point is filled with (0,0, 0).
Further, in step 105, when the data set is divided, the sorted data set is randomly divided into a training set and a testing set according to a 9:1 ratio.
Specifically, the step 2 specifically includes the following steps:
step 201, training a network structure, wherein basic feature extraction adopts resnet101 as a feature extraction module, and is followed by GolbalNet and RefineNet, wherein the GolbalNet is used for detecting all key points, and the RefineNet is used for predicting correction of the key points;
step 202, initializing parameters of the model, wherein a basic feature extraction module adopts a pre-training weight of resnet101 on ImageNet; the total training EPOCHS is 30, the batch-size under a single GPU is 4, the input size of the initialized picture is 512 x 512, an Adam optimizer is adopted, the learning rate is set to be 0.0001, and the loss function is the sum of the distances between the coordinates of each predicted point and the coordinates of the real point.
Further, the step 3 specifically comprises the following steps;
step 301, setting algorithm parameters, and setting model initialization parameters according to the step 202;
step 302, establishing a network model;
step 303, judging whether the current EPOCH is smaller than EPOCHS, if so, turning to step 304, otherwise, turning to step 4;
304, taking a batch-size invoice picture from the training set, training the model, and updating the model algorithm by using Adam;
step 305, judging whether all pictures in the training set are trained, if so, turning to step 306, otherwise, turning to step 304;
step 306, verifying the accuracy of the model by using the test set, storing the trained model, and going to step 303.
Specifically, in the step 4, when detecting the bill key points, performing key point detection on an image to be corrected by using a model obtained by training to obtain detected key points, marking the detected key points as predicted _ points, marking a corresponding key point set in a template picture as template _ points, then calculating a perspective transformation matrix corresponding to the detected key points and the key points of the standard bill picture as homograph by using the detected key points and the key points of the standard bill picture, and specifically calculating by using a findHomography method of opencv;
in step 5, when the bill is aligned, the obtained perspective transformation matrix is used as homograph for obtaining the corrected image dst on the image src to be corrected, and the correction adopts opencv warpPeractive method.
The method has the advantages that the method can accurately detect the key points in the invoice image through the invoice image distortion correction method based on the deep learning key point detection, further calculate the corresponding perspective transformation matrix by using the detected key points and the key points of the standard bill image, and act the transformation matrix on the image to be processed to obtain the corrected image. The method is rapid and accurate, is suitable for natural scenes, and greatly improves the accuracy of OCR recognition by recognizing the corrected pictures. The labor and material investment is reduced for downstream OCR application, and resources are saved.
Drawings
FIG. 1 is a flowchart of an invoice image distortion correction method based on deep learning keypoint detection according to the present invention.
Detailed Description
The technical scheme of the invention is described in detail in the following with reference to the accompanying drawings.
The invention discloses an invoice image distortion correction method based on deep learning key point detection, a flow chart of which is shown in figure 1, wherein the method comprises the following steps:
step 1, marking and enhancing training data. For effective training of neural networks, a large amount of labeling data is required, and manual labeling of data consumes a large amount of time. In order to ensure the quality of the labeled data and reduce the labeling cost as much as possible, the invention adopts a method of partially manually labeling the data and then generating a large amount of training data by using a data enhancement strategy, which comprises the following steps:
step 101, preparing marking data. Collecting different types of pictures to be marked, wherein about 1000 pictures in each type are collected, the positions and names of key points of each type of bills are defined, and the number of the key points is more than 4. The definition of the key points is performed according to the following criteria: 1. if the invoice picture has a table and the form style is fixed, the corner points in the table are taken as the standard during actual definition, and the selected key points are distributed at each position of the invoice face. 2. If the invoice picture has no table, the key points are defined according to the position of the fixed character area of the invoice. Furthermore, it should be noted that there may be irregularities in the actual invoice picture, such as defects, occlusion, distortion, severe exposure, etc. that result in partially defined key points being invisible. For such a situation, only the corresponding visible key points can be labeled when the actual labeling task is executed.
And 102, marking data. And when the annotation task is actually executed, the invoice picture with the table is annotated by adopting the points task type of the VIA tool. The non-table may use the rect task type of VIA to perform annotation, for example: a quota invoice in which the corresponding keypoints take the four vertices or the midpoint of the four edges of the rectangle.
And 103, enhancing data. And the labeling cost is limited, the labeled partial data can be actually adopted, and then the image enhancement is carried out on the labeled data according to the actual business bill image condition, so that the labeling cost is saved, and the robustness and the precision of the training model can be improved. The enhancement strategy of the training data in the invention uses the imgauge image enhancement library of python in the following way. 1. Random affine transformation (scaling, rotation, translation) of the image, wherein the scaling scale range is (0.5,2), the rotation angle range is [ (-15,15), (75,105), (165,195), (255,285) ], the unit is degree, and the translation amount is (-200, 200), the unit is pixel; 2. image random perspective transformation, wherein the scale parameter random value range is (0.025, 0.15); 3. randomly adding noise to the image; 4. image contrast stretching; 5. the image adds shadow noise. The marked training data are expanded and enhanced by the fusion of the data enhancement modes.
And 104, converting the data format. In order to support the key point detection of various bill types, the invention performs format fusion and conversion on the labeled data of different types of bills. Data types similar to one-hot manner are adopted. For a specific type of note annotation data, the corresponding annotation data is unchanged, and for other types of note annotation data, the corresponding point annotation data is filled with (0,0, 0).
And 105, dividing the data set. The sorted data set in step 104 is randomly divided into training sets and test sets in a 9:1 ratio.
And 2, setting a network structure and training parameters.
Step 201, training a network structure. The training adopts a model structure similar to CPN, and the basic feature extraction adopts resnet101 as a feature extraction module followed by GolbalNet and RefineNet. Wherein, GolbAlNet is used for detecting all key points, and RefineNet is used for predicting the correction of the key points.
Step 202, model initialization parameters. The basic feature extraction module adopts the pre-training weight of resnet101 on ImageNet; the total training EPOCHS is 30, the batch-size under a single GPU is 4, the input size of an initialized picture is 512 x 512, an Adam optimizer is adopted, the learning rate is set to be 0.0001, and a loss function is the sum of the distances between the coordinates of all predicted points and the coordinates of real points;
and 3, setting a training key point detection model by using the network structure and the training parameters, and storing the trained model. And (3) training a key point detection model by using the network structure and parameter setting in the step (2), and storing the trained model parameters for key point detection. The specific training steps are as follows:
step 301, setting algorithm parameters, and setting model initialization parameters according to the step 202;
step 302, establishing a network model;
step 303, judging whether the current EPOCH is smaller than EPOCHS, if so, turning to step 304, otherwise, turning to step 4;
304, taking a batch-size invoice picture from the training set, training the model, and updating the model algorithm by using Adam;
step 305, judging whether all pictures in the training set are trained, if so, turning to step 306, otherwise, turning to step 304;
step 306, verifying the accuracy of the model by using the test set, storing the trained model, and going to step 303.
And 4, detecting key points of the bills by using the trained model. And (3) performing key point detection on the image to be corrected by using the model obtained by training in the step (3), and recording the detected key points as preset _ points and recording a corresponding key point set in the template picture as template _ points. And then calculating a corresponding perspective transformation matrix as homographic by using the detected key points and the key points of the standard bill picture. The specific calculation adopts the findHomography method of opencv.
And 5, aligning the bill by using the detected key points. And (4) taking the perspective transformation matrix obtained in the step (4) as homograph to be used on the image src to be corrected to obtain a corrected image dst. The correction is carried out by the method warPerspectral of opencv.
Claims (10)
1. The invoice image distortion correction method based on deep learning key point detection is characterized by comprising the following steps:
step 1, marking and enhancing training data;
step 2, setting a network structure and training parameters;
step 3, setting a training key point detection model by using the network structure and the training parameters, and storing the trained model;
step 4, detecting key points of bills by using the trained model;
and 5, aligning the bill by using the detected key points.
2. The invoice image distortion correction method based on deep learning key point detection as claimed in claim 1, wherein in step 1, partial manual labeling data is adopted, and then a large amount of training data is generated by a data enhancement strategy, which specifically comprises the following steps:
step 101, preparing label data;
102, data annotation;
103, enhancing data;
step 104, converting data formats;
and 105, dividing the data set.
3. The invoice image distortion correction method based on deep learning key point detection as claimed in claim 2, characterized in that in step 101, when annotation data is prepared, different types of images to be annotated are collected, 1000 images of each type are collected, the key point position and name of each type of invoice are defined, and the number of key points is more than 4;
the definition of the key points is performed according to the following criteria:
if the invoice picture has a table and the form style is fixed, the corner points in the table are taken as the standard during actual definition, and the selected key points are distributed at each position of the invoice face;
if the invoice picture has no table, defining key points according to the position of a fixed character area of the invoice;
if the actual invoice picture has an irregular problem, which causes that part of the defined key points are invisible, only corresponding visible key points are marked when the actual marking task is executed.
4. The invoice image distortion correction method based on deep learning key point detection as claimed in claim 2, wherein in step 102, during data annotation, the invoice pictures with tables are annotated by the points task type of VIA tool, and the invoice pictures without tables are annotated by the rect task type of VIA.
5. The invoice image distortion correction method based on deep learning key point detection as claimed in claim 2, characterized in that in step 103, during data enhancement, the labeled partial data is adopted, then image enhancement is performed on the labeled data according to the actual business bill image condition, and the imgauge image enhancement library of python is used as the enhancement strategy for the training data, and the following method is adopted:
random affine transformation of the image, wherein the scaling scale range is (0.5,2), the rotation angle range is [ (-15,15), (75,105), (165,195), (255,285) ], the unit is degree, the translation amount is (-200, 200), the unit is pixel;
image random perspective transformation, wherein the scale parameter random value range is (0.025, 0.15);
randomly adding noise to the image;
image contrast stretching;
adding shadow noise to the image;
the marked training data are expanded and enhanced by the fusion of the data enhancement modes.
6. The invoice image distortion correction method based on deep learning key point detection as claimed in claim 2, characterized in that in step 104, during data format conversion, the format fusion and conversion are performed on the annotation data of different types of bills, and the annotation data is arranged into a data set, for a specific type of bill annotation data, the corresponding annotation data is unchanged, and for other types of bill annotation data, the corresponding annotation data is filled with (0,0, 0).
7. The invoice image distortion correction method based on deep learning key point detection as claimed in claim 2 or 6, characterized in that, in the step 105, when the data set is divided, the sorted data set is randomly divided into a training set and a testing set according to a 9:1 ratio.
8. The invoice image distortion correction method based on deep learning key point detection as claimed in claim 1, wherein the step 2 specifically comprises the following steps:
step 201, training a network structure, wherein basic feature extraction adopts resnet101 as a feature extraction module, and is followed by GolbalNet and RefineNet, wherein the GolbalNet is used for detecting all key points, and the RefineNet is used for predicting correction of the key points;
step 202, initializing parameters of the model, wherein a basic feature extraction module adopts a pre-training weight of resnet101 on ImageNet; the total training EPOCHS is 30, the batch-size under a single GPU is 4, the input size of the initialized picture is 512 x 512, an Adam optimizer is adopted, the learning rate is set to be 0.0001, and the loss function is the sum of the distances between the coordinates of each predicted point and the coordinates of the real point.
9. The invoice image distortion correction method based on deep learning key point detection as claimed in claim 1 or 8, characterized in that step 3 specifically comprises the following steps;
step 301, setting algorithm parameters, and setting model initialization parameters according to the step 202;
step 302, establishing a network model;
step 303, judging whether the current EPOCH is smaller than EPOCHS, if so, turning to step 304, otherwise, turning to step 4;
304, taking a batch-size invoice picture from the training set, training the model, and updating the model algorithm by using Adam;
step 305, judging whether all pictures in the training set are trained, if so, turning to step 306, otherwise, turning to step 304;
step 306, verifying the accuracy of the model by using the test set, storing the trained model, and going to step 303.
10. The invoice image distortion correction method based on deep learning key point detection as claimed in claim 1 or 8, characterized in that, in step 4, during the detection of the bill key point, the trained model is used to perform key point detection on the image to be corrected, the detected key point is recorded as a predicted _ points, the corresponding key point set in the template picture is recorded as a template _ points, then the detected key point and the key point of the standard bill picture are used to calculate the corresponding perspective transformation matrix and record as homograph, and the opendcv findHomography method is specifically adopted for calculation;
in step 5, when the bill is aligned, the obtained perspective transformation matrix is used as homograph for obtaining the corrected image dst on the image src to be corrected, and the correction adopts opencv warpPeractive method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910932792.9A CN110674815A (en) | 2019-09-29 | 2019-09-29 | Invoice image distortion correction method based on deep learning key point detection |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910932792.9A CN110674815A (en) | 2019-09-29 | 2019-09-29 | Invoice image distortion correction method based on deep learning key point detection |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110674815A true CN110674815A (en) | 2020-01-10 |
Family
ID=69080020
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910932792.9A Pending CN110674815A (en) | 2019-09-29 | 2019-09-29 | Invoice image distortion correction method based on deep learning key point detection |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110674815A (en) |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111382742A (en) * | 2020-03-15 | 2020-07-07 | 策拉人工智能科技(云南)有限公司 | Method for integrating OCR recognition software on cloud financial platform |
CN111507265A (en) * | 2020-04-17 | 2020-08-07 | 北京百度网讯科技有限公司 | Form key point detection model training method, device, equipment and storage medium |
CN111507349A (en) * | 2020-04-15 | 2020-08-07 | 深源恒际科技有限公司 | Dynamic data enhancement method in OCR (optical character recognition) model training |
CN111582153A (en) * | 2020-05-07 | 2020-08-25 | 北京百度网讯科技有限公司 | Method and device for determining document orientation |
CN111881882A (en) * | 2020-08-10 | 2020-11-03 | 晶璞(上海)人工智能科技有限公司 | Medical bill rotation correction method and system based on deep learning |
CN112115943A (en) * | 2020-09-16 | 2020-12-22 | 四川长虹电器股份有限公司 | Bill rotation angle detection method based on deep learning |
CN112347865A (en) * | 2020-10-21 | 2021-02-09 | 四川长虹电器股份有限公司 | Bill correction method based on key point detection |
CN112347994A (en) * | 2020-11-30 | 2021-02-09 | 四川长虹电器股份有限公司 | Invoice image target detection and angle detection method based on deep learning |
CN112507973A (en) * | 2020-12-29 | 2021-03-16 | 中国电子科技集团公司第二十八研究所 | Text and picture recognition system based on OCR technology |
CN112633275A (en) * | 2020-12-22 | 2021-04-09 | 航天信息股份有限公司 | Multi-bill mixed-shooting image correction method and system based on deep learning |
CN113974828A (en) * | 2021-09-30 | 2022-01-28 | 西安交通大学第二附属医院 | Operation reference scheme generation method and device |
TWI775038B (en) * | 2020-01-21 | 2022-08-21 | 群邁通訊股份有限公司 | Method and device for recognizing character and storage medium |
TWI775039B (en) * | 2020-01-21 | 2022-08-21 | 群邁通訊股份有限公司 | Method and device for removing document shadow |
US11605210B2 (en) | 2020-01-21 | 2023-03-14 | Mobile Drive Netherlands B.V. | Method for optical character recognition in document subject to shadows, and device employing method |
US11876945B2 (en) | 2020-01-21 | 2024-01-16 | Mobile Drive Netherlands B.V. | Device and method for acquiring shadow-free images of documents for scanning purposes |
CN114494038B (en) * | 2021-12-29 | 2024-03-29 | 扬州大学 | Target surface perspective distortion correction method based on improved YOLOX-S |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102074001A (en) * | 2010-11-25 | 2011-05-25 | 上海合合信息科技发展有限公司 | Method and system for stitching text images |
US20180114084A1 (en) * | 2015-07-13 | 2018-04-26 | Baidu Online Network Technology (Beijing) Co., Ltd | Method for recognizing picture, method and apparatus for labelling picture, and storage medium |
CN108171669A (en) * | 2017-12-29 | 2018-06-15 | 星阵(广州)基因科技有限公司 | A kind of image correction method based on OpenCV algorithms |
CN108596867A (en) * | 2018-05-09 | 2018-09-28 | 五邑大学 | A kind of picture bearing calibration and system based on ORB algorithms |
CN108647681A (en) * | 2018-05-08 | 2018-10-12 | 重庆邮电大学 | A kind of English text detection method with text orientation correction |
CN108876858A (en) * | 2018-07-06 | 2018-11-23 | 北京字节跳动网络技术有限公司 | Method and apparatus for handling image |
CN109583408A (en) * | 2018-12-07 | 2019-04-05 | 高新兴科技集团股份有限公司 | A kind of vehicle key point alignment schemes based on deep learning |
CN109784350A (en) * | 2018-12-29 | 2019-05-21 | 天津大学 | In conjunction with the dress ornament key independent positioning method of empty convolution and cascade pyramid network |
CN109871744A (en) * | 2018-12-29 | 2019-06-11 | 新浪网技术(中国)有限公司 | A kind of VAT invoice method for registering images and system |
CN110008956A (en) * | 2019-04-01 | 2019-07-12 | 深圳市华付信息技术有限公司 | Invoice key message localization method, device, computer equipment and storage medium |
CN110032990A (en) * | 2019-04-23 | 2019-07-19 | 杭州智趣智能信息技术有限公司 | A kind of invoice text recognition method, system and associated component |
CN110263694A (en) * | 2019-06-13 | 2019-09-20 | 泰康保险集团股份有限公司 | A kind of bank slip recognition method and device |
-
2019
- 2019-09-29 CN CN201910932792.9A patent/CN110674815A/en active Pending
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102074001A (en) * | 2010-11-25 | 2011-05-25 | 上海合合信息科技发展有限公司 | Method and system for stitching text images |
US20180114084A1 (en) * | 2015-07-13 | 2018-04-26 | Baidu Online Network Technology (Beijing) Co., Ltd | Method for recognizing picture, method and apparatus for labelling picture, and storage medium |
CN108171669A (en) * | 2017-12-29 | 2018-06-15 | 星阵(广州)基因科技有限公司 | A kind of image correction method based on OpenCV algorithms |
CN108647681A (en) * | 2018-05-08 | 2018-10-12 | 重庆邮电大学 | A kind of English text detection method with text orientation correction |
CN108596867A (en) * | 2018-05-09 | 2018-09-28 | 五邑大学 | A kind of picture bearing calibration and system based on ORB algorithms |
CN108876858A (en) * | 2018-07-06 | 2018-11-23 | 北京字节跳动网络技术有限公司 | Method and apparatus for handling image |
CN109583408A (en) * | 2018-12-07 | 2019-04-05 | 高新兴科技集团股份有限公司 | A kind of vehicle key point alignment schemes based on deep learning |
CN109784350A (en) * | 2018-12-29 | 2019-05-21 | 天津大学 | In conjunction with the dress ornament key independent positioning method of empty convolution and cascade pyramid network |
CN109871744A (en) * | 2018-12-29 | 2019-06-11 | 新浪网技术(中国)有限公司 | A kind of VAT invoice method for registering images and system |
CN110008956A (en) * | 2019-04-01 | 2019-07-12 | 深圳市华付信息技术有限公司 | Invoice key message localization method, device, computer equipment and storage medium |
CN110032990A (en) * | 2019-04-23 | 2019-07-19 | 杭州智趣智能信息技术有限公司 | A kind of invoice text recognition method, system and associated component |
CN110263694A (en) * | 2019-06-13 | 2019-09-20 | 泰康保险集团股份有限公司 | A kind of bank slip recognition method and device |
Non-Patent Citations (1)
Title |
---|
余小宝: "增值税发票抵扣联字符识别中的图像倾斜校正方法", 《电脑编程技巧与维护》 * |
Cited By (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI775038B (en) * | 2020-01-21 | 2022-08-21 | 群邁通訊股份有限公司 | Method and device for recognizing character and storage medium |
US11876945B2 (en) | 2020-01-21 | 2024-01-16 | Mobile Drive Netherlands B.V. | Device and method for acquiring shadow-free images of documents for scanning purposes |
US11605210B2 (en) | 2020-01-21 | 2023-03-14 | Mobile Drive Netherlands B.V. | Method for optical character recognition in document subject to shadows, and device employing method |
TWI775039B (en) * | 2020-01-21 | 2022-08-21 | 群邁通訊股份有限公司 | Method and device for removing document shadow |
CN111382742A (en) * | 2020-03-15 | 2020-07-07 | 策拉人工智能科技(云南)有限公司 | Method for integrating OCR recognition software on cloud financial platform |
CN111507349A (en) * | 2020-04-15 | 2020-08-07 | 深源恒际科技有限公司 | Dynamic data enhancement method in OCR (optical character recognition) model training |
CN111507349B (en) * | 2020-04-15 | 2023-05-23 | 北京深智恒际科技有限公司 | Dynamic data enhancement method in OCR recognition model training |
CN111507265A (en) * | 2020-04-17 | 2020-08-07 | 北京百度网讯科技有限公司 | Form key point detection model training method, device, equipment and storage medium |
CN111582153A (en) * | 2020-05-07 | 2020-08-25 | 北京百度网讯科技有限公司 | Method and device for determining document orientation |
CN111582153B (en) * | 2020-05-07 | 2023-06-30 | 北京百度网讯科技有限公司 | Method and device for determining orientation of document |
CN111881882A (en) * | 2020-08-10 | 2020-11-03 | 晶璞(上海)人工智能科技有限公司 | Medical bill rotation correction method and system based on deep learning |
CN112115943A (en) * | 2020-09-16 | 2020-12-22 | 四川长虹电器股份有限公司 | Bill rotation angle detection method based on deep learning |
CN112347865A (en) * | 2020-10-21 | 2021-02-09 | 四川长虹电器股份有限公司 | Bill correction method based on key point detection |
CN112347994A (en) * | 2020-11-30 | 2021-02-09 | 四川长虹电器股份有限公司 | Invoice image target detection and angle detection method based on deep learning |
CN112633275A (en) * | 2020-12-22 | 2021-04-09 | 航天信息股份有限公司 | Multi-bill mixed-shooting image correction method and system based on deep learning |
CN112633275B (en) * | 2020-12-22 | 2023-07-18 | 航天信息股份有限公司 | Multi-bill mixed shooting image correction method and system based on deep learning |
CN112507973B (en) * | 2020-12-29 | 2022-09-06 | 中国电子科技集团公司第二十八研究所 | Text and picture recognition system based on OCR technology |
CN112507973A (en) * | 2020-12-29 | 2021-03-16 | 中国电子科技集团公司第二十八研究所 | Text and picture recognition system based on OCR technology |
CN113974828A (en) * | 2021-09-30 | 2022-01-28 | 西安交通大学第二附属医院 | Operation reference scheme generation method and device |
CN113974828B (en) * | 2021-09-30 | 2024-02-09 | 西安交通大学第二附属医院 | Surgical reference scheme generation method and device |
CN114494038B (en) * | 2021-12-29 | 2024-03-29 | 扬州大学 | Target surface perspective distortion correction method based on improved YOLOX-S |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110674815A (en) | Invoice image distortion correction method based on deep learning key point detection | |
CN110689037B (en) | Method and system for automatic object annotation using deep networks | |
CN110738602B (en) | Image processing method and device, electronic equipment and readable storage medium | |
CN106156761B (en) | Image table detection and identification method for mobile terminal shooting | |
CN110008956B (en) | Invoice key information positioning method, invoice key information positioning device, computer equipment and storage medium | |
CN111160352B (en) | Workpiece metal surface character recognition method and system based on image segmentation | |
CN110827247A (en) | Method and equipment for identifying label | |
US10984284B1 (en) | Synthetic augmentation of document images | |
CN109034155A (en) | A kind of text detection and the method and system of identification | |
CN110516554A (en) | A kind of more scene multi-font Chinese text detection recognition methods | |
CN111191649A (en) | Method and equipment for identifying bent multi-line text image | |
CN109886257B (en) | Method for correcting invoice image segmentation result by adopting deep learning in OCR system | |
CN112883926B (en) | Identification method and device for form medical images | |
CN113903024A (en) | Handwritten bill numerical value information identification method, system, medium and device | |
CN113592735A (en) | Text page image restoration method and system, electronic equipment and computer readable medium | |
CN111414905B (en) | Text detection method, text detection device, electronic equipment and storage medium | |
CN110956147B (en) | Method and device for generating training data | |
CN113902402A (en) | Document auxiliary filling method, system, storage medium and device based on AR technology | |
CN111784587A (en) | Invoice photo position correction method based on deep learning network | |
CN104077557B (en) | A kind of method and apparatus obtaining card image | |
CN111783763A (en) | Text positioning box correction method and system based on convolutional neural network | |
CN112508000B (en) | Method and equipment for generating OCR image recognition model training data | |
CN111325106B (en) | Method and device for generating training data | |
Zhang et al. | Key point localization and recurrent neural network based water meter reading recognition | |
CN111666882A (en) | Method for extracting answers of handwritten test questions |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200110 |