CN110674815A - Invoice image distortion correction method based on deep learning key point detection - Google Patents

Invoice image distortion correction method based on deep learning key point detection Download PDF

Info

Publication number
CN110674815A
CN110674815A CN201910932792.9A CN201910932792A CN110674815A CN 110674815 A CN110674815 A CN 110674815A CN 201910932792 A CN201910932792 A CN 201910932792A CN 110674815 A CN110674815 A CN 110674815A
Authority
CN
China
Prior art keywords
data
invoice
key point
training
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910932792.9A
Other languages
Chinese (zh)
Inventor
池明辉
肖欣庭
梁欢
罗珊珊
赵冬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan Changhong Electric Co Ltd
Original Assignee
Sichuan Changhong Electric Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan Changhong Electric Co Ltd filed Critical Sichuan Changhong Electric Co Ltd
Priority to CN201910932792.9A priority Critical patent/CN110674815A/en
Publication of CN110674815A publication Critical patent/CN110674815A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/24Aligning, centring, orientation detection or correction of the image
    • G06V10/243Aligning, centring, orientation detection or correction of the image by compensating for image skew or non-uniform image deformations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides an invoice image distortion correction method based on deep learning key point detection, and belongs to the technical field of image processing. The invention solves the problem of correcting the bill image distortion, and the key points of the technical scheme are as follows: firstly, marking and enhancing training data; secondly, setting a network structure and training parameters; then, setting a training key point detection model by using the network structure and the training parameters, and storing the trained model; then, detecting key points of the bills by using the trained model; and finally, aligning the bill by using the detected key points. The method can be fast, accurate and suitable for natural scenes, the accuracy of OCR recognition is improved to a great extent by recognizing the corrected picture, the manpower and material resource investment is reduced for downstream OCR application, and resources are saved.

Description

Invoice image distortion correction method based on deep learning key point detection
Technical Field
The invention relates to the technical field of image processing, in particular to an invoice image distortion correction method based on deep learning key point detection.
Background
In recent years, the development of AI technology has been rapidly advanced, and its application fields are also increasingly wider, such as robot, speech recognition, image recognition, computer vision, automatic driving, and the like. In image recognition, deep learning-based OCR recognition is widely adopted in the industry because of its advantages such as high recognition accuracy and high recognition speed. As is well known, OCR technology generally divides into two technical branches of text detection and text recognition, and although end-to-end OCR recognition based on a neural network has been recently introduced, its effect in a specific scenario is not ideal. Therefore, the mainstream OCR recognition technology is divided into two directions of text detection and text recognition. The OCR recognition accuracy is not only limited by the quality of the recognition algorithm, but also the text detection effect plays a decisive role. And the influence of the image quality on the text detection effect is more obvious. Especially in the era of mobile internet, the rise of mobile devices has led to an increasing demand of common users for OCR applications, however, due to the uncontrollable behavior of users, images captured by mobile devices in various scenes are also very different. The influence of image distortion (non-flat shooting) on the positioning and recognition of the image character region is particularly obvious, and if the acquired image can be corrected before image recognition, the accuracy of character recognition can be effectively improved from the source.
Disclosure of Invention
The invention aims to provide an invoice image distortion correction method based on deep learning key point detection, which solves the problem of correcting the invoice image distortion.
The invention solves the technical problem, and adopts the technical scheme that: the invoice image distortion correction method based on the deep learning key point detection comprises the following steps:
step 1, marking and enhancing training data;
step 2, setting a network structure and training parameters;
step 3, setting a training key point detection model by using the network structure and the training parameters, and storing the trained model;
step 4, detecting key points of bills by using the trained model;
and 5, aligning the bill by using the detected key points.
Specifically, in the step 1, data is labeled by using a part of manual work, and then a large amount of training data is generated by using a data enhancement strategy, which specifically comprises the following steps:
step 101, preparing label data;
102, data annotation;
103, enhancing data;
step 104, converting data formats;
and 105, dividing the data set.
Further, in step 101, when preparing annotation data, collecting different types of pictures to be annotated, wherein 1000 pictures in each type define the positions and names of key points of each type of bills, and the number of the key points is more than 4;
the definition of the key points is performed according to the following criteria:
if the invoice picture has a table and the form style is fixed, the corner points in the table are taken as the standard during actual definition, and the selected key points are distributed at each position of the invoice face;
if the invoice picture has no table, defining key points according to the position of a fixed character area of the invoice;
if the actual invoice picture has an irregular problem, which causes that part of the defined key points are invisible, only corresponding visible key points are marked when the actual marking task is executed.
Specifically, in step 102, during data annotation, the invoice pictures with tables are annotated by adopting points task types of VIA tools, and the invoice pictures without tables are annotated by adopting rect task types of VIA tools.
Further, in step 103, during data enhancement, the labeled partial data is adopted, then image enhancement is performed on the labeled data according to the actual business bill image condition, and the imgauge image enhancement library of python is used as the enhancement strategy of the training data, and the following method is adopted:
random affine transformation of the image, wherein the scaling scale range is (0.5,2), the rotation angle range is [ (-15,15), (75,105), (165,195), (255,285) ], the unit is degree, the translation amount is (-200, 200), the unit is pixel;
image random perspective transformation, wherein the scale parameter random value range is (0.025, 0.15);
randomly adding noise to the image;
image contrast stretching;
adding shadow noise to the image;
the marked training data are expanded and enhanced by the fusion of the data enhancement modes.
Specifically, in step 104, during format conversion, format fusion and conversion are performed on the annotation data of different types of tickets, and the annotation data is arranged into a data set, for the annotation data of a specific type of ticket, the corresponding annotation data is not changed, and for the annotation data of other types of tickets, the annotation data of the corresponding point is filled with (0,0, 0).
Further, in step 105, when the data set is divided, the sorted data set is randomly divided into a training set and a testing set according to a 9:1 ratio.
Specifically, the step 2 specifically includes the following steps:
step 201, training a network structure, wherein basic feature extraction adopts resnet101 as a feature extraction module, and is followed by GolbalNet and RefineNet, wherein the GolbalNet is used for detecting all key points, and the RefineNet is used for predicting correction of the key points;
step 202, initializing parameters of the model, wherein a basic feature extraction module adopts a pre-training weight of resnet101 on ImageNet; the total training EPOCHS is 30, the batch-size under a single GPU is 4, the input size of the initialized picture is 512 x 512, an Adam optimizer is adopted, the learning rate is set to be 0.0001, and the loss function is the sum of the distances between the coordinates of each predicted point and the coordinates of the real point.
Further, the step 3 specifically comprises the following steps;
step 301, setting algorithm parameters, and setting model initialization parameters according to the step 202;
step 302, establishing a network model;
step 303, judging whether the current EPOCH is smaller than EPOCHS, if so, turning to step 304, otherwise, turning to step 4;
304, taking a batch-size invoice picture from the training set, training the model, and updating the model algorithm by using Adam;
step 305, judging whether all pictures in the training set are trained, if so, turning to step 306, otherwise, turning to step 304;
step 306, verifying the accuracy of the model by using the test set, storing the trained model, and going to step 303.
Specifically, in the step 4, when detecting the bill key points, performing key point detection on an image to be corrected by using a model obtained by training to obtain detected key points, marking the detected key points as predicted _ points, marking a corresponding key point set in a template picture as template _ points, then calculating a perspective transformation matrix corresponding to the detected key points and the key points of the standard bill picture as homograph by using the detected key points and the key points of the standard bill picture, and specifically calculating by using a findHomography method of opencv;
in step 5, when the bill is aligned, the obtained perspective transformation matrix is used as homograph for obtaining the corrected image dst on the image src to be corrected, and the correction adopts opencv warpPeractive method.
The method has the advantages that the method can accurately detect the key points in the invoice image through the invoice image distortion correction method based on the deep learning key point detection, further calculate the corresponding perspective transformation matrix by using the detected key points and the key points of the standard bill image, and act the transformation matrix on the image to be processed to obtain the corrected image. The method is rapid and accurate, is suitable for natural scenes, and greatly improves the accuracy of OCR recognition by recognizing the corrected pictures. The labor and material investment is reduced for downstream OCR application, and resources are saved.
Drawings
FIG. 1 is a flowchart of an invoice image distortion correction method based on deep learning keypoint detection according to the present invention.
Detailed Description
The technical scheme of the invention is described in detail in the following with reference to the accompanying drawings.
The invention discloses an invoice image distortion correction method based on deep learning key point detection, a flow chart of which is shown in figure 1, wherein the method comprises the following steps:
step 1, marking and enhancing training data. For effective training of neural networks, a large amount of labeling data is required, and manual labeling of data consumes a large amount of time. In order to ensure the quality of the labeled data and reduce the labeling cost as much as possible, the invention adopts a method of partially manually labeling the data and then generating a large amount of training data by using a data enhancement strategy, which comprises the following steps:
step 101, preparing marking data. Collecting different types of pictures to be marked, wherein about 1000 pictures in each type are collected, the positions and names of key points of each type of bills are defined, and the number of the key points is more than 4. The definition of the key points is performed according to the following criteria: 1. if the invoice picture has a table and the form style is fixed, the corner points in the table are taken as the standard during actual definition, and the selected key points are distributed at each position of the invoice face. 2. If the invoice picture has no table, the key points are defined according to the position of the fixed character area of the invoice. Furthermore, it should be noted that there may be irregularities in the actual invoice picture, such as defects, occlusion, distortion, severe exposure, etc. that result in partially defined key points being invisible. For such a situation, only the corresponding visible key points can be labeled when the actual labeling task is executed.
And 102, marking data. And when the annotation task is actually executed, the invoice picture with the table is annotated by adopting the points task type of the VIA tool. The non-table may use the rect task type of VIA to perform annotation, for example: a quota invoice in which the corresponding keypoints take the four vertices or the midpoint of the four edges of the rectangle.
And 103, enhancing data. And the labeling cost is limited, the labeled partial data can be actually adopted, and then the image enhancement is carried out on the labeled data according to the actual business bill image condition, so that the labeling cost is saved, and the robustness and the precision of the training model can be improved. The enhancement strategy of the training data in the invention uses the imgauge image enhancement library of python in the following way. 1. Random affine transformation (scaling, rotation, translation) of the image, wherein the scaling scale range is (0.5,2), the rotation angle range is [ (-15,15), (75,105), (165,195), (255,285) ], the unit is degree, and the translation amount is (-200, 200), the unit is pixel; 2. image random perspective transformation, wherein the scale parameter random value range is (0.025, 0.15); 3. randomly adding noise to the image; 4. image contrast stretching; 5. the image adds shadow noise. The marked training data are expanded and enhanced by the fusion of the data enhancement modes.
And 104, converting the data format. In order to support the key point detection of various bill types, the invention performs format fusion and conversion on the labeled data of different types of bills. Data types similar to one-hot manner are adopted. For a specific type of note annotation data, the corresponding annotation data is unchanged, and for other types of note annotation data, the corresponding point annotation data is filled with (0,0, 0).
And 105, dividing the data set. The sorted data set in step 104 is randomly divided into training sets and test sets in a 9:1 ratio.
And 2, setting a network structure and training parameters.
Step 201, training a network structure. The training adopts a model structure similar to CPN, and the basic feature extraction adopts resnet101 as a feature extraction module followed by GolbalNet and RefineNet. Wherein, GolbAlNet is used for detecting all key points, and RefineNet is used for predicting the correction of the key points.
Step 202, model initialization parameters. The basic feature extraction module adopts the pre-training weight of resnet101 on ImageNet; the total training EPOCHS is 30, the batch-size under a single GPU is 4, the input size of an initialized picture is 512 x 512, an Adam optimizer is adopted, the learning rate is set to be 0.0001, and a loss function is the sum of the distances between the coordinates of all predicted points and the coordinates of real points;
and 3, setting a training key point detection model by using the network structure and the training parameters, and storing the trained model. And (3) training a key point detection model by using the network structure and parameter setting in the step (2), and storing the trained model parameters for key point detection. The specific training steps are as follows:
step 301, setting algorithm parameters, and setting model initialization parameters according to the step 202;
step 302, establishing a network model;
step 303, judging whether the current EPOCH is smaller than EPOCHS, if so, turning to step 304, otherwise, turning to step 4;
304, taking a batch-size invoice picture from the training set, training the model, and updating the model algorithm by using Adam;
step 305, judging whether all pictures in the training set are trained, if so, turning to step 306, otherwise, turning to step 304;
step 306, verifying the accuracy of the model by using the test set, storing the trained model, and going to step 303.
And 4, detecting key points of the bills by using the trained model. And (3) performing key point detection on the image to be corrected by using the model obtained by training in the step (3), and recording the detected key points as preset _ points and recording a corresponding key point set in the template picture as template _ points. And then calculating a corresponding perspective transformation matrix as homographic by using the detected key points and the key points of the standard bill picture. The specific calculation adopts the findHomography method of opencv.
And 5, aligning the bill by using the detected key points. And (4) taking the perspective transformation matrix obtained in the step (4) as homograph to be used on the image src to be corrected to obtain a corrected image dst. The correction is carried out by the method warPerspectral of opencv.

Claims (10)

1. The invoice image distortion correction method based on deep learning key point detection is characterized by comprising the following steps:
step 1, marking and enhancing training data;
step 2, setting a network structure and training parameters;
step 3, setting a training key point detection model by using the network structure and the training parameters, and storing the trained model;
step 4, detecting key points of bills by using the trained model;
and 5, aligning the bill by using the detected key points.
2. The invoice image distortion correction method based on deep learning key point detection as claimed in claim 1, wherein in step 1, partial manual labeling data is adopted, and then a large amount of training data is generated by a data enhancement strategy, which specifically comprises the following steps:
step 101, preparing label data;
102, data annotation;
103, enhancing data;
step 104, converting data formats;
and 105, dividing the data set.
3. The invoice image distortion correction method based on deep learning key point detection as claimed in claim 2, characterized in that in step 101, when annotation data is prepared, different types of images to be annotated are collected, 1000 images of each type are collected, the key point position and name of each type of invoice are defined, and the number of key points is more than 4;
the definition of the key points is performed according to the following criteria:
if the invoice picture has a table and the form style is fixed, the corner points in the table are taken as the standard during actual definition, and the selected key points are distributed at each position of the invoice face;
if the invoice picture has no table, defining key points according to the position of a fixed character area of the invoice;
if the actual invoice picture has an irregular problem, which causes that part of the defined key points are invisible, only corresponding visible key points are marked when the actual marking task is executed.
4. The invoice image distortion correction method based on deep learning key point detection as claimed in claim 2, wherein in step 102, during data annotation, the invoice pictures with tables are annotated by the points task type of VIA tool, and the invoice pictures without tables are annotated by the rect task type of VIA.
5. The invoice image distortion correction method based on deep learning key point detection as claimed in claim 2, characterized in that in step 103, during data enhancement, the labeled partial data is adopted, then image enhancement is performed on the labeled data according to the actual business bill image condition, and the imgauge image enhancement library of python is used as the enhancement strategy for the training data, and the following method is adopted:
random affine transformation of the image, wherein the scaling scale range is (0.5,2), the rotation angle range is [ (-15,15), (75,105), (165,195), (255,285) ], the unit is degree, the translation amount is (-200, 200), the unit is pixel;
image random perspective transformation, wherein the scale parameter random value range is (0.025, 0.15);
randomly adding noise to the image;
image contrast stretching;
adding shadow noise to the image;
the marked training data are expanded and enhanced by the fusion of the data enhancement modes.
6. The invoice image distortion correction method based on deep learning key point detection as claimed in claim 2, characterized in that in step 104, during data format conversion, the format fusion and conversion are performed on the annotation data of different types of bills, and the annotation data is arranged into a data set, for a specific type of bill annotation data, the corresponding annotation data is unchanged, and for other types of bill annotation data, the corresponding annotation data is filled with (0,0, 0).
7. The invoice image distortion correction method based on deep learning key point detection as claimed in claim 2 or 6, characterized in that, in the step 105, when the data set is divided, the sorted data set is randomly divided into a training set and a testing set according to a 9:1 ratio.
8. The invoice image distortion correction method based on deep learning key point detection as claimed in claim 1, wherein the step 2 specifically comprises the following steps:
step 201, training a network structure, wherein basic feature extraction adopts resnet101 as a feature extraction module, and is followed by GolbalNet and RefineNet, wherein the GolbalNet is used for detecting all key points, and the RefineNet is used for predicting correction of the key points;
step 202, initializing parameters of the model, wherein a basic feature extraction module adopts a pre-training weight of resnet101 on ImageNet; the total training EPOCHS is 30, the batch-size under a single GPU is 4, the input size of the initialized picture is 512 x 512, an Adam optimizer is adopted, the learning rate is set to be 0.0001, and the loss function is the sum of the distances between the coordinates of each predicted point and the coordinates of the real point.
9. The invoice image distortion correction method based on deep learning key point detection as claimed in claim 1 or 8, characterized in that step 3 specifically comprises the following steps;
step 301, setting algorithm parameters, and setting model initialization parameters according to the step 202;
step 302, establishing a network model;
step 303, judging whether the current EPOCH is smaller than EPOCHS, if so, turning to step 304, otherwise, turning to step 4;
304, taking a batch-size invoice picture from the training set, training the model, and updating the model algorithm by using Adam;
step 305, judging whether all pictures in the training set are trained, if so, turning to step 306, otherwise, turning to step 304;
step 306, verifying the accuracy of the model by using the test set, storing the trained model, and going to step 303.
10. The invoice image distortion correction method based on deep learning key point detection as claimed in claim 1 or 8, characterized in that, in step 4, during the detection of the bill key point, the trained model is used to perform key point detection on the image to be corrected, the detected key point is recorded as a predicted _ points, the corresponding key point set in the template picture is recorded as a template _ points, then the detected key point and the key point of the standard bill picture are used to calculate the corresponding perspective transformation matrix and record as homograph, and the opendcv findHomography method is specifically adopted for calculation;
in step 5, when the bill is aligned, the obtained perspective transformation matrix is used as homograph for obtaining the corrected image dst on the image src to be corrected, and the correction adopts opencv warpPeractive method.
CN201910932792.9A 2019-09-29 2019-09-29 Invoice image distortion correction method based on deep learning key point detection Pending CN110674815A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910932792.9A CN110674815A (en) 2019-09-29 2019-09-29 Invoice image distortion correction method based on deep learning key point detection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910932792.9A CN110674815A (en) 2019-09-29 2019-09-29 Invoice image distortion correction method based on deep learning key point detection

Publications (1)

Publication Number Publication Date
CN110674815A true CN110674815A (en) 2020-01-10

Family

ID=69080020

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910932792.9A Pending CN110674815A (en) 2019-09-29 2019-09-29 Invoice image distortion correction method based on deep learning key point detection

Country Status (1)

Country Link
CN (1) CN110674815A (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111382742A (en) * 2020-03-15 2020-07-07 策拉人工智能科技(云南)有限公司 Method for integrating OCR recognition software on cloud financial platform
CN111507265A (en) * 2020-04-17 2020-08-07 北京百度网讯科技有限公司 Form key point detection model training method, device, equipment and storage medium
CN111507349A (en) * 2020-04-15 2020-08-07 深源恒际科技有限公司 Dynamic data enhancement method in OCR (optical character recognition) model training
CN111582153A (en) * 2020-05-07 2020-08-25 北京百度网讯科技有限公司 Method and device for determining document orientation
CN111881882A (en) * 2020-08-10 2020-11-03 晶璞(上海)人工智能科技有限公司 Medical bill rotation correction method and system based on deep learning
CN112115943A (en) * 2020-09-16 2020-12-22 四川长虹电器股份有限公司 Bill rotation angle detection method based on deep learning
CN112347865A (en) * 2020-10-21 2021-02-09 四川长虹电器股份有限公司 Bill correction method based on key point detection
CN112347994A (en) * 2020-11-30 2021-02-09 四川长虹电器股份有限公司 Invoice image target detection and angle detection method based on deep learning
CN112507973A (en) * 2020-12-29 2021-03-16 中国电子科技集团公司第二十八研究所 Text and picture recognition system based on OCR technology
CN112633275A (en) * 2020-12-22 2021-04-09 航天信息股份有限公司 Multi-bill mixed-shooting image correction method and system based on deep learning
CN113974828A (en) * 2021-09-30 2022-01-28 西安交通大学第二附属医院 Operation reference scheme generation method and device
TWI775038B (en) * 2020-01-21 2022-08-21 群邁通訊股份有限公司 Method and device for recognizing character and storage medium
TWI775039B (en) * 2020-01-21 2022-08-21 群邁通訊股份有限公司 Method and device for removing document shadow
US11605210B2 (en) 2020-01-21 2023-03-14 Mobile Drive Netherlands B.V. Method for optical character recognition in document subject to shadows, and device employing method
US11876945B2 (en) 2020-01-21 2024-01-16 Mobile Drive Netherlands B.V. Device and method for acquiring shadow-free images of documents for scanning purposes
CN114494038B (en) * 2021-12-29 2024-03-29 扬州大学 Target surface perspective distortion correction method based on improved YOLOX-S

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102074001A (en) * 2010-11-25 2011-05-25 上海合合信息科技发展有限公司 Method and system for stitching text images
US20180114084A1 (en) * 2015-07-13 2018-04-26 Baidu Online Network Technology (Beijing) Co., Ltd Method for recognizing picture, method and apparatus for labelling picture, and storage medium
CN108171669A (en) * 2017-12-29 2018-06-15 星阵(广州)基因科技有限公司 A kind of image correction method based on OpenCV algorithms
CN108596867A (en) * 2018-05-09 2018-09-28 五邑大学 A kind of picture bearing calibration and system based on ORB algorithms
CN108647681A (en) * 2018-05-08 2018-10-12 重庆邮电大学 A kind of English text detection method with text orientation correction
CN108876858A (en) * 2018-07-06 2018-11-23 北京字节跳动网络技术有限公司 Method and apparatus for handling image
CN109583408A (en) * 2018-12-07 2019-04-05 高新兴科技集团股份有限公司 A kind of vehicle key point alignment schemes based on deep learning
CN109784350A (en) * 2018-12-29 2019-05-21 天津大学 In conjunction with the dress ornament key independent positioning method of empty convolution and cascade pyramid network
CN109871744A (en) * 2018-12-29 2019-06-11 新浪网技术(中国)有限公司 A kind of VAT invoice method for registering images and system
CN110008956A (en) * 2019-04-01 2019-07-12 深圳市华付信息技术有限公司 Invoice key message localization method, device, computer equipment and storage medium
CN110032990A (en) * 2019-04-23 2019-07-19 杭州智趣智能信息技术有限公司 A kind of invoice text recognition method, system and associated component
CN110263694A (en) * 2019-06-13 2019-09-20 泰康保险集团股份有限公司 A kind of bank slip recognition method and device

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102074001A (en) * 2010-11-25 2011-05-25 上海合合信息科技发展有限公司 Method and system for stitching text images
US20180114084A1 (en) * 2015-07-13 2018-04-26 Baidu Online Network Technology (Beijing) Co., Ltd Method for recognizing picture, method and apparatus for labelling picture, and storage medium
CN108171669A (en) * 2017-12-29 2018-06-15 星阵(广州)基因科技有限公司 A kind of image correction method based on OpenCV algorithms
CN108647681A (en) * 2018-05-08 2018-10-12 重庆邮电大学 A kind of English text detection method with text orientation correction
CN108596867A (en) * 2018-05-09 2018-09-28 五邑大学 A kind of picture bearing calibration and system based on ORB algorithms
CN108876858A (en) * 2018-07-06 2018-11-23 北京字节跳动网络技术有限公司 Method and apparatus for handling image
CN109583408A (en) * 2018-12-07 2019-04-05 高新兴科技集团股份有限公司 A kind of vehicle key point alignment schemes based on deep learning
CN109784350A (en) * 2018-12-29 2019-05-21 天津大学 In conjunction with the dress ornament key independent positioning method of empty convolution and cascade pyramid network
CN109871744A (en) * 2018-12-29 2019-06-11 新浪网技术(中国)有限公司 A kind of VAT invoice method for registering images and system
CN110008956A (en) * 2019-04-01 2019-07-12 深圳市华付信息技术有限公司 Invoice key message localization method, device, computer equipment and storage medium
CN110032990A (en) * 2019-04-23 2019-07-19 杭州智趣智能信息技术有限公司 A kind of invoice text recognition method, system and associated component
CN110263694A (en) * 2019-06-13 2019-09-20 泰康保险集团股份有限公司 A kind of bank slip recognition method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
余小宝: "增值税发票抵扣联字符识别中的图像倾斜校正方法", 《电脑编程技巧与维护》 *

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI775038B (en) * 2020-01-21 2022-08-21 群邁通訊股份有限公司 Method and device for recognizing character and storage medium
US11876945B2 (en) 2020-01-21 2024-01-16 Mobile Drive Netherlands B.V. Device and method for acquiring shadow-free images of documents for scanning purposes
US11605210B2 (en) 2020-01-21 2023-03-14 Mobile Drive Netherlands B.V. Method for optical character recognition in document subject to shadows, and device employing method
TWI775039B (en) * 2020-01-21 2022-08-21 群邁通訊股份有限公司 Method and device for removing document shadow
CN111382742A (en) * 2020-03-15 2020-07-07 策拉人工智能科技(云南)有限公司 Method for integrating OCR recognition software on cloud financial platform
CN111507349A (en) * 2020-04-15 2020-08-07 深源恒际科技有限公司 Dynamic data enhancement method in OCR (optical character recognition) model training
CN111507349B (en) * 2020-04-15 2023-05-23 北京深智恒际科技有限公司 Dynamic data enhancement method in OCR recognition model training
CN111507265A (en) * 2020-04-17 2020-08-07 北京百度网讯科技有限公司 Form key point detection model training method, device, equipment and storage medium
CN111582153A (en) * 2020-05-07 2020-08-25 北京百度网讯科技有限公司 Method and device for determining document orientation
CN111582153B (en) * 2020-05-07 2023-06-30 北京百度网讯科技有限公司 Method and device for determining orientation of document
CN111881882A (en) * 2020-08-10 2020-11-03 晶璞(上海)人工智能科技有限公司 Medical bill rotation correction method and system based on deep learning
CN112115943A (en) * 2020-09-16 2020-12-22 四川长虹电器股份有限公司 Bill rotation angle detection method based on deep learning
CN112347865A (en) * 2020-10-21 2021-02-09 四川长虹电器股份有限公司 Bill correction method based on key point detection
CN112347994A (en) * 2020-11-30 2021-02-09 四川长虹电器股份有限公司 Invoice image target detection and angle detection method based on deep learning
CN112633275A (en) * 2020-12-22 2021-04-09 航天信息股份有限公司 Multi-bill mixed-shooting image correction method and system based on deep learning
CN112633275B (en) * 2020-12-22 2023-07-18 航天信息股份有限公司 Multi-bill mixed shooting image correction method and system based on deep learning
CN112507973B (en) * 2020-12-29 2022-09-06 中国电子科技集团公司第二十八研究所 Text and picture recognition system based on OCR technology
CN112507973A (en) * 2020-12-29 2021-03-16 中国电子科技集团公司第二十八研究所 Text and picture recognition system based on OCR technology
CN113974828A (en) * 2021-09-30 2022-01-28 西安交通大学第二附属医院 Operation reference scheme generation method and device
CN113974828B (en) * 2021-09-30 2024-02-09 西安交通大学第二附属医院 Surgical reference scheme generation method and device
CN114494038B (en) * 2021-12-29 2024-03-29 扬州大学 Target surface perspective distortion correction method based on improved YOLOX-S

Similar Documents

Publication Publication Date Title
CN110674815A (en) Invoice image distortion correction method based on deep learning key point detection
CN110689037B (en) Method and system for automatic object annotation using deep networks
CN110738602B (en) Image processing method and device, electronic equipment and readable storage medium
CN106156761B (en) Image table detection and identification method for mobile terminal shooting
CN110008956B (en) Invoice key information positioning method, invoice key information positioning device, computer equipment and storage medium
CN111160352B (en) Workpiece metal surface character recognition method and system based on image segmentation
CN110827247A (en) Method and equipment for identifying label
US10984284B1 (en) Synthetic augmentation of document images
CN109034155A (en) A kind of text detection and the method and system of identification
CN110516554A (en) A kind of more scene multi-font Chinese text detection recognition methods
CN111191649A (en) Method and equipment for identifying bent multi-line text image
CN109886257B (en) Method for correcting invoice image segmentation result by adopting deep learning in OCR system
CN112883926B (en) Identification method and device for form medical images
CN113903024A (en) Handwritten bill numerical value information identification method, system, medium and device
CN113592735A (en) Text page image restoration method and system, electronic equipment and computer readable medium
CN111414905B (en) Text detection method, text detection device, electronic equipment and storage medium
CN110956147B (en) Method and device for generating training data
CN113902402A (en) Document auxiliary filling method, system, storage medium and device based on AR technology
CN111784587A (en) Invoice photo position correction method based on deep learning network
CN104077557B (en) A kind of method and apparatus obtaining card image
CN111783763A (en) Text positioning box correction method and system based on convolutional neural network
CN112508000B (en) Method and equipment for generating OCR image recognition model training data
CN111325106B (en) Method and device for generating training data
Zhang et al. Key point localization and recurrent neural network based water meter reading recognition
CN111666882A (en) Method for extracting answers of handwritten test questions

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200110